Thank you for this, it was helpful to understand some potential issues. I've tried changing my code to the following, accounting for your suggestions to name the target region and to reduce jobs (I've tried running it from 1-8 jobs) but I'm still running into the same error:
qiime rescript get-ncbi-data
--p-query 'txid6656[ORGN] AND (cytochrome c oxidase subunit 1[Title] OR cytochrome c oxidase subunit I[Title] OR cytochrome oxidase subunit 1[Title] OR cytochrome oxidase subunit I[Title] OR COX1[Title] OR CO1[Title] OR COI[Title]) NOT environmental sample[Title] NOT environmental samples[Title] NOT environmental[Title] NOT uncultured[Title] NOT unclassified[Title] NOT unidentified[Title] NOT unverified[Title]'
--p-n-jobs 2
--o-sequences NCBI_Arthropoda/ncbi-refseqs-unfiltered.qza
--o-taxonomy NCBI_Arthropoda/ncbi-refseqs-taxonomy-unfiltered.qza
Error from running this:
WARNING:2025-04-12 13:48:11,243:MainProcess:This query could result in more than 100 requests to NCBI. If you are not running it on the weekend or between 9 pm and 5 am Eastern Time weekdays, it may result in NCBI blocking your IP address. See Policies and Disclaimers - NCBI for details.
joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/joblib/externals/loky/process_executor.py", line 661, in wait_result_broken_or_wakeup
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/multiprocessing/connection.py", line 250, in recv
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/multiprocessing/connection.py", line 421, in _recv_bytes
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/multiprocessing/connection.py", line 386, in _recv
MemoryError
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 530, in call
results = self._execute_action(
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 602, in _execute_action
results = action(**arguments)
File "", line 2, in get_ncbi_data
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 299, in bound_callable
outputs = self.callable_executor(
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 570, in callable_executor
output_views = self._callable(**view_args)
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/rescript/ncbi.py", line 83, in get_ncbi_data
seqs, taxa = _get_ncbi_data(query, accession_ids, ranks, rank_propagation,
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/rescript/ncbi.py", line 122, in _get_ncbi_data
seqs, taxids = get_data_for_query(
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/rescript/ncbi.py", line 397, in get_data_for_query
chunky = parallel(delayed(_get_query_chunk)(chunk, params, entrez_delay,
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/joblib/parallel.py", line 2007, in call
return output if self.return_generator else list(output)
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/joblib/parallel.py", line 1650, in _get_outputs
yield from self._retrieve()
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/joblib/parallel.py", line 1754, in _retrieve
self._raise_error_fast()
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/joblib/parallel.py", line 1789, in _raise_error_fast
error_job.get_result(self.timeout)
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/joblib/parallel.py", line 745, in get_result
return self._return_or_raise()
File "/home/klunn94/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/joblib/parallel.py", line 763, in _return_or_raise
raise self._result
joblib.externals.loky.process_executor.BrokenProcessPool: A result has failed to un-serialize. Please ensure that the objects returned by the function are always picklable.
Do you have any further suggestions?