Hi everyone!
I'm trying to use the cutadapt plugin to demultiplex 7500 samples' sequence data from a 175Gib file imported by "qiime tools imports". Dual barcoded was employed in the paired sequences. With the following command line:
qiime cutadapt demux-paired
--i-seqs multiplexed-seqs.qza
--m-forward-barcodes-file sample-metadata.tsv
--m-forward-barcodes-column Forward_Barcode
--m-reverse-barcodes-file sample-metadata.tsv
--m-reverse-barcodes-column Reverse_Barcode
--p-error-rate 0
--p-batch-size 500
--o-per-sample-sequences per_sample_sequences.qza
--o-untrimmed-sequences untrimmed_sequences.qza
--p-cores 8
First time running this command failed because of the storage space empty. When I re-run this command, I got the error message "Plugin error from cutadapt:", last 50 lines of the log file are that:
WARNING: Adapter 'AACTCCATT' (regular 5') was specified multiple times! Please make sure that this is what you want.
WARNING: Adapter 'AACTCCATT' (regular 5') was specified multiple times! Please make sure that this is what you want.
WARNING: Adapter 'AACTCCATT' (regular 5') was specified multiple times! Please make sure that this is what you want.
WARNING: Adapter 'AACTCCATT' (regular 5') was specified multiple times! Please make sure that this is what you want.
WARNING: Adapter 'AACTCCATT' (regular 5') was specified multiple times! Please make sure that this is what you want.
WARNING: Adapter 'AACTCCATT' (regular 5') was specified multiple times! Please make sure that this is what you want.
WARNING: Adapter 'AACTCCATT' (regular 5') was specified multiple times! Please make sure that this is what you want.
Processing paired-end reads on 8 cores ...
Traceback (most recent call last):
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/bin/cutadapt", line 8, in <module>
sys.exit(main_cli())
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/cutadapt/__main__.py", line 1014, in main_cli
main(sys.argv[1:])
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/cutadapt/__main__.py", line 1100, in main
stats = r.run()
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/cutadapt/pipeline.py", line 946, in run
workers, connections = self._start_workers()
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/cutadapt/pipeline.py", line 941, in _start_workers
worker.start()
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 54, in _launch
child_r, parent_w = os.pipe()
OSError: [Errno 24] Too many open files
Traceback (most recent call last):
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/q2cli/commands.py", line 339, in __call__
results = action(**arguments)
File "<decorator-gen-304>", line 2, in demux_paired
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/qiime2/sdk/action.py", line 234, in bound_callable
outputs = self._callable_executor_(scope, callable_args,
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/qiime2/sdk/action.py", line 381, in _callable_executor_
output_views = self._callable(**view_args)
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/q2_cutadapt/_demux.py", line 229, in demux_paired
untrimmed = _demux(
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/q2_cutadapt/_demux.py", line 180, in _demux
run_command(cmd)
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/q2_cutadapt/_demux.py", line 37, in run_command
subprocess.run(cmd, check=True)
File "/Users/albert_wu/opt/miniconda3/envs/qiime2-2022.8/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['cutadapt', '--front', 'file:/var/folders/yp/bctrpfn15qdg9wwwlwxkdhrh0000gn/T/tmpc4fao8nx', '--error-rate', '0', '--minimum-length', '1', '-o', '/var/folders/yp/bctrpfn15qdg9wwwlwxkdhrh0000gn/T/q2-CasavaOneEightSingleLanePerSampleDirFmt-me9lnaq2/{name}.1.fastq.gz', '--untrimmed-output', '/var/folders/yp/bctrpfn15qdg9wwwlwxkdhrh0000gn/T/q2-MultiplexedPairedEndBarcodeInSequenceDirFmt-dbeefimn/forward.fastq.gz', '--pair-adapters', '-G', 'file:/var/folders/yp/bctrpfn15qdg9wwwlwxkdhrh0000gn/T/tmppm6tkq2m', '-p', '/var/folders/yp/bctrpfn15qdg9wwwlwxkdhrh0000gn/T/q2-CasavaOneEightSingleLanePerSampleDirFmt-me9lnaq2/{name}.2.fastq.gz', '--untrimmed-paired-output', '/var/folders/yp/bctrpfn15qdg9wwwlwxkdhrh0000gn/T/q2-MultiplexedPairedEndBarcodeInSequenceDirFmt-dbeefimn/reverse.fastq.gz', '/var/folders/yp/bctrpfn15qdg9wwwlwxkdhrh0000gn/T/qiime2/albert_wu/data/4fec7665-f9f0-4440-84f7-5c72dd15accc/data/forward.fastq.gz', '/var/folders/yp/bctrpfn15qdg9wwwlwxkdhrh0000gn/T/qiime2/albert_wu/data/4fec7665-f9f0-4440-84f7-5c72dd15accc/data/reverse.fastq.gz', '-j', '8']' returned non-zero exit status 1.
I changed the "--p-batch-size" number from 500 to 400, 300, 200, 100, 50...... finally, when the number went to 5, the command line ran successfully.
Even I've tried to remove the qiime2-2022.8 environment with "conda remove -n qiime2-2022.8 --all" in my Mac and re-create it, still only work with batch size 5.
I've checked my Macbook setting for available file descriptors:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 2560
pipe size (512 bytes, -p) 1
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 2784
virtual memory (kbytes, -v) unlimited
Is there anyone could help me figure out what happened with my command? and why the batch size could run with 500 at first but then only could run with 5?
Thanks a lot !