Missing index files, which demultiplex method shall we choose?

We just received 2 batches of 16S V4 region sequencing data (PE150bp on Illumina Novaseq) from the collaborator. The sequencing follows EMP protocol. However, the index files are missing because they were not transferred due to some unknown reasons and have already been deleted.

Since the barcode information might be still in the header of forward and reverse read using Illumina sequencing platform, here we have 2 options to deal with this issue.

  1. demultiplex with cutadapt demux-paired
    demux-paired: Demultiplex paired-end sequence data with barcodes in-sequence. — QIIME 2 2021.2.0 documentation
    I've never tried this before. There is no argument for --p-rev-comp-barcodes or --p-rev-comp-mapping-barcodes. I'm suspecting whether this command will consider to reverse complement the barcodes?

  2. use extract_barcodes.py in Qiime1 to pull out the barcodes.fasq.gz, then demultiplex in Qiime2 with qiime demux emp-paired

Currently I would prefer the 2nd option. Would be great to have your suggestions and comments.

I tried the 1st option with the command below. But it returns an error message

qiime cutadapt demux-paired
--i-seqs multiplexed-seqs.qza
--m-forward-barcodes-column barcode-sequence
--o-per-sample-sequences demux_seqs_lane1.qza
--o-untrimmed-sequences demux_seqs_lane1_unmatched_sequence.qza

Error message:

Command '['cutadapt', '--front', 'file:/tmp/tmputu16oe9', '--error-rate', '0.1', '--minimum-length', '1', '-o', '/tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-2a32zl31/{name}.1.fastq.gz', '--untrimmed-output', '/tmp/q2-MultiplexedPairedEndBarcodeInSequenceDirFmt-izjzicef/forward.fastq.gz', '-p', '/tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-2a32zl31/{name}.2.fastq.gz', '--untrimmed-paired-output', '/tmp/q2-MultiplexedPairedEndBarcodeInSequenceDirFmt-izjzicef/reverse.fastq.gz', '/tmp/qiime2-archive-dhlu_h_z/94ec6b95-924f-46a3-9f32-712951b0c2c1/data/forward.fastq.gz', '/tmp/qiime2-archive-dhlu_h_z/94ec6b95-924f-46a3-9f32-712951b0c2c1/data/reverse.fastq.gz']' returned non-zero exit status 1.

Forward.fastq.gz

Reverse.fastq.gz

@Nicholas_Bokulich may I have your suggestions on how to solve this issue?

@Claire010,

I think using cutadapt demux-paired is the right approach based on what you have posted so far. Can you try running your command again, but this time add the --verbose flag to it and post the results here?

1 Like

@Keegan-Evans thanks for your reply. Below is the output.

qiime cutadapt demux-paired
--i-seqs multiplexed-seqs.qza
--m-forward-barcodes-file metadata.txt
--m-forward-barcodes-column barcode-sequence
--o-per-sample-sequences demux_seqs_lane1.qza
--o-untrimmed-sequences demux_seqs_lane1_unmatched_sequence.qza
--verbose

Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command: cutadapt --front file:/tmp/tmpceiha8pf --error-rate 0.1 --minimum-length 1 -o /tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-mbi4su5z/{name}.1.fastq.gz --untrimmed-output /tmp/q2-MultiplexedPairedEndBarcodeInSequenceDirFmt-o5xiv04p/forward.fastq.gz -p /tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-mbi4su5z/{name}.2.fastq.gz --untrimmed-paired-output /tmp/q2-MultiplexedPairedEndBarcodeInSequenceDirFmt-o5xiv04p/reverse.fastq.gz /tmp/qiime2-archive-nyzr0wuv/5521c098-365e-442f-808d-1941dc0cba34/data/forward.fastq.gz /tmp/qiime2-archive-nyzr0wuv/5521c098-365e-442f-808d-1941dc0cba34/data/reverse.fastq.gz

This is cutadapt 3.2 with Python 3.6.13
Command line parameters: --front file:/tmp/tmpceiha8pf --error-rate 0.1 --minimum-length 1 -o /tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-mbi4su5z/{name}.1.fastq.gz --untrimmed-output /tmp/q2-MultiplexedPairedEndBarcodeInSequenceDirFmt-o5xiv04p/forward.fastq.gz -p /tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-mbi4su5z/{name}.2.fastq.gz --untrimmed-paired-output /tmp/q2-MultiplexedPairedEndBarcodeInSequenceDirFmt-o5xiv04p/reverse.fastq.gz /tmp/qiime2-archive-nyzr0wuv/5521c098-365e-442f-808d-1941dc0cba34/data/forward.fastq.gz /tmp/qiime2-archive-nyzr0wuv/5521c098-365e-442f-808d-1941dc0cba34/data/reverse.fastq.gz
Traceback (most recent call last):
File "/home/jxu/miniconda3/envs/qiime2-2021.2/bin/cutadapt", line 10, in
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/cutadapt/main.py", line 845, in main_cli
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/cutadapt/main.py", line 899, in main
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/cutadapt/main.py", line 438, in open_output_files
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/cutadapt/main.py", line 504, in open_demultiplex_out
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/cutadapt/utils.py", line 167, in xopen
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/xopen/init.py", line 609, in xopen
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/xopen/init.py", line 519, in _open_gz
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/gzip.py", line 53, in open
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/gzip.py", line 163, in init
OSError: [Errno 24] Too many open files: '/tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-mbi4su5z/020.66154.48M.1.fastq.gz'
Traceback (most recent call last):
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/q2cli/commands.py", line 329, in call
results = action(**arguments)
File "", line 2, in demux_paired
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/qiime2/sdk/action.py", line 245, in bound_callable
output_types, provenance)
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/qiime2/sdk/action.py", line 390, in callable_executor
output_views = self._callable(**view_args)
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/q2_cutadapt/_demux.py", line 226, in demux_paired
error_rate, mux_fmt, batch_size, minimum_length)
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/q2_cutadapt/_demux.py", line 177, in _demux
run_command(cmd)
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/q2_cutadapt/_demux.py", line 37, in run_command
subprocess.run(cmd, check=True)
File "/home/jxu/miniconda3/envs/qiime2-2021.2/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['cutadapt', '--front', 'file:/tmp/tmpceiha8pf', '--error-rate', '0.1', '--minimum-length', '1', '-o', '/tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-mbi4su5z/{name}.1.fastq.gz', '--untrimmed-output', '/tmp/q2-MultiplexedPairedEndBarcodeInSequenceDirFmt-o5xiv04p/forward.fastq.gz', '-p', '/tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-mbi4su5z/{name}.2.fastq.gz', '--untrimmed-paired-output', '/tmp/q2-MultiplexedPairedEndBarcodeInSequenceDirFmt-o5xiv04p/reverse.fastq.gz', '/tmp/qiime2-archive-nyzr0wuv/5521c098-365e-442f-808d-1941dc0cba34/data/forward.fastq.gz', '/tmp/qiime2-archive-nyzr0wuv/5521c098-365e-442f-808d-1941dc0cba34/data/reverse.fastq.gz']' returned non-zero exit status 1.

Plugin error from cutadapt:

Command '['cutadapt', '--front', 'file:/tmp/tmpceiha8pf', '--error-rate', '0.1', '--minimum-length', '1', '-o', '/tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-mbi4su5z/{name}.1.fastq.gz', '--untrimmed-output', '/tmp/q2-MultiplexedPairedEndBarcodeInSequenceDirFmt-o5xiv04p/forward.fastq.gz', '-p', '/tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-mbi4su5z/{name}.2.fastq.gz', '--untrimmed-paired-output', '/tmp/q2-MultiplexedPairedEndBarcodeInSequenceDirFmt-o5xiv04p/reverse.fastq.gz', '/tmp/qiime2-archive-nyzr0wuv/5521c098-365e-442f-808d-1941dc0cba34/data/forward.fastq.gz', '/tmp/qiime2-archive-nyzr0wuv/5521c098-365e-442f-808d-1941dc0cba34/data/reverse.fastq.gz']' returned non-zero exit status 1.

See above for debug info.

Try setting this to a smaller value, and let us know if it helps!

--p-batch-size INTEGER  The number of samples cutadapt demultiplexes
    Range(0, None)        concurrently. Demultiplexing in smaller batches will
                          yield the same result with marginal speed loss, and
                          may solve "too many files" errors related to sample
                          quantity. Set to "0" to process all samples at once.
                                                                  [default: 0]

An off-topic reply has been split into a new topic: I'm wondering whether q2-cutadapt would reverse and complement the barcodes during demultiplexing?

Please keep replies on-topic in the future.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.