is this a bug of qiime2, version 2019.10, in plugin "qiime cutadapt demux-paired"

I used Qiime2, version 2019.10, to analyze my 16s data. My data format is based on pair-ended sequencing, and the file structure is as follows:


These two files have not been demultiplexed yet. So I want to use the plugin “qiime cutadapt demux-paired” to demultiplex all samples. The code I used is listed below.

qiime cutadapt demux-paired
–i-seqs 16s_paired-end-undemultiplexed.qza
–m-forward-barcodes-file barcode_16s
–m-forward-barcodes-column forward_barcode
–m-reverse-barcodes-file barcode_16s
–m-reverse-barcodes-column reverse_barcode
–o-per-sample-sequences demultiplexed_sequences_16s.qza
–o-untrimmed-sequences untrimmed_sequences_16s.qza

However, the problem occurred, although the log did not indicate any error or warning, the output file shows that most of the reads (~80%) were discarded. I repeatedly confirmed that the barcode was not wrong.

Then, I tried the second method below for comparison. That is, the two files of pair-ended sequencing were regarded as two single-ended sequencing files, and then the corresponding barcode (forward or reverse barcode) was respectively used for file demultiplexing. The code was listed below. The results showed that most of the reads were remained, approximately 80% - 90%.

#forward demutiplexing
qiime cutadapt demux-single
–i-seqs 16s_f_multiplexed-seqs.qza
–m-barcodes-file barcode_16s
–m-barcodes-column forward_barcode
–o-per-sample-sequences 16s_f_demultiplexed-seqs.qza
–o-untrimmed-sequences 16s_f_untrimmed.qza

#reverse demultiplexing
qiime cutadapt demux-single
–i-seqs 16s_r_multiplexed-seqs.qza
–m-barcodes-file barcode_16s
–m-barcodes-column reverse_barcode
–o-per-sample-sequences 16s_r_demultiplexed-seqs.qza
–o-untrimmed-sequences 16s_r_untrimmed.qza

So, I want to know what may cause this problem? Is there any problem with my parameter which led to this fatal error? In addition, is the second treatment method feasible?

Hello Kai,

Thank you for bringing this interesting question to the forums!

This issue is surprising, as you have confirmed that the barcodes are correct and work with single end reads, so now we have got to figure out what’s messing up the paired reads!

Have you (pre)processed your reads outside of Qiime before running qiime cutadapt demux-paired . I wonder if cutadapt needs the reads to be in the same order, and if some sort of processing changed their order or labels…

It’s a mystery! :female_detective: :mag_right:


Thank you so much.
I cut the first 12 bp of every reads by a python script in the 5’ side. Then I removed the duplicates in these reads. I found there are nearly more than 1,000,000 different barcodes in all these reads. In addition to the target barcodes, the abundance of other barcodes is also of very high concentration. Considering the fact that barcodes may have errors and the result in the second treatment methods, I think the most likely cause is as you said that if cutadapt needs the reads to be in the same order, and if some sort of processing changed their order or labels.

So as you said, how can I pre-process these data before running cutadapt, in order that the reads or the labels can be in the right order. I am not good at programming. The data comes from the sequencing company without any processing.

Hello Kai,

Ah OK! I think we have found the issue.

Based on the process you have described, I think the issue could be removing duplicates. Deduplication would remove many reads, but is not recommended this early in the pipeline for 16S amplicon reads.

Try demultiplexing with your raw reads, then trimming inside of Qiime 2 as shown in this part of the Atacama soil microbiome tutorial.

I cut the first 12 bp

You can do that here too, when you run the qiime dada2 denoise-paired command by passing this setting: --p-trim-left-f 12 -p-trim-left-r 12

Let me know what you find!


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.