High read counts demultiplexing with cutadapt


I am using Cutadapt 3.4 with QIIME2 version 2019.10 to demultiplex data. I found that for one samples, I am getting exceedingly high read counts within the files generated via cutadapt (2618203 total reads while the median for the samples is 82629 reads per sample). For another dataset, I am getting similar results for another single sample within the dataset (2385025 total reads, median of 87458 reads for the dataset). Additionally, after taxonomix classification, the majority of their features are low-resolution and cannot be identified beyond the kingdom level if at all.

I additionally demultiplexed with demux emp-single after extracting barcodes in qiime1 for one of the datasets. This resulted in much more normalized read counts across all samples, including the sample that had the higher count via cutadapt (67370, median 68153.0).

Samples in both datasets were amplified using the F515 and R806 primers (Bokulich). The F515 was barcoded in both cases, but the two outlier samples in each dataset have different barcodes from each other. Both libraries were sequenced using an IonTorrent Genestudio S5 system. I was wondering if anyone else had run into similar results using cutadapt.

qiime demux emp-single
–i-seqs /home/pf412/Desktop/2010_4_glucosamine/multiplexed_seqs.qza
–m-barcode /home/pf412/Desktop/2010_4_glucoseamine/Mapfile.txt
–m-barcodes-column BarcodeSequence
–o-per-sample-sequences /home/pf412/Desktop/2010_4_glucoseamine/demuxempsingle/per-sample-sequences.qza

qiime cutadapt demux-single
–i-seqs /home/pf412/Desktop/2010_4_glucoseamine/multiplexed_seqs.qza
–m-barcodes-file /home/pf412/Desktop/2010_4_glucoseamine/Mapfile.txt
–m-barcodes-column BarcodeSequence
–p-error-rate 0
–o-per-sample-sequences /home/pf412/Desktop/2010_4_glucoseamine/demultiplexed_seqs.qza
–o-untrimmed-sequences /home/pf412/Desktop/2010_4_glucoseamine/untrimmed_seqs.qza

Hi @pfinnegan, sorry for the slow reply.

What does your BarcodeSequence column look like? If you aren’t specifying the anchor, you might want to give that a shot: