Dada2 trimming question

I have a question about the dada2 trimming. Basically we got our demultiplexed paired-end data from our genomics core who used the 16S Metagenomic Sequencing Library Preparation.

We did a presentation in our data and somebody said that primers are still in the sequence data from our core. Do we have to remove primers before running dada2? I checked the raw fastq file for several files. The forward sequences are always starting with CCTACGGG and the reverse sequences are always starting with NACTAC. I do not see they match with the Primers, am I right?

My last question is that how to set the trim and trunc parameters is appropriate in dada2 to trim the primers like I found above?


If you compare the start of your sequences to the Illumina adapter/primer sequences, you will see they match in the second-half:



Those are the 341F and 805R primers, and they need to be removed. You can do so here with trim-left-f 17 and trim-left-r 21, which you can read off the above.

This amplicon is longer than the reads, so they won’t read into the other primer, hence trunc-len is not needed for primer removal.



Thanks for answering the question. so can I run the removal of primers with quality control together or I have to run them seperately? When I run the denoise-paired in quality control, if I use the following commands, is it correct?

qiime dada2 denoise-paired --i-demultiplexed-seqs demux-paired-end.qza --p-trim-left-f 17 --p-trunc-len-f 250 --p-trim-left-r 21 --p-trunc-len-r 250 --o-table bing-dada2-table.qza --o-representative-sequences bing-data2-rep-seqs.qza --verbose


That looks good to me. You can revisit the quality filtering parameters depending on if enough reads are making it through.

1 Like

I just read the post about primer removal in dada2 in GitHub. Our data have primers as below:

The question is whether we need to remove the primers before running data2 denoise in QIIME 2, or we can trim the primers in the dada2 denoise section? If we have to remove the primers before running data2 in QIIME 2, how can do it? Please guide us!


You can trim them with the trim-left parameters, but you need to know the length the primer sequence that appears on your reads. So you either need to understand your amplicon sequencing set up (some don’t sequence primers, others do), or you can inspect your raw reads to determine if primers are on the reads, and how long they are.

Just look at the first few reads and compare them to the primer sequences you are using. Do they match some part? How much? Determine trim-left-f and trim-left-r from that.


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.