Hello,
I have a question about the dada2 trimming. Basically we got our demultiplexed paired-end data from our genomics core who used the 16S Metagenomic Sequencing Library Preparation.
We did a presentation in our data and somebody said that primers are still in the sequence data from our core. Do we have to remove primers before running dada2? I checked the raw fastq file for several files. The forward sequences are always starting with CCTACGGG and the reverse sequences are always starting with NACTAC. I do not see they match with the Primers, am I right?
My last question is that how to set the trim and trunc parameters is appropriate in dada2 to trim the primers like I found above?
Those are the 341F and 805R primers, and they need to be removed. You can do so here with trim-left-f 17 and trim-left-r 21, which you can read off the above.
This amplicon is longer than the reads, so they won't read into the other primer, hence trunc-len is not needed for primer removal.
Thanks for answering the question. so can I run the removal of primers with quality control together or I have to run them seperately? When I run the denoise-paired in quality control, if I use the following commands, is it correct?
The question is whether we need to remove the primers before running data2 denoise in QIIME 2, or we can trim the primers in the dada2 denoise section? If we have to remove the primers before running data2 in QIIME 2, how can do it? Please guide us!
You can trim them with the trim-left parameters, but you need to know the length the primer sequence that appears on your reads. So you either need to understand your amplicon sequencing set up (some don't sequence primers, others do), or you can inspect your raw reads to determine if primers are on the reads, and how long they are.
Just look at the first few reads and compare them to the primer sequences you are using. Do they match some part? How much? Determine trim-left-f and trim-left-r from that.