Dada2 trimming question

Bing · July 28, 2017, 8:55pm

Hello,
I have a question about the dada2 trimming. Basically we got our demultiplexed paired-end data from our genomics core who used the 16S Metagenomic Sequencing Library Preparation.

We did a presentation in our data and somebody said that primers are still in the sequence data from our core. Do we have to remove primers before running dada2? I checked the raw fastq file for several files. The forward sequences are always starting with CCTACGGG and the reverse sequences are always starting with NACTAC. I do not see they match with the Primers, am I right?

My last question is that how to set the trim and trunc parameters is appropriate in dada2 to trim the primers like I found above?

Thanks,
Bing

benjjneb · July 31, 2017, 8:06pm

If you compare the start of your sequences to the Illumina adapter/primer sequences, you will see they match in the second-half:

                                 CCTACGGG
TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG

                                  NACTAC
GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC

Those are the 341F and 805R primers, and they need to be removed. You can do so here with trim-left-f 17 and trim-left-r 21, which you can read off the above.

This amplicon is longer than the reads, so they won't read into the other primer, hence trunc-len is not needed for primer removal.

Bing · July 31, 2017, 8:45pm

@benjjneb

Thanks for answering the question. so can I run the removal of primers with quality control together or I have to run them seperately? When I run the denoise-paired in quality control, if I use the following commands, is it correct?

qiime dada2 denoise-paired --i-demultiplexed-seqs demux-paired-end.qza --p-trim-left-f 17 --p-trunc-len-f 250 --p-trim-left-r 21 --p-trunc-len-r 250 --o-table bing-dada2-table.qza --o-representative-sequences bing-data2-rep-seqs.qza --verbose

Bing

benjjneb · August 1, 2017, 1:45am

That looks good to me. You can revisit the quality filtering parameters depending on if enough reads are making it through.

Bing · August 17, 2017, 8:13pm

Hi,
I just read the post about primer removal in dada2 https://github.com/qiime2/q2-dada2/issues/32 in GitHub. Our data have primers as below:

benjjneb:

                                      CCTACGGG

TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG

                                        NACTAC

GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC

The question is whether we need to remove the primers before running data2 denoise in QIIME 2, or we can trim the primers in the dada2 denoise section? If we have to remove the primers before running data2 in QIIME 2, how can do it? Please guide us!

Thanks,
Bing

benjjneb · August 17, 2017, 8:28pm

You can trim them with the trim-left parameters, but you need to know the length the primer sequence that appears on your reads. So you either need to understand your amplicon sequencing set up (some don't sequence primers, others do), or you can inspect your raw reads to determine if primers are on the reads, and how long they are.

Just look at the first few reads and compare them to the primer sequences you are using. Do they match some part? How much? Determine trim-left-f and trim-left-r from that.

system · September 18, 2017, 2:29am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.