dada2 denoise-paired + cutadapt trim-paired

Birong · June 10, 2021, 12:26am

Hi everyone,

Now I have 15 paired data(300bp, illumina) with 515FB(19bp) and 926R(20bp).
Here are my scripts and my results:

Importing data Casava 1.8 paired-end demultiplexed fastq

qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path data
--input-format CasavaOneEightSingleLanePerSampleDirFmt
--output-path demux-paired-end.qza

Trim primers

qiime cutadapt trim-paired
--i-demultiplexed-sequences demux-paired-end.qza
--p-front-f GTGYCAGCMGCCGCGGTAA
--p-front-r CCGYCAATTYMTTTRAGTTT
--p-error-rate 0
--p-minimum-length 200
--o-trimmed-sequences trimmed-seqs.qza
--verbose

dada2 denoise-paired

qiime dada2 denoise-paired
--p-n-threads 16
--i-demultiplexed-seqs trimmed-seqs.qza
--p-trunc-len-f 243
--p-trunc-len-r 226
--p-trim-left-f 0
--p-trim-left-r 5
--o-table table.qza
--o-representative-sequences rep-seqs.qza
--o-denoising-stats stats.qza

And I have some questions:

Why did the length of reads remain unchanged after primer trimming? They are still 300bp?
In the first figure, the interaction quality plot of forward reads, why are there some missing parts (I mean only some dotted lines) at the beginning?
My overlap= 2*300-(916-515)=189 (Here I do not know if I need to cut primer, because question 1), my trim= (300-246)+(300-226-5)=123, so it seems okay. But the percentage of input passed filter and the percentage of input merged is very low, even if I set: --p-trunc-len-f 243 \ --p-trunc-len-r 196 , it can only increase from 70% to 75%, but still very low, right?

I would appreciate any suggestions. Thanks!

Kind regards,
Birong

ChrisKeefe · June 10, 2021, 12:36am

Welcome to the forum, @Birong!

Cutadapt only removes primers when it finds them. It's powerful but complex software, with many parameters. I'm guessing that by changing the default error rate from 0.1 to 0, cutadapt failed to match with some of your primers that were a little faulty, but it could be something else. Because some of your reads were still 300 bp long, the quality plot had to show 300 bp.

Try clicking and dragging a "selection" over the part of the plot you're wondering about in q2view. I expect you'll find box plots once you zoom in.

You seem to have reasonably good quality, so you could probably do better, but many of your samples are very deep. There are a lot of good discussions of how to set DADA2 truncation parameters on this forum. Please do some searching , and you'll learn a lot!

Side note: DADA2's --p-trim-left parameters will allow you to trim off your primers without cutadapt, so long as you know how many bp you need to trim.

Happy :qiime2: -ing!

system · July 11, 2021, 6:36am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.