Dada2 trunc-len 0 for ITS analyses

rahel_park · August 21, 2018, 5:34pm

Hi,

I am analyzing fungal ITS2 (ITS86F/ITS4 primers, PE300) sequences and I have some questions regarding the truncation of the reads as well as dealing with amplicons of different lengths.
In several sites I have seen that it's advised to put the truncation length at dada2 denoise step to "0".
(https://github.com/Joseph7e/ITS_metabarcoding_analyses
DADA2 Pipeline Tutorial (1.16) )
This makes sense as ITS has very variable length depending on the species - using a specific truncation length for all reads will bias the results (as you loose all the shorter amplicons). At the same time when setting the truncation length to 0 I have a lot less sequences that pass the merging step. I tried it as well on data that has been passed through cutadapt.
I found some explanation at this thread DADA2, truncation lengths and features number - but in the end I still don't know how to proceed with my data so that I could keep all quality reads... Maybe ITSxpress will help (posted another topic on my issues with that tool).

demux-paired-SB72.qzv (270.0 KB)
trimmed_sequencesSB72.qzv (274.4 KB)

Original data:

qiime dada2 denoise-paired
--verbose
--i-demultiplexed-seqs demux-paired-end_SB72.qza
--p-trim-left-f 0
--p-trim-left-r 0
--p-trunc-len-f 270
--p-trunc-len-r 220
--p-n-threads 40
--o-table table_0SB72.qza
--o-representative-sequences rep-seqs_SB72.qza
--o-denoising-stats denoising-stats_truncSB72.qzv

denoising stats
sample-id input filtered denoised merged non-chimeric
SB72 18647 14620 14620 14513 14466

qiime dada2 denoise-paired
--verbose
--i-demultiplexed-seqs demux-paired-end_SB72.qza
--p-trim-left-f 0
--p-trim-left-r 0
--p-trunc-len-f 0
--p-trunc-len-r 0
--p-n-threads 40
--o-table table_0SB72.qza
--o-representative-sequences rep-seqs_0SB72.qza
--o-denoising-stats denoising-stats_0SB72.qza

denoising stats
sample-id input filtered denoised merged non-chimeric
SB72 18647 11091 11091 298 298

The cutadapt trimmed data - removed the part that for short amplicons runs into reverse primer

qiime dada2 denoise-paired
--verbose
--i-demultiplexed-seqs trimmed_sequences.qza
--p-trim-left-f 0
--p-trim-left-r 0
--p-trunc-len-f 270
--p-trunc-len-r 220
--p-n-threads 40
--o-table table_trimmedSB72.qza
--o-representative-sequences rep-seqs_trimmedSB72.qza
--o-denoising-stats denoising-stats_trimmedSB72.qza

denoising stats
sample-id input filtered denoised merged non-chimeric
SB72 18647 13822 13822 13770 13770

qiime dada2 denoise-paired
--verbose
--i-demultiplexed-seqs trimmed_sequences.qza
--p-trim-left-f 0
--p-trim-left-r 0
--p-trunc-len-f 0
--p-trunc-len-r 0
--p-n-threads 40
--o-table table_0trimmedSB72.qza
--o-representative-sequences rep-seqs_0trimmedSB72.qza
--o-denoising-stats denoising-stats_0trimmedSB72.qza

denoising stats
sample-id input filtered denoised merged non-chimeric
SB72 18647 11488 11488 703 703

So in the end I am having trouble with the logical recommendation for ITS not to truncate the reads, but then loosing most of my reads.

Thank you in advance!!!

Nicholas_Bokulich · August 22, 2018, 5:36pm

I think as you noted above, this is because of read-through through the primer at the 3' ends of the reads. So untrimmed reads leave the read-through, which cannot align

Are you sure cutadapt is removing everything? The reverse primers on the forward reads and the forward primers on the reverse reads? From those demux summaries it looks like only a fraction of sequences were actually trimmed, which seems surprising, though with ITS it's always questionable what the expected read length is if you don't know what species are present...

q2-ITSxpress should take care of the trimming issues

But otherwise it looks like the trimmed reads are working quite well for you — unless if there's something wrong with that protocol that I'm missing, I'd say run with those if q2-ITSxpress does not fix things for you...

I hope that helps!

rahel_park · August 23, 2018, 8:36am

Hi,
I'm rather sure that cutadapt removes everything - only fraction was trimmed just because many of the amplicons were very long - so they didn't have the read-through issue.
But finally I agree that the ITSxpress is the way to go
Thanks!

system · September 23, 2018, 2:36pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.