Dada2 analysis queries

Hello, all

I performed dada2 analysis without any trimming and truncation after demultiplexing, This was the command line I used

qiime dada2 denoise-paired --i-demultiplexed-seqs ITSpaired-end-demux.qza --p-trim-left-f 0 --p-trim-left-r 0 --p-trunc-len-f 0 --p-trunc-len-r --o-representative-sequences ITSrep-seqs-dada2.qza --o-table ITStable-dada2.qza --o-denoising-stats ITSstats-dada2.qza

Before Dada2 analysis, My sequence count was as follows

After Dada2 analysis (without trimming and truncation)

My sequence count reduced immensely in all sample, For instance, Sample 12 lacks nearly 129,806 reads after dada analysis.

My forwards reads quality was good rather than reverse reads, So I truncated low quality reads from reverse reads. I kept phred score value as 20 for truncation. After truncation, My sequence count was reduced even more worse than previous dada analysis result ( without truncation ).

This was the command line I used

qiime dada2 denoise-paired --i-demultiplexed-seqs ITSpaired-end-demux.qza --p-trim-left-f 0 --p-trim-left-r 0 --p-trunc-len-f 0 --p-trunc-len-r 223 --o-representative-sequences ITSrep-seqs-dada2.qza --o-table ITStable-dada2.qza --o-denoising-stats ITSstats-dada2.qza

  1. would reduced sequence count will affect further analysis like taxonomy identification ?

  2. I think I am loosing many reads in dada2 analysis, How to rectify this problem ?

  3. Based upon on my interactive quality plot, Could you please suggest me what truncation value should I use for gaining high sequence count value?

Herewith, I have my file link for your perusal. Kindly have a look on it.

https://drive.google.com/drive/folders/1D_mloWpC3ptZZSkKldn2Wk8hRhjPwNCA?usp=sharing

Looking forward to your reply. Thanking you in advance.

Hey there,
ITS processing before dereplication is a little bit different than 16S, and so a “regular pipeline”. Dada2 provides an ITS tutorial where you can have a look.
https://benjjneb.github.io/dada2/ITS_workflow.html

Cheers

Shouldn’t I follow Dada2 qiime work flow for ITS Sequence analysis?

I don’t have much knowledge on Programming language. Would reduced sequence count affect further analysis like taxonomy identification?

Hi @Asha1,

Yes — we have a basic ITS tutorial that you can follow here (though note that as with 16S analysis there are many different possible workflows with QIIME 2 for analyzing ITS and you may want to check out plugins like q2-itsxpress and others to help with your ITS analysis):

Well maybe yes and maybe no. You need to check out the dada2 stats file to assess.

  1. If you are losing these sequences at the filtering stage or any stage prior to read merging (joining) in the dada2 pipeline, then the loss should be random and will not impact taxonomic composition profiles or diversity estimates unless if loss is extreme.

  2. If you are losing too many sequences at the merging step, then this loss will be non-random and will severely skew taxonomic profiles. ITS is hypervariable in length and sequences lost at the merging stage will bias for taxa with longer ITS that cannot merge. If this is the case, you either need to adjust your trimming settings or quite possibly only process your ITS data as single-end reads (e.g., only use the forward reads in your analysis).

Thank so much for your prompt reply and help sir. I will follow your suggestions and Let you know the outcome soon.

1 Like