ITS-metabarcoding, denoised data

Excuse me I am quite new to the metabarcoding analysis,
I started to analyze 20 samples (Endophytes and soil samples), I run the following lines;
iime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-format PairedEndFastqManifestPhred33
--input-path manifest.txt
--output-path sequences.qza
qiime cutadapt trim-paired
--i-demultiplexed-sequences sequences.qza
--o-trimmed-sequences sequences-trimmed.qza

qiime demux summarize
--i-data sequences-trimmed.qza
--o-visualization sequences-trimmed.qzv

dada2 denoising

qiime dada2 denoise-paired
--i-demultiplexed-seqs sequences-trimmed.qza
--p-trunc-len-f 0
--p-trunc-len-r 0
--p-max-ee-f 3
--p-max-ee-r 3
--p-n-threads 20
--o-representative-sequences dada2-repseq_ITS.qza
--o-table dada2-table_ITS.qza
--o-denoising-stats dada2-stats_ITS.qza

and I noticed that the denoised samples was quite poor/ low as below ..

my question is what steps I should take in consideration to improve the results if there are any steps I should do ?

Thank you

Welcome to the forums! :qiime2:

Thank you for posting your full commands and DADA2 output stats. It looks like most reads are being removed during filtering, so let's work to improve that.

Have you viewed the sequences-trimmed.qzv file? This will show you quality throughout the read.
I usually trim off the low-quality ends of the read, which helps more reads to pass the quality filter.

See this post for more details: Why do I have more sequences (therefore more taxonomic groups recovered) when I use only my forward sequences ? - #4 by SoilRotifer

Let us know what you try next and if you have more questions!


if I changed that the :
--p-trunc-len-f 0 to --p-trunc-len-f 240
--p-trunc-len-r 0 to --p-trunc-len-f 200

would it help ? and do you think that it might affect the upcoming diversity analysis ?

Thank you

Hello Colin,

Thank you :slight_smile:

Yes I did and I noticed that the quality is not that good I will attach the figures here

Those two graphs are very helpful!

Those new settings make a lot more sense to me!

When you run DADA2 with those settings, what happens?

I did something else,

  1. Truncation both forward and reverse at 200 and this was the results: Image 200
  2. Truncation both forward and reverse at 180 and this was the resultsL image 180

I think the truncation for both ends at 200, improved the results more, while at 180 it wasn't really helpful although some sequences dropped compared to the 200 .. what do you think ?

Because the quality is different in forward and reverse, I think you should choose different truncation settings for forward and reverse :arrow_forward: :arrow_backward:

Or don't worry about it! Having >80% of the reads pass filter is great! Having >70% of the reads merge is also very good in general and especially for ITS data.