I have analyzed several environmental samples using dada2 and CLC in order to examine the differences between the two methods.
From the CLC I get an OTU table using OTU peaking and Refseq scripts.
My question is regarding the alfa diversity of the samples.
When I am analyzing the abundance tables for evenness value, I see a huge different between the two methods, when the CLC provides a much less evenness community ( around 0.4) and the dada2 provide a much higher evenness ( around 0.8).
The number of different species (richness) is also different (300 ASV as opposed to more than 1000 OTUs), however I understand that idea of using dada2 platform is to get more refined number of species, so I am ok with that.
This is the script that I use for the dada2 denoising:
qiime dada2 denoise-paired --i-demultiplexed-seqs /home/qiime2/Desktop/xxx_paired-end-demux.qza --p-trim-left-f 0 --p-trim-left-r 0 --p-trunc-len-f 240 --p-trunc-len-r 240 --p-n-threads 0 --o-representative-sequences xxx_rep-seqs.qza --o-table xxxx_table.qza --o-denoising-stats xxx_dada2.qza --verbose
The amplicons ( for and rev) size is 250 nt.
Thank you very much for your help,
Great question. This is the key to both:
There has been a lot of discussion about this on the forum. Essentially, OTU clustering leads to much noisier data and requires more stringent filtering to avoid inflating diversity estimates. The point of denoising is to identify and remove or correct these errors, leading to better estimates of diversity that preserve unique sequence variants.
So OTU clustering with CLC is expected to lead to both high richness AND lower evenness, because the many extra spurious OTUs will be low abundance and hence skew the evenness metric. Try filtering OTUs based on abundance (as linked above) prior to calculating evenness… this should lead to more similar evenness estimates if you filter enough.
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.