Alpha diversity metrics change after dada2/deblur

pau · November 17, 2020, 10:26am

Dear all,
I got a recent issue while running two similar paths but only changing dada2/deblur from the same sample groups.

From sample group 1,2 and 3 i ran dada2 or deblur following commands below (as suggested in q2 tutorials). For deblur i ran:

qiime quality-filter q-score
--i-demux demux.qza
--o-filtered-sequences deblur_files/demux-filtered.qza
--o-filter-stats deblur_files/demux-filter-stats.qza ;

qiime deblur denoise-16S
--i-demultiplexed-seqs deblur_files/demux-filtered.qza
--p-trim-length 240
--p-left-trim-len 5
--p-jobs-to-start 16
--p-sample-stats
--o-table deblur_files/table-deblur.qza
--o-representative-sequences deblur_files/rep-seqs-deblur.qza
--o-stats deblur_files/denoising-stats-deblur.qza

And for dada2 (I did this 3 times, for each group as they came from 3 separate sequencing runs followed by a table and rep-seqs merge).

qiime dada2 denoise-paired
--i-demultiplexed-seqs demux-B.qza
--p-trim-left-f 0
--p-trim-left-r 5
--p-trunc-len-f 250
--p-trunc-len-r 240
--o-table dada2-files/table-B.qza
--o-representative-sequences dada2-files/rep-seqs-B.qza
--o-denoising-stats dada2-files/denoising-stats-B.qza
--p-n-threads 24

For both options I followed the same downstream analysis to alpha diversity metrics Observed OTUs and Chao index, and observed significant changes whether the denoising followed (just an example with Chao, but is the same situation with observed OTUs). Group 2 (in the middle) is changing a lot its richness, as well as its p value compared to other groups. In the image (up is Chao index computed from deblur files, and down from dada2 files)

On the other hand, when estimating alpha diversity with Shannon index (taking into account evenness) the situation is the same in both cases, so there's no problem.

I also checked both denoising stats and saw that samples from groups 1 and 3 had more reads after deblur than dada2(as I suppose it's normal), but group 2 samples had more reads after dada2 than deblur (don't know if could cause changes).

Any idea about what could be going on? I hope everything is clearly explained!
Thanks a lot for your support!

llenzi · November 17, 2020, 12:04pm

Hi @pau

Just a question, did you merged the reads before using deblur? If not you denoised forward read only, making the comparison quite unfair to me! If not, please look at: 'Alternative methods to read-joining' tutorial (Alternative methods of read-joining in QIIME 2 — QIIME 2 2020.8.0 documentation).

Also as a note aside, I tend not to use Chao2 index because rely on 'doublets' count which may be not reliable for the way dada2 filter the sequences out (deblur, I am not sure). I happy to hear other view on this!
Hope it helps
PS I changed to 'user support' because it seems more appropriate to me!

pau · November 18, 2020, 1:56pm

Hi @llenzi , thanks for your quick answer!
That's maybe what could be happening as I did not merge reads. I just started a new analysis doing so, so let's see if it gets more similar!
As I am quite new using deblur, would you recomend anything else appart from deblur denoising? maybe a chimera filtering. For example uchime (vsearch), I think de novo wouldn't be necessary, but perhaps uchime-ref after deblur (uchime-ref: Reference-based chimera filtering with vsearch. — QIIME 2 2020.8.0 documentation). Or another step?
And about qiime quality-filter q-score, it is recomended in the moving pictures tutorial to be used under default parameters, but it's not PHRED score = 4 a very low value? (compared to dada2).
Thanks a lot for your help!!

llenzi · November 19, 2020, 9:13am

Hi @pau,
I don't use much deblur myself, but I would start following the tutorial linked above. The chimera filtering should be included as one of the deblur step (as well as within dada2) so should not be necessary. There are many interesting discussion on the statistic beyond deblur (Deblur stats.qzv file meaning and interpretation - #3 by wasade), and comparing dada2 and deblur results (Different taxa result from DADA2 and Deblur). So please search also the forum for more information.

On the quality filter, from the manual: the default for the quality-filter q-score=4 (q-score: Quality filter based on sequence quality scores. — QIIME 2 2020.8.0 documentation), in the dada2 denoise-single: the default is 2 (denoise-single: Denoise and dereplicate single-end sequences — QIIME 2 2020.8.0 documentation). So it seems even lower to me!

For the comparison, please consider that deblur pre-filter the sequences to exclude anything not loosely similar to 16S, so if you have other species in the samples, they will included into alpha-diversity results using dada2 but not in the deblur results. That may be misleading too!
Hope it helps

system · December 20, 2020, 3:13pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.