DADA2 differ in QIIME2 version?

Hi everyone,
After running DADA2 with the following command, there are clear differences in the final number of sequences in the file “denoising-stats.qzv” depending on the version of QIIME2 (2019.4 or 2019.7). Does it have something to do with the version or there are other possible reasons for this? (biggest difference is in non-chimeric).
The only difference between 2 runs (besides qiime version) is the set of samples i ran, although I’m comparing the same samples.
Thank you in advance! :slightly_smiling_face:

The command:
qiime dada2 denoise-paired
–i-demultiplexed-seqs DADA2_files/paired-end-demux.qza
–p-trim-left-f 0
–p-trim-left-r 0
–p-trunc-len-f 250
–p-trunc-len-r 250
–o-table DADA2_files/table_full.qza
–o-representative-sequences DADA2_files/rep-seqs.qza
–o-denoising-stats DADA2_files/denoising-stats.qza
–p-n-threads 24

The denoising-stats.qzv output for the same sample (input --> filtered --> denoised --> merged --> non-chimeric)
version 2019.4: 430534 --> 222817 --> 222817 --> 219351 --> 56466
version 2019.7: 430534 --> 222817 --> 221058 --> 217861 --> 44665


Hi @pau,
Welcome to the forum!
If I understand correctly, you have done 2 dada2 runs with identical parameters, which have different samples but happen to have some duplicate samples int hem, and those duplicate samples are being denoised slightly differently between the two Qiime2 versions. Right?
So, I don’t think anything changed with regards to q2-dada2 between those 2 Qiime2 versions (if I’m wrong, one of the devs can correct me) but what is likely causing this ‘slight’ change is the fact that you have different samples and the training subset of reads which is used to the train the error model is being draw from different samples. This is why you begin to see slight changes at the ‘denoising’ step in your stats output and all the downstream steps are thus affected a bit. In your case I don’t see this as an issue as you are still getting lots of reads, but yes there will be some slight changes from run to run if you are changing the samples included.
If this is a concern for your analysis, and you have a set up where you routinely are adding samples, Deblur might be a good option for you. Deblur uses a static error model so regardless of what samples you put into it, it should behave exactly the same.
Hope this clarifies it.


Thank you very much for your answer!


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.