Dada2 error rate plots to compare accuracy of different calibrations

Lpoehlein · October 18, 2024, 12:57am

Hello all,

I am running into an issue with trying to compare accuracy of different denoising runs using dada2. I ran a traditional dada2 run, and then ran one with "pseudo" pooling instead of the default "individual" and then one with pseudo pooling along with adjusting the number of reads the dynamic model learns from to 15% of the demux frequency, splitting the range of the recommended 10-20% it is recommended to learn from. The lines for adding those are below:

--p-n-reads-learn 1340000
--p-pooling-method pseudo \

While these ran fine, my goal was to compare their error rate plots to see which had the best fit between observed error rates and the estimated error rate, to see the improvements of these calibrations to the resolution of the denoising step and accuracy of the calls for downstream taxonomy. Using the --verbose function to create a log did not end up including data about the error rates, and extracting information from the denoise stats qza file did not reveal any information saved about the error rates. I am interested in seeing if anyone has been able to successfully obtain this data (and have it saved in a file) using qiime2. From what I have read, this is possible by running dada2 in R, but I am looking to see if there is a workaround in qiime2. Although, if anyone has performed this in R and has graphs they were able to visualize, I would love to see and learn how. If anyone is curious, this link below has error rate plots I am referring to about 1/3 of the way down on the page. Thanks.
[Metagenomics - Bioinformatics Workbook]

colinbrislawn · October 18, 2024, 1:06am

Welcome to the forums Lance,

I have great news! This functionality is being added to the DADA2 plugin!

github.com/qiime2/q2-dada2

Error model output

qiime2:dev ← jordenrabasco:error_model_output

opened 07:39PM - 22 Sep 24 UTC

jordenrabasco

+6891 -31

This pull request is to resolve issue #158 Big changes: - Q2-DADA2 'denois…e-' commands output a collection[DADA2stats] rather than a DADA2STATS object - New Q2-DADA2 action stats_viz will visualize all DADA2stats in the collection[DADA2stats] (denoised stats, and error model stats) as a singular visualization with different tabs for each DADA2STATs object. - tests were updated accommodate the new output type - tests added for the visualization to make sure that all support files are generated

For now, you would have to do some technical stuff to try it (q2-dev conda env, git pull from PR, etc.)

Soon, this will be included with the amplicon distribution!