I am currently trying to do comparison between DADA2 and the OTU clustering methods. As we know, chimera removal is already included in DADA2 while for OTU clustering method, we will have to do it 'manually'.
Therefore, I wish to get the chimera statistic that is almost similar to what DADA2 provides like below:
Hi @thermokarst - yes, I did that one as well but it doesn't produce the statistics like the one from DADA2. The file stats.qza from uchime-denovo does not summarize the nonchimera reads from each samples. I would like to find out the total number of non-chimeric reads from all samples.
I am thinking to use the value from the table-nonchimeric-wo-borderline.qza as non-chimeric reads.
However, it is written as feature count and not reads count, therefore I don't think it is correct, isn't it?
QIIME 2 doesn't have a way for mapping what features came from what reads ("OTUMap"), so answering this question isn't possible. Is it necessary, or can you get a good sense of what you're looking for based on the features?
My purpose is to compare the chimera removal step in both DADA2 and otu clustering method pipeline.
What I am trying to say is, DADA2 method pipeline has already included chimera removal in the pipeline, right? And at the end of this step, we are able to get stats-dada2.qza by running this:
From the stats-dada2.qza we can obtain information such as total non-chimeric (just like in the image in the first post).
Therefore, I would like to recreate this information but using otu clustering method pipeline.
For otu clustering method, we would have to do dereplicate/deblur > otu filtering > chimera filtering > abundance filtering, right? However, at the chimera filtering or abundance filtering step, it does not provide any stats for after the chimera filtering so we are unable to get the same information like what stats2-dada.qza can provide.
I am wondering if it is possible to obtain such information from otu clustering method.
Hope this makes sense. Thanks again.
Hi @afrinaad - as I said above, no, this isn't possible, there are no mechanisms currently in place for tracing the features back to their original reads. If you want to compare, you could compute the percentage of merged non-chimeric from the DADA2 results, and directly compare to the OTU clustering.