Denoising and Clustering from Tutorials

Tutorial says

Denoising and clustering

Congratulations on getting this far! Denoising and clustering steps are slightly less confusing than importing and demultiplexing!

The names for these steps are very descriptive:

  1. We denoise our sequences to remove and/or correct noisy reads.
  2. We dereplicate our sequences to reduce repetition and file size/memory requirements in downstream steps (don’t worry! we keep count of each replicate).
  3. We cluster sequences to collapse similar sequences (e.g., those that are ≥ 97% similar to each other) into single replicate sequences. This process, also known as "OTU picking", was once a common procedure, used to simultaneously dereplicate but also perform a sort of quick-and-dirty denoising procedure (to capture stochastic sequencing and PCR errors, which should be rare and similar to more abundant centroid sequences). Use denoising methods instead if you can. Times have changed. Welcome to the future.
    ---------------------------------------------------------from Overview of QIIME 2 Plugin Workflows — QIIME 2 2021.11.0 documentation
    The bold section is not clear to me.
    I understand as " OTU picking performs a denoising procedure to capture stochastic sequence and PCR errors. And the PCR errors are error-sequences which rare but similar to more abundant centroid sequences".
    so, please let me know I am understood well.



I think this video from one of our workshops does a great job at explaining this. Watch it and if does not answer your question, come back here and I will try to answer any further questions you have :slightly_smiling_face:

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.