I am analysing a dataset generated from amplicon sequencing of a DNA region corresponding to viral genes. More specifically, the PCR targeted the major protein capsid gene of dinornaviruses found in corals and their associated algae.
A previous study (https://doi.org/10.3389/fmicb.2017.01665) showed that the diversity in such data could be huge and that clustering sequences at a lower similarity cutoff (lower than 100% or 98% - the paper in question actually also used 65% similarity) could be more meaningful.
I therefore wish to ask if it is possible to use different cut-offs for the ASV clustering while processing the sequences in Qiime2?
I think this is a great idea! And this would be a great way to make your modern analysis comparable with past papers.
So… sort of. ASVs don’t have cutoffs, as the denoising methods are designed to resolve as much diversity as possible.
The recommended way to do this is to denoise to make ASVs, say with dada2, then perform de novo OTU clustering at various levels to match previous papers, say with vsearch. This gives you both the highest possible resolution, and historically consistent OTU cutoffs.
I’m not sure if there is a tutorial for this, but the workflow should be:
reads
|
v
preprocessing
|
v
ASVs -> ASVs
|-> 99% OTUs
|-> 97% OTUs
|-> 90% OTUs