Hi QIIMEer
I am analyzing the gut microbiome of herptiles, which has been spiked with specific fungal and bacterial species. This is to be used for discerning fungal-bacterial interactions. I used the QIIME2 pipeline to analyze 16S and ITS gene markers and initially selected DADA2, resulting in ASVs. However, I ended up with several ASVs related to the spiked species (~300 for 16S and 12 ITS), which led me to check pairwise phylogenetic distances and create several phylogenetic trees before merging similar species for calculating spiked scaling factors and converting relative to absolute abundance.
To prevent these steps, I decided to redo the analysis using VSEARCH clustering at 97% similarity for OTUs. The results were significantly different. As expected, there was lower diversity, but I also noticed different results in the percentage of spiked species across the samples (this was unexpected).
Here is what I did to achieve clustering to OTUs: I denoised the data using DADA2, then clustered the resulting representatives with VSEARCH, and finally assigned taxa to the clustered OTUs using the sklearn classifier.
ASVs gave me an acceptable spiked percentage across the spiked samples. However, OTUs surprisingly resulted in more failed samples regarding spiked percentage, which, based on Roa et al., 2020 (Multi-kingdom ecological drivers of microbiota assembly in preterm infants | Nature ), should be between 0.1% to 10% of the total reads per sample to be acceptable.
So, my questions are:
Should I switch to OTUs for clustering the representative reads, or is it acceptable to use ASVs and then cluster those specific spiked species even if they have 2 to 4 nucleotide differences as I did before switching to OTUs?
Thank you so much for your time and guidance in advance