I’m interested in your thoughts.
Originally, I had done all of my microbiome analysis with mitochondria, chloroplasts, and contaminant sequences removed.
However, after doing my analysis, I decided that I also wanted to filter my table to remove any features that were not annotated at at least the phylum level. I then re-did the analysis and re-computed diversity metrics.
However, I am now concerned about how this has changed my diversity metrics, and I’m wondering what your thoughts on this process are. For example, previously across all samples I had a mean Jaccard Distance of 0.95, and now I have a mean Jaccard Distance of 0.68. I understand that the samples that are not annotated at the phylum level are not very informative taxonomically, but I’m unclear whether these features are adding resolution, or adding noise. In other words: which diversity metrics are more reliable?
In general, the comparisons that were significant before are significant now, but one of my key takeaways from my original analysis was that my samples had a very high Jaccard Distance. By filtering out these reads, am I artificially changing the diversity metrics?
Thanks so much!