I am working on this dataset made of 16S from 102 soil samples from 4 different places. They are all forward reads targeting V3 region.
Quality drops from around base 260 and because of it I’ve decided to evaluate whether do or don’t trim reads at position 260 could improve diversity analyses.
Here you can find a QC file:
reads.soil.QC.qzv (288.5 KB)
I ran denoising (DADA2) with trimming at position 260 and not trimming, both with --p-max-ee 1.5.
I then ran core-metrics-diversity and collected the alpha and beta diversity results, and here begins my asking for any advice.
Number of features reduced from 9806 to 8315, which I don’t consider an issue. Nevertheless, diversity outputs happen to be quite divergent when looking to shannon index and number_of_observed_features. They rose up after trimming.
Indeed, from my experience shannon indexes from soil samples use to range from 6.5 to 8.5-9 instead of 7.6 to 9.7, and those intervals match the not trimmed and trimmed datasets here.
Here a depict of a few samples:
Beta diversity significances are strongly significant different after permanova in both cases, and PCoA plots of weighted unifrac and bray curtis dissimilarities also look similar for trim/not trim, so I don’t think there are effects on it.
However there are differences on adonis output.
Discalaimer: I still didn’t access chemical measurements from soil minerals, so I simulated values of a factor called “measurement” and tested Bdiversity
What I would expect is that significances would be similar but there are huge differences in the p value and R2, alongside the squares…
So, could someone be so kind as to help me understanding why such differences in diversity after trimming reads? Also, does it sound reliable to keep with those trimmed reads, considering quality improvement, even after those differences?