I am working on 16S rRNA data that represent 25 samples collected from dead standing trees with either white- or brown-rot fungi grown on it.
I didn’t observe any clear significant differences between the two categories in terms of the alpha diversity. However, I could see some significance in the beta-diversity. The PCoA plot showed that there is separation but not so clear. The same samples were used for the ITS based analyses by someone else who found that 8 out of 25 samples don’t contain any ITSs representing the above mentioned brown- or white-rot fungi. I removed those 8 samples from my analyses altogether to see if the 16S data become less fuzzy and a bit more revealing.
I did the following:
1. Removed the raw reads of all the 8 samples and re-run the analyses again on 17 samples.
I found that the alpha diversity differences did not change much in terms of statistical significance. However, to my surprise the beta diversity changed and I could not see any statistical significance at all and the PCoA plot was all over the place.
2. Instead of removing all the raw reads and re-doing the demux and DADA2 steps, I made a new metadata file with only 17 samples and performed the downstream analyses using the table.qza representing all 25 samples. (i.e. I took the table.qza representing all 25 samples that was generated after dada2 step and filtered it so that only 17 samples are available for downstream analyses).
I found that the alpha-diversity significance did not change but the beta-diversity was significant. The PCoA was very clear showing separation of groups. The data appeared good enough to show some clear differences.
Note : 25 samples contained 12 brown-rot samples and 13 white-rot samples. While after removal of 8 samples I got a total of 17 samples that contain 10 brown-rot and 7 white-rot samples. DNA extractions, experimental design, collection of samples etc were done by someone else, I only got the .fastq files of the sequences.
Which method is scientifically correct? What would be your advice to make sense of the data in the best possible way?
Thank you all,