Hello, I followed your tutorial and analyzed my data. I am studying the bacteria associated with a forest tree and a beetle, using qiime2-2020.2.
By reviewing the taxa bar plot and the data structure (mitochondria,unnecessary samples..) I decided to clean my table.qza and rep-seqs.qza (commands below), in order to run a new analysis with filtered data, starting from the ‘tree generation for phylogenetic diversity analysis’ step.
Input: Feature table tableJ.qza and Representative sequences rep-seqsJ.qza
One difference that I noticed is that my new Feature Table “tableJ_idfilter_138.qza“ is not interactive and I cannot directly assess the sampling depth (See screenshot).
My question is whether now I should set a low (e.g. 100) sampling depth in qiime diversity core-metrics-phylogenetic, since I want to use all the samples. During the first analysis, I set the sampling depth at 1164 and I am afraid that if I use this value now, some samples would be discarded.
Do you usually proceed in this way to analyze the alpha and beta diversity of only bacteria, without being biased by other organisms/organelles?
Sounds good! A common problem, especially when looking at host-associated communities, and you are taking the right approach.
You should use qiime feature-table summarize to get that information. It looks like you used metadata tabulate, which shows you everything (too much in this case! You need a summary).
Use the summary visualization to select an appropriate sampling depth based on the filtered data. You could also use alpha rarefaction or another method to select an appropriate depth.
Yes, through filtering the way you have done. Then I would select a sampling depth in the same way (as I've described above) whether or not I did such filtering.
I have another doubt. Once the alpha and beta diversity with the filtered data (no mitochondria,unwanted samples…) is done, repeating the taxonomy step (e.g. classification) on the new filtered sequences, is completely unnecessary, right?
You are correct, re-classifying the taxonomy is totally unnecessary. This is because you have done nothing to the sequences themselves, you have only removed features that are not wanted... but the original taxonomy remains valid and usable (and should not change even if you did reclassify the filtered sequences)