A>
qiime diversity core-metrics-phylogenetic
–i-phylogeny rooted-tree.qza
–i-table table.qza
–p-sampling-depth 1103
–m-metadata-file sample-metadata.tsv
–output-dir core-metrics-results
The default matrices are shannon, chao1, observed_otus etc? As far as, I know you there are more than 20 matrices in QIIME2. If I want to add one more matrix such as Good’s coverage, how can I add it to core diversity script?
B> In the core alpha diversity analysis above, it rarefy the master feature table to equal depth of 1103, right? I don’t think the output of core diversity analysis save the sub-sample feature table? Would it be possible for me save it using this scripts? If I am not saving this subsample table, I can’t use this for downstream analysis. I know I can use the rarefy feature table to do it manually. Later, calculate the alpha diversity from the subsampled table. It seems the tutorial doesn’t do this. Mostly, I use the subsample feature table to plot taxonomic plots? Is this a good idea? If you plot taxonomic plot for you research, do you use the total table or subsampled table (equal deapth).
C> About the “observed_otus” output. – I can generate a observed_otus_vector.qza file and I use qiime tools export to export it to csv file.
Something looks like this
observed_OTUs
sample 1 1000
sample 2 200
I am confused about the header “observed_OTUs”. Does this mean observed ASV? I think I use DADA2 workflow and I don’t do OTU clustering.
6>This is a general question. Do you normally filter out those rare OTUs/ASVs in your research? If you do, any general rules about this? The tutorial remove low abundance features, which is less than 10. I am not sure if this is general rule? I did my dataset at 50? Is this too high? I think this is trade-off. If you remove too many, it will give you a good rarefaction curve, but you lose diversity.
Default are Shannon, observed OTUs, Faith PD and evenness (Pielou).
Yes, just run it with a chosen metric
You will find a rarefied table in the output folder
I did both and didn't find a big differences between plots in other analyses.
Yes, it will be ASVs until you clustered it to OTUs.
I don't think there are specific rules, it's entirely depends on the origin source of you samples, on the amount and frequency of the features and number of repeats for each treatment.
I would take a look on the overall abundances of the ASVs or features among the samples. For example, in soil samples I had very high frequencies of sequences, but in other niche much lower. So I decided to remove only sequences with abundances lower, than 10. But if I had only soil samples with high frequencies, I would increase a minimum frequency to 25 or higher.
Also lets say I have at least 10 repeats for each time-point and treatment. Based on it, I would delete all features that found less than in 4 samples. But if I had 20 samples, I also could increase this limit to 6 samples. Something like this.
1>“Also lets say I have at least 10 repeats for each time-point and treatment. Based on it, I would delete all features that found less than in 4 samples. But if I had 20 samples, I also could increase this limit to 6 samples. Something like this.”
I think I understand you point. For example, you want all ASVs have at least 10 repeats in each sample. How do you do this?
Did you run qiime feature-table filter-features --p-min-frequency 10
If I run this, I think min-frequency means total ASVs among samples less than 10 will filter out. If I have 100 samples, there is an ASV only appear in sample1 for 9 times. Then, this one will be filtered. It doesn’t mean there are 900 ASVs and 9 ASVs in each sample.
I am not sure which parameter I should use to for a repeats cutoff.
2> “You will find a rarefied table in the output folder” I found it I didn’t realize they are given in .gz format.
After I unzip it, there are two biom format files. one is table name as “table_even1100.biom”, the other is table_mc1000.biom. Should I use the former one? what does the mc table means?
I used this command to remove features that have frequencies lower than 10 and found in less than 5 samples. But I think you are asking for a different settings, so check the tutorial for better examples.