Alpha diversity core matrix

moonlight · February 7, 2020, 2:55am

The question is about the script “qiime diversity core-metrics” and its output. I follow the moving picture tutorials.(https://docs.qiime2.org/2019.10/tutorials/moving-pictures/#featuretable-and-featuredata-summaries)

A>
qiime diversity core-metrics-phylogenetic
–i-phylogeny rooted-tree.qza
–i-table table.qza
–p-sampling-depth 1103
–m-metadata-file sample-metadata.tsv
–output-dir core-metrics-results
The default matrices are shannon, chao1, observed_otus etc? As far as, I know you there are more than 20 matrices in QIIME2. If I want to add one more matrix such as Good’s coverage, how can I add it to core diversity script?

Or I have to do it manually as suggested here (Alpha and Beta Diversity Explanations and Commands)

B> In the core alpha diversity analysis above, it rarefy the master feature table to equal depth of 1103, right? I don’t think the output of core diversity analysis save the sub-sample feature table? Would it be possible for me save it using this scripts? If I am not saving this subsample table, I can’t use this for downstream analysis. I know I can use the rarefy feature table to do it manually. Later, calculate the alpha diversity from the subsampled table. It seems the tutorial doesn’t do this. Mostly, I use the subsample feature table to plot taxonomic plots? Is this a good idea? If you plot taxonomic plot for you research, do you use the total table or subsampled table (equal deapth).

C> About the “observed_otus” output. – I can generate a observed_otus_vector.qza file and I use qiime tools export to export it to csv file.

Something looks like this

             observed_OTUs

sample 1 1000

sample 2 200

I am confused about the header “observed_OTUs”. Does this mean observed ASV? I think I use DADA2 workflow and I don’t do OTU clustering.

6>This is a general question. Do you normally filter out those rare OTUs/ASVs in your research? If you do, any general rules about this? The tutorial remove low abundance features, which is less than 10. I am not sure if this is general rule? I did my dataset at 50? Is this too high? I think this is trade-off. If you remove too many, it will give you a good rarefaction curve, but you lose diversity.

What do you normally do?

Thanks in advance

timanix · February 7, 2020, 9:06am

Hi!

Default are Shannon, observed OTUs, Faith PD and evenness (Pielou).

Yes, just run it with a chosen metric

You will find a rarefied table in the output folder

I did both and didn't find a big differences between plots in other analyses.

Yes, it will be ASVs until you clustered it to OTUs.

I don't think there are specific rules, it's entirely depends on the origin source of you samples, on the amount and frequency of the features and number of repeats for each treatment.

moonlight · February 7, 2020, 8:50pm

I am working on temperate soil? Any rules for this environment? what is your system? and what would you do?

"

timanix · February 7, 2020, 9:25pm

I would take a look on the overall abundances of the ASVs or features among the samples. For example, in soil samples I had very high frequencies of sequences, but in other niche much lower. So I decided to remove only sequences with abundances lower, than 10. But if I had only soil samples with high frequencies, I would increase a minimum frequency to 25 or higher.
Also lets say I have at least 10 repeats for each time-point and treatment. Based on it, I would delete all features that found less than in 4 samples. But if I had 20 samples, I also could increase this limit to 6 samples. Something like this.

moonlight · February 10, 2020, 2:02am

Hi Timanix,

1>“Also lets say I have at least 10 repeats for each time-point and treatment. Based on it, I would delete all features that found less than in 4 samples. But if I had 20 samples, I also could increase this limit to 6 samples. Something like this.”

I think I understand you point. For example, you want all ASVs have at least 10 repeats in each sample. How do you do this?

Did you run qiime feature-table filter-features --p-min-frequency 10

If I run this, I think min-frequency means total ASVs among samples less than 10 will filter out. If I have 100 samples, there is an ASV only appear in sample1 for 9 times. Then, this one will be filtered. It doesn’t mean there are 900 ASVs and 9 ASVs in each sample.

I am not sure which parameter I should use to for a repeats cutoff.

2> “You will find a rarefied table in the output folder” I found it I didn’t realize they are given in .gz format.

After I unzip it, there are two biom format files. one is table name as “table_even1100.biom”, the other is table_mc1000.biom. Should I use the former one? what does the mc table means?

timanix · February 10, 2020, 2:39pm

Hi!

There should be a file rarefied_table.qza

Here is the detailed tutorial which commands to use for different cases
https://docs.qiime2.org/2019.10/tutorials/filtering/

I used this command to remove features that have frequencies lower than 10 and found in less than 5 samples. But I think you are asking for a different settings, so check the tutorial for better examples.

!qiime feature-table filter-features \
    --i-table table.qza \
    --p-min-frequency 10 \
    --p-min-samples 5 \
    --o-filtered-table filtered_table.qza

system · March 12, 2020, 8:50pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.