table summary question

Pablo_V · August 30, 2020, 7:24pm

Hi,

I am having problems interpreting the output of the table summary after demultiplexing with DADA2.

The table summary says that there are a total of 7,000 features in the whole dataset and that there are 6 million reads processed, right?

Then the frequency per sample indicates the frequency of reads per sample? I don't think it is the frequency of features per sample since the mean frequency (21,000) is larger than the total number of features.

And how can I interpret the frequency per feature statistics? How can I translate into words the mean frequency of features (881)? Does it mean that one feature is found in average 881 times throughout all the samples?

Thank you for your support.

Cheers,
Pablo

thermokarst · September 1, 2020, 3:19pm

Hi @Pablo_V!

Basically, but I wouldn't say "6 million reads processed," instead it is saying that there are 6.7 million total features in the feature table, after denoising/quality control.

No - reads are no part of the equation any more - a feature table is just that, features. This is just like my correction above - this is the total frequency distribution - apparently in this dataset many features are seen multiple times! Let's work through some examples: the minimum frequency is 5342 - if you click the "Interactive Sample Detail" tab and scroll to the bottom, you will see that the sample with the lowest feature frequency has 5342 observations. Similarly, the sample with the highest frequency of features has 193491 total observations - meaning that many/most features were seen more than once.

Yes, this sounds like a reasonable interpretation to me. Similarly, the most "rare" features were seen only 3 times across all samples.

Hope that helps! :qiime2:

Pablo_V · September 1, 2020, 6:15pm

Hi @thermokarst,

Thank you! I think it is clearer now for me. So this table summary is only about features, i.e. ASVs in this case. So then I have two follow up questions:
(1) is there a way to check how many unique features (ASvs) have been picked up per sample?
(2) when trying to filter the table for a min. frequency of 0.1% mean sample depth, would that be 0.1% of the mean frequency in the frequency per sample section?

Thanks!
Pablo

thermokarst · September 2, 2020, 3:46pm

Alpha diversity analysis will help here, in particular you can compute the "observed features" of each sample. Check this part of the Moving Pictures tutorial for examples.

If I am understanding your question correctly, the answer is "yes."

Pablo_V · September 3, 2020, 3:10pm

Hi @thermokarst,

Great! I will calculate the observed features per sample.

Regarding the 0.1% filtering, I am taking it from the Langille SOP for qiime2. On the SOP is stated
"... One possible choice would be to remove all ASVs that have a frequency of less than 0.1% of the mean sample depth"

So is that indeed what you are understanding?

Cheers,
Pablo

thermokarst · September 9, 2020, 8:05pm

I am not familiar with that SOP, but my understanding is that if you wanted to compute this threshold, it would be 0.1% of 21,078.

Keep us posted!

:qiime2:

Pablo_V · September 20, 2020, 8:28pm

Hi @thermokarst,

Thanks a lot for your support! I did indeed calculated 0.1% of 21,078! I am glad that you back up this calculation.

Cheers,
Pablo