I am having problems interpreting the output of the table summary after demultiplexing with DADA2.
The table summary says that there are a total of 7,000 features in the whole dataset and that there are 6 million reads processed, right?
Then the frequency per sample indicates the frequency of reads per sample? I don't think it is the frequency of features per sample since the mean frequency (21,000) is larger than the total number of features.
And how can I interpret the frequency per feature statistics? How can I translate into words the mean frequency of features (881)? Does it mean that one feature is found in average 881 times throughout all the samples?
Basically, but I wouldn't say "6 million reads processed," instead it is saying that there are 6.7 million total features in the feature table, after denoising/quality control.
No - reads are no part of the equation any more - a feature table is just that, features. This is just like my correction above - this is the total frequency distribution - apparently in this dataset many features are seen multiple times! Let's work through some examples: the minimum frequency is 5342 - if you click the "Interactive Sample Detail" tab and scroll to the bottom, you will see that the sample with the lowest feature frequency has 5342 observations. Similarly, the sample with the highest frequency of features has 193491 total observations - meaning that many/most features were seen more than once.
Yes, this sounds like a reasonable interpretation to me. Similarly, the most "rare" features were seen only 3 times across all samples.
Thank you! I think it is clearer now for me. So this table summary is only about features, i.e. ASVs in this case. So then I have two follow up questions:
(1) is there a way to check how many unique features (ASvs) have been picked up per sample?
(2) when trying to filter the table for a min. frequency of 0.1% mean sample depth, would that be 0.1% of the mean frequency in the frequency per sample section?
Alpha diversity analysis will help here, in particular you can compute the “observed features” of each sample. Check this part of the Moving Pictures tutorial for examples.
If I am understanding your question correctly, the answer is “yes.”
Great! I will calculate the observed features per sample.
Regarding the 0.1% filtering, I am taking it from the Langille SOP for qiime2. On the SOP is stated
“… One possible choice would be to remove all ASVs that have a frequency of less than 0.1% of the mean sample depth”