Hello QIIME community,
Can anyone help clear my confusion regarding a few qiime 2 functions please? Thanks in advance.
I came across 3 functions:
qiime diversity alpha-rarefaction
qiime feature-table rarefy, and
qiime diversity core-metrics-phylogenetic
The core-metrics-phylogenetic also outputs a rarefied-table. My questions are:
- I normally do alpha-rarefaction, and get an idea of sampling depth from this function, and use this depth for core-metrics-phylogenetic, where a sampling depth is required. I am assuming the core-metrics-phylogenetics function has its built-in rarefaction, so no more rarefaction is needed before or after it. Right? What is the of the rarefied-table from this function?
- I may have misunderstanding. diversity alpha-rarefaction can plot rarefaciton curve, it needs --p-max-depth flag. But this function is designed to estimate the sampling depth, how could I tell/estimate the max-depth before running it?
- What is the difference between the rarefaction performed in feature-table rarefy and core-metrics-phylogenetics? According to the tutorial, feature-table rarefy is to subsample frequencies from all samples ("without replacement" in 2017.02 version) so that the sum of frequencies in each sample is equal to sampling-depth. Is there any replacement in other rarefaction process? what is the replacement?
- The rarefaction plot generated in alpha-rarefaction function is actually equivalent to Good's coverage as this , right?
All the best~
Correct. The core-metrics-phylogenetic is a pipeline, meaning it wraps several stand-alone plugins so you don’t have to do them one-by-one. It is a convenient thing. It does require a rarefaction depth even though the stand-alone version of those plugins don’t. So when you provide it with a sampling depth it is actually running
feature-table rarefy first, then it performs several actions from various plugins. It also provides you with the rarefied table, again for convenience. Your original feature-table will not be rarefied.
Not quite, from the help-file
--p-max-depth INTEGER The maximum rarefaction depth. Must be greater than
Basically how “deep” do you want your subsampling process to go. In your images it looks like your max-depth would have been somewhere between 40k-50k. There is no estimating maximum depth at all in this plugin. If you are wondering where you can find the maximum depth (the sample with the most sequences) check out the summary of your feature-table using feature-table summarize. You’ll be able to find all your sampling depth values there.
Core-metrics phylogenetic rarefies your table prior to diversity metrics calculations. And
alpha-rarefaction creates rarefaction curves which are somewhat related but mainly different thing than rarefying. Rarefying is just subsampling without replacement.
No, these are simply whatever alpha diversity measure you input on the Y axis, and sampling depth on the X. Though you CAN use Good coverage score if that is what your input is. You’ll have to calculate that first using
diversity alpha with
--p-metric : goods_coverage.
Hope this helps
Thank you very much for your response. Would you be able to help explain a few things? Much appreciated.
What is the use of rarefied table? Alpha diversity is calculated using rarefied table (at least in core-metrics-phylogenetics), should I use the rarefied table for further analysis, like making taxa bar plot, to make data consistent?
I am looking at the observed OTUs/shannon diversity of 11 samples, which can be considered as 11 biological replicates. But one of sample has unusual high observed OTUs number compared with others. Most of the time, I remove samples with extreme with extreme low reads and OTU. Would it be reasonable to remove samples as this with significantly high feature counts and OTUs. It looks just like an outlier to me. Thanks again!
There are a few other discussion here here, here, and here, that might give you some in-depth insights into the questions you have here. There are likely more on the foru, but those ones I found the fastest.
But brief answers to your questions:
This really depends on what it is you are doing. For some actions, like alpha group significance or beta group significance, you should, but for tools that are compositionally aware like ANCOM, ALDEx2, songbird, you shouldn’t.
Normally I would say yes, as long within your experiment you can justify this being an outlier. For example if these were 11 mice from the same treatment, then there is reason to think something happened to this 1 mouse and it is clearly an outlier and so you could justify removing it. In your case however, I am a little concerned with regards to your sampling depth. A depth of 700 sequences is not enough (for most common microbial communities we see anyways) to really capture the full diversity. I Would recommend re-running this with a much higher max depth, say 10-20k at least. I personally go even higher but the benefits diminish rapidly thereafter.
Thanks a lot for your timely reply. Very helpful~
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.