q2- longitudinal feature volatility analysis

Dear developers for the "q2- longitudinal feature volatility analysis" plugin,

We are not 100% sure about how to interpret the "importance" the plugin generates. For example, we have 4 groups of mice in the study(wish I could attach a picture but there's no such option).

  1. The longitudinal feature volatility analysis told us Turicibacter has the most importance after comparing the 4 groups. Could you let us know whether this "importance" was the importance of the contribution of Turicibacter after comparing all 4 groups, the same idea as "repeated measures ANOVA"? Or we should understand it in a different way?
  2. Turicibacter actually has very low abundance after reading the "global mean" panel. But the "importance" is 54%. Is the low abundance bacterium able to make such great contribution to the importance? What statistical method is used for this analysis?
  3. Based on the analysis with this plugin, Turicibacter was increasing at the last time point compared with the starting time point. Can we know what bacteria that significantly decreased also contributed to the importance?

Thank you so much for your patience and help!

Welcome to the forum, @George!
From the q2-longitudinal tutorial:

A supervised learning regressor is used to identify important features and assess their ability to predict sample states.

Which regressor is used will depend on which regressor you select with --p-estimator, but the RandomForestRegressor is used by default.

From the q2-sample-classifier tutorial

Another really useful output of supervised learning methods is feature selection, i.e., they report which features (e.g., ASVs or taxa) are most predictive. A list of all features, and their relative importances (or feature weights or model coefficients, depending on the learning model used), will be reported.... Features with higher importance scores were more useful.... Feature importance scores are assigned directly by the scikit-learn learning estimator that was used; more details on individual estimators and their importance scores should refer to the scikit-learn documentation. Note that some estimators — notably K-nearest neighbors models — do not report feature importance scores, so this output will be meaningless if you are using such an estimator.

Imagine you're interested in trends in the flavors of ice cream. You run a longitudinal study of ingredients (features) using available flavors as the labels for your data set. Your dominant features are probably cream, sugar, egg, etc. They are the most abundant features, but give you little ability to predict trends in flavor production, because they show up in every sample.

Vanilla, and rum are low-"frequency" features, but have much more predictive power. Of these, vanilla is less "important" than rum, because it is present in vanilla, chocolate, chocolate chip, butterscotch, and other flavors. Rum, on the other hand, is only found in Rum Raisin, and so is of high importance to a machine learning tool despite its low frequency.

This is metaphor not science, but hopefully it communicates the important bit: abundance and predictive power are not necessarily linked, and different questions may prioritize different data. "Which ingredients tell us whether this is ice cream or creme brulee?" cares more about the ratios of high-abundance features like eggs and cream, while "Which features best predict which flavor of ice cream this is?" might show low-abundance features to be more impactful.

Features can be important regardless of whether their abundance is increasing or decreasing. Looking at "net avg change" alongside importance can help you tease out which important features are decreasing in abundance.

One note - microbiome data is compositional, which counfounds statements about changes in features' true abundance. Is the feature's abundance increasing, or is the apparent increase caused by decreases in the abundance of other features? Non-compositional methods can show useful trends, but if you plan to report on changes in feature abundance, you will want to consider compositionally-aware approaches (e.g. ANCOM).

You can copy-paste pictures into the text box, or use the upload button image in the menu bar.


Dear Chris,

Wow, thank you so much for your detailed answers! They are really clear!

One quick follow-up question:

Which regressor is used will depend on which regressor you select with --p-estimator , but the RandomForestRegressor is used by default.

We have 4 experimental groups. I uploaded all of them for the feature volatility analysis. Was the “importance” calculated including all 4 groups with the default “RandomForestRegressor” as you mentioned? If I only want to calculate the “importance” considering 2 experimental groups of the 4, should I re-upload my data with just these 2 groups of interest to get the “importance”?

Thank you again for your patience and answer!

Not sure what you mean by this, sorry. If you can share the exact command you ran before, or a QIIME 2 artifact (.qza), that may help clarify things.

Hello Chris,

Yes! I attached a picture of my result. Also, the command I ran was below. Please correct me if I was wrong:

qiime longitudinal feature-volatility --i-table genus_5k.qza --m-metadata-file metadatabarcode2.tsv --p-state-column days --p-individual-id-column Sample --p-n-estimators 50 --p-random-state 50 --output-dir genus_2nd_volatility

The idea is: we have 4 experimental groups of mice: WT(T), WT(UT), Per(T), Per(UT). Samples were collected at 4 time points(they were the 4 days in the x-axis in the attached picture), and we were trying to identify the feature bacteria(at genus level).

Was my command giving me the feature genus by comparing all 4 experimental groups? I don't understand what the most "important" feature genus "Turicibacter" means. Could you let me know how I should interpret it?

Thank you so much Chris, for your help and patience!

This visualization lets you choose any categorical column from your sample metadata (e.g. metadatabarcode2.tsv), and plot a separate line for each class in that column.

Not really. The command was considering all four classes (WT(T), Per(UT), etc), but I don't think it was "comparing" those groups per se. If I understand what it's doing correctly, the command you ran trains a random forest model to predict which class your samples belong to. The features that are most useful to the model in predicting whether a sample is WT(T), Per(T), etc are the most "important" (see the ice cream analogy above). The accuracy-results.qzv that it produces will tell you how accurate those predictions were. If your accuracy is bad, these "important" features might not be very useful.

If you're interested in a broader category (like are these T or UT mice), you can just add a column to your metadata sheet where each sample is labeled with one of the two groups you're interested in and re-run.

If, on the other hand, you're only interested in visualizing samples from two of the four groups you mentioned above, you'll probably need to filter the other samples out.

Good luck!

Dear Chris,

Wow, I am so impressed by how clearly you explained all of the difficult concepts! Thank you so much for the nice metaphor and incredible explanations! May I bother you with one last question?

Following your instructions, I picked out the 2 experimental groups: WT(T) and Per(T), and the feature analysis gave me this result in the new picture: the most important feature is still Turicibacter. Based on your "ice cream metaphor", may I interpret so that the best predictor for differentiating/predicting WT(T) and Per(T) is Turicibacter, instead of understanding it as a significance test that Turicibacter is the most different feature between WT(T) and Per(T)? And also, the difference between "global mean" and "global CV%" is?

Million thanks, Chris, for all your patience and time!

Hey Chris! This is the model accuracy. Does it look acceptable? Which part of the table is usually reported in a publication? Sorry. I am really unfamiliar with machine learning.

Great question, @George, but unfortunately one I can't answer. I'm not an ML expert, and how to interpret results is generally outside the scope of what we cover here. The terms listed in the screenshot you posted are pretty common, though.

I'd encourage you to spend some time with wikipedia, or better, a colleague with some experience in stats/ML. If you have more specific questions after that, feel free to open a new topic. Good luck!

p.s. In case it helps, the q2-longitudinal paper has a nice section on feature-volatility that explains what's going on at a high level.

1 Like

Thank you so much again, Chris! Will do some further study ourselves.
You have helped us so much already!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.