We would like to repeat feature volatility analysis with q2-longitudinal code multiple times to obtain more accurate machine learning models, using paired samples. To check the overall accuracy of the final accumulated model, I would like to have the sample information used for training and the coordinate data used for the calculation of accuracy-results for each feature volatility analysis with q2-longitudinal, but I cannot output them.
Could you tell me how to output the following data in feature volatility analysis in q2-longitudinal?
1) Sample information used for training and test
2) The coordinate data used for the calculation of accuracy-results
Thank you for your help!
Hi @Shimpei ,
feature-volatility is a pipeline that runs multiple different actions under the hood — and it only outputs some of the results, it throws out the intermediate data (e.g., specifics on which samples were used for training/testing). You should instead:
- use q2-sample-classifier directly. The
regress-samples pipeline will give you what you need.
- you can then pass the outputs to q2-longitudinal's
plot-feature-volatility action to obtain the plots of feature volatility and importance.
regress-samples will output this information.
You can also use q2-sample-classifier's
split-table action directly if you want a little more control over this step.
This is in your sample metadata file, whatever the target variable is (e.g., time). But
regress-samples will also output this information... the true target values and predicted target values for each test sample.
Hi @Nicholas_Bokulich ,
Thank you so much for the quick response!
I would like to try the analysis using the method you gave me.
Two more questions,
- Are feature abundances used as variables for machine learning in the feature-volatility pipeline in q2-longitudinal?
- Are the variables used for machine learning different in between q2-sample-classifier pipeline and feature-volatility pipeline in q2-longitudinal?
Thanks a lot!
q2-longitudinal wraps q2-sample-classifier to perform machine learning. So what I outlined above was basically to perform the underlying steps directly in q2-sample-classifier so that you can access the intermediate files that you need. The outputs can be passed to q2-longitudinal to generate the same visualization.
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.