Could q2-sample-classifier take relative abundance or CLR transformed abundance as input?

05dongkeren · May 22, 2021, 5:57am

Dear QIIME2 developer,

Thanks a lot for developing QIIME2 for us!

I would like to predict the sample metadata use the microbial data based on the random forest algorithm. Therefore, may I ask several questions about the q2-sample-classifier plugin?

Could I use the relative abundance (percentage) or the CLR transformed abundance (with plus and minus values) as the input?
In addition to 16S data, I also have the metagenomic data (genes, pathways) and metabolomic data (metabolites). Could I combine these datasets and use the combined file as the input for q2-sample-classifier? If so, do I need to perform any normalization or scale for all these datasets before running the q2-sample-classifier (because each dataset has specific data characteristics)? What do you suggest?
Could alpha-diversity indices (Shannon, observed features) also be used to predict the sample metadata through the q2-sample-classifier?
Could the shifts of certain taxon units (e.g. difference of an ASV between pre-treatment and after-treatment) be the input to predict the sample metadata through the q2-sample-classifier?

Many thanks for your time and guidance! I sincerely appreciate your help!

Warmest regards,
Nathan

Nicholas_Bokulich · May 25, 2021, 5:43am

Welcome @05dongkeren ,

No — right now the classify-samples pipeline only accepts frequency data as input. In theory other feature table types should be acceptable but this has yet to be tested or implemented.

Sure, you can use the q2-feature-classifier plugin to merge these tables and use as input to classify-samples. Just make sure that you have the same number of samples in each table, i.e., you do not have missing data types for some samples but not others.

Not for Random Forests. See discussion here:

Sure! You can use the metatable action in q2-sample-classifier to merge alpha diversity (or even metadata) with your other feature data.

I am not sure how this would work out, or if the differences in relative abundances would be meaningful.

Good luck!

system · June 25, 2021, 11:43am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.