Using qiime taxa collapse Before Logistic Regression: Is It Appropriate?

Greetings,

I have a question and I'm not sure how to proceed. I have the table with abundance calculations, and I have the taxonomic assignment done with the classifier.

I wanted to ask if it would be appropriate to use qiime taxa collapse before using the data for a logistic regression because I have noticed that the classifier has produced many repeated assignments. In other words, many feature IDs have been assigned to the same taxon and level, but in the table, they appear as separate values.

Regards!

2 Likes

Hi @jau,

I think the answer is It Depends :tm: on several things.

The answer to this is yes. You can absloutely collapse your data to genus level before doing... whatever. You're making a set of assumptions which are discussed in the linked posts below (these are are a starting place.)

There's a second issue in your question, and that's about the appropriate use of logistic regression in microbiome data. I assume you 're using the taxa to predict an outcome? This might also be something to consider. I appreciate that the standard consideration in models that if your exposure (e.g. microbiome) should predict your outcome (e.g. disease) but this can be more complicated.

If helpful, there's also a discussion about that on here:

https://forum.qiime2.org/t/modeling-bacterial-features-as-indepedent-variables/25766/8

Again, happy to discuss further in general terms and as always happy to have others involved!

Best,
Justine

3 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.