I have a question and I'm not sure how to proceed. I have the table with abundance calculations, and I have the taxonomic assignment done with the classifier.
I wanted to ask if it would be appropriate to use qiime taxa collapse before using the data for a logistic regression because I have noticed that the classifier has produced many repeated assignments. In other words, many feature IDs have been assigned to the same taxon and level, but in the table, they appear as separate values.
I think the answer is It Depends on several things.
The answer to this is yes. You can absloutely collapse your data to genus level before doing... whatever. You're making a set of assumptions which are discussed in the linked posts below (these are are a starting place.)
There's a second issue in your question, and that's about the appropriate use of logistic regression in microbiome data. I assume you 're using the taxa to predict an outcome? This might also be something to consider. I appreciate that the standard consideration in models that if your exposure (e.g. microbiome) should predict your outcome (e.g. disease) but this can be more complicated.
If helpful, there's also a discussion about that on here: