Hi qiime community,
I looking to use the taxa collapse plugin to collapse my feature table but have run into an issue with extra data at the genus level in my taxonomy strings. For example I see things like
The genus is Aphanizomenon however there are the extra strain IDs at the end and therefore these two Aphanizomenon features will not collapse into the one genus. Is there any way to get around this without having to modify the whole taxonomy table manually? I have around 10,000 features. These were identified with a classifier trained from silva
Thanks in advance for the help!
Unfortunately no, that scheme is declaring them to be two different genera.
I wonder if something went slightly wrong upstream of the classifier. It seems strange to me that there would be whitespace in the taxonomy label, so I wonder if the taxonomy used was mapping taxonomy strings to OTUs (rather than vice-versa).
Could you provide your reference
FeatureData[Taxonomy]? The one used to train the classifier?
I think that’s just a quirk with SILVA — lots of whitespace in the taxonomy labels, and lots of taxa with strain IDs in the taxonomy label.
@jjankowiak SILVA is better in other ways (e.g., updated more recently, frequently), but this is one reason why I prefer Greengenes in many cases — the taxonomy labels are more uniform.
The best course of action would be to modify the taxonomy labels before training the classifier, e.g., to remove strain designations.
I hope that helps!
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.