[Feature Request] Warnings for Importing Taxonomies with Odd Levels

Based on conversations elsewhere, I wanted to suggest adding a warning when importing taxonomies with a single level, or when merging taxonomies that have differing levels. These would occur when running import or feature-table merge-taxa. The warnings might read:

  Warning! Only one level detected in the Taxon column. If this was unintentional, check the format --input-path

or

Warning! Merging taxonomies with different numbers of levels may cause problems in the future (e.g. when creating a classifier). Check the levels of each of your taxonomies.

This would have been tremendously helpful for me. As you can see here, I unintentionally had a trailing semi-colon in my imported taxonomy which meant the entire Taxon string was interpreted as a single level. This went unnoticed to me while I dereplicated, extracted sequences, filtered, etc. When I used BLAST or vsearch to identify sequences, I noticed the Taxon strings from sequences from the imported file and the other file were different, but I didn’t understand why. It wasn’t until rescript evaluate-fit-classifier failed and I asked on the forum that I realized the problem.

I agree with @Nicholas_Bokulich in the original issue that it should not be a requirement that taxonomies have certain levels or a certain number of levels. Still the simple warning would be helpful.

This warning would’ve saved me a lot of time and confusion.

Thanks for considering!

4 Likes

Hey @alexkrohn,

Thanks for this suggestion - I agree this is quite a reasonable request and would be straightforward to implement.

I've created a Github issue for this so our team can track this request and address it when bandwidth allows.

Cheers :lizard: