Reconciling 16s Databases in a Single Pipeline

SoilRotifer · August 28, 2024, 3:36pm

We've been having some discussion on this topic and here are some thoughts:

On thing that might be helpful for a comparison like this is that qiime feature-table tabulate-seqs now takes one or more optional FeatureData[Taxonomy] artifacts, which makes it straight-forward to compare how different classifiers assigned taxonomy to the same set of sequences.
Providing the per-sample and total frequencies per feature output generated by qiime feature-table summarize-plus as metadata to qiime feature-table tabulate-seqs can facilitate exploration, for example by allowing you to sort the table by number of samples or total frequency of the features.
Basically, you can follow something similar to this approach to append multiple taxonomies to a table, or what-ever, etc...

EDIT: obviously this deals with amplicon data. But you can use similar approaches on the reference data. The only issue would be to deal with mismatching sequence IDs, or subsets of sequences sets between the databases.

Anyone else, please feel free to jump in!

-Cheers!
-Mike