SILVA 138 Classifiers

I just wanted to let everyone be aware that I’ve hobbled together a simple pipeline for constructing classifiers based on the SILVA 138 release. I’ve been working on this as time permits, so I apologize in advance for the short-cuts :scissors: and clunkiness :hammer: of my approach, but I figured this would be something useful for the community. At least in the short-term :timer_clock: .

Anyway, the files will be temporarily available here, until I can find a longer-term hosting solution:

Do not be surprised if the suddenly disappear :wilted_flower: . If they do, I hope the pipeline I’ve linked above should be sufficient.

The classifiers, and the reference sequences and taxonomy files used to build them, are available too. Note: I’ve made classifiers with and without the species labels. This not only helps to reduce the size of the classifiers, but also allows for faster classification as there is less rank information. This may be ideal for those that typically do not trust species-level taxonomy. Either-way, use what works best for you.

Please let me know if these are useful. Otherwise happy :qiime2:-ing my friends!

-Mike

8 Likes

Awesome, thanks Mike.

A bit of a philosophical/operational question. Given all the changes in taxonomy, with groups changing place in classification between phyla, classes, orders etc, it is becoming impossible to compare taxonomic analyses performed with different versions of Silva/classifier versions in QIIME2. Do you see a potential solution in the future, selecting what taxonomy flavor/vintage to use at the classification step without selecting different classifier files and re-running all analyses?

Cheers,
Mircea

1 Like

Hi @mpodar,

You’ve discovered one of the things that keeps me up at night! :scream: I would like to figure a way to provide taxonomies from multiple sources (e.g. GTDB, SILVA, etc…) and be able to present those side-by-side. Like a taxonomy-assignment ensemble approach, similar to what is available through the online version of SILVA. I know there are people linking DOIs to taxonomy, so that if your data is assigned to some record / lineage, and that record / lineage has it’s taxonomy updated, then you just pull that updated information via the DOI.

I do not necessarily think you’d have to rerun all of your analyses, unless you are collapsing your OTUs/ASVs by taxonomy. The patterns in your ASVs should be the same, unless the data has been parsed based on taxonomy.

In a nutshell, I do not have a good answer to your inquiry. But this is something I have been thinking about quite often these days. Perhaps someone much smarter than I will have better insight into this. :slight_smile:

-Best wishes!
-Mike

1 Like