Can I use different versions of Silva for taxonomic classification and fragment insertion

Irshad · December 6, 2021, 10:52pm

I used the fragment-insertion method with SILVA 128 and filtered the feature-table as you have reported here https://currentprotocols.onlinelibrary.wiley.com/doi/full/10.1002/cpbi.100. However, no features were removed. Does that mean all features were inserted onto the tree?

Also, will it be technically sound/acceptable if I use SILVA-138 for q2-feature-classifier when the tree (fragment-insertion SEPP) is based on SILVA 128?

Thank you

yanxianl · December 7, 2021, 9:19am

Hi, I just want to upvote this question since I have the same doubt. All my representative sequences can be inserted into the silva128 reference tree using the SEPP fragment-insertion, but I used silva138 for the taxonomic assignment. Will there be any problems by doing so?

jwdebelius · December 7, 2021, 10:10am

Hi @yanxianl and @Irshad,

I moved this over to a new topic, since the older one is two years old. I'd say unless you're using a method that explicitly requires the same version of the database, you're probably okay to mix and match. I've even mixed and matched across databases (Greengenes tree with Silva classficiation), although this is much more a headache to explain to collaborators/reviewers).

I can forsee this being a problem in two semi-niche cases:

Using Empress, your taxonomic annotation may not match your tree tips. If you plan to use Empress, you may need to be more cautious about this issue (although I think you may be okay)
If you're using Sidle for multiple region reconstruction, your database versions must all match.

Outside those two senarios, you should be good!

The relationship between taxonomy and phylogeny has always been kind of hand-wavy anyway. We call things by names that molecularly aren't true and that makes annotation quite messy! (See the GTDB paper for more details). So, given the fact that the relationship is an inexact science, I tend to just go for it.

Best,
Justine

https://www.nature.com/articles/nbt.4229

yanxianl · December 7, 2021, 10:53am

Hi Justine,

Thanks for sharing your thoughts on this topic. I was considering if I should build a de novo tree instead. This clears my concerns.

Mehrbod_Estaki · December 8, 2021, 11:17pm

Thanks @jwdebeliu! I agree with everything you said including that I think even Empress should be ok with mix and match.

Yes, that would be the implication, and I actually have never had any reads not inserted when I am using mouse or human gut samples, so this is not uncommon at all.

As for using a Silva backbone tree, I just want to give a heads up that this hasn't been thoroughly benchmarked and checked. While it is probably ok, I tend to use the vetted GG backbone tree for fragment insertion and then mix and match for taxonomy to whatever suits my needs best. I've also never been given any hassle from reviewers about this, just once someone questioned it and were happy when I explained that it was a non-issue for all the reasons @jwdebelius mentioned.

system · January 9, 2022, 5:17am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.