q2-sidle reference database meaning

Hi everyone!

I am starting to use q2-sidle to analyze microbiome data and I have some questions about. In the Database Preparation Section (Database Preparation — q2-sidle 2020.8 documentation) uses a reference database. However, I don't find any information about this database. Also, it has wonderful names as «wonder woman» or «batman».

Anyone knows what this database is about?

Thank you!


Hi @elsamdea,

Within the tutorial, the reference database is just an example sequence-taxonomy reference database. I think the one provided is from Greengenes. However, you can also use the pre-made SILVA or any other database of your liking.

Additionally, we’ve made it easier to help users construct their own sequence-taxonomy reference database with RESCRIPt. Check it out!

Does this help?



Hi @elsamdea,

@SoilRotifer is totally right!

The tutorial database is, I think, based off of greengenes, but it’s specifically curated to have characteristics that I wanted in the tutorial data. So, you need to prepare your own database from the full dataset.

If you plan to use Silva, it will be a larger database and you’ll need to use Silva 128 if you want a tree. You can get 128 pretty easily via RESCRIPt (which you already have installe with Sidle!) or you can run with Greengenes.

You can also check the avengers assemble repository to see if any of the databases suit your needs, but you likely need to prepare your own.



Thank you @SoilRotifer!

Yes, your answer really help me! I have used RESCRIPT before to prepare Silva 138 database, but I did not think about that this time. I will follow your advice!


Hi @jwdebelius,

Thank you so much for your answer, Justine!

I will check the avengers assemble, but you are right. Right now I am not sure if I will prepare my own database or not, because I do not want to miss any family/genus/species.



An off-topic reply has been split into a new topic: Finding alignment files for Sidle

Please keep replies on-topic in the future.