Unpublissed sequences in SILVA 138 db; preparing personal database

I am using qiime2 for NGS results from apicomplexan study in ruminants. I used SILVA 138 db classifier but found that some of the sequences were unpublished (so can’t be used as reference sequences). Is there a way that we can exclude unpublished sequences from the database ? Or we should make our own database of published sequences. If yes, please provide a guideline or any link to the guideline?


Hello Abdul,

While SILVA 138 might include some Candidatus taxonomy, all the entries have been curated and been published in the SILVA release, and I think the SILVA database itself counts as a publication you can site. You are good to to!

If you still want to construct your own database, you can see that full process here:



Thanks a lot dear Colin


Hi all fellows,

I have spent sometime re-curating the SILVA database (for Theileria/Babesia) and found some discrepancies. For example, some of Babesia were classified as Bacteria instead of apicomplexa (e.g. KX881914). So it would be better if one curates it again before using. I also found several N… bases in the published sequences. Although its time consuming but really useful and a must do thing as per my experience since it might change the taxonomic output of your data.

