taxonomical asignment questions: SILVA, PR2 and vsearch vs BLAST

SoilRotifer · March 1, 2021, 4:08pm

You can simply use RESCRIPt to make a 16S-rRNA gene reference set from the SILVA db. This is the tool we currently use to generate the SILVA files on the data resources page.

I often prefer to keep the 18S rRNA genes as there can be amplified off-targets. Leaving these 18S sequences in the reference set as "decoys" helps with identifying non-16S rRNA gene sequences. However, it can be beneficial to make a SILVA 16S rRNA only reference db to help limit memory usage. So, following the above linked tutorial and only keep Bacteria and Archaea should do the trick.

Not currently, although other users may have made their own. Hopefully, others will check into this thread. Otherwise, you can also download 16S rRNA gene files from GTDB and re-format them for import into QIIME 2. Obviously you should be able to do the same for PR2.

What parameter settings did you use for vsearch and blast?