Make 12S reference database using Rescript

SoilRotifer · January 17, 2021, 4:48pm

I've got quite a bit of experience in making 12S rRNA gene databases (e.g. for eukaryota, and metazoa). Hopefully, we can help.

I agree with @Nicholas_Bokulich, there are things beyond our control when it comes to internet connections. Your query does not appear to result in that many sequences, but you can break it up into smaller downloadable chunks within the vertebrates.

Here is a query you can use to download Gnathostomata:

txid7776[ORGN] AND (12S OR 12S ribosomal RNA OR 12S rRNA) AND (mitochondrion[Filter] OR plastid[Filter]) NOT environmental sample[Title] NOT environmental samples[Title] NOT environmental[Title] NOT uncultured[Title] NOT unclassified[Title] NOT unidentified[Title] NOT unverified[Title]"

One thing to note, always make sure you have some taxonomic "out groups", or "off target" taxa, for your reference database. This will better unsure that you do not under or over classify your data.

Once you download your data as separate chunks you can merge them using the standard qiime commands:

Then you can proceed with RESCRIPt.

Keep us posted!