Hi @Peter_Kos ,
The new SILVA v138 reference database recently became available. There should be a QIIME compatible version soon, with less messy taxonomy. My recent attempt to making a cleaner taxonomy is available here:
Hi all, just an FYI:
Here are new locations for the updated SILVA taxonomy (i.e. Greengenes-like) reference files for both the SSU and LSU data. To save space and bandwidth, these are the raw FASTA and TSV files. So, you’ll have to import and train them yourself. Be sure to read the SILVA License .
Be wary of the species labels. For example, there are a few taxa annotated with a species label that corresponds not to the organism to which the sequence belongs, but from the source material from w…
Other details are outlined in the rest of the above thread as well this one:
I can likely provide some insight here as I am one of the contributors that helped to format SILVA database for QIIME. The D_X__ convention was chosen to be as much of a unique and “safe” text string as possible, considering many of the bizarre taxonomy text annotations within the SILVA reference database. That is, it was meant as a quick fix to be able to search and parse these taxonomy strings.
The ‘D’ was a way of annotating the “taxonomic Depth”. At the time some of the code was written, th…
-Mike
2 Likes