I want to create a database using SILVA_132, using only 16S sequences.
I have read the RESCRIPt tutorial on creating the database starting from silva, but I cannot figure out how to create it from silva 132 compared to silva 138 shown in the example. If I download the Silva_132_relase.zip folder from the link Archive, I cannot understand how to import the content into qiime2 as in the first part of the tutorial ...
instead of:
tax_slv_ssu_138.txt.gz
taxmap_slv_ssu_ref_nr_138.txt.gz
tax_slv_ssu_138.tre.gz
the sequence file:
SILVA_138_SSURef_Nr99_tax_silva_trunc.fasta.gz
But if I wanted to extract the entire 16S sequence, as in the archive of the other link, how can I do?
I can only do this by selecting a sequence based on the primers used with: qiime feature-classifier extract-reads
?
SILVA database in the file contains full length sequences of 16S rRNA (or Small Sububnit - SSU). The further steps in the tutorial are made to optimize the performance for sequenced target region.
May I ask what is your final goal? Taxonomic classification?
If yes, you can download a trained classifier on full-length SILVA_132 from older versions of QIIME2 here.
my goal is to replicate metagenomic analyzes conducted on the 16s of bacteria with qiime2.
using the complete silva database with qiime I get % totally different from those provided, while if I repeat the analysis with the silva 16s database on kraken2 I can get practically identical results.
Surprised by this difference in values obtained using different databases, I want to create a database with only the 16s as the kraken2 one hoping to get similar results...