Make OTU cluster from SILVAdatabase

W.H · September 30, 2024, 3:24am

Hi, Qiime Users. I'm student and using this tool for the first time.
At first in case what I ask sounds strange and not reasonable, please don't be hesitate to point out, since I am totally beginner of bioinfomatics.

My goal is making phylogenetic tree using my 18SrRNA protists result and including other database of SILVA. And at first, I was asked to built the OTU cluster using SILVA 138.1database to extract representetive sequences from each cluster for used in phylogenetic tree by Qiime2, but is it possible by using this?
I could import SILVA database in Qiime2 following this procedure, "Processing, filtering, and evaluating the SILVA database (and other reference sequence data) with RESCRIPt".
After that I think I need to conduct below command.

""qiime vsearch cluster-features-de-novo
--i-table table.qza
--i-sequences rep-seqs.qza
--p-perc-identity 0.99
--o-clustered-table table-dn-99.qza
--o-clustered-sequences rep-seqs-dn-99.qza""

However, it seems SILVA supply only FASTA format file in database, so we can't get table.qza by using demultiplex option.
Could you give me any suggestion?

Thanks,

SoilRotifer · September 30, 2024, 1:31pm

Hi @W.H,

There are no FASTQ sequences for SILVA as these sequences are mainly pulled from GenBank and other repositories. So, they only exist as FASTA format.

I think the intent is for you to take the OTUs/ASVs you generate from your data, which will be in FASTA format. Then append them to any representative sequences (or sequence clusters) of interest you decide to extract from the SILVA database.

But a far quicker approach would be to run fragment insertion, which is already part of QIIME 2. You can "insert" your 18S sequences into an existing phylogeny, using the existing SEEP files here. If you'd like to make a new reference tree, you can try working through this notebook.

Note, the latest version of SILVA v138.2 is available, but you'll need to install the latest version of RESCRIPt via github, which you can do by using the following commands:

conda activate qiime2-amplicon-2024.5
pip install git+https://github.com/bokulich-lab/RESCRIPt.git
qiime dev refresh-cache

W.H · October 2, 2024, 2:39am

Thanks @SoilRotifer.
I think the way to built new reference tree is also good for my project. I'll check soon!

Cheers

system · November 2, 2024, 8:40am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.