I know, there are at least 10 topics in the forum with the same title. I am posting this topic as the last resource to find some answers for a few doubts regarding Silva. In advance, I will disclose that am not a big user of 16S analysis, so I beg your pardon for my naiveness.
I have been still using greengenes when I need 16S results and failed when trying to move to Silva. Now I see that greengenes is impressively outdated compared to Silva. Now I think its time for a definitive update.
A few questions I couldn’t find an explanation for anywhere else:
Why does the SILVA database has Uracils instead of Thymines in their fasta sequence? am I downloading the wrong database file? I understand that this is rRNA, but sequences made from reverse transcriptase are DNA so I don’t understand why greengenes wasn’t like that. I looked for the SILVA_138_SSURef_NR99_tax_silva.fasta, which contains a non-redundant set of sequences. Does 16S software like vsearch recognize the Us as being related to the Ts? (Why just not change the Us in the database for Ts like everywhere else?)
Does Silva give a 16S database? Or is it always mixed with 18S? I find a bit annoying to have 18S sequences in the database if I am not amplifying 18S sequences. Also if I am using 16S primers and somehow get contaminating 18S OTUs due to similar sequences.
In which situations would I prefer using the align database instead of the regular fasta database? Which advantages (or downsides) do I get with the align files? I get kind of overwhelmed by thoughts just by looking at it, its a crazy one.
Thanks for all the help and sorry for your trouble.