TEF1 metabarcoding: Should I use BLAST or VSEARCH instead of Naive Bayes?

Hi @salias,

Are there any out-groups / decoy sequences contained within your Fusarium classifier? If not, this is likely the reason why some of your BLAST results are not returning Fusarium, but the classifier is...

If the classifier only contains, or "only knows about", Fusarium, then any query searches would have to be a really bad match for a given sequence to be identified as "unclassified". That is, there is a high chance that your sequences will erroneously be classified as Fusarium, even though they are not.

This is a common issue with other amplicon targets. Often classifiers built using reference databases such as UNITE and SILVA, can be constructed with out-group taxa. For example, UNITE has the option to provide non-fungal eukaryote taxa, and SILVA contains Eukaryotic 16S & 18S sequences as out-groups for bacteria and achaea. This way you can remove those sequences from you data, if needed. See these threads for more detail:

Also... PCR / Sequencing primers can be "leaky" and amplify off-targets.

-Mike

3 Likes