Hi @marymcelroy, welcome to !
Are you asking if you should use one of the following?
classify-consensus-blast
classify-consensus-vsearch
classify-sklearn
- ...
The short answer, it depends.
Generally, classify-sklearn
is the way to go for most datasets, but your mileage may vary. Often, attention to robust curation of your reference sequences is most important.
If you are asking about what to consider when curating and training your reference data, look through some of the threads listed below. Though it appears you're already familiar with these.
- Generic RESCRIPt tutorial.
- Building a COI database from BOLD references.
- Using RESCRIPt to compile sequence databases and taxonomy classifiers from NCBI Genbank.
- An updated version of this approach is outlined here.
I'm sure others in the forum can add their experiences and insights.