I'm looking for some insight on using a qiime2 workflow for taxonomic assignment of short-read amplicons generated with PE Illumina sequencing of COI, 16S, and 18S markers targeting marine animals and macrophytes (DNA from seawater samples). I built custom reference databases for each marker with seq records and taxonomy from NCBI/Genbank and BOLD, and I'd like to use these curated dbs for taxonomic assignment. Is there a specific qiime feature-classifier approach that would be most appropriate for these kinds of data? Any suggestions and resources welcome, thanks!
Hi @marymcelroy, welcome to !
Are you asking if you should use one of the following?
The short answer, it depends.
classify-sklearn is the way to go for most datasets, but your mileage may vary. Often, attention to robust curation of your reference sequences is most important.
If you are asking about what to consider when curating and training your reference data, look through some of the threads listed below. Though it appears you're already familiar with these.
- Generic RESCRIPt tutorial.
- Building a COI database from BOLD references.
Using RESCRIPt to compile sequence databases and taxonomy classifiers from NCBI Genbank.
- An updated version of this approach is outlined here.
I'm sure others in the forum can add their experiences and insights.
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.