Welcome to the forum, @jasongallant,
Thanks for digging up this old topic... some of the info I wrote in there a year ago is impacted by recent updates.
That is correct — fit-classifier-naive-bayes does not give any "progress update" unfortunately.
I'd recommend reducing database size to reduce runtime, if possible:
- use
extract-reads
to focus on the amplicon region you are using - remove any low-quality sequences
- dereplicate the database (ideally after extracting amplicons) to reduce database size and redundancy. You can use RESCRIPt to dereplicate the sequences together with the taxonomy.
RESCRIPt also has a "get-ncbi-data" method that you can use to download data from genbank and automatically format it and import it as QIIME 2 artifacts. Since BOLD deposits their public data on genbank (all or most? not sure), it would be possible to use that to grab public BOLD data.
Good luck!