Is there a pre-trained QIIME 2 classifier with full-length 16S (.qza)?

Hi @Scarleth_Bravo,

Yes, it can take a while. However, it often helps to remove redundant and low-quality sequences prior to training. Often there are many identical sequences with the same taxonomy. Removing these, and performing other quality control steps prior to training will help reduce the database size and memory footprint, which will enable faster training.

I suggest looking through this RESCRIPt tutorial for general ideas…, and skip to the dereplication and cull sequence steps. Also, you can drastically speed things up by making an amplicon specific classifier, by extracting the amplicon region, dereplicating the extracted amplicon region, remove low quality sequences, then train.

Keep in mind the tutorial is not necessarily structured as a standard operating procedure (SOP). Its main purpose is to provide command examples. You can carry out the commands in many different ways. Feel free to alter the order of the commands, etc…

I am wondering if increasing the --p-classify--chunk-size will help with training speed? Can anyone else provide insight on this, or other ideas?