Training Feature Classifier

Hi, there. I would like to train my customized naive bayes classifier using Greengenes reference database. Every step went smooth until the step of ‘Train Classifier’.
The error displayed:
Plugin error from feature-classifier:

Debug info has been saved to /tmp/qiime2-q2cli-err-hs1i5aky.log

For your information, I used 99_otus.fasta and 99_otu_taxonomy.txt documents and trimmed according to 341F and 805R primers which targeting V3-V4 region of 16S rRNA. I have been trying for several times, yet the classifier was failed to be created.

Hi @Benedict,

Not related to your error message but in case you are interested there is a pre-trained classifier available in the community contribution section that was made from the exact region you mentioned. This would save you the hassle of training your own.

3 Likes

Hey @Benedict,

Related to your error, what does this file say?

Chances are you ran into a out-of-memory error, but the log will tell us for sure.

2 Likes

Hi @Mehrbod_Estaki,

Thank you for your kind response. May I know whether the reference sequence in your pre-trained classifier has been trimmed using the command --p-trunc-len? I’m looking on the region length that hasn’t been truncated yet.

Apart from that, I would like to request for pre-trained classifier for SILVA. Do you have any idea pertaining to this?

Hi,

Yes, I read the debug info using --verbose and found it was MEMORY ERROR.

Do you have any suggestion to upsize the memory of qiime2 using virtual box?

Good news then @Benedict! No trimming has been done on that classifier. You can check out the full details on the provenance tab in the artifact.

Have you seen the data resource section? It does have a few pre-trained classifiers, including one on the full SILVA database and several links as to where you could find other databases. Good luck!

2 Likes

Hi Mehrbod, do you know the commands to perform NB classifier of SILVA targeting V3-V4 region using the pre-trained full length classifier? I found no information on this throughout the entire qiime2 tutorial.

Thank you in advance. ^^

Hi @Benedict,

The commands would be the same as before when you were training it on the greengenes classifier, you would just be using different reference sequences and taxonomy as the input. Same as the tutorial here. Though I should mention that if you were having memory issues with greengenes you are likely to have issues again as the full SILVA database is much larger.

You can change how much memory (and how many processors) you dedicate to your virtual machine under the settings > system tab. Training the classifiers can be a memory intensive step and can take a long time depending on how much memory you can dedicate. If possible I would try and access a more powerful machine for this step and then you can bring the classifier back to your own computer for the rest of the analysis.


Hope that thelps!

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.