Looking for classifier (Silva, V3-V4) generated with scikit-learn version 0.21.2

Hello!
I am looking for a naive-bayes classifier based on Silva reference database for V3-V4 region, 341F, 805R, generated with scikit-learn version 0.21.2.

I have not been able of running it due to memory errors. I also tried to follow Mehrbod_Estak advise: Looking for Pre-trained Silva Classifier (V3-V4), but I could not solve it.

I found a poster that would kindly provided this classifier (Looking for Pre-trained Silva Classifier (V3-V4) by jwchen). However it was generated with scikit-learn version 0.20.2, so it is not compatible with the most updated version 0.21.2.

Thanks for your help and to QIIME2 developers!

Hi @Daniela_Vargas,

One potential solution is to install the QIIME 2 version that was used to generate that classifier — looks like 2019.4 uses scikit-learn=0.20.2. You can install separate environments using conda, so that you can switch to QIIME 2-2019.4 for taxonomy classification, then switch back to the latest release for downstream analyses.

Thank you very much Nicholas! It is a simple and great idea.

I did it! although the problems with the memory continues. I will ask to install qiime2 in my university cluster.

Thanks again!

it sounds like this is a new problem — memory errors with classify-sklearn instead of with fit-classifier-naive-bayes — correct?

if so, check out the forum archive (click on the :mag: symbol in the top-right corner of the forum website to enter search terms). Search for MemoryError --p-reads-per-batch to find many posts with troubleshooting advice. There are some ways to mitigate memory constraints with classify-sklearn, which may help you get this command running now.

You should do that too! Because then you (and others) will have the capacity to run higher-memory jobs in the future.

Good luck!

Thank you a lot Nicholas! I could finally run it!

it sounds like this is a new problem — memory errors with classify-sklearn instead of with fit-classifier-naive-bayes — correct?

Right! Now the problems was with 'classify-sklearn'.

Here I will leave the script of how I run it.

  qiime feature-classifier classify-sklearn   \
  --i-classifier silva_132_99_v3v4.qza  \ 
  --i-reads denoise_rep_set_dada2.qza   \
  --p-pre-dispatch   6   \
  --p-reads-per-batch 100 \  
  --p-n-jobs 2   \
  --o-classification ref-taxonomy_silva_132_99_v3v4.qza

I have a Mac 8 GB 1867 MHz DDR3, 2.7 GHz Intel Core i5, with two processors and it took me 5 h to run.
I hope this helps others.
Thank you very much to you Nicholas and to my friend Checo for your recommendations.

Daniela

3 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.