feature-classifier classify-sklearn fails silently with no output

colinbrislawn · September 10, 2024, 9:30pm

Here's the code block I'm running:

! conda activate qiime2-amplicon-2024.5
# GTDB:
! qiime feature-classifier classify-sklearn \
  --i-classifier dbs/gtdb_classifier_r220.qza \
  --i-reads          results_analysis2/dada2_f240_r240/representative_sequences.qza \
  --o-classification results_analysis2/dada2_f240_r240/taxonomy_gtdb_classifier_r220.qza \
  --p-read-orientation 'same' \
  --p-n-jobs 1 \
  --verbose
! qiime tools peek results_analysis2/dada2_f240_r240/taxonomy_gtdb_classifier_r220.qza

This give me the error:

(1/1) Invalid value for 'ARTIFACT/VISUALIZATION': File
'results_analysis2/dada2_f240_r240/taxonomy_gtdb_classifier_r220.qza' does
not exist.

That's from the final line, as feature-classifier classify-sklearn fails to save the file and fails to throw an error.

I feel like I'm missing something here. Am I missing a \ or - somewhere? Is --verbose broken for this plugin?

lizgehret · September 10, 2024, 10:04pm

Hey @colinbrislawn!

That's really weird that it's failing silently Can you DM me your classifier file and I'll try to repro on my machine?

colinbrislawn · September 10, 2024, 10:16pm

It's gtdb_classifier_r220.qza from https://resources.qiime2.org/

sha256sum dbs/gtdb_classifier_r220.qza 
07aadcf7472d9cc6f853f6b4615348619f1a3eceb56c1fb1b6d8dbb20554765f  dbs/gtdb_classifier_r220.qza

Should I share my dada2 representative_sequences.qza?

This is running on a Digital Ocean VM, in a Jupyter notebook, served by VS Code.
I may be using the wrong conda env or python kernel or something...

lizgehret · September 10, 2024, 10:25pm

Whoops sorry, yes could you share your rep_seqs with me?

I don't know... the error handling should be the same, regardless of the machine or your version of conda. If I can't repro on my machine, maybe I can try to set up a similar configuration to yours and see what happens!

lizgehret · September 11, 2024, 6:12pm

Hey @colinbrislawn,

Okay so I had to run this twice (both times I included the --verbose flag) - on my first try, I used your rep seqs and the direct link you provided to the classifier, and that failed. I received the following CLI error:

 (1/1) Invalid value for '--i-classifier': colinb-gtdb_classifier_r220.qza is
  not a QIIME archive.

I confirmed this by running qiime tools validate on that file, which gave me the same output.

I then ran it a second time (also with your rep seqs) and downloaded the classifier directly from the resources page - and that was successful in ~3 mins.

I'm wondering if the CLI errors are somehow being masked on your machine... let's have you try a couple of things:

Try running any command with a missing required parameter, and see if that also fails silently.
Try running qiime tools validate on the classifier file you've been using - I know you got it from our resources page, but I'm wondering if the file was somehow corrupted when you downloaded it.

We'll get to the bottom of this!

colinbrislawn · September 11, 2024, 7:46pm

The issue appears to be

lack of memory, causing the job to be 'killed'
poor Jupyter Notebook support, in which tasks killed by the OS look the same as finished tasks (yes, this is a known issue)

I'll get a machine with more RAM!

system · October 13, 2024, 2:17am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.