Feature-classifier classify-sklearn killed

Hi,

I have been checking some posts regarding an memory issue with the plug-in feature-classifier classify-sklearn. I have checked that I have 32 GB of RAM in the following way:

(base) usuari@WS0202:~/Downloads/ITS2$ free -g
               total        used        free      shared  buff/cache   available
Mem:              31           3          26           0           1          26
Swap:              1           1           0

Then I try to run the following command:

qiime feature-classifier classify-sklearn --i-classifier unite_ver9_dynamic_25.07.2023-Q2-2024.2.qza --i-reads dada2out/representative_sequences.qza --p-reads-per-batch 10 --o-classification taxonomy.qza --verbose

But then after 10 min or so, I get the message "killed". That's it. I have tried adjusting the --p-reads-per-batch argument with different values: 10, 100, 1000, 5000, 10000. However, I always get the message that the process is killed.

Any recommendations on what should I do?

Thanks!

Cheers,
Pablo

Hi @Pablo_V ,
Have you tried using the default parameter for --p-reads-per-batch?

Hi @cherman2

Yeah I also tried with the default option but it didn’t work. That’s why I changed the values. It’s surprising to me that even a low value such as 100 doesn’t work :confused:

Hi @Pablo_V ,
Yeah that is disappointing. Unfortunately, It seems like your local computer may not have enough ram to classify your sequences. Do you have access to an HPC where you could run this job?

1 Like

Hi @cherman2

Really???? Oh no! I mean I successfully run it before for bacterial amplicons by reading 5k reads per batch but this dataset is of ITS2 fungal amplicons. I find it weird it doesn’t work? I obtained my rep-seqs via itsxpress2.

For the time being no, I don’t have access to HPC. Is there really no other option besides this?

Cheers,
Pablo

Hi @Pablo_V,
I have a work around idea that you could try out if you would like.

You could try faceting your reps-seqs file and then classifying. So you would use
filter-seqs: Filter features from sequences — QIIME 2 2024.5.0 documentation to filter your rep-seqs into batches like 10 smaller rep-seqs files. Then run feature-classifier classify-sklearn on all the 10 rep-seqs one after another (avoid running these in parallel because you will run into the same issue). Then run merge-taxa: Combine collections of feature taxonomies — QIIME 2 2024.5.0 documentation to combine all the batched taxonomy.

I don't know for sure that this will work but I think its worth a try!

3 Likes

Hi @cherman2

This sounds like a really good idea! I'll keep it in mind. However, somehow it worked now. I re-started my computer and tried to clean unnecessary memory-draining processes as much as possible and left the following command overnigh:

qiime feature-classifier classify-sklearn --i-classifier unite_ver9_dynamic_25.07.2023-Q2-2024.2.qza --i-reads dada2out/representative_sequences.qza --p-reads-per-batch 100 --o-classification taxonomy.qza --verbose

And it WORKED!

Thank you for the support.

Cheers,
Pablo

2 Likes