Taxonomic classificatin sklearn killed, enough memory

jessimya · March 24, 2023, 9:45am

Hi!

I'm trying to assign taxonomy and am having trouble figuring out whats going wrong. I downloaded the Silva classifier from qiime2, (https://data.qiime2.org/2023.2/common/silva-138-99-nb-classifier.qza) as a zip file, but when I try to run the command I immediately get the error message "killed". The computer should have ~100 GB left, so I don't think its a memory issue?

Here's my script:

(qiime2-2023.2) jessie@jessie-HP-ENVY-x360-Convertible-15-ee1xxx:~$ qiime feature-classifier classify-sklearn
--i-reads deblur_output/representative_sequences.qza
--i-classifier /home/jessie/silva-138-99-nb-classifier.qza
--p-n-jobs $NCORES
--output-dir taxa
Killed

Thanks!

timanix · March 24, 2023, 9:53am

Hello!

Does it refer to the disk storage (SSD, hard drive) or RAM (operative memory)? For example, RAM is crucial for taxonomy assignment and from my experience Silva can require 32 GB or even more of RAM. How much of RAM have you available for the run?

jessimya · March 24, 2023, 10:07am

That was referring to disk storage, it looks like I am pretty limited with RAM (in GB)

(base) jessie@jessie-HP-ENVY-x360-Convertible-15-ee1xxx:~$ free -g
total used free shared buff/cache available
Mem: 11 3 0 0 6 7
Swap: 1 0 1

Is there a way to expand RAM? Or repartition?

Thanks!

timanix · March 24, 2023, 10:13am

It depends on your machine and how you are running qiime2. If you are using virtual machine, check in the settings how much of RAM is available for VM.
If you don't have enough of RAM on the entire machine, it can be upgraded mechanically (I ordered 32 Gb of RAM to be physically added to my laptop), or you can run analysis on more powerful machine, or you can access HPC, or remote cluster for heavy tasks.

jessimya · March 24, 2023, 10:17am

Ok great, thank you so much! Is it normal to be so limited? This is a pretty new computer and I don't have much downloaded other than miniconda, qiime, and microsoft.

timanix · March 24, 2023, 2:01pm

RAM is not depended on the amount of software installed, it is a memory that can be allocated for performing various computational tasks. To perform heavy tasks, such as taxonomy classification, you need to find a way to run it on a stronger machine (with higher amount of RAM).

Another option is to try to use GreenGenes classifier, which is trained on a lesser amount of reads and thetefore requires less amount of RAM.

jessimya · March 24, 2023, 8:24pm

Sounds good!

This is the script I'm trying to run. I have $NCORES = 10, is that potentially why I'm running into issues with RAM? I can also include a lower --p-reads-per-batch

qiime feature-classifier classify-sklearn
--i-reads deblur_output/representative_sequences.qza
--i-classifier /home/shared/taxa_classifiers/qiime2-2022.11_classifiers/silva-138-99-nb-classifier.qza
--p-n-jobs $NCORES
--output-dir taxa

SoilRotifer · March 25, 2023, 3:15pm

Hi @jessimya,

I'd recommend only using 1-2 cores with feature-classifier classify-sklearn, for reasons outlined here. As you've noted, you can combine this with --p-reads-per-batch.

-Mike

system · April 25, 2023, 9:15pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.