SIGSEGV(-11) while feature-classifier classify-sklearn

Anuka · January 11, 2023, 9:41am

Hi there qiime wizars!!

I'm trying to do the taxonomy from 24 samples of soil and I'm getting some troubles...

When running, I get a SIGSEGV(-11) error the whole time... I tried to change the reads per batch and also the n-jobs. The last command was:

time qiime feature-classifier classify-sklearn --i-reads dada2/representative_sequences.qza --i-classifier taxonomy/NCBI_16S_classifier.qza --o-classification taxonomy/taxonomy --p-reads-per-batch 1000 --p-n-jobs 5

I change the temp directory to my unit C, and this is the space i have to run it:

Filesystem Size Used Avail Use% Mounted on
rootfs 944G 386G 558G 41% /
none 944G 386G 558G 41% /dev
none 944G 386G 558G 41% /run
none 944G 386G 558G 41% /run/lock
none 944G 386G 558G 41% /run/shm
none 944G 386G 558G 41% /run/user
tmpfs 944G 386G 558G 41% /sys/fs/cgroup
C:\ 944G 386G 558G 41% /mnt/c
D:\ 10G 4.4G 5.7G 44% /mnt/d

What can I do???

Thank you in advance.

Sincerely,

Ana.

SoilRotifer · January 11, 2023, 2:32pm

Hi @Anuka,

It appears your system may not have enough memory. This step often requires 24-64 GB of RAM (sometimes more, sometimes less) depending on the reference database size, and the run settings. More is explained here.

-Mike

Anuka · January 11, 2023, 4:57pm

Hi SoilRotifer,

I've 32 Gb of RAM, and I already run this command before with other samples (and the same library, sometimes a pretraines GG and other times with some personal one) with no troubles...

SoilRotifer · January 11, 2023, 10:33pm

It depends on the size of the reference database. For example, SILVA requires substantially more memory resources than GreenGenes, some custom databases require significantly more. Some custom databases can require close to 128 GB RAM.

How big is your database?

Have you tried reducing the number of jobs, and/or tried a machine with more RAM, as suggested in the post I linked? The more threads you use, in this case, the more RAM you'll require. So, reducing the number of threads may help.

-Mike

Anuka · January 12, 2023, 7:46am

my database has 466.747 kB... but I dont understand why I could use it last week with a subset of these samples and now with 5 more samples I cannot do it.

No I dont have any better computer right now... What do u refer as reducing the number of threads? reducing the numer of reads per batch??

-Ana-

timanix · January 12, 2023, 9:08am

Could you try to set --p-n-jobs to 1 and run the command? The less jobs are running in the parallel, the less amount of RAM needed.

SoilRotifer · January 12, 2023, 2:30pm

Okay this might be a useful piece of information. What version of QIIME 2 are you running? What operating system? How did you construct your classifier?

If setting --p-n-jobs 1 does not work, can you send me a private message with links to your NCBI_16S_classifier.qza and representative_sequences.qza files via Dropbox or similar service? I'd like to test this out on my end, and make sure that it is not an issue with the files and/or your computer setup, or something else.

-Mike

Anuka · January 12, 2023, 10:11pm

It already worked!!

Took 11 hours, but it worked!! Thank you very much!!

Ana.

SoilRotifer · January 13, 2023, 4:41pm

Thank you for letting us know @Anuka. We're glad it worked!

system · February 13, 2023, 10:42pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.