SIGSEGV(-11) while feature-classifier classify-sklearn

Hi there qiime wizars!!

I'm trying to do the taxonomy from 24 samples of soil and I'm getting some troubles...

When running, I get a SIGSEGV(-11) error the whole time... I tried to change the reads per batch and also the n-jobs. The last command was:

time qiime feature-classifier classify-sklearn --i-reads dada2/representative_sequences.qza --i-classifier taxonomy/NCBI_16S_classifier.qza --o-classification taxonomy/taxonomy --p-reads-per-batch 1000 --p-n-jobs 5

I change the temp directory to my unit C, and this is the space i have to run it:

Filesystem Size Used Avail Use% Mounted on
rootfs 944G 386G 558G 41% /
none 944G 386G 558G 41% /dev
none 944G 386G 558G 41% /run
none 944G 386G 558G 41% /run/lock
none 944G 386G 558G 41% /run/shm
none 944G 386G 558G 41% /run/user
tmpfs 944G 386G 558G 41% /sys/fs/cgroup
C:\ 944G 386G 558G 41% /mnt/c
D:\ 10G 4.4G 5.7G 44% /mnt/d

What can I do???

Thank you in advance.

Sincerely,

Ana.

Hi @Anuka,

It appears your system may not have enough memory. This step often requires 24-64 GB of RAM (sometimes more, sometimes less) depending on the reference database size, and the run settings. More is explained here.

-Mike

3 Likes

Hi SoilRotifer,

I've 32 Gb of RAM, and I already run this command before with other samples (and the same library, sometimes a pretraines GG and other times with some personal one) with no troubles...

It depends on the size of the reference database. For example, SILVA requires substantially more memory resources than GreenGenes, some custom databases require significantly more. Some custom databases can require close to 128 GB RAM. :scream:

How big is your database?

Have you tried reducing the number of jobs, and/or tried a machine with more RAM, as suggested in the post I linked? The more threads you use, in this case, the more RAM you'll require. So, reducing the number of threads may help.

-Mike

1 Like

my database has 466.747 kB... but I dont understand why I could use it last week with a subset of these samples and now with 5 more samples I cannot do it.

No I dont have any better computer right now... What do u refer as reducing the number of threads? reducing the numer of reads per batch??

-Ana-

Could you try to set --p-n-jobs to 1 and run the command? The less jobs are running in the parallel, the less amount of RAM needed.

2 Likes

Okay this might be a useful piece of information. What version of QIIME 2 are you running? What operating system? How did you construct your classifier?

If setting --p-n-jobs 1 does not work, can you send me a private message with links to your NCBI_16S_classifier.qza and representative_sequences.qza files via Dropbox or similar service? I'd like to test this out on my end, and make sure that it is not an issue with the files and/or your computer setup, or something else. :thinking:

-Mike

1 Like

It already worked!!

Took 11 hours, but it worked!! Thank you very much!!

Ana.

2 Likes

Thank you for letting us know @Anuka. We're glad it worked! :sparkler: