Error in Qiime Feature Classifier

Dear All,

My command got killed after running for a while. I am not understanding the reason behind it. is it because of the memory.

qiime feature-classifier classify-sklearn   --i-classifier classifier.qza   --i-reads dada2_rep_set.qza   --o-classification taxonomy.qza

@Shan, that does look like it could be memory related, can you rerun the command with the --verbose flag and post the entire command you ran and its output here? Additionally, how much RAM do you have access to, and how large is your data?

1 Like

When i am running the command with --verbose then also same error again. My PC has 32 GB RAM space and data is around 21 GB

@Shan, what value are you using for the --p-n-jobs argument? If you aren't providing the argument at all it defaults to 1, if you are providing it then try not providing it. The more jobs you run the more RAM you use (it looks like you aren't providing it, but for some reason the way your command is rendering is all messed up for me, so I can't see it clearly).

That being said, if your reads qza is 21GB it's possible you will use more than 32GB of RAM with 1 job. Can you try running the command again (with no value provided to the --p-n-jobs argument) with htop open in another terminal and view the RAM usage? If your RAM usage is really too high, you might try playing with the --p-reads-per-batch parameter. It defaults to 20000. Try lowering it and see if that helps.


Actually I am very new to all this. when i was looking at the usage of RAM, i got this.

Try running the following and posting the output here please. This should directly read the amount of RAM the command is using. /usr/bin/time -v qiime feature-classifier classify-sklearn --i-classifier classifier.qza --i-reads dada2_rep_set.qza --o-classification taxonomy.qza

1 Like

this is what i get after running these commands and i think it is the issue with the memory and can you please confirm this.

Command terminated by signal 9
	Command being timed: "qiime feature-classifier classify-sklearn --i-classifier classifier.qza --i-reads dada2_rep_set.qza --o-classification taxonomy.qza"
	User time (seconds): 250.40
	System time (seconds): 48.38
	Percent of CPU this job got: 66%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 7:27.80
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 30402660
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 4580
	Minor (reclaiming a frame) page faults: 7570691
	Voluntary context switches: 72260
	Involuntary context switches: 8373
	Swaps: 0
	File system inputs: 19587208
	File system outputs: 19037120
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

The line Maximum resident set size (kbytes): 30402660 means that job peaked at using about 30.5 gigs of RAM. Keeping in mind the fact that your computer is doing other things that also need RAM, you are definitely running out of RAM.

If you have access to a machine with more RAM (an HPC cluster or something) use that, otherwise you will most likely need to shrink your data somehow. You may also try passing smaller numbers to the --p-reads-pre-batch argument (remember if you don't pass it the default value is 20000). This should reduce RAM usage but increase runtime. I'm not sure if it will reduce RAM usage enough though. Your best bets are either shrinking your data or getting access to more RAM.

1 Like

Thanks @Oddant1 for your help.

In addition to the above suggestions, you may try increasing the SWAP size. Worked for us when increased it to 50GB. However, our data was small (<10GB).


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.