All commands I have defaulted cup/thread to 1. I'm not sure if this is a RAM issue. Or is it because I'm using some files by mistake? Like backbone.v4.fna.qza and taxonomy.id.tsv.qza from gg2?
I noticed the method of generating the blastdb artifact, would using blastdb instead of the reference sequence solve the problem?
time qiime feature-classifier classify-consensus-blast \
--i-query rep-seqs-162-212-20.qza \
--i-blastdb gg2-v4-blast.qza \
--i-reference-taxonomy ../2022.10.taxonomy.id.tsv.qza \
--p-maxaccepts 1 \
--o-classification taxo-gg2-consensus-blast-162-212-20.qza \
--o-search-results hit-gg2-consensus-blast-162-212-20.qza \
Killed
real 33m59.670s
user 7m21.607s
sys 2m7.440s
After an even longer wait it was still KILLED!
But I noticed that it took up less RAM in the early part of the process, although it still took up the maximum amount of RAM later on. For my computer it's 70%+
Hi @KonradV,
Thanks for all the info on this error! :qiime2:
This does look like a Out of Memory Error for me!
However, this has me questioning things. 32 GB should be enough ram, I think.
This has me even more confused!
Can you elaborate on why you are choosing these methods over classify-sklearn?
Does classify-sklearn also fail? If classify-sklearn also fails, Lets try something silly!
Can you try running this part of the moving pictures tutorial: “Moving Pictures” tutorial — QIIME 2 2023.9.2 documentation. This tutorial data is specifically designed to be very small so you should be able to run it without a memory error? If that command succeeds lets try using your rep-seqs but the moving pictures classifier.
As much as you said so, I could go ahead and not pick out a new RAM stick
It's because of the problems here. I tried using a different approach from the feature classifier plugin trying to see if the results would be better. Of course in response to this issue it's likely that it's happening at the dada2 step, but I'll continue to reply under that issue page.
Only classify-sklearn can succeed. But I'm not sure why. Is it because his requires less arithmetic, or takes up less memory?
Meaning I use the data provided here to run the commands I have here where the kills occur? I'll try that right away, that's a great suggestion, thanks! I'll reply later with the results.
time qiime feature-classifier classify-hybrid-vsearch-sklearn \
--i-query rep-seqs-dada2.qza \
--i-reference-reads 2022.10.backbone.v4.fna.qza \
--i-reference-taxonomy 2022.10.taxonomy.id.tsv.qza \
--i-classifier 2022.10.backbone.v4.nb.qza \
--o-classification hybird-taxo.qza
Saved FeatureData[Taxonomy] to: hybird-taxo.qza
real 9m27.070s
user 9m10.923s
sys 0m14.090s
time qiime feature-classifier classify-hybrid-vsearch-sklearn \
--i-query rep-seqs-162-212-24-1.qza \
--i-reference-reads 2022.10.backbone.v4.fna.qza \
--i-reference-taxonomy 2022.10.taxonomy.id.tsv.qza \
--i-classifier 2022.10.backbone.v4.nb.qza \
--o-classification hybird-taxo.qza
Killed
real 12m25.673s
user 8m5.619s
sys 0m58.482s
Why? The rep-seqs file used here is only 51.0kb and mine is only 58.2kb, not a huge difference! And why did smaller amounts of data for one task via the batch parameter not solve the problem.