Multiple feature-classifier commands run and then killed

KonradV · December 8, 2023, 9:29am

I'm trying feature-classifier methods other than classify-sklearn, but invariably they all killed. the command is as follows:

classify-consensus-vsearch:

time qiime feature-classifier classify-consensus-vsearch \
--i-query rep-seqs-162-212-20.qza \
--i-reference-reads ../2022.10.backbone.v4.fna.qza \
--i-reference-taxonomy ../2022.10.taxonomy.id.tsv.qza \
--o-classification taxo-gg2-consensus-vsearch-162-212-20.qza \
--o-search-results hit-gg2-consensus-vsearch-162-212-20.qza
Killed

real    7m46.642s
user    4m29.171s
sys     0m22.738s

classify-consensus-blast:

time qiime feature-classifier classify-consensus-blast \
--i-query rep-seqs-162-212-20.qza \
--i-reference-reads ../2022.10.backbone.v4.fna.qza \
--i-reference-taxonomy ../2022.10.taxonomy.id.tsv.qza \
--o-classification taxo-gg2-consensus-blast-162-212-20.qza \
--o-search-results hit-gg2-consensus-blast-162-212-20.qza
Killed

real    8m3.616s
user    6m11.097s
sys     0m15.623s

classify-hybrid-vsearch-sklearn:
In this section I also tried to get him not to process so much data.

time qiime feature-classifier classify-hybrid-vsearch-sklearn \
--i-query rep-seqs.qza \
--i-reference-reads ../2022.10.backbone.v4.fna.qza \
--i-reference-taxonomy ../2022.10.taxonomy.id.tsv.qza \
--i-classifier ../2022.10.backbone.v4.nb.qza \
--p-reads-per-batch 10000 \
--o-classification taxo-gg2-hybrid.qza
Killed

All commands I have defaulted cup/thread to 1. I'm not sure if this is a RAM issue. Or is it because I'm using some files by mistake? Like backbone.v4.fna.qza and taxonomy.id.tsv.qza from gg2?

KonradV · December 8, 2023, 10:17am

By the way, my computer with a single 32G RAM stick.

KonradV · December 8, 2023, 12:37pm

Would it be better if blast returned fewer results?

time qiime feature-classifier classify-consensus-blast \
--i-query rep-seqs-162-212-20.qza \
--i-reference-reads ../2022.10.backbone.v4.fna.qza \
--i-reference-taxonomy ../2022.10.taxonomy.id.tsv.qza \
--p-maxaccepts 1 \
--o-classification taxo-gg2-consensus-blast-162-212-20.qza \
--o-search-results hit-gg2-consensus-blast-162-212-20.qza
Killed

real    17m33.663s
user    7m47.337s
sys     0m55.851s

Well, he lasted longer and then KILLED it.

KonradV · December 8, 2023, 1:21pm

I noticed the method of generating the blastdb artifact, would using blastdb instead of the reference sequence solve the problem?

time qiime feature-classifier classify-consensus-blast \
--i-query rep-seqs-162-212-20.qza \
--i-blastdb gg2-v4-blast.qza \
--i-reference-taxonomy ../2022.10.taxonomy.id.tsv.qza \
--p-maxaccepts 1 \
--o-classification taxo-gg2-consensus-blast-162-212-20.qza \
--o-search-results hit-gg2-consensus-blast-162-212-20.qza \
Killed

real    33m59.670s
user    7m21.607s
sys     2m7.440s

After an even longer wait it was still KILLED!
But I noticed that it took up less RAM in the early part of the process, although it still took up the maximum amount of RAM later on. For my computer it's 70%+

KonradV · December 8, 2023, 1:32pm

Even when I only process very, very small reads at a time (1/20th by default) through the batch parameter, it still KILLED

time qiime feature-classifier classify-hybrid-vsearch-sklearn \
--i-query rep-seqs.qza \
--i-reference-reads ../2022.10.backbone.v4.fna.qza \
--i-reference-taxonomy ../2022.10.taxonomy.id.tsv.qza \
--i-classifier ../2022.10.backbone.v4.nb.qza \
--p-reads-per-batch 1000 \
--o-classification taxo-gg2-hybrid.qza
Killed

real    5m23.921s
user    3m54.541s
sys     0m16.011s

cherman2 · December 8, 2023, 5:52pm

Hi @KonradV,
Thanks for all the info on this error! :qiime2:
This does look like a Out of Memory Error for me!

However, this has me questioning things. 32 GB should be enough ram, I think.

KonradV:

Even when I only process very, very small reads at a time (1/20th by default) through the batch parameter, it still KILLED

time qiime feature-classifier classify-hybrid-vsearch-sklearn \
--i-query rep-seqs.qza \
--i-reference-reads ../2022.10.backbone.v4.fna.qza \
--i-reference-taxonomy ../2022.10.taxonomy.id.tsv.qza \
--i-classifier ../2022.10.backbone.v4.nb.qza \
--p-reads-per-batch 1000 \
--o-classification taxo-gg2-hybrid.qza
Killed

real    5m23.921s
user    3m54.541s
sys     0m16.011s

This has me even more confused!

Can you elaborate on why you are choosing these methods over classify-sklearn?

Does classify-sklearn also fail? If classify-sklearn also fails, Lets try something silly!

Can you try running this part of the moving pictures tutorial: “Moving Pictures” tutorial — QIIME 2 2023.9.2 documentation. This tutorial data is specifically designed to be very small so you should be able to run it without a memory error? If that command succeeds lets try using your rep-seqs but the moving pictures classifier.

KonradV · December 9, 2023, 1:31am

Thank you for your reply!

As much as you said so, I could go ahead and not pick out a new RAM stick

It's because of the problems here. I tried using a different approach from the feature classifier plugin trying to see if the results would be better. Of course in response to this issue it's likely that it's happening at the dada2 step, but I'll continue to reply under that issue page.

Only classify-sklearn can succeed. But I'm not sure why. Is it because his requires less arithmetic, or takes up less memory?

Meaning I use the data provided here to run the commands I have here where the kills occur? I'll try that right away, that's a great suggestion, thanks! I'll reply later with the results.

KonradV · December 9, 2023, 1:38am

Sorry I forgot to add that I am doing all this under wsl2 on windows.

KonradV · December 9, 2023, 7:28am

This one worked!

time qiime feature-classifier classify-hybrid-vsearch-sklearn \
--i-query rep-seqs-dada2.qza \
--i-reference-reads 2022.10.backbone.v4.fna.qza \
--i-reference-taxonomy 2022.10.taxonomy.id.tsv.qza \
--i-classifier 2022.10.backbone.v4.nb.qza \
--o-classification hybird-taxo.qza
Saved FeatureData[Taxonomy] to: hybird-taxo.qza

real    9m27.070s
user    9m10.923s
sys     0m14.090s

time qiime feature-classifier classify-hybrid-vsearch-sklearn \
--i-query rep-seqs-162-212-24-1.qza \
--i-reference-reads 2022.10.backbone.v4.fna.qza \
--i-reference-taxonomy 2022.10.taxonomy.id.tsv.qza \
--i-classifier 2022.10.backbone.v4.nb.qza \
--o-classification hybird-taxo.qza
Killed

real    12m25.673s
user    8m5.619s
sys     0m58.482s

Why? The rep-seqs file used here is only 51.0kb and mine is only 58.2kb, not a huge difference! And why did smaller amounts of data for one task via the batch parameter not solve the problem.

cherman2 · December 11, 2023, 4:01pm

Hi @KonradV,
Again, thank you for all the information!

This might be why! There is a chance that your WSL is not configured to use all your ram!

I dont know a ton about WSL , but look into this stackoverflow thread, lets see how much memory is allocated in your WSL system.

Yeah! Lets avoid buying more hardware!

system · January 11, 2024, 10:02pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.