QIIME is caching your current deployment.....Illegal instruction (core dumped)

File number 3 - final file.

core.qiime.1000.e72f570ab78e4ac1afd2f7985920cb2a.3091.1681948829000000.003.7z (6.0 MB)

1 Like

Thanks @cosmic,

I was able to get the 7z working (I had to make all the extensions look like *.7z.00[123] before it would work) and then ran

7z e -ai'!core.qiime.1000.e72f570ab78e4ac1afd2f7985920cb2a.3091.1681948829000000.7z.*' -an

Looks like the illegal instruction was vmovups (Move Unaligned Packed Single-Precision Floating-Point Values) from AVX.

image

I used

gdb libssu.so core.qiime.1000.e72f570ab78e4ac1afd2f7985920cb2a.3091.1681948829000000

(where I had already extracted libssu from the unifrac package)
Got a bunch of warnings as my filesystem was different, so it couldn't resolve the symbols, but you can still run in the GDB shell:

layout asm

which worked great!

Alright so ultimately, there's not anything you will be able to do to solve this issue (beyond using a more modern computer).

Presently we would need to create alternative non-AVX builds of (potentially) a few packages.

Another approach we might take is to see if it's possible to gate the imports so that these instructions don't get called during initialization. From there, we could provide alternative implementations of some of these particular steps (which would be relatively little effort since it's basically just using scikit-bio again).

Things like core-metrics-phylogenetic probably wouldn't be updated, but there would be a straight-forward workaround as it's not a complex pipeline to do by hand.

I'm curious what other's thoughts are for this.

cc @gregcaporaso @wasade

1 Like

For a no AVX build of UniFrac, we anticipate resolution in the next week or two. I'd really like to make the code freeze window.

One complexity with alternative implementations is it may become tricky for the user to determine what correct reference to cite is.

Best,
Daniel

1 Like

Thanks @wasade,

That's great news, and the timing should work out!

Re: alternative implementations, I think the goal would be a completely different action, so the citation would be straight-forward. Nothing clever or automatic here, just a "Illegal instruction" -> "Use this action instead".

That said, I'm not especially attached to the idea, it just struck me as an easy out for this. Non-AVX/alternative architecture builds would be a better option.

1 Like

If anyone is feeling brave, it may be possible to get things working by running QIIME 2 under the Intel SDE emulator. Caveat, it's quite likely this won't work as most CPUs missing AVX are outside of Intel's support window.

It does seem to have specific reporting for scalar-simd which includes AVX.

That said, the command looks pretty simple:

path-to-kit/sde -- <Your QIIME 2 command here>

So it may be worth trying

Hi @ebolyn. Thanks for persevering with the problem.
I think that given I'm still a relatively new user and some of the coding language sounds like Klatchian to me, I'm going to try a newer PC to work on.
I went and sweet-talked the IT guys at my institution in to overwriting a Win10 PC with Ubuntu so I'm going to give that a crack and hopefully get it to work smoothly.
Do appreciate all the help though, thanks.

4 Likes

Hi all,

Just came to add to the fray saying that I too am facing issues with non-AVX processors (running Ubuntu 20.04). I have been running qiime 2022.2 for a while now without problems.

I second the idea of allowing the non-AVX dependent methods to load, but then giving the core dumped error when one tries to run AVX-dependent ones. Personally, I use qiime2 for the filtering, visualization and dada2 implementation on an Ubuntu server, then take the results and analyze them elsewhere. So, updated pipelines that break at the core-metrics-phylogenetic step wouldn't affect my use case. (Probably not a good solution, by my two cents anyway.)

Has there been any movement on the non-AVX build of Unifrac?

1 Like

Yes! We've got a discussion going internally and we're working out what our plan is going to be. I'm personally hoping we can have some kind of workaround available for this upcoming release, but no guarantees on that yet.

In any case, this discussion has been super useful for us, and whatever we end up going with is probably going to set the pattern for how we handle these things moving forward.

1 Like

Edit from @ebolyen: Don't run these commands out of context. I'll have some instructions shortly, but ran into an issue getting the environment to solve.

Original message continue below the break.


Per Igor:

cd $CONDA_PREFIX/lib/ && rm -f libssu.so && ln -s libssu_cpu_basic.so libssu.so
cd $CONDA_PREFIX/bin/ && rm -f ssu && ln -s ssu_cpu_basic ssu

Perfect, thank you!

I imagine we should also do the same link for faithpd? It appears to have the same build variants and I think q2-div-lib also must call that via CLI at the moment as well.

I will write up some instructions for updating unifrac and see if anyone in this thread would like to test.

1 Like

Right, that binary too. Ya, same thing for it too. And thanks!

Alrighty folks,

We have a workaround that ought to work, but we need help testing.

Note to future readers/googlers: These instructions were written May 2nd 2023, if it's more than like 6 months out from that, you may wish to disregard/read very very carefully what these instructions are doing. There ought to be a better way and/or you have a different problem at that point, so these commands may be destructive (especially if you skip the conda install or start changing the version).

If you are contemporaneous with this post/have posted in this thread already, then it's not like your environment can be any more broken. So this shouldn't be particularly destructive.


:warning: IMPORTANT :warning:
I have not thoroughly tested every plugin to make sure it is working correctly, so you may run into issues. Treat this as experimental, if this works, we'll be able to adopt it into the next release.

Also, we are literally lying to the conda installer, so this environment should be considered de-facto broken.


First we have to cheat the conda solver, so we are going to create a small env and modify it slightly:

conda create -n throwaway \
  -c conda-forge \
  -c bioconda \
  python=3.8 unifrac unifrac-binaries=1.3.1

Then we exclude a few dependencies which will break QIIME 2 but for which QIIME 2's solutions seem to be fine for the new unifrac.

conda list -n throwaway --explicit \
  | grep --invert-match 'scikit-bio\|hdmedians\|pandas\|ipython\|scipy\|jedi\|importlib\|matplotlib\|numpy\|decorator' \
  > packages_to_overwrite.txt

Now activate the environment you want to patch

and run:

conda install --file packages_to_overwrite.txt

Now you can run the following commands to switch the build of unifrac-binaries being used:

ln -sf $CONDA_PREFIX/lib/libssu_cpu_basic.so $CONDA_PREFIX/lib/libssu.so
ln -sf $CONDA_PREFIX/bin/ssu_cpu_basic $CONDA_PREFIX/bin/ssu
ln -sf $CONDA_PREFIX/bin/faithpd_cpu_basic $CONDA_PREFIX/bin/faithpd

This will overwrite the current symlinks to point at a different build which was compiled without AVX capabilities.

Now to make sure these changes are working as expected:

pytest --pyargs unifrac q2_diversity_lib q2_feature_table q2_types 

(Should end up being all green with maybe some yellow warnings you can ignore).

If you do see errors, please let us know.

1 Like

Thanks to @sfiligoi and @wasade for getting a new version of unifrac-binaries out so quickly so that we could test this!

1 Like

Hello,

the workaround solved the illegal instruction (core dumped) error on my Linux server (fresh install of AlmaLinux 9.2, with miniconda/python 3.8), but I ran into other issues with q2 composition ancombc.

Since the underlying hardware is old (HPE DL180 G6, year 2011), I am wondering whether the next release would be more tolerant to older hardware, or whether this will not be in the focus of the update.

Hey @arwqiime,

Sorry for losing track of this thread!

Our goal was to have these fixes available in the latest release, however it requires us to update a dependency chain in biom-format -> scikit-bio -> scipy.

We attempted this in 2022.11, but it went pretty poorly, so since the start of this year, we've been working behind the scenes on a new CI system to make these larger transitions possible. We were expecting that system to be online by the 2023.5 release, but we didn't quite get there. So for the moment, you are limited to some of the suggestions in this thread, none of which are super fun to get working.

Hopefully we will have our new CI online for the next release (perhaps 2023.7?) so that we can wrangle these dependencies correctly and get the latest unifrac installed which will fix this issue for everyone missing AVX.

1 Like