QIIME 2 2019.4.0 has faulty OpenBLAS results on certain CPUs

data-integrity-bug
(Evan Bolyen) #1

We’re sorry to report that on certain CPUs1 matrix multiplication is performed incorrectly by OpenBLAS. We had initially thought the issue was a mismatch between numpy bindings and OpenBLAS, however further research has revealed the issue to be fundamental to OpenBLAS.

1. those with AVX 512 extensions, i.e. Skylake-X

Am I affected?

Probably not, unless you have a new (and relatively expensive) computer. In particular only Intel CPUs with AVX 512 have this issue. If your CPU is not a member of one of these processor families then there is nothing to worry about; OpenBLAS will do the correct thing.

If you don’t know (or don’t care) what CPU you have, we recommend just updating to the latest patch.

If you would like to learn if your CPU has AVX 512, you can run the following:

OS X

sysctl -a | grep machdep.cpu.features # Look for AVX 512 (AVX 2 or 1 is fine)

Linux

cat /proc/cpuinfo | grep flags | uniq # Look for AVX 512 (AVX 2 or 1 is fine)

What is OpenBLAS?

OpenBLAS is a widely used part of the scientific computing stack, providing fast and efficient linear algebra routines, and can be found linked from libraries like fastspar, numpy, and scipy to full programming languages like R and Julia. Much of what we do would not be possible without high quality and free libraries like this one.

What is being done?

Later today/tomorrow morning we will have a patch ready (2019.4.1). We strongly recommend you update your version of QIIME 2 (ensuring that qiime info indicates that you have q2-types 2019.4.1 installed, not 2019.4.0). This patch will specifically pin OpenBLAS to 0.3.3 which does not use AVX 512 instructions. In order to make it easy to identify which 2019.4 version was used, we’re updating the patch number for q2-types so that it is easier to spot in provenance.


Extra Details

We believe that only OpenBLAS versions 0.3.5 and 0.3.6 are impacted (specifically the DGEMM routine). To the best of our knowledge this issue has not been fully solved yet as we were able to reproduce the issue with both of those versions (even though 0.3.6 should have disabled the problematic code). OpenBLAS 0.3.3 was unaffected by the issue on the same hardware.

6 Likes
QIIME 2 2019.4 is now available!
GNEISS. Aitchison basis is not orthonormal
(Evan Bolyen) #2

We will follow up here when the new environment is ready. Sorry for the inconvenience!

2 Likes
(Evan Bolyen) #3

New environment files are now available. Download and reinstall normally.

1 Like
(Daniel) #4

Thanks @ebolyen, I have this result from “cat /proc/cpuinfo”:

processor : 19
vendor_id : GenuineIntel
cpu family : 6
model : 62
model name : Intel® Xeon® CPU E5-2660 v2 @ 2.20GHz
stepping : 4
microcode : 1068
cpu MHz : 2199.987
cache size : 25600 KB
physical id : 0
siblings : 20
core id : 12
cpu cores : 10
apicid : 25
initial apicid : 25
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat xsaveopt pln pts dtherm spec_ctrl ibpb_support tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
bogomips : 4399.97
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

It seems I have AVX. Am I safe?

Thanks!

#5

Hi! Thank you for the update.
Currently I am rerunning my analysis in Qiime2-2019.1 since I have AVX 512. Just to be sure - am I safe with 2019.1 or I affected as well on this version?

(Clara) #6

Hi, may I know how to download and reinstall?
Thanks.

(Matthew Ryan Dillon) #7

Hi @danielsebas!

Yep! AVX is not the same as AVX 512 — you’re fine. If you really want to be sure, remove your 2019.4 env:

conda env remove -n qiime2-2019.4  # or whatever you named your env

Then follow the official installation guide to get set up with the latest 2019.4. Once installed, run qiime info and check that q2-types is at 2019.4.1. That is all!

Keep us posted! :t_rex:

3 Likes
(Matthew Ryan Dillon) #8

Hi @timanix.

2019.1 does not use the impacted versions of openblas (I don’t think it uses openblas at all, actually). To be clear though, 2019.4 has been patched to use an unimpacted version of openblas, so you could just update your 2019.4 deployment by first removing the old env:

conda env remove -n qiime2-2019.4  # or whatever you named your env

Then follow the official installation guide to get set up with the latest 2019.4. Once installed, run qiime info and check that q2-types is at 2019.4.1.

Thanks! :t_rex:

2 Likes
(Matthew Ryan Dillon) #9

Hi there @Clara!

First, remove your 2019.4 env (if you installed it before the fix was released):

conda env remove -n qiime2-2019.4  # or whatever you named your env

Then follow the official installation guide to get set up with the latest 2019.4. Once installed, run qiime info and check that q2-types is at 2019.4.1. That is all!

Please keep us posted with any more questions - thanks! :t_rex:

3 Likes
#10

After removing 2019.4 I searched my conda for openblas and found nothing, thank you for the explanation, now it is clear.
It’s already third time I am rerunning my analysis, I’ll wait maybe there will be some extra updates :sunglasses:
Thx for fixes!!!

1 Like
(Eman Khalaf) #11

Hi @thermokarst
I got this from cat /proc/cpuinfo

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 60
model name	: Intel(R) Core(TM) i5-4460  CPU @ 3.20GHz
stepping	: 3
microcode	: 0x25
cpu MHz		: 997.350
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts flush_l1d
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
bogomips	: 6385.43
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 60
model name	: Intel(R) Core(TM) i5-4460  CPU @ 3.20GHz
stepping	: 3
microcode	: 0x25
cpu MHz		: 999.506
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 4
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts flush_l1d
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
bogomips	: 6385.43
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 60
model name	: Intel(R) Core(TM) i5-4460  CPU @ 3.20GHz
stepping	: 3
microcode	: 0x25
cpu MHz		: 1009.409
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 2
cpu cores	: 4
apicid		: 4
initial apicid	: 4
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts flush_l1d
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
bogomips	: 6385.43
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 60
model name	: Intel(R) Core(TM) i5-4460  CPU @ 3.20GHz
stepping	: 3
microcode	: 0x25
cpu MHz		: 981.318
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 6
initial apicid	: 6
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts flush_l1d
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
bogomips	: 6385.43
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

I don’t know what to do?
Anything I did yesterday on qiime2-2019.4 should be repeated?
thanks

(Matthew Ryan Dillon) #12

I don’t see AVX 512 in any of your flags — your processor is not impacted. Please see @ebolyen’s note above:

You can do that by following my post above.

2 Likes
(Clara) #14

Thanks @thermokarst, I followed what you have said:

  1. remove 2019.4 version
    “conda env remove -n qiime2-2019.4”
    (I tried to use “source activate qiime2-2019.4” again to ensure I have remove them and yes, I cant activate qiime at this time)
  2. Reinstall again using these code
    wget https://data.qiime2.org/distro/core/qiime2-2019.4-py36-linux-conda.yml
    conda env create -n qiime2-2019.4 --file qiime2-2019.4-py36-linux-conda.yml

OPTIONAL CLEANUP

rm qiime2-2019.4-py36-linux-conda.yml
3) source activate qiime2-2019.4
4) Check with “qiime info” and this is what I got:
System versions
Python version: 3.6.7
QIIME 2 release: 2019.4
QIIME 2 version: 2019.4.0
q2cli version: 2019.4.0

Installed plugins
alignment: 2019.4.0
composition: 2019.4.0
cutadapt: 2019.4.0
dada2: 2019.4.0
deblur: 2019.4.0
demux: 2019.4.1
diversity: 2019.4.0
emperor: 2019.4.0
feature-classifier: 2019.4.0
feature-table: 2019.4.0
fragment-insertion: 2019.4.0
gneiss: 2019.4.0
longitudinal: 2019.4.0
metadata: 2019.4.0
phylogeny: 2019.4.0
quality-control: 2019.4.0
quality-filter: 2019.4.0
sample-classifier: 2019.4.0
taxa: 2019.4.0
types: 2019.4.1
vsearch: 2019.4.0

Application config directory
/home/ubuntu/Myvolume_1/miniconda3/envs/qiime2-2019.4/var/q2cli

Getting help
To get help with QIIME 2, visit https://qiime2.org

It seems like only the “types” and “demux” are showing 2019.4.1, so this means OK?

2 Likes
(Evan Bolyen) #15

Yes that is exactly right!

1 Like
(Matthew Ryan Dillon) split this topic #16

An off-topic reply has been split into a new topic: Trouble with filepaths and q2-dada2

Please keep replies on-topic in the future.