q2 beta-group performance

arwqiime · September 23, 2025, 12:58pm

May I come back to a previous post q2 diversity beta-group-significance performance - #9 by Oddant1 and add a short question.

I have re-analyzed the q2 diversity beta-group-significance performance after moving to q2-amplicon-2025.4 a while ago using two datasets for my tests:

The first dataset was taken from the Moving Pictures tutorial (downloaded the unweighted_unifrac_distance_matrix.qza created by qiime2-2021.11 and sample-metadata.tsv files).

The second dataset was from an own project (242 samples). This distance matrix was created in q2-2024.2.

A q2 diversity beta-group-significance run was started using the identical input data and the following q2-releases (all conda): 2024.2, 2024.10, and 2025.4 on three different machines (the youngest machine is a virtual server maintained by ou IT department). The performance (duration) was taken from the provenance tab of each visualization. The data were in each case stored on the same filesystem, not on a mounted network volume.

I do see two results:

First, younger hardware is faster than old ones. Great

Second, there is a significant preformance drop from 2024.2 to 2024.10 (and on the old hardware with own data also to 2025.4). Although, I can live with the performance e.g. on the younger hardware, are these differences between q2 releases something which you do observe on your machines, too?

Is there anything which I should take care of when adding new releases to to conda? E.g., do you recommend to remove previous conda environments?

Best regards

ebolyen · September 25, 2025, 6:18pm

Thanks for reporting this @arwqiime! We're going to attempt to reproduce this on one of our larger datasets using Docker and will report back with our findings.

Oddant1 · September 29, 2025, 9:37pm

I conducted some testing of this on my machine. Three trial runs of the command on the data from moving pictures in 2024.2 and 2025.4. Times taken from provenance.

2024.2
.944907s
.996095s
1.296163s
avg: 1.07934s

2025.4
1.764887s
1.689018s
1.686567s
avg: 1.713491s

Seems to be a notable slowdown there.

I tried running the entire core-metrics-phylogenetic command from the tutorial as well, and the pipeline runs ~3x faster in the newer environment. ~7s vs ~24s

gregcaporaso · October 1, 2025, 6:51pm

I just ran this in the most recent dev environment (installed 31 Sept 2025) and the command completed in ~6s (according to data provenance). This is real data from this paper's 'artifact archive' on Zenodo.

I think we should test this command on different systems to see if this differs on mac/linux/windows to help narrow down what's going on here. The command that I ran follows, and the inputs are attached here. If we determine that this does differ, next step will be running the Python profiler to try to narrow down where the slowdown is coming from.

qiime diversity beta-group-significance --i-distance-matrix braycurtis.qza --m-metadata-file sample-metadata.tsv --m-metadata-column "SampleType" --o-visualization bgs-test.qzv

braycurtis.qza (5.7 MB)
sample-metadata.tsv (347.7 KB)

Thanks for bringing this up @arwqiime!

Update: when I run this through time, I get the following. This is probably preferable to looking at the times in data provenance since that won't represent the full run time as the user sees it - only the time spend inside the action.

time qiime diversity beta-group-significance  ...
real    1m23.442s
user    0m33.142s
sys     0m48.938s

Another update: forgot to mention this is on my M3 MacBook Pro.

colinbrislawn · October 1, 2025, 9:04pm

WSL 2 on Windows 10
CPU: AMD 5700G x86

code on GitHub

Saved Visualization to: bgs-colinbrislawn-wsl-qiime2-amplicon-2025.10.qzv
real    0m30.660s

Saved Visualization to: bgs-colinbrislawn-wsl-qiime2-amplicon-2025.7.qzv
real    0m28.935s

Saved Visualization to: bgs-colinbrislawn-wsl-qiime2-amplicon-2024.10.qzv
real    0m14.780s

Saved Visualization to: bgs-colinbrislawn-wsl-qiime2-amplicon-2024.5.qzv
real    0m20.161s

Saved Visualization to: bgs-colinbrislawn-wsl-qiime2-amplicon-2023.9.qzv
real    0m13.457s

arwqiime · October 7, 2025, 2:46pm

Hello @colinbrislawn, @gregcaporaso , and @Oddant1

Thank you for suggesting to use ‘time’ to get real numbers. I have rerun the previous data through time and added also Colin’s data from Github. Here are the results (single runs, not repeated).

It seems to me that the issue is related to the distance matrix from my own project. This was created by qiime deicode rpca, which has been replaced by qiime gemelli rpca recently. As far as I can tell now is that there is not much differences between distance matrices created by the deicode or gemelli. I would be ready to share my data by DM if necessary.

Best,

cherman2 · October 16, 2025, 7:55pm

Thank you so much for pointing this issue out!

We have created a github issue for this bug and will update there with progress. Check out this issue for updates on this fix!

Oddant1 · October 21, 2025, 5:43pm

Hello @arwqiime, if you could DM me your data I would appreciate it. I've been running some pretty involved benchmarks with the Python profiler, and would like to see if I can determine what it is about your data in particular that seems to cause such a massive slowdown.