I have re-analyzed the q2 diversity beta-group-significance performance after moving to q2-amplicon-2025.4 a while ago using two datasets for my tests:
The first dataset was taken from the Moving Pictures tutorial (downloaded the unweighted_unifrac_distance_matrix.qza created by qiime2-2021.11 and sample-metadata.tsv files).
The second dataset was from an own project (242 samples). This distance matrix was created in q2-2024.2.
A q2 diversity beta-group-significance run was started using the identical input data and the following q2-releases (all conda): 2024.2, 2024.10, and 2025.4 on three different machines (the youngest machine is a virtual server maintained by ou IT department). The performance (duration) was taken from the provenance tab of each visualization. The data were in each case stored on the same filesystem, not on a mounted network volume.
First, younger hardware is faster than old ones. Great
Second, there is a significant preformance drop from 2024.2 to 2024.10 (and on the old hardware with own data also to 2025.4). Although, I can live with the performance e.g. on the younger hardware, are these differences between q2 releases something which you do observe on your machines, too?
Is there anything which I should take care of when adding new releases to to conda? E.g., do you recommend to remove previous conda environments?
Thanks for reporting this @arwqiime! We're going to attempt to reproduce this on one of our larger datasets using Docker and will report back with our findings.
I conducted some testing of this on my machine. Three trial runs of the command on the data from moving pictures in 2024.2 and 2025.4. Times taken from provenance.
I tried running the entire core-metrics-phylogenetic command from the tutorial as well, and the pipeline runs ~3x faster in the newer environment. ~7s vs ~24s
I just ran this in the most recent dev environment (installed 31 Sept 2025) and the command completed in ~6s (according to data provenance). This is real data from this paper's 'artifact archive' on Zenodo.
I think we should test this command on different systems to see if this differs on mac/linux/windows to help narrow down what's going on here. The command that I ran follows, and the inputs are attached here. If we determine that this does differ, next step will be running the Python profiler to try to narrow down where the slowdown is coming from.
Update: when I run this through time, I get the following. This is probably preferable to looking at the times in data provenance since that won't represent the full run time as the user sees it - only the time spend inside the action.
time qiime diversity beta-group-significance ...
real 1m23.442s
user 0m33.142s
sys 0m48.938s
Another update: forgot to mention this is on my M3 MacBook Pro.
Saved Visualization to: bgs-colinbrislawn-wsl-qiime2-amplicon-2025.10.qzv
real 0m30.660s
Saved Visualization to: bgs-colinbrislawn-wsl-qiime2-amplicon-2025.7.qzv
real 0m28.935s
Saved Visualization to: bgs-colinbrislawn-wsl-qiime2-amplicon-2024.10.qzv
real 0m14.780s
Saved Visualization to: bgs-colinbrislawn-wsl-qiime2-amplicon-2024.5.qzv
real 0m20.161s
Saved Visualization to: bgs-colinbrislawn-wsl-qiime2-amplicon-2023.9.qzv
real 0m13.457s
Thank you for suggesting to use ‘time’ to get real numbers. I have rerun the previous data through time and added also Colin’s data from Github. Here are the results (single runs, not repeated).
It seems to me that the issue is related to the distance matrix from my own project. This was created by qiime deicode rpca, which has been replaced by qiime gemelli rpca recently. As far as I can tell now is that there is not much differences between distance matrices created by the deicode or gemelli. I would be ready to share my data by DM if necessary.
Hello @arwqiime, if you could DM me your data I would appreciate it. I've been running some pretty involved benchmarks with the Python profiler, and would like to see if I can determine what it is about your data in particular that seems to cause such a massive slowdown.