I am currently following the qiime2 moshpit tutorial using version 2025.4.0 which I believe was installed using conda. I am getting an error message after running the mosh annotate kraken2-to-mag-features command.
ValueError: Length of values (0) does not match length of index (28)
Plugin error from annotate:
Length of values (0) does not match length of index (28)
It looks like this may be an edge case where some piece of data which was not expected to be empty is ending up empty. Are you comfortable sharing your inputs to this command (the kraken2 reports and outputs) so I can try to recreate the issue?
I am referencing the cocoa fermentation tutorial ( Cocoa fermentation - MOSHPIT documentation). I guess all pairs give the issue? The keys I provided as inputs in the command (eg. kraken_reports_mags_derep_eukaryota) encompasses the data for all four of the samples.
It looks like the software that finds the lowest common ancestor (LCA) among classifications for each MAG has a bug that does not allow it to properly handle the case when the LCA is the root of the taxonomy. We will work on a fix for this and let you know once it's done.
It sounds like you followed the tutorial you referenced exactly and with the tutorial-provided data, is that correct? Just checking so I can let the author(s) of the tutorial know that this step is broken.
Thanks @colinvwood! It is with the cocoa fermentation tutorial-provided data, yes. I followed the tutorial, but it turns out not exactly. I had used the eukaryote lineage BUSCO dataset while the tutorial used the bacteria lineage for the BUSCO bin evaluation step. I just switched to the bacteria lineage at that step, then proceeding with those outputs, I am able to pass the mosh annotate kraken2-to-mag-features step where I initially got an error.
That makes sense. Since the dataset contained bacterial sequences, using a eukaryotic database to select MAGs left you with only low-quality (maybe spurious) eukaryotic MAGs. These were then classified with classify-kraken2 and given low-quality classifications which showed disagreement all the way to the domain level. And since all of the four MAGs had such disagreement, the edge case that resulted in the error you saw surfaced.
We created a fix for this issue here. That fix will then eventually be available in the next release.