QoL Improvement: q2-boots and q2-diversity core-metrics should error at the start of run when samples missing from mapping file

Just encountered a frustrating situation in which I ran the q2-boots core-metrics pipeline over the weekend, only for it to error at the end of its run on Sunday at the Emperor step with the usual “KeyError: 'There are samples not included in the sample mapping file. Override this error by using the ignore_missing_samples argument. Offending samples:…”

Firstly, the q2-boots pipeline doesn’t include the –p-ignore-missing-samples option like in q2-diversity, and it should, especially since it can throw an error that suggests it (which has also been suggested here No ignore_missing_samples parameter · Issue #30 · qiime2/q2-boots · GitHub). But additionally if there was a check for this at the start of the pipeline instead of just at the Emperor step where the mapping file gets used, I would’ve been able to spot this immediately and fix it before the weekend. This may be more relevant for q2-boots than q2-diversity, given the increased time requirements.

Another potential fix is to at least output the results that complete prior to the Emperor step, instead of discarding everything when that errors. Let me know if there’s any questions about this.

2 Likes

Hi @Adam_Cantor,
Thanks for these QOL suggestions! I totally agree and have added your comment to the issue you linked.

I am sorry for the inconvenient timing of this error. I hope that pipeline resumption can help salvage your run.

Thanks

4 Likes

Thanks for the reply @cherman2, wasn’t familiar with pipeline resumption, I’ll check it out.

Adam

@Adam_Cantor - just FYI @cherman2 is working on a fix for this and it'll go into the next release. Both boots core-metrics and boots kmer-diversity will fail right away if this situation comes up, and the error message will provide some information on how to remedy the situation.

4 Likes