differences between 2022 and 2023 release of UNITE

Thank you for sharing these updated results. I can only investigate so much on a volunteer project, but hopefully I can look a little more and get you started.

Mock 1 and Mock 2 look similar expect...

  • mock 1 (from 25.07.2023) has many more unassigned reads compared to mock 2 (29.11.2022).
  • I think this is due to differences in the Unite database, not the Qiime2 pipeline, but I have not benchmarked this.

Comparing leaves 1 and 2

  • leaves 1 has FAR more unassigned reads compared to leaves 2
  • This is the exact same pattern as the simpler mock communities, just with a stronger result.
  • Perhaps something is wrong with my file! unite_ver9_dynamic_all_25.07.2023-Q2-2024.2.qza

Unite v10 just came out, if you want to try that: Releases · colinbrislawn/unite-train · GitHub

The variable we need to isolate is database version vs Qiime2 version.

As long as this is a UNITE issue and not a Qiime2 bug or regression, we can simply point out this problem when we tell reviewers why we are not using the newest Unite database.

This is why trying the newest Unite with the same qiime2-2024.2 release is so interesting!

Keep in touch,
Colin