Hello everyone, I’m having some different results when using the “all eukaryotes” and “Fungi” from UNITE databases, some of the taxa that show up when I use the “Fungi” database, can’t be found when using the “all eukaryotes” which I believed to contain all the “fungi” sequences, it was suggested to me to join both datasets, anyone knows how to do this joining, or have other suggestions to address this issue?

This is because the content of a database can impact results, e.g., if you have query sequences that are actually eukaryotes then using the fungi-only database would yield improper results. Likewise, sequence similarity between fungi/other euks could impact classification of query sequences that cannot be confidently classified to one group or the other.

Yes as far as I know the fungi accessions are all found in the “all eukaryotes” release, just did a quick check to confirm in the latest UNITE release.

I would advise against this — for one, this will result in duplicate accessions, which will cause an error when you try importing this database into QIIME 2 (since all sequence IDs must be unique). For another, by duplicating these sequences you would just slow runtime, increase memory demands, and any performance increases would be artificial.

My advice is to stick with one database or the other, depending on how likely you expect to find non-fungal eukaryotes in your samples.

@Nicholas_Bokulich Thank you for your answer, I will consider it.

