Should nonfungi microorganisms be removed from Mycobiome analysis?


I'm analyzing nasopharyngeal mycobiome in patients.
When assigned with UNITE+INSD all eukaryotes, there were some ASVs assigned with non-fungi (unassigned, kingdom Alveolata, Eukaryota_kgd_Incertae_sedis, Protista, Rhizaria, and Viridiplantae).

Should these nonfungi microorganisms be removed from Mycobiome analysis including relative abundance, alpha and beta diversity?



Welcome to the forums! :qiime2: This is a great first question.

Thanks for mentioning the database. What primers did you use and what microbes are they designed to target?

This depends.

You could select just the microbes targeted by your primers and covered by your database. The argument is that because these are the only microbes that are supposed to be there, these are the only microbes you can comment on.

You could also keep all your data, arguing that this reflects all microbes amplified by your primers, and so this avoids any extra bias introduced by removing taxa. This shows all observed microbes.

You can choose which argument to make, depending on what you think is best for your project.

To add to @colinbrislawn's great comments, I'll add one minor detail..

The reason why the "other eukaryotes" are included within the UNITE database is to act as set of outgroup / decoy / off-target taxa. The purpose of these outgroup taxa is to make sure that you properly classify fungi as fungi.

Meaning, if there is nothing but fungi in the reference data set, then most things, even if they are not fungi, will often be misclassified as "k__Fungi", with no further taxonomic level of classification.

So, having these outgroups will increase the chances of properly classifying your features as fungi if they really are, and as non-fungi if they are plant, metazoa, etc... Then you can remove these taxa if they are not part of your research focus. That is, this is no different than removing chloroplasts and mitochondrial sequences from bacterial data sets. Some even remove taxa that do not have at least a phylum level classification.

Also, terms like incertae sedis, is not an actual taxonomy, is is simply Latin for 'of uncertain placement'.

I hope this helps you decide which is the best way to go. :slight_smile: