Hi @SophieD,
Do you know how the taxonomy assignment algorithm works within EzBioCloud? I've not used this tool before. I suspect it is similar to BLAST?
Many tools simply take the top BLAST hit to a given reference database. However, the top hit is not always correct, as that hit might be arbitrarily sorted to the top, despite having hundreds or thousands of equally likely hits listed below a given hit. For example, many organisms have the exact same sequence over a given sequenced region, and can not be disambiguated. The fit-classifier-naive-bayes
take this into account and will return the lowest common ancestor (LCA) when multiple taxa have identical sequence.
For example see this thread:
I might also add that, it is very difficult to expect species-level classifications with short amplicon reads. There are even cases in which having the full length 16S rRNA gene sequences can not disambiguate between species or genera!
-Mike