Different unassigned features name

I want to use my data at genus level so I export it to tsv table in level 6 by the ‘collapse’ command. The problem is the feature name were not classified are have some strange behavior. Some have the ‘taxonomic letter’ in the begining but others not.

For example:
“k__Bacteria;p__AD3;c__;o__;f__;g__” - with all the tax letters (k,p,c,o,f,g).
“k__Bacteria;p__Firmicutes;c__Bacilli;__;__;__” with only part of the tax letter (k,p,c) but the letters o,f,g miissing.

My main question is if this a real biology different or seems as technical error?
Thank you,

The first taxon is a case where “there is not enough information for the Greengenes database to differentiate members of that clade, either due to ambiguity in the database or because the gene region being sequenced doesn’t provide the resolution to distinguish members of that clade” [ref]. The second case means that the classifier was not able to classify below the class level at the specified confidence level. Both are valid taxon strings, and both represent different mechanisms of “underclassification”.

So, if I understand you well -
The first example (with the tax letter) can solved by change database
and the second example (without the tax letter( can solved by change the classifier?
I also wonder if this unclassified feature are include in the highest levels?
I mean if I take the tsv tables at level 2 by collapse this bacterias are inside?
Thank you very much

In a sense, yes. If you don’t like the ambiguous taxonomies in the database, use a different database that does not have those ambiguous taxonomies (but may have others!)

No. If the classifier is unable to resolve the identity, it is probably because there are other taxa with near-identical sequences. It is quite difficult to achieve species-level classification with short amplicon reads. So there is no “solving” this, just accepting that short amplicon reads are insufficient to identify all possible species.

Yes. They will be collapsed with other taxa that share the same annotation through level 2.

@Nicholas_Bokulich and @thermokarst thank you very much! It was very helpful

1 Like