I have a question about taxonomic assignments using the feature classifier and how the depth is labeled for some taxonomic assignments.
I’ve noticed that in both my data processing and in the “Moving Pictures” tutorial, the taxon labels seem to not be consistent, even if the assignment is down to the same taxonomic depth.
For instance, in the “Moving Pictures” taxonomy file, Feature ID 02ef9a59d6da8b642271166d3ffd1b52 is assigned “k__Bacteria; p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__; s__” and Feature ID 73291cac0e802b6a1fb25ae7079390ef is assigned “k__Bacteria; p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae”.
My interpretation of these assignments is that both Feature IDs have been assigned to the family Ruminococcaceae, but the assignment cannot go more specific than the family level. Why does one Feature ID get an additional “g__; s__” but the other does not?
I’m doing some downstream analysis in R using phyloseq, and I think I may run into some issues with taxonomic ranks if some taxonomic assignments have additional labelling.
Is the solution to export the taxonomy file to .tsv (which I’m already doing to create a phyloseq-importable .biom file), and then make the taxonomic labeling consistent (i.e. add "p__; c__; o__; etc all the way down to s__;)?
Thanks in advance!