You’re right, those ASV names are not publication friendly, and emperor doesn’t have a friendly way of dealing with them either yet, but this post
might offer a somewhat work around, so you can avoid those names, while keeping the vectors and their taxonomic designation which as you mentioned might be more interpretable. Not super elegant…
I think that is an interesting observation, but I would be hesitant to generalize that across all studies because it really will be dependent on the data. In fact my personal opinion is to operate at the highest resolution possible (ASVs) by default, unless you have a reason to do otherwise. For instance, your biological signal may exist at the genus level due some to shared trait. But in another example, I recently did some analysis on a project where I compared disease vs healthy mice and when I did differential abundance at the species level I found no differences in a species that we full expected to find a difference. But when I did the analysis again at the ASV level, turns out there were about 6-7 different ASVs belonging to that species and most of them did change as we expected, but 1 of them changed in the other direction and essentially masked the expected difference across the other members. So at the end of day I guess it really depends on the biological question, but I still think the default should be ASV then collapsing if you have some specific reason to do so. Others may have different opinions on this too but to me it seems a bit less risky of losing potential signals.
Btw, in phyloseq it is even more burdensome to maintain ASV names because they are the full DNA sequences, I always do rename them to ASV1, ASV2…and store the full DNA into the
refseq slot just in case. Looks much better!
No problem! I actually really appreciate reading about other’s take on these topics as they don’t have any right or wrong answers, ‘just depends’.