Question about q2-ghost-tree

Hi @sixvable apologies for the delay. Hard to stay on top of things during a move.

I'm honestly not familiar with itsxpress unfortunately. I was performing ITS analysis prior to the development of this tool and I'm currently not working on fungal analysis so I am not fully up to date. Were you able to figure out anything additional with regards to your question #1?

For #2 if you're building your own ghost-tree, you are able to choose 0.97 or you can even go lower (I tried 90 and even lower). When you cluster your sequences using vsearch you are choosing to lose the benefit of the ASVs and make a little bit more flexible groupings so that you don't discard 'unclassified' sequences when those clusters of seqs do not have a consensus taxonomic match to the foundation tree. To create a ghost tree, there needs to be a match between 'genus' in the foundation and 'genus' in your extension sequence group. So even when you use something like 90, it sounds terrible at first, but it's preventing many seqs from being discarded because they might be classified as "unknown". But this will decrease the quality of your phylogenetic tree. From the tests I did, it was better to have an "acceptable" tree than no tree at all (which would force you to do non-phylogenetic diversity analysis), or to discard a ton of your sequences due to missing nomenclature. At the time I worked on ghost-tree, so many sequences in the UNITE database were "unclassified"... I hope that gets better over time.

Let me know if this answers your questions or if you have any follow up questions especially considering the delay in my response. :slightly_smiling_face:

-Jennifer

2 Likes