My colleagues and I are trying to use ghost-tree to get some phylogenetic information from fungal ITS data, but we came across several setbacks that we can’t solve . Since we have questions about a few different aspects and the other posts were quite inactive, we decided to build a new post.
We have inquiries regarding three topics (sorry! ): one related to using pre-built ghost trees and two related to building our own tree, which is probably the best option, to be able to use up-to-date databases.
Using pre-built trees, we encountered an issue when running q2-diversity-core, as happened to other users (e.g., Ghost tree filtering? - #26 by Jennifer_Fouquier, Error when running Ghost tree): OTUs were not fully represented by the tree, even though the clustering was carried out with the exact same database. After re-reading posts and the ghost-tree paper, we realized we needed to filter the OTU table to keep only those IDs present in the tree. We did so, and it worked. Yet, we still have some questions/concerns about this table filtering step. Firstly, why is this still necessary if we clustered with the same database used for the ghost tree? Also, will it be necessary even if we build our own tree? And finally, considering it was discarding at least 25% of our OTUs (even higher with 90% or 100% ghost trees), are we not losing a significant amount of information when doing this? Maybe there’s a theoretical aspect we’re not quite grasping here, and you could help us ease our concerns.
Building our own trees, we came across two issues:
We were able to build our tree using q2-ghost-tree scaffold-hybrid-tree-foundation-alignment but, once again, we have the IDs-not-matching problem in the diversity core. We suspect it is being caused by an issue with the underscores being replaced by spaces (Ghost tree filtering? - #26 by Jennifer_Fouquier), which would be solved adding single quotes in the IDs, but we’re not sure how to do this. Firstly: Should the single quotes be added to the ghost tree node IDs or to one or more of the files used in q2-ghost-tree scaffold-hybrid-tree-foundation-alignment? And, if they should be added to the ghost tree, how would we do that?
We were not able to run q2-ghost-tree scaffold-hybrid-tree-foundation-tree because it wouldn’t take the --i-foundation-taxonomy file we were using. We tried the SilvaTaxonomy file we imported according to the tutorial (Q2-ghost-tree Plugin: Community Tutorial for Creating Hybrid-Gene Phylogenetic Trees):
qiime tools import
Yet, it didn’t work because it needed to be a FeatureData[Taxonomy] file, which means this artifact is only used to run q2- ghost-tree extract-fungi. The problem is that the example to load the taxonomy foundation is done with an example file called “minitaxonomy_foundation.txt”, so we’re not sure which Silva file we need for this step. Which file would this be? And also, is there any difference between obtaining the ghost-tree from q2-ghost-tree scaffold-hybrid-tree-foundation-alignment or from q2-ghost-tree scaffold-hybrid-tree-foundation-tree?
Wow, that was a lot of questions , but I think that’s all. We’re sorry for sending such a long post but we’re hoping you can help us understand, at least little by little, what’s going on with these issues and how we can solve them.
Thanks in advance!