ITS phylogeny using iq-tree

Hello QIIME 2 team and community,

I’m working on fungal ITS data (from soil and tree microbiomes) and would like to include phylogenetic metrics such as WUniFrac in my diversity analyses (and not only; for example for the bin-based assembly mechanisms calculations).

I initially planned to use the q2-ghost-tree plugin to graft my ITS ASVs onto a reference backbone (UNITE in my case). However, Ghost Tree hasn’t been updated since 2021, and I’m wondering whether it is still recommended for current QIIME 2 releases (I’m using 2024.10 version).

I’d like to ask:

  1. Are there modern alternatives to Ghost Tree for fungal ITS data that allow construction of a meaningful phylogenetic tree?

  2. Is it feasible and biologically acceptable to generate a de novo phylogeny directly from my ITS representative sequences using q2-phylogeny and IQ-TREE (e.g., via qiime phylogeny align-to-tree-mafft-iqtree)?

    • Would this be considered valid for diversity analyses (even though ITS is highly variable and difficult to align globally)?

    • Are there recommended parameters or masking strategies to improve alignment quality for ITS1 or ITS2?

Any recent examples or best practices for ITS phylogeny within QIIME 2 would be very helpful!

Thank you in advance for your insights,

Ivan

Hi @scilexenko ,

ghost-tree has not been maintained for a few years now.

Alignments/phylogenies directly on the ITS can be problematic — this is done for closely related species, but across the full fungal kingdom the trees will not be very noisy, as ITS is non-coding and hypervariable so can be challenging to align and cannot be used for estimating evolutionary distance. So if you actually want a phylogeny, you should probably use another marker like LSU rRNA gene instead of ITS.

However, for diversity analyses you are not interested in the phylogeny per-se, just a phylogeny-aware diversity metric. One option is to use kmer composition as a pseudo-phylogenetic metric, which correlates quite well with UniFrac distance, see more discussion here and my associated publication:

If your question is diversity, not actually retracing evolutionary distance, then kmerizer should scratch that itch. Kmer frequency profiles are also compatible inputs for different diversity metrics as well as downstream methods like ML classifiers, so can open up other analytical possibilities as well.

I hope that helps!

1 Like

Hi @Nicholas_Bokulich

Thank you for a swift reply ! The tool you suggesting is indeed interesting when the phylogeny is not the core of the analysis.

However, in my case it is important to have the phylogenetic tree, since based on that the bin-based model will be produced for the calculation of the assembly mechanism of the fungal community. This is a framework of iCAMP package ( GitHub - DaliangNing/iCAMP1: Infer Community Assembly Mechanisms by Phylogenetic bin-based null model analysis (Version 1) ).
Since the phylogenetic tree is a central element of the analysis, I wanted to be sure, that the one calculated in qiime is accurate (which appears not).

That being said, I’m not sure that q2-kmerizer is be suitable for this analysis.

I’d appreciate hearing your thoughts if you have further suggestions.

Thank you,

Ivan

Hi Ivan,

Okay so you do want a tree! No, q2-kmerizer will not be suitable for the iCAMP package in that case. But you could, e.g., map to a full-length ITS database like eukaryome (which is also an option with RESCRIPt starting in release 2025.10, by the way). Then build a tree off of the LSU sequences in eukaryome.

2 Likes