Error when running midpoint-root on imported tree

I am doing my first downstream analysis on metagenomic WGS data. I was classifying the taxonomy kraken2 (custom database - all refseqs) and calculating species abundances with bracken.

My Idea was to use qiime2 to calculate unifrac distances, since it provides nice plotting features. I obtained the pyhlogenetic tree in newick format from the kraken database using this script.

I tried to import this tree to qiime2 using the following commands:

qiime tools import --input-path ncbi_taxonomy.newick --type 'Phylogeny[Unrooted]' --output-path tax_tree.qza
qiime phylogeny midpoint-root --i-tree tax_tree.qza --o-rooted-tree tax_tree_rooted.qza

…and I get the following error:

Traceback (most recent call last):

File “/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/tree/_tree.py”, line 2418, in get_max_distance
self._set_max_distance()
File “/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/tree/_tree.py”, line 2359, in _set_max_distance
raise TreeError(“No support for single descedent nodes”)
skbio.tree._exception.TreeError: No support for single descedent nodes

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/q2cli/commands.py”, line 274, in call
results = action(**arguments)
File “”, line 2, in midpoint_root
File “/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable
output_types, provenance)
File “/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 362, in callable_executor
output_views = self._callable(**view_args)
File “/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/q2_phylogeny/_util.py”, line 13, in midpoint_root
return tree.root_at_midpoint()
File “/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/tree/_tree.py”, line 862, in root_at_midpoint
max_dist, tips = tree.get_max_distance()
File “/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/tree/_tree.py”, line 2420, in get_max_distance
return self._get_max_distance_singledesc()
File “/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/tree/_tree.py”, line 2376, in _get_max_distance_singledesc
distmtx = self.tip_tip_distances()
File “/opt/conda/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/tree/_tree.py”, line 2501, in tip_tip_distances
result = np.zeros((num_tips, num_tips), float) # tip by tip matrix
MemoryError

I verified that the input tree is in valid Newick format. I was running the command on a +300GB RAM machine, so memory shouldn’t be an issue. I don’t know if I am getting something wrong? Does qiime require pyhlogentic trees to show some certain features. I would be glad if somone can give me Ideas or guidelines on how to get a my data into qiime.

Hi @josmos,
It sounds like the tree being generated by kraken has a messed up topology and scikit-bio is getting upset. Is this tree built on 16S rRNA gene data in the kraken database? Or just whatever your input is? You should use a tree visualization tool to check out the topology, possible prune out these single-descendent nodes.

A tree is required for unifrac, but that is pretty much it… none of the plotting features, etc, require unifrac distances. So you can just use non-phylogenetic distance metrics and generate all the same interactive plots, etc.

1 Like

I am aware phylogentic distance metrics are not requrired, I still want to investigate it.

The Error is not coming up if I filter the tree with the filter_tree.py script from QIIME. (Although I am getting different errors now). Does QIIME2 provide a similar functionality? It seems it is only possible to filter the OTU table for tree-ids, not vice-versa.

No, QIIME 2 does not have a similar function. So you can filter the tree prior to importing, then use the tree to filter your feature table with qiime fragment-insertion filter-features