Making rooted SILVA tree

I am trying to do some phylogenetic diversity analyses with data that were processed with the SILVA database (release 132). I’ve downloaded the SILVA tree file I think it correct (https://ftp.arb-silva.de/release_132/Exports/taxonomy/tax_slv_ssu_132.tre) and imported it as unrooted and then did the qiime phylogeny midpoint-root but am still getting the “tree must be rooted” error when I try to run the phylogenetic diversity analyses. Any idea where I may be going wrong with trying to get a rooted SILVA tree? Any suggestions would be appreciated! :slight_smile:

Hey there @c.older, it is a bit hard for us to help you without any sample commands you ran, or the complete error messages you have received. Can you help us help you by replying with that info? Thanks so much! :qiime2:

Here you go:

qiime tools import --input-path tax_slv_ssu_132.tre --output-path unrooted-SILVA-tree.qza --type 'Phylogeny[Unrooted]'
Imported tax_slv_ssu_132.tre as NewickDirectoryFormat to unrooted-SILVA-tree.qza
qiime diversity core-metrics-phylogenetic --i-table taxa_filtered/filtered_table.qza --i-phylogeny unrooted-SILVA-tree.qza --p-sampling-depth 76819 --m-metadata-file ../FIV_ging_mapping_12122018_condensed_UTF.txt --p-n-jobs 8 --output-dir core_div_08092019
Usage: qiime diversity core-metrics-phylogenetic [OPTIONS]

Applies a collection of diversity metrics (both phylogenetic and non-
phylogenetic) to a feature table.

Inputs:
--i-table ARTIFACT FeatureTable[Frequency]
The feature table containing the samples over which
diversity metrics should be computed. [required]
--i-phylogeny ARTIFACT Phylogenetic tree containing tip identifiers that
Phylogeny[Rooted] correspond to the feature identifiers in the table.
This tree can contain tip ids that are not present
in the table, but all feature ids in the table must
be present in this tree. [required]
Parameters:
--p-sampling-depth INTEGER
Range(1, None) The total frequency that each sample should be
rarefied to prior to computing diversity metrics.
[required]
--m-metadata-file METADATA...
(multiple arguments The sample metadata to use in the emperor plots.
will be merged) [required]
--p-n-jobs INTEGER [beta/beta-phylogenetic methods only, excluding
Range(0, None) weighted_unifrac] - The number of jobs to use for
the computation. This works by breaking down the
pairwise matrix into n-jobs even slices and
computing them in parallel. If -1 all CPUs are used.
If 1 is given, no parallel computing code is used at
all, which is useful for debugging. For n-jobs below
-1, (n_cpus + 1 + n-jobs) are used. Thus for n-jobs
= -2, all CPUs but one are used. (Description from
sklearn.metrics.pairwise_distances) [default: 1]
Outputs:
--o-rarefied-table ARTIFACT FeatureTable[Frequency]
The resulting rarefied feature table. [required]
--o-faith-pd-vector ARTIFACT SampleData[AlphaDiversity]
Vector of Faith PD values by sample. [required]
--o-observed-otus-vector ARTIFACT SampleData[AlphaDiversity]
Vector of Observed OTUs values by sample. [required]
--o-shannon-vector ARTIFACT SampleData[AlphaDiversity]
Vector of Shannon diversity values by sample.
[required]
--o-evenness-vector ARTIFACT SampleData[AlphaDiversity]
Vector of Pielou's evenness values by sample.
[required]
--o-unweighted-unifrac-distance-matrix ARTIFACT
DistanceMatrix Matrix of unweighted UniFrac distances between
pairs of samples. [required]
--o-weighted-unifrac-distance-matrix ARTIFACT
DistanceMatrix Matrix of weighted UniFrac distances between pairs
of samples. [required]
--o-jaccard-distance-matrix ARTIFACT
DistanceMatrix Matrix of Jaccard distances between pairs of
samples. [required]
--o-bray-curtis-distance-matrix ARTIFACT
DistanceMatrix Matrix of Bray-Curtis distances between pairs of
samples. [required]
--o-unweighted-unifrac-pcoa-results ARTIFACT
PCoAResults PCoA matrix computed from unweighted UniFrac
distances between samples. [required]
--o-weighted-unifrac-pcoa-results ARTIFACT
PCoAResults PCoA matrix computed from weighted UniFrac
distances between samples. [required]
--o-jaccard-pcoa-results ARTIFACT
PCoAResults PCoA matrix computed from Jaccard distances between
samples. [required]
--o-bray-curtis-pcoa-results ARTIFACT
PCoAResults PCoA matrix computed from Bray-Curtis distances
between samples. [required]
--o-unweighted-unifrac-emperor VISUALIZATION
Emperor plot of the PCoA matrix computed from
unweighted UniFrac. [required]
--o-weighted-unifrac-emperor VISUALIZATION
Emperor plot of the PCoA matrix computed from
weighted UniFrac. [required]
--o-jaccard-emperor VISUALIZATION
Emperor plot of the PCoA matrix computed from
Jaccard. [required]
--o-bray-curtis-emperor VISUALIZATION
Emperor plot of the PCoA matrix computed from
Bray-Curtis. [required]
Miscellaneous:
--output-dir PATH Output unspecified results to a directory
--verbose / --quiet Display verbose output to stdout and/or stderr
during execution of this action. Or silence output
if execution is successful (silence is golden).
--citations Show citations and exit.
--help Show this message and exit.

                There was a problem with the command:

(1/1) Invalid value for "--i-phylogeny": Expected an artifact of at least
type Phylogeny[Rooted]. An artifact of type Phylogeny[Unrooted] was
provided.
qiime phylogeny midpoint-root --i-tree unrooted-SILVA-tree.qza --o-rooted-tree actually_rooted_SILVA_tree
Saved Phylogeny[Rooted] to: actually_rooted_SILVA_tree.qza
qiime diversity core-metrics-phylogenetic --i-table taxa_filtered/filtered_table.qza --i-phylogeny actually_rooted_SILVA_tree.qza --p-sampling-depth 76819 --m-metadata-file ../FIV_ging_mapping_12122018_condensed_UTF.txt --p-n-jobs 8 --output-dir core_div_08092019
Plugin error from diversity:

tree must be rooted.

Debug info has been saved to /state/partition1/tmp/c.older/qiime2-q2cli-err-znrk7nk0.log

I attempted this again, partially to see if I was just unlucky on Friday (used the same exact same code) and re-create the issue and actually get access to the debug info in the temp file (was being directed to a folder where it’d be immediately deleted). So below I’ve pasted that in case that is helpful also!

/data/apps/miniconda/envs/qiime2-2019.7/lib/python3.6/site-packages/sklearn/metrics/pairwise.py:1575: DataConversionWarning: Data was converted to boolean for metric jaccard
  warnings.warn(msg, DataConversionWarning)
Traceback (most recent call last):
  File "/data/apps/miniconda/envs/qiime2-2019.7/lib/python3.6/site-packages/q2cli/commands.py", line 327, in __call__
    results = action(**arguments)
  File "</data/apps/miniconda/envs/qiime2-2019.7/lib/python3.6/site-packages/decorator.py:decorator-gen-390>", line 2, in core_metrics_phylogenetic
  File "/data/apps/miniconda/envs/qiime2-2019.7/lib/python3.6/site-packages/qiime2/sdk/action.py", line 240, in bound_callable
    output_types, provenance)
  File "/data/apps/miniconda/envs/qiime2-2019.7/lib/python3.6/site-packages/qiime2/sdk/action.py", line 477, in _callable_executor_
    outputs = self._callable(scope.ctx, **view_args)
  File "/data/apps/miniconda/envs/qiime2-2019.7/lib/python3.6/site-packages/q2_diversity/_core_metrics.py", line 55, in core_metrics_phylogenetic
    metric='faith_pd')
  File "</data/apps/miniconda/envs/qiime2-2019.7/lib/python3.6/site-packages/decorator.py:decorator-gen-481>", line 2, in alpha_phylogenetic
  File "/data/apps/miniconda/envs/qiime2-2019.7/lib/python3.6/site-packages/qiime2/sdk/action.py", line 240, in bound_callable
    output_types, provenance)
  File "/data/apps/miniconda/envs/qiime2-2019.7/lib/python3.6/site-packages/qiime2/sdk/action.py", line 383, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/data/apps/miniconda/envs/qiime2-2019.7/lib/python3.6/site-packages/q2_diversity/_alpha/_method.py", line 54, in alpha_phylogenetic
    tree=phylogeny)
  File "/data/apps/miniconda/envs/qiime2-2019.7/lib/python3.6/site-packages/skbio/diversity/_driver.py", line 170, in alpha_diversity
    counts, otu_ids, tree, validate, single_sample=False)
  File "/data/apps/miniconda/envs/qiime2-2019.7/lib/python3.6/site-packages/skbio/diversity/alpha/_faith_pd.py", line 136, in _setup_faith_pd
    _validate_otu_ids_and_tree(counts[0], otu_ids, tree)
  File "/data/apps/miniconda/envs/qiime2-2019.7/lib/python3.6/site-packages/skbio/diversity/_util.py", line 79, in _validate_otu_ids_and_tree
    raise ValueError("``tree`` must be rooted.")
ValueError: ``tree`` must be rooted.

Sorry for another update, but wanted to keep trying to get around this issue!

I tried out the qiime-formatted 99% OTU silva 132 tree, which I imported and rooted, then attempted the core diversity analyses again. This time didn’t get the “tree must be rooted” error, so guess the rooted issue is gone, although I do not understand why… I do now get the “…features ids must be present as tip names…” which makes sense, since my tree would have the SILVA OTU identifiers and not my feature ids.

Clearly I didn’t think this all the way through - still pretty green with qiime 2 :slight_smile:

Since I’d rather stick with SILVA 132 I guess my only option is to do de novo tree construction since fragment-insertion/sepp is not currently compatible with this version of SILVA :frowning:

Is that correct or am I missing some other method I could use that would be more reliable?

1 Like

:+1: align-to-tree-mafft-fasttree: Build a phylogenetic tree using fasttree and mafft alignment — QIIME 2 2019.7.0 documentation


This sounds like a good feature to add! Let's see what the devs recommend on using SEPP with Silva
@Stefan @wasade

1 Like

I don't think this actually the case --- q2-fragment-insertion will use gg by default, but you are able to pass in your own tree and aligned seqs.

2 Likes

It is important to note that SILVA uses parsimony insertion in ARB for its phylogeny. SEPP was benchmarked against a phylogeny derived from a multiple sequence alignment. It is plausible that SEPP will perform different on SILVA than Greengenes.

A de novo phylogeny from read fragments can lead to bad trees that can impact results. Caution may be warranted if reconstructing a phylogeny from amplicon data. This issue is explored in further detail in Janssen et al.

Best,
Daniel

2 Likes

The SEPP program https://anaconda.org/bioconda/sepp and bioconda package https://anaconda.org/bioconda/sepp now support alternative reference packages (phylogeny, alignment, info file). And we have Silva 13.2 files compatible for SEPP: https://github.com/smirarab/sepp-refs/tree/master/silva

Unfortunately, the qiime2 plugin does not yet expose the necessary parameters and does not source the bioconda package. I am working on this, but could need some help for proper Qiime2 integration https://github.com/qiime2/q2-fragment-insertion/pull/32

But please take Daniel’s comment seriously: It has never been benchmarked and we don’t know if we/you are able to tell that results are off.

Best,
Stefan

5 Likes

Hi @Stefan,
The github link you provided looks to be for Silva 12.8 and not 13.2 files. As per this discussion here the 13.2 files appears to be still ‘under construction’ (holding my breath for that btw!). Is this correct? I bring this up because the inquiry in this thread was regarding the 13.2 version.
walks back into the hedges :evergreen_tree:

2 Likes

Thank you all for responding and providing the Janssen et al pub.
Glad you all are also interested in making this a possibility in qiime2, I look forward to this addition!
For now, I’ll definitely take these words of caution seriously as I move forward with analysis.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.