Changing tip labels on rooted-tree.qza from sequence identifiers to SILVA taxa annotations using R

My question relates to the rooted-tree.qza created in QIIME 2 2019.4, and how to change the tip labels from sequence IDs from DADA2 to taxa annotations, using R.

I have exported rooted-tree.qza to newick format:

qiime tools export
–input-path rooted-tree.qza
–output-path exported-tree

…and uploaded it to R using the package ‘ape’:

phy <- read.tree(file = “tree.nwk”)

Additionally, I have exported and modified the taxonomy.qza created with SILVA classifier.

qiime tools export taxonomy.qza --output-dir exp-tree
echo $‘LABELS\nSEPARATOR TAB\nDATA’ > taxonomy.txt
sed “1d” taxonomy.tsv | cut -f1,2 >> taxonomy.txt

I then uploaded taxonomy.txt as a df in R, as seen below:

1 677dc4d80ebf1de5f0abff920ff79a3e
2 6bf1a63f1d1b1de9479fb487edb3c077
3 1170dd578d8b706ee98d951cc6c6e6b0
4 7059e70b32a6e91148afa4a361a2f2ef
5 96d7fe52dfcf9a158701479bd8af75e0
6 ff232c0997800a1e4b231ac8856f682a
1 D_0__Archaea;D_1__Euryarchaeota;D_2__Methanomicrobia;D_3__Methanosarcinales;D_4__Methanosarcinaceae;D_5__Methanolobus
2 D_0__Bacteria;D_1__Proteobacteria;D_2__Gammaproteobacteria;D_3__Enterobacteriales;D_4__Enterobacteriaceae;D_5__Escherichia-Shigella
3 D_0__Bacteria;D_1__Cyanobacteria;D_2__Oxyphotobacteria;D_3__Chloroplast
4 D_0__Bacteria;D_1__Cyanobacteria;D_2__Oxyphotobacteria;D_3__Chloroplast
5 D_0__Bacteria;D_1__Bacteroidetes;D_2__Bacteroidia;D_3__Bacteroidales;D_4__Bacteroidetes vadinHA17
6 D_0__Bacteria;D_1__Proteobacteria;D_2__Deltaproteobacteria;D_3__Syntrophobacterales;D_4__Syntrophaceae;D_5__Syntrophus

My goal is to match the sequence IDs on my tree (vector named phy) to the annotations dataframe (newtips). I have attempted to use join(), to replace the IDs, as seen below:

phytips <- phy$tip.label
relabeled.phy <- join(newtips, phytips)

…but this has been unsuccessful. I have also annotated my tree using iTOL, yet my exported newick file is saved as .txt, posing new and additional difficulty. I am open to any suggestions?

I don’t think that you can just replace sequence ID’s by taxonomy annotations since they are not equally abundant - several ASVs may be assigned to the same taxonomy. But you still can combine them.
When I needed a tree with taxonomy annotations I exported a biom table, taxonomy file and rep-seq.qza, modified IDs in the same way, imported back and recreated a tree with those files. Be aware of numerous symbols that are in taxonomy and may be not allowed by the plugin for tree construction.


This was incredibly helpful. Thank you for addressing my concerns!

I have successfully aligned the sequence IDs with the annotations within R, yet I am having difficulty importing my transformed rep-seqs.txt file back into QIIME. I have converted it to a json .biom format file, yet what “importable type” should I be using? I have tried the following:

qiime tools import --input-path rep-seq-transformed.biom --type ‘FeatureData[AlignedSequence]’ --input-format BIOMV210Format --output-path rep-seqs-transformed.qza

and received the following error:

An unexpected error has occurred:

No transformation from <class ‘q2_types.feature_table._format.BIOMV210Format’> to <class ‘qiime2.plugin.model.directory_format.AlignedDNASequencesDirectoryFormat’>

I appreciate any suggestions you may have for me?

You don’t need to convert it to biom file for rep-seq.qza since it should be in fasta format.

qiime tools import \
    --input-path fasta_file \
    --output-path modified-rep-seqs.qza \
    --type 'FeatureData[Sequence]'

The steps I performed.

  1. Exported table.qza as a biom. converted biom to .tsv. Modified a .tsv file, converted to biom and imported as table.qza
  2. Exported rep-seq.qza. Modified sequence ID’s in the same way as in 1 step. Imported back to rep-seq.qza
  3. Exported taxonomy.qza as a .tsv table, modified in the same way as first two, imported back
  4. Builded a tree in qiime2.
Thank you. This has been very helpful!