KeyError when attempting phylo-RPCA with Gemelli

Thanks, @colinvwood. Hi @NathanStewart,

Thanks for using Gemelli and for posting this error!

The issue is with the way the taxonomy is formatted. I assumed QIIME2 required a rank order delimiter as a requirement for import but it looks like it does not. I will put in an issue to Gemelli to force a more descriptive error when the rank delimiter is missing (which is needed to determine the LCA along the phylogeny).

To fix this for your taxonomy, the following python code with do the job. But I have also attached the revised taxonomy here: taxonomy_fixed.qza (8.0 KB)

import qiime2 as q2
import pandas as pd

taxonomy = q2.Artifact.load('taxonomy.qza')
RANK_ORDER = ['d', 'p', 'c', 'o', 'f', 'g', 's']
taxonomy_df = taxonomy.view(q2.Metadata).to_dataframe()
taxonomy_df['Taxon'] = ['; '.join([r  + '__' + t_r for t_r, r in zip(t.split(';'), RANK_ORDER)])  + '; s__'
                        for t in taxonomy_df.Taxon.values]
taxonomy_fixed = q2.Artifact.import_data('FeatureData[Taxonomy]', taxonomy_df)
taxonomy_fixed.save('taxonomy_fixed.qza')

I was able to run the command with the new taxonomy but let me know if you have any other problems.

On a related note, this command works better (more accurate LCA) with GG2 here, so it might be worth trying it out for generating your phylogeny/taxonomy.

Thanks!

Cameron

3 Likes