Open FeatureData in Artifact API

Hi,

I am trying to write a small python script that process MiSeq samples through Qiime 2 with Artifact API and plugins to generate taxonomy data. The steps are as follow :
cutadapt -> import in Qiime 2 -> DADA2 -> classifier (classify_sklearn) -> output

Question 1 : Is it possible to access the FeatureData[Taxonomy] Artifact data (UUID, taxon and confidence level) at the classification step directly in python ? Something like when we generate the TSV file from the visualization but within the script.

Question 2 : I tried to export the taxonomy visualization with the following python code :
metadata.visualizers.tabulate(input=classification).visualization.save(os.path.join(directory['Qiime'] + "/taxonomy.qzv"))

Which I assumed would be the equivalent of :
qiime metadata tabulate --m-input-file taxonomy.qza --o-visualization taxonomy.qzv

However I get the following error :
TypeError: Argument to parameter 'input' is not a subtype of Metadata.

Is there any way to export the taxonomy visualization other than export the artifact and use the above command on subprocess to generate the visualization ?

Question 3 : Is there any Artifact/plugins API tutorial in the documentation ? I found a lot of informations about the CLI but not much about the API (other than the dev. docs and the Artifact API docs).

Thank you

1 Like

Hello @samuel.drouin!

Grabbing the UUID is straightforward (just a prop on the Artifact), while getting the other parts are pretty simple, too, but only if you know to use the view API method (we are lacking good docs in the dept, sorry!)

import qiime2
import pandas as pd

# here I am loading, but this could also be results passed
# down from a previous step.
taxonomy_artifact = qiime2.Artifact.load('taxonomy.qza')
# Get the UUID of the Artifact
print(taxonomy_artifact.uuid)
7224759e-03bb-4179-a382-485165ca8c86

# `view` as a pd.DataFrame to get the taxon and conf.
taxonomy_df = taxonomy_artifact.view(pd.DataFrame)
print(taxonomy_df.head())
                                                                              Taxon  \
Feature ID                                                                            
7fa8f515ee1ac5cb8b529b13e6a89790  k__Bacteria; p__Proteobacteria; c__Gammaproteo...   
40f13b077d41d6837ca6fae65763cf01  k__Bacteria; p__Actinobacteria; c__Actinobacte...   
2e55763e77234a7bfc3a311067a0c7e1  k__Bacteria; p__Firmicutes; c__Clostridia; o__...   
d5ef913e6aee3067001dc610ac8af1d1  k__Bacteria; p__Actinobacteria; c__Actinobacte...   
bc15061b61cf6b5002c58284591f97d4  k__Bacteria; p__Firmicutes; c__Clostridia; o__...   

                                          Confidence  
Feature ID                                            
7fa8f515ee1ac5cb8b529b13e6a89790  0.9998282422613309  
40f13b077d41d6837ca6fae65763cf01  0.9999994725274034  
2e55763e77234a7bfc3a311067a0c7e1  0.9177375883240979  
d5ef913e6aee3067001dc610ac8af1d1  0.9971204554883859  
bc15061b61cf6b5002c58284591f97d4  0.9999262723979453  

You will need to view as qiime2.Metadata, first:

import qiime2
from qiime2.plugins import metadata

taxonomy_artifact = qiime2.Artifact.load('taxonomy.qza')
taxonomy_md = taxonomy_artifact.view(qiime2.Metadata)
taxa_tab, = metadata.visualizers.tabulate(input=taxonomy_md)

Still a bit light in this department (would welcome some contributions!):

Hope that helps! :qiime2: :t_rex:

1 Like

Thank you for your help @thermokarst ! This is exactly what I was looking for.

I would like to contribute to the Python API documentation in some way and will probably write some documentation about what I found out while writing the script.

1 Like

Is there any way to know what classes are accepted as view_type for the artifact view function ?

I tried the same thing on the DADA2 representative_sequences : FeatureData[Sequences] and I get the following error :

rep_seq.view(pandas.DataFrame)
Traceback (most recent call last):
File "", line 1, in
File "/Users/samueldrouin/conda/envs/qiime2/lib/python3.5/site-packages/qiime2/sdk/result.py", line 254, in view
return self._view(view_type)
File "/Users/samueldrouin/conda/envs/qiime2/lib/python3.5/site-packages/qiime2/sdk/result.py", line 265, in _view
recorder=recorder)
File "/Users/samueldrouin/conda/envs/qiime2/lib/python3.5/site-packages/qiime2/core/transform.py", line 59, in make_transformation
(self._view_type, other._view_type))
Exception: No transformation from <class 'qiime2.plugin.model.directory_format.DNASequencesDirectoryFormat'> to <class 'pandas.core.frame.DataFrame'>

I am not sure what class I can use to get this class fasta file.

I figured out that I can get the fasta file from the directory file structure but it most certainly is not the good way to do it :

rep_seq._archiver.root_dir
PosixPath('/var/folders/gh/9m466jhj1zn1jtk8w114w4dr0000gn/T/qiime2-archive-p_4lnjwb/fda193f1-e60c-475b-a0a8-a1cb01f14925')

That is a great question @samuel.drouin! We really need to implement a convenience API for this, but in the meantime, you can do something like this to learn more about what you can directly transform to:

import qiime2

pm = qiime2.sdk.PluginManager()
rep_seqs = qiime2.Artifact.load('/path/to/rep-seqs.qza')
transform_to = pm.transformers[rep_seqs.format.file.format].keys()
print('\n'.join([str(t) for t in transform_to]))

which returns:

<class 'pandas.core.series.Series'>
<class 'qiime2.metadata.metadata.Metadata'>
<class 'q2_types.feature_data._transformer.DNAIterator'>

Sorry for the inconvenience, hope that helps! :t_rex: :qiime2:

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.