Python API semantic type under the hood?

Hello friends!

I am trying to do some work with the Artifact API and I’m missing a key piece of documentation. Is there a place (Github repo, forum post, :dove:-and-:coconut: mail system) where I can find a listing of the python structures that interchange with a given semantic type?

Specifically, I’m trying to figure out how to make a FeatureData[Sequence] type out of a pandas Series where my index is the sequence identifier and the value is the sequence without quality information.

Thanks!
Justine

1 Like

Hello @jwdebelius, sorry for the slow reply!

Fortunately there is a transformer defined in q2-types that does this:

I only share the snippet above so that you can get a sense of where those kinds of things might be defined, code-wise.

Since transformers are defined in a plugin, but then registered for use within an entire deployment of QIIME 2, it is usually easiest to look up the transformer from an instance of the sdk.PluginManager:

# Rather than importing an individual plugin, we
# can get information about the _entire_ environment
# by using
from qiime2 import sdk, Artifact
from qiime2.plugin import util

import pandas as pd


# This instance of the PluginManager will be
# our "Virgil" on this descent into madness.
pm = sdk.PluginManager()

# We need to build a better interface for this, but in
# the meantime this should work... I am looking up
# the view type that you want to go _from_.
for fmt in pm.transformers[pd.Series].keys():
    print(fmt.__name__)
print()

# Okay, so it looks like we have a transformer:
# from pd.Series -> DNAFASTAFormat
# Let's use it!
data = pd.Series(['AAAA', 'GGGG'], index=['f1', 'f2'])

fasta_format_record = pm.formats['DNAFASTAFormat']
fasta_file = util.transform(data, to_type=fasta_format_record.format)

with fasta_file.open() as fh:
    print(fh.read())

# Two important things to note here:
# 1) The exercise of "searching" for transformers
#    above isn't a requirement for just trying to
#    invoke a transformer.
# 2) This example just uses QIIME 2 machinery, without
#    actually creating a QIIME 2 result (Artifact).

# Alternatively, if you need an Artifact, you can just import
# the series, invoking the transformer automatically

artifact = Artifact.import_data('FeatureData[Sequence]',
                                data, view_type=pd.Series)

print(artifact)

Stdout:

FirstDifferencesFormat
AlphaDiversityFormat
TSVTaxonomyFormat
DNAFASTAFormat
BooleanSeriesFormat
PredictionsFormat

>f1
AAAA
>f2
GGGG

<artifact: FeatureData[Sequence] uuid: 96be024e-6a1d-44d6-a814-540e1294e0d9>

We are currently working on making modifications to the SDK to allow for a more streamlined transformer lookup (maybe landing in 2020?). Also, our CZI-funded Library work aims to expose information (on a plugin level) about what formats, types, and transformers are registered.

Hope that helps!

2 Likes

Thanks @thermokarst! I need to process a little bit more, but might be back with more questions.

Best,
Justine

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.