python api: complete usage examples

I’m quite ok with using Qiime2 from the command line and I have some ancient python scripts that use 'subprocess’ to run qiime commands. However, I want to modernize the scripts we are using and was wonderering if there is more extensive documentation on the API.

I’m lost in how to connect all the things together like in

from qiime2 import Artifact

url = 'https://gut-to-soil-tutorial.readthedocs.io/en/latest/data/gut-to-soil/demux.qza'
fn = 'demux.qza'
request.urlretrieve(url, fn)
demux = Artifact.load(fn)

# followed by 
import qiime2.plugins.demux.actions as demux_actions

demux_viz, = demux_actions.summarize(
    data=demux,
)

but then…..

  • How to exact the tables from a qza_object like demux to something like a pandas dataframe
  • How to I get the visualization out of the demux_viz object (for example, to display in a HTML page or use the https://view.qiime2.org/?src= paradigm)
  • etc.

if someone can point me in the right direction, I would appreciate it

3 Likes

Hi @fenny!

I am really excited that you would like to use the Python API directly, I do think that’s the most pleasant way to use QIIME 2 (but I am probably biased).

You’ve got the nuts and bolts of it, but what you are most likely missing is the following:

This one won’t quite work, because there isn’t necessarily a transformer from that format to a dataframe, but in general you can do this:

my_artifact = Artifact.load('example.qza')
df = my_artifact.view(pd.DataFrame)

or sometimes pd.Series is defined. Unfortunately for these object views, your simplest way to know is to just try it and see or look for matches in q2-types. I would like to include those someday when Myst does a better job of API generation/linking. There aren’t really all that many object views as pd.DataFrame/pd.Series covers a lot of cases, and otherwise there’s probably an skbio object which is a better fit, so you might see ordination objects, distance matrices, and trees as the other pretty typical forms.

You’ll probably want to call my_viz.save(‘my_viz’). But you could also call my_viz.get_index_paths()[‘html’] if you wanted to serve the data or fish around locally for a file. Please know that visualizations do not have a stable structure, so just be prepared for internal data to change without any special notice.

Metadata API - you’ll need this eventually, it’s how merging/filtering/etc happen. I find it pretty convenient to pair get_ids() with filter_ids().

I also assume you’ve seen the usage examples in our tutorials with the "Python API” tab:

Something else that’s worth getting working if you can is the Jupyter repr, which can show you visualizations inline, although it currently only works with an older Jupyter (updates are pending a few other misc things on our side).

Also, when in doubt, write . and hit tab. For the most part, if you can import it from the base package qiime2, then it’s fair game to use and we’ll limit any breaking changes. And if you are willing to be more of an interface (which clever enough scripts tend to become after a while), then qiime2.sdk is also of interest (but it may change more rapidly than qiime2 or qiime2.plugin).

7 Likes