Accessing index.html from Qiime2 Visualization in Artifact API

Hi, all. I'm using songbird through the Artifact API and I have a question that I think applies broadly to Visualizations.

What I would like to do is retrieve the html itself from a Visualization in Python (i.e. the contents of data/index.html). As an example, songbird paired_summary returns a visualization that I can check in a notebook

For my purposes, I am primarily interested in extracting the Q^2 score. My current solution is something like this:

tmp_file = "tmp.qzv"
paired_summary_results.visualization.save(tmp_file)
tmp_viz = Visualization.load(tmp_file)
uuid = str(viz.uuid)
html_file = f"{uuid}/data/index.html"

import zipfile

with zipfile.ZipFile(tmp_file) as myzip:
    with myzip.open(html_file) as myfile:
        text = myfile.read()

q2 = parse_html_for_q2(text)
os.remove(tmp_file)

Ideally, I would like to do this without the use of any temporary files that I save to disk. Is this something I can easily do?

4 Likes

Hey @gibsramen!

Yes you can! I cannot promise this won’t break someday (as it is private), but for any of our archive objects you can do this:

my_viz._archiver.data_dir

That will give you a pathlib object. If you need a boring old string, then just use str to coerce it.


That said, I think this does indicate that some of the data is useful on it’s own, and perhaps there should be an intermediate artifact, so that more processing can be done (or perhaps alternative visualizations). You might open an issue on the songbird repository requesting such a feature.

6 Likes

Great, thanks so much! :grin:

For reference, this is what is now working for me:

viz_path = f"{str(viz._archiver.data_dir)}/index.html"
with open(viz_path, "r") as f:
    text = f.read()
2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.