viewing provenance from command line to show details available from view.qiime2.org

devonorourke · September 11, 2020, 7:46pm

Apologies if this is well documented and I'm just low on and ...

If I have a sequence artifact file that was processed through a few steps in QIIME, I can upload that .qza file to view.qiime2.org and then click on the various dots that represent the data transformations. It would look something like the screenshot below.

I'm trying to get those Action Details on the right of the image, but was trying to avoid having to download the 200 Mb file to my desktop, only to then upload it to view.qiime2.org. Is there any way to get these same data from the command line? If I was to run qiime tools export, I believe all I get in this case is a folder with a fasta sequence file - no provenance text file, right?

Many thanks!

thermokarst · September 11, 2020, 7:57pm

Ah ha, prepare to have your mind blown: q2view doesn't upload data anywhere.

All q2view is is a simple viewer that happens to be in a browser. Ain't no way Q2HQ is prepared to pay for bandwidth fees for the terabytes and terabytes of data that people look at in q2view.

From the docs (https://view.qiime2.org/about):

QIIME 2 View (or q2view for short) is an entirely client-side interface for viewing QIIME 2 artifacts and visualizations (.qza/.qzv files respectively). This means that you do not need to have a working QIIME 2 installation to inspect QIIME 2 results. It also means that the files you provide are not sent beyond your browser. In other words, this entire site functions without a server (which makes it very inexpensive to operate).

So with the above in mind, your statement here:

should be rephrased as

Yes, but it isn't in quite such a nice layout:

https://docs.qiime2.org/2020.8/tutorials/exporting/#exporting-versus-extracting

You can learn a bit more about the extracted contents here:

https://dev.qiime2.org/latest/storing-data/archive/

TLDR: no upload means no file-size penalty!

As part of our NCI award (QIIME 2 is now funded by the National Cancer Institute and lots of exciting new things are coming!), @ChrisKeefe is leading the development of a new provenance parser, so hopefully some exciting new changes will come in the next few months/quarters.

:qiime2:

devonorourke · September 14, 2020, 1:05pm

Fantastic - thank you for the detailed reply. I completely missed extract from the qiime tools help menu.

One minor feature request: would it be possible to incorporate some type of additional flag to qiime tools extract, so that the extracted information is limited to all things except the feature data itself? In other words, what I'd like to do is simply obtain the provenance information about the file, not the file itself. So passing something like --p-provenance-only below, would result in an output directory with all things except the data subdirectory/contents:

qiime tools extract --input-path my_seqs.qza --p-provenance-only

The reason I ask is for those instances where I have some kind of giant file or database, and all I want is to know how the data was processed (not what the file contains). Perhaps this is way too niche a use case (as often seems to happen to me!?).

Thanks!

thermokarst · September 21, 2020, 3:13pm

Sounds like an interesting idea to me, thanks for sharing @devonorourke!

cc @ebolyen

ebolyen · September 21, 2020, 5:38pm

Hey @devonorourke, I think that's a good idea!

Since it sounds like you run into this often enough, you can also use normal zip utilities, most all of which have mechanisms to extract only parts of a zip file (you may need to change the extension to .zip, but that doesn't really change anything about the data).

system · October 22, 2020, 11:38pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.