Development of q2-pathway plugin

Hi,

I have recently made a q2-pathway plugin, which is capable of:

  • Inference of functional profiles (gene family abundance) from 16S rRNA gene sequencing data
  • Comparison and summarization of multiple functional profiles including those from q2-sapienns and q2-picrust2
  • Gene set enrichment analysis based on user-specified ranking of gene family abundances
  • Visualization of biological pathway information

The GitHub repository for the package is:

The package is currently being developed, and I'm seeking feedback from this active community.
There should be much room for improvement, and I would be grateful if you could try the plugin and provide bug reports or feature suggestions.

Also, I would like to ask the following specific question regarding the development.

Currently, to use Tax4Fun2 inference, the users must provide the path to the reference database bundled with the Tax4Fun2 library. However, the plugin development guide indicated this behavior should be avoided. I am thinking of installing the database and library by making the visualizers of the name install, which downloads and installs the databases in the predefined location. I would be grateful if you could let me know the better way for handling this situation.

4 Likes

Hi @nsato, Thanks for sharing q2-pathway - I look forward to trying it out once I get a grant proposal that I'm working on behind me!

Ideally you will create a new Artifact Class for the database (I'm guessing you've seen the docs on this already, but see here if not). Once you've done this, you can obtain the path when viewing an Artifact of that class as a DirectoryFormat subtype.

Here's an example that works in an environment based on the q2-dwq2 example plugin from the book.

First, let's load an Artifact of type SingleDNASequence. (Here's that artifact: query.qza; 6.3 KB.)

In [1]: import qiime2
In [2]: a = qiime2.Artifact.load('./query.qza')

Then, lets import the DirectoryFormat that we want to view it as. When defining your Action, this would be the view type that you associated with the input, for example in place of DNA here.

In [3]: from q2_dwq2 import SingleRecordDNAFASTADirectoryFormat

Internal to your Action, you would then have a DirectoryFormat view of the Artifact - the same as if we call view on it as follows:

In [4]: d = a.view(SingleRecordDNAFASTADirectoryFormat)

You can then get the path to provide to your Tax4Fun2 call as follows:

In [5]: d.path
Out[5]: InPath('/var/folders/1t/w4ys4pks4q5d5_kl7svl4t080000gn/T/qiime2/jgc/data/43edeffd-28f4-452d-a3a3-7903df851cd1/data') 

Does that get you what you need?

Once you get that working, you might next wonder if you can avoid the overhead of unzipping that database every time you use it. For this, the QIIME 2 artifact cache is key. We're working on some new docs that will cover this. In the meantime, @Oddant1, can you recommend the best resource for seeing how to use an Artifact Cache?

@nsato this tutorial provides some basic instructions on what the artifact cache is and how to use it. We should have more complete documentation soon. It can do a lot more than what is currently in that basic tutorial.

It should allow you to use your large artifacts without so much overhead from zipping and unzipping them and moving them around on your file system.

Dear @gregcaporaso,

Thank you very much for your reply, and I am very happy that you are going to try the plugin when you have time.

For creating the new Artifact Class and viewing them in a DirectoryFormat, thank you for your detailed instruction about how to make and call them inside the function. I will try to implement a new class based on your guidance and documentation you referred to, and update the plugin.

Also, thank you for letting me know that Artifact Cache functionality exists. I will definitely use this feature due to the large file size as you mentioned.

Dear @Oddant1,

Thank you very much for pointing me to the Artifact Cache tutorial post. I will definitely use this feature and report back to you if there are some questions.