Default citations for a plugin action (function, transformer, etc)

Consider this example (using an older version of Qiime2 as I was testing something):

$ qiime dada2  --citations
% use `qiime tools citations` on a QIIME 2 result for complete list

@article{key0,
 author = {Callahan, Benjamin J and McMurdie, Paul J and Rosen, Michael J and Han, Andrew W and Johnson, Amy Jo A and Holmes, Susan P},
 doi = {10.1038/nmeth.3869},
 journal = {Nature methods},
 number = {7},
 pages = {581},
 publisher = {Nature Publishing Group},
 title = {DADA2: high-resolution sample inference from Illumina amplicon data},
 volume = {13},
 year = {2016}
}

However, this surprised me:

$ qiime dada2 denoise-paired --citations
No citations found.

Looking at q2-dada2/q2_dada2/plugin_setup.py at dev · qiime2/q2-dada2 · GitHub the plugin does define citations:

plugin = qiime2.plugin.Plugin(
    name='dada2',
    ...
    citations=qiime2.plugin.Citations.load('citations.bib', package='q2_dada2')
)

However, none of its actions registered via plugin.methods.register_function explicitly do so. I would have expected the default here to be inherit the plugin level citations, rather than default to no citations.

Currently qiime2/qiime2/plugin/plugin.py at dev · qiime2/qiime2 · GitHub does explicitly set the default to no citations:

        if citations is None:
            citations = ()
        else:
            citations = tuple(citations)

I think this would change the default behaviour (argument left as citations=None) to work as I expected:

        if citations is None:
            citations = self.citations
        else:
            citations = tuple(citations)

Would this change be welcome? I'm happy to submit a pull request.

1 Like

Hi Peter!

Thank you for your patience while we work on this. Some of our principal devs are out right now...

Once we figure out how we want to implement this, would you be interested in drafting a PR?

2 Likes

Potentially - looking at the code afresh today, class PluginActions is a subclass of dict, meaning its subclasses PluginMethods, PluginVisualizers, and PluginPipelines do not have easy access to the parent Plugin instance. This might require caching the default citations alongside recording the plugin ID:

class PluginActions(dict):
    def __init__(self, plugin):
        self._plugin_id = plugin.id
        self.citations = plugin.citations  # proposal
        super().__init__()

On the other hand, my idea does look easy to apply to the Plugin class' methods register_formats, register_formats, register_transformer.

1 Like

Hi @peterjc,

This is a good discussion, and thanks for looking so closely at the code!

You are totally correct here in terms of mechanics. And the reason the Action objects (Method, Visualizer, Pipeline) don't contain the plugin reference (causing citations to need to be "cached" or injected in register_*) is that we make sure that these objects can be pickled for multiprocessing. The reference-cycle between actions and plugins makes that difficult to handle well, so we tend to just cut the reference at the action. (At some point we moved to using dill inside of the pickle protocol so this reason may not even matter anymore.)

I would anticipate sending the plugin citations forward into the action object would be the most reliable, but we would probably want a new property for it (like plugin_citation) so that an interface (or documentation) can distinguish the default citation from the action's citation.

Although perhaps that's too much effort for too little gain? Adding the plugin citation to the list would immediately just work without any interface changes at all. Open to thoughts here.


I think there's also an open question here as to whether the interfaces are doing the right thing in the first place by only looking at the action's citations. For example, should q2cli be walking up the "citeable tree"? In principle it understands that actions come from plugins and so reporting plugin citations makes a lot of sense in --citations as you note.

This might be a better approach when you think about view/transformer citations. The plugin may be exposing formats or views which it did not individually develop, and so a default citation for the plugin that occurs whenever that view is loaded would makes less sense. (You could imagine that I made a plugin which registers a pandas view, tacking my plugin's citation onto each use of that pandas view feels odd, although maybe defensible if it's an unusual enough layout.)


Another perspective is that the citations inside the provenance of a given result will contain the actually used path (of transformers, actions, and nested actions in pipelines) and so that's usually the better place to look for the end user.

I do agree that --citations really should almost always show the plugin's citation in addition though, this definitely feels like an oversight on our part.

1 Like

That alternative idea of "walking the tree" sounds simpler :+1:

@ebolyen, @peterjc - should we get an issue opened on this in the framework issue tracker. I agree that at the very least we should get the plugin citation included in the output associated with an action.

1 Like

I created an issue for this here.

1 Like