Great to see QIIME2 shaping up! I wanted to spark some discussion or ask about a couple of things related to database management for upcoming plugins.
Background: Reference databases (nucleotides, most frequently) are necessary for a lot of reference-based analyses, including open/closed-ref OTU picking and even some forms of taxonomy assignment. These databases may also include auxiliary components like phylogenetic trees, multiple sequence alignments, and so forth.
Question(s): How would distributing reference databases work within the plugin system? Would it be good to make a version-control-like “database manager” plugin that works with a bunch of mainstream tools?
NINJA-OPS, for instance, doesn’t depend on a particular database, but it has conventionally shipped outside of QIIME thus far with the GreenGenes 13.8 reference database. Users seeking other databases might wander onto the NINJA-OPS web site and discover UNITE, SILVA, etc and try to get them set up with their standalone NINJA-OPS installation. But plugin distribution is a more elegant way to install packages like this, taking the burden of installation juggling out of users’ hands – and users are not expected to know precisely where and how the plugin framework management is taking care of these things under the hood. It would make sense for there to be a database plugin (either specific to NINJA-OPS or aware of multiple plugins), along with a (centralized?) location to house these often massive databases for automatic installation.
What would the best way to accomplish this be under the current system? Or are there/should there be better alternatives (better than hijacking the plugin system)?