Should I be using QIIME 2 while it's in alpha?

gregcaporaso · March 6, 2017, 8:31pm

This topic is outdated and has been superseded by an updated topic describing the status of QIIME 2: QIIME 2 Project Status

Click here to see the original contents of this topic

We've recently had users asking whether they should be using QIIME 2, given that we currently describe it as being in "alpha" release. Here are our thoughts on the issue, which requires a little bit of discussion about the philosophy behind QIIME 2.

QIIME 2 is composed of different parts: the framework, the plugins, and the interfaces. The framework handles things like tracking provenance, checking semantic types, and working with QIIME 2 artifacts. The plugins provide all of the microbiome analysis methods and visualizations. The interfaces provide the different ways that users interact with QIIME 2. The framework, plugins, and interfaces are all independent software packages, potentially written and maintained by different authors who are not necessarily part of the QIIME 2 development team.

The QIIME 2 framework is currently in alpha release stage. This means that some things may change, such as how parallel computing is configured, how metadata is formatted, or what information is tracked inside of an artifact. It also means that some things are not currently possible, but will be in the future (for example, it is not currently possible for a method to take an optional artifact as input). And finally, it means that you may experience issues with some aspects of the framework. We expect that framework issues will result in things like failed commands, or files that can't be read. We don't expect that framework issues will generate data that are scientifically invalid. It is extremely helpful for us to get feedback on any framework issues that you experience.

The QIIME 2 plugins, where the microbiome analysis functionality is implemented, are in different stages of maturity. Some, such as q2-dada2 (the DADA2 plugin), are thin wrappers around published third-party software and therefore can be expected to generate publication quality results. Others, like q2-feature-classifier, are using experimental methods that haven't yet been published. A good way to differentiate these cases is to check whether there are citations associated with a plugin. You can do this for individual plugins:

$ qiime dada2 --help
Usage: qiime dada2 [OPTIONS] COMMAND [ARGS]...

  Plugin website: http://benjjneb.github.io/dada2/

  Getting user support: To get help with DADA2, post to the DADA2 issue
  tracker: https://github.com/benjjneb/dada2/issues

  Citing this plugin: DADA2: High-resolution sample inference from Illumina
  amplicon data. Benjamin J Callahan, Paul J McMurdie, Michael J Rosen,
  Andrew W Han, Amy Jo A Johnson, Susan P Holmes. Nature Methods 13, 581–583
  (2016) doi:10.1038/nmeth.3869.
...

$ qiime feature-classifier --help
Usage: qiime feature-classifier [OPTIONS] COMMAND [ARGS]...

  Plugin website: https://github.com/qiime2/q2-feature-classifier

  Getting user support: No user support information available. See plugin
  website: https://github.com/qiime2/q2-feature-classifier

  Citing this plugin: No citation available.
...

You can also see the citations associated with all of your installed plugins by running qiime info --citations.

This rule isn't perfect, as we're working on improving our citation support while the framework is in alpha release. If you have questions about the reliability of a specific plugin or method, please post to the QIIME 2 Forum and we'll be happy to help out.

Just like the QIIME 2 framework is in alpha release, the plugins and interfaces are also in alpha release. Again, we don't expect this to result in scientifically invalid results. Rather, plugin issues or interface issues are most likely to show up in the form of commands that fail to generate a result and produce an error message, or in commands that change names between releases. For example, between QIIME 2 versions 2.0.6 and 2017.2, the command qiime dada2 denoise was renamed qiime dada2 denoise-single.

The take-away message here is that it is safe to start using QIIME 2, but as with any data analysis pipeline, you should pay attention to what is happening at each step, and understand the level of uncertainty that is associated with each step. If you identify steps that you think have more uncertainty (such as the q2-feature-classifier steps at the time of this writing), you should be more skeptical of those results and consider running a related tool to confirm the results (in this case maybe the RDP classifier).