QIIME 2 2023.5 has been released!
Please see the official changelog for more details.
Original content of post
Important
The following is an early developer preview of the changes expected in 2023.5
This post is a live-document which will be updated throughout our development cycle. Any links will in this topic will be broken until the release is officially published. When we are ready for release, we’ll copy this changelog and create a new post in the Announcements category .
Important Developer Information
Dates (please keep an on this post, these might change):
- PRs must be merged by: May 22nd, 2023
- Repo Freeze and Package Building: May 23rd, 2023
- Release Day: May 24th, 2023
Developer Project Board:
QIIME 2 2023.5 Project on Github
Exciting Announcements!
Parsl
QIIME 2 pipelines can now be parallelized via parsl Huge thanks to @Oddant1 for implementing this!
On the CLI, pass the --parallel
flag to parallelize a given pipeline using a basic parsl configuration that should work on most non HPC systems.
In the Python API, call .parallel
on the pipeline to get the same result (ex. diversity.pipelines.core_metrics.parallel(*args, **kwargs)
).
Parsl allows for more detailed configuration for HPC systems. More documentation on how to do this in QIIME 2 may be found in the dev docs linked above.
Pipeline Resumption
QIIME 2 pipelines that fail part way through can now be resumed from their point of failure instead of needing to restart from the beginning. Another huge thanks to @Oddant1 for implementing this!
This behaviour is enabled by default on the CLI. QIIME 2 will create a pool in your default cache (or the cache indicated by the new --use-cache
flag on pipelines) that will store all intermediate results from a pipeline that is running and will attempt to reuse the results in this pool should you rerun the pipeline after a failure. The pool will be deleted on pipeline success.
If you want to specify a pool to use (that will not be deleted automatically on pipeline success) provide the --recycle-pool
flag followed by a key to be used for the pool in the cache. If you want to opt out of this behavior, pass the --no-recycle
flag.
In the Python API, you must with in a pool to be used for pipeline resumption using the usual syntax for withing in a pool.
from qiime2 import Cache
cache = Cache('cache_path')
pool = Cache.create_pool('pool', reuse=True)
with pool:
diversity.pipelines.core_metrics(*args, **kwargs)
This will do the exact same thing as
qiime diversity core-metrics <inputs and params> --use-cache 'cache_path' --recycle-pool 'pool' <outputs>
This will hopefully make it so that if you hit your wall time or some other transient error while executing a pipeline you do not lose all the progress the pipeline made before it failed.
NOTE: If you change any of your inputs or parameters to a pipeline it may not be possible to reuse all of the intermediate results created by the previous run; however, QIIME 2 will still reuse any results not implicated by the changed arguments.
Output Collections
It is now possible (also thanks to @Oddant1 ) to return collections of artifacts as a single output.
On the CLI, output collections will need to be given a directory that does not exist yet (the same as --output-dir
). They will create this directory then write all artifacts to it along with a .order file that simply contains the names of all of the artifacts in the collection in order.
In the Python API, a ResultCollection object will be returned that can be accessed in much the same way as a dictionary with the addition of a validate
method that will run validate on all artifacts in the collection. .save
can be called on ResultCollections to save them to disk using the exact same rules as the CLI. ResultCollection.load
may be called to load a directory into a ResultCollection object in the same way as a single artifact may be loaded.
q2-quality-control
@jordenrabasco added three new commands for using decontam in QIIME 2:
-
decontam-identify
- Supports identifying contaminants based on negative controls using either frequency information (quantitative measures) or feature prevalence in controls. -
decontam-score-viz
- Histrogram summary of contaminants with optional normalization of feature-counts. -
decontam-remove
- (experimental) Filter feature table by the scores. This may be replaced byfeature-table
'sfilter-features
in the future.
BREAKING CHANGES
-
-
@colinvwood added the commands
qiime tools list-types
andqiime tools list-formats
that replace the--show-importable-types
and--show-importable-formats
flags to theqiime tools import
command. The new commands list descriptions for each available semantic type or format where available and allow only queries of interest to be listed.
-
@colinvwood added the commands
-
-
@gregcaporaso addressed an issue in
da-barplot
where the visualization made
assumptions about the feature id schema.da-barplot
previously split feature ids on semicolons for readability in figures, assuming that the different semicolon-delimited fields were different taxonomic levels. However, there is no guarantee that semicolons in feature ids are always intended to be level delimiters, or that if there are intended to be level delimiters that they would always be semicolons (for example,|
is a commonly used delimiter as well). Users must now provide the--p-level-delimiter ';'
parameter to achieve the previous behavior.
-
@gregcaporaso addressed an issue in
Here are the highlights of the release:
-
- @Oddant1 fixed a race condition that could occur when processes were cleaning up on exit
-
-
@cherman2 fixed a bug in
da-barplot
where links to subplots with metadata values that included spaces were broken. -
@lizgehret fixed a in
ancombc
that caused undesirable string splitting in thetabulate
visualizer when a singlereference_level
column::value pair was provided. Thanks to @arwqiime for bringing this to our attention! -
@lizgehret added metadata column type enforcement in
ancombc
, allowing for CategoricalMetadata columns containing integer values to be treated as discrete groups when included in theformula
-
@lizgehret added a unit test suite to the
tabulate
visualizer
-
@cherman2 fixed a bug in
-
-
@cherman2 added support for all FeatureTable types to
transpose
. Now any feature table can be transposed This will address issues like the one @emmlemore detailed in their post on the forum! Thanks @emmlemore
-
@cherman2 added support for all FeatureTable types to
-
-
@cherman2 added a method called
feature-peds
. This calculates what proportion of subjects engrafted each donor feature. -
@cherman2 refactored
sample-peds
to match the implementation offeature-peds
. -
@cherman2 fixed a bug that allowed
FeatureTable[Composition]
in as an input forsample-peds
. -
@cherman2 added a
drop_incomplete_timepoint
parameter tosample-peds
. This will enable dropping any time points with large numbers of samples missing! -
@cherman2 added a level delimiter parameter in
plot-heatmap
that allows users to split taxonomic strings
-
@cherman2 added a method called
-
-
@lizgehret and @colinvwood fixed a in the
feature-volatility
visualizer caused by blank values in NumericMetadata columns
-
@lizgehret and @colinvwood fixed a in the
-
-
@crusher083 added support for additional base estimators in
Adaboost
estimators
-
@crusher083 added support for additional base estimators in
-
-
@gregcaporaso added support for
FeatureTable[PresenceAbsence]
as input tobarplot
. This is useful to support some of our QIIME 2 end-to-end shotgun metagenomics workflows (which are coming soon!). -
@nicholas_bokulich updated
barplot
by makingFeatureData[Taxonomy]
an optional input. For this use-case, feature labels are parsed from the Feature s.
-
@gregcaporaso added support for
-
-
@gregcaporaso added the
ImmutableMetadata
type, which is intended to house QIIME 2 metadata in an artifact. This enables actions to output metadata, which previous wasn't possible since QIIME 2 actions can only output artifacts and visualizations. If anImmutableMetadata
artifact is exported, it will be a plain-old (mutable) metadata file. - @Oddant1 added support for numeric sample-ids
-
@gregcaporaso added the
Documentation Updates
-
Cancer Microbiome Intervention Tutorial
- Fixed typos in the Cancer Microbiome Intervention Tutorial pointed out by Amanda Birmingham. Thanks Amanda!
-
- @gregcaporaso added a note about the Silva taxonomy classifiers to the Data Resources page of the docs. Specifically, the Silva classifiers and reference files provided on the QIIME 2 webiste include species-level taxonomy. While Silva annotations do include species, Silva does not curate the species-level taxonomy so this information may be unreliable. In a future version of QIIME 2 we will no longer include species-level information in our Silva taxonomy classifiers. This is discussed on the QIIME 2 Forum here (see Species-labels: caveat emptor!) and on GitHub. Thanks to @wasade for bringing this to our attention!
-
- @crusher083 added some new section headers and examples to the Python 3 API for improved readability. Thanks @crusher083!
- @oddant1 added documentation for Parsl, Pipeline Resumption, and Collections