[Preview] QIIME 2 2022.11 development changelog

QIIME 2 2022.11 has been released!
Please see the official changelog for more details.

Original content of post

:exclamation: Important :exclamation:

The following is an early developer preview of the changes expected in 2022.11

This post is a live-document which will be updated throughout our development cycle. Any links will in this topic will be broken until the release is officially published. When we are ready for release, we’ll copy this changelog and create a new post in the Announcements category .


Important Developer Information

Dates (please keep an :eye: on this post, these :calendar: might change):

  • PRs must be submitted by: December 2nd, 2022
  • PRs must be merged by: December 9th, 2022
  • Repo Freeze and Package Building: December 19th, 2022
  • Release Day: December 19th, 2022

Developer Project Board:

QIIME 2 2022.11 Project on Github


Exciting Announcements!

q2-composition

@lizgehret added a new method in q2-composition that wraps the ANCOM-BC functionality from the 3-16 release of Frederick Huang Lin's R package. Some of @mortonjt's ideas from his q2-ancombc plugin were used, along with some additional guards, and the outputs have been packaged slightly differently. The contents of this update include:

  • New method qiime composition ancombc that produces an artifact containing the following:

    lfc (log fold change)
    w scores
    std err
    p-vals
    q-vals
    
  • Unit test suite and usage examples for ancombc method

  • Visualizer for the above output artifact. This visualizer provides tab separated views for each dataframe containing the log-fold change (lfc), standard error (se), P and Q values, and W scores.

Artifact Cache

The Artifact Cache was added in the background in the last release, now it is accessible via q2cli. The Artifact Cache is a directory structure that stores QIIME 2 Results in an unzipped form on disk in a consistent way controlled by QIIME 2. This is particularly useful for large reference databases and shotgun sequencing where the data is very large, and it is easier to use up more storage space than to be constantly taking the CPU time to zip and unzip it to and from a .qza.

Since the release of QIIME 2 2022.8, all QIIME 2 actions have been interacting with a cache created automatically in the temp directory. It is now possible to create and use custom caches in user-defined locations on the command line interface. The following tools have been added to support this (more details on their usage in the cli help text).

  • qiime tools cache-create creates a cache at the given path (must be a new path).
  • qiime tools cache-remove removes the given key and data if this was the only key to that data from the given cache.
  • qiime tools cache-garbage-collection runs garbage collection on the given cache.
  • qiime tools cache-store stores a given artifact to the given cache under a given key.
  • qiime tools cache-fetch fetches the artifact with the given key from the given cache into an artifact at the given output path.
  • qiime tools cache-status lists the contents of the given cache.

The cache uses user-defined keys to reference the data stored in it. Currently, these keys must be valid Python identifiers. To use a cache on the command line interface, first, create a cache at a given location using the new qiime tools create-cache command. Then you may use artifacts in the cache as inputs (or save outputs to the cache) using the cache_path:key syntax.

For example, the following commands will create a cache at ~/Documents/cache, save the artifact at ~/Documents/art.qza into the cache under the key in creating an unzipped copy of the artifact in the cache, then use that artifact as an input to an action and save the output of the action into the cache under the key out before saving that output into an artifact at ~/Documents/out.qza note that you can just as easily use ~/Documents/cache:out as an input to another command. The data saved into the cache under the keys in and out will remain in the cache accessible via those keys until removed via the qiime tools cache-remove command.

Creating a cache:

qiime tools cache-create --cache ~/Documents/cache

Placing a qza into the cache:

qiime tools cache-store \
  --cache ~/Documents/cache \
  --artifact-path ~/Documents/art.qza \
  --key in

An example action using the cache for input and/or output:

qiime some-plugin some-action \
  --i-input ~/Documents/cache:in \
  --o-output ~/Documents/cache:out

Converting a cached artifact into a qza:

qiime tools cache-fetch \
  --cache ~/Documents/cache \
  --key out \
  --output-path ~/Documents/out.qza

:exclamation: BREAKING CHANGES :exclamation:

  • q2-vsearch

  • q2studio has been fully deprecated, and will no longer be included in our core or community distribution installations :no_entry_sign: Check out our tools :hammer_and_wrench: within the Galaxy Toolshed for a fully-functioning graphical user interface! :tv:


Here are the highlights of the release:

  • QIIME 2 Framework

    • @lizgehret bumped the core and community distributions up to the latest version of R, so all plugins within these distributions are now compatible with :registered: 4.2.0 :chart_with_upwards_trend:
    • @cherman2 fixed a :lady_beetle: that unintentionally removed the Jupyter Notebook :notebook_with_decorative_cover: extension from the list of installed packages for our core and community distributions :astonished: This extension will now be included with any installation of QIIME 2 for 2022.11 and beyond! :milky_way:
    • @emollier fixed a :cockroach: when running our test suite on 32-bit CPU architectures. On 64-bits CPUs, the pandas.dtype returned for integers is a plain int 64 bits wide. However, on 32 bits machines, the type is returned as int64 to differenciate from plain int. This :bug: fix now allows for either int or int64 as an allowed type within our test suite.
    • @Oddant1 Fixed cross-device link errors when attempting to symlink across devices by copying instead :link:
    • @gregcaporaso added the ability to add descriptions and examples to "artifact classes" (a new term being used to describe semantic types with associated directory formats, i.e., semantic types that can be associated with QIIME 2 artifacts) :gem:
      • Descriptions and examples are added on definition of an artifact class with Plugin.register_artifact_class. Plugin.register_artifact_class is a replacement for Plugin.register_semantic_type_to_format (the latter is still available for backward compatibility, but the former should be used preferentially).
      • Descriptions and examples of artifact classes can be accessed via PluginManager.artifact_classes or Plugin.artifact_classes.
    • @gregcaporaso added helpers for use in defining usage examples to obtain an artifact or metadata from a URL :computer: :nerd_face: These can be accessed with Usage.init_artifact_from_url and Usage.init_metadata_from_url, respectively.
    • @Keegan-Evans added the ability to make files and file collections optional in directory formats :file_cabinet:
    • @ebolyen made it so metadata column type is now informed by Metadata.load's default_missing_scheme :lobster:
    • @gregcaporaso added better functionality for examining/finding formats on a PluginManager :mag::receipt:
  • q2cli

    • colinvwood added the ability to provide multiple files to be examined to qiime tools peek.
    • colinvwood added a --tsv parameter that can be added to calls to qiime tools peek that outputs the results as a TSV file.
  • q2-cutadapt

    • IGK-NZ added the ability to perform quality base trimming.
    • @Keegan-Evans added usage examples for demux-single and demux-paired.
  • q2-dada2

    • @lizgehret added an R traceback for any errors that arise within the DADA2 R package for easier debugging :beetle: :mag:
    • @Oddant1 made it so barcodes and IDs won't get mixed up while sorting :no_good_woman:t3:
    • @lizgehret added usage examples for denoise-single and denoise-paired :sound:
  • q2-deblur

    • @ChrisKeefe added usage examples for denoise-16s and visualize-stats :eye:
  • q2-demux

    • @cherman2 added usage examples for emp-single and summarize :dumpling:
  • q2-diversity

    • @lizgehret added an R traceback for any errors that arise using the adonis method from within the Vegan R package for easier debugging :mag_right: :bug:
    • @lizgehret added usage examples for alpha-group-significance and alpha-correlation :abc:
  • q2-emperor

    • @cherman2 added usage examples for plot and procrustes-plot :crown:
  • q2-feature-table

    • @gregcaporaso and @Oddant1 added usage examples for merge, merge-taxa, merge-seqs, filter-samples, filter-features, and filter-features-conditionally :globe_with_meridians:
    • @ebolyen made it so summarize can handle NaN values :1234:
    • @wasade fixed a bug where rarefying with replacement was incorrectly keeping values below the set rarefaction depth :axe::key:
    • @lizgehret added usage examples for summarize and tabulate-seqs
  • q2-fragment-insertion

    • @Stefan helped speed up this plugin, always a welcome addition :dizzy:, by adding explicit file format selection instead of sniffing for it, in accordance with this issue in scikit-bio.
  • q2-metadata

  • q2-phylogeny

    • @lizgehret added a usage example for align-to-tree- mafft-fasttree :evergreen_tree:
  • q2-quality-filter

  • q2-stats

    • @cherman2 made the error message raised when N=0 in wilcoxon more informative and added an ignore_empty_comparator parameter that fills invalid comparisons with NaN in the stats table if used :jar:
  • q2-taxa

    • @ChrisKeefe added usage examples for collapse and barplot :bar_chart:
  • q2-types

    • @misialq added ProteinFASTAFormat validation for sequences which contain asterisks *. While usually not present in sequences fetched from NCBI, the asterisk is often added during in silico translation :page_with_curl:
    • @gregcaporaso added support for importing multiplexed fasta/qual files to support analysis of legacy 454 sequencing data :crossed_swords:
    • @gregcaporaso added descriptions and examples for common artifact classes using new functionality added in the QIIME 2 Framework :books:
    • @SoilRotifer: added support for importing and converting lower-case & mixed-case nucleotide sequences into standard upper-case IUPAC form via the new formats: MixedCaseDNAFASTAFormat, MixedCaseAlignedDNAFASTAFormat, MixedCaseRNAFASTAFormat, and MixedCaseAlignedRNAFASTAFormat . :arrow_double_up: Add to address: Should DNAIterator support lowercase fasta sequences? · Issue #91 · qiime2/q2-types · GitHub
  • q2-vsearch

    • @colinbrislawn added --min_seq_length and --min_unique_size parameters to vsearch dereplicate-sequences to discard short or uncommon sequences from the dereplication output :dna: :microbe:
  • User Docs

    • @lizgehret & @thermokarst added documentation for installing QIIME 2 via conda on Apple Silicon devices :green_apple:
    • @thermokarst updated the Windows Subsystem for Linux (WSL) installation instructions to provide more context on when to use those instructions :rat:
  • Virtual Machines

    • nickodell made some improvements that should reduce the size of our docker builds by ~60%:flying_saucer::arrow_double_down:
4 Likes

Hi @lizgehret

There is a change to qiime2.plugin.model.FileCollection that it looks like was made on Sep 28th this year. See here IMP: add `optional` to File and FileCollection. (#646) · qiime2/[email protected] · GitHub.

I actually need this change for my application but it is not listed in this message. I am wondering if this will be included in the 2022.11 release?

Thanks so much!

1 Like

Hi @mroper,

Great question! We haven't fully updated our changelog yet, but yes - this change will be included in the 2022.11 release.

Cheers :lizard:

2 Likes