Discarded results produced by q2-cutadapt's trim-paired when working with mixed-orientation reads

Summary

We recently discovered a bug related to the q2-cutadapt plugin’s demux-paired command that, in certain situations, can unnecessarily discard a significant portion of reads, when the command is run with the --p-mixed-orientation flag.

Impact

This issue impacts the following versions of q2-cutadapt:

  • 2020.6
  • 2020.8

If you think you may have been impacted please check your provenance at https://view.qiime2.org for the versions listed above to verify that you ran demux-paired with the mixed-orientation flag enabled (we are happy to help with identification, please share a demux summarize visualization with us).

Details

Mixed orientation reads are produced by some sequencing protocols, where R1 and R2 multiplexed files don’t necessarily correspond to forward and reverse reads (that is to say, the orientation is “mixed” in both files). In general we haven’t seen this strategy very often (but if you use this, let us know - we’d like to learn more about the use case for this functionality)!

Internally, q2-cutadapt follows a mixed-orientation demux protocol based on:

https://cutadapt.readthedocs.io/en/stable/guide.html#demultiplexing-paired-end-reads-in-mixed-orientation

Essentially, we demux the mixed-orientation reads in two rounds. In the first round we treat R1 and R2 as forward and reverse reads, respectively, as in a typical paired-end demultiplexing workflow. This will produce per-sample demultiplexed R1 and R2 files, as well as new multiplexed files (“unknown”), containing all of the reads that weren’t able to be demultiplexed. We then demultiplex the unknown reads, but swapping R1 and R2 so that R2 is treated as the forward reads and R1 as the reverse reads in the second demultiplexing round.

While making unrelated changes to q2-cutadapt, we discovered that on the second demultiplexing round, rather than appending reads to the results of the first round of demultiplexing, we were overwriting those results. As a result, any reads that were demultiplexed in round 1 were discarded, if additional reads were matched to a sample during round 2.

Resolution

We have implemented a fix for this issue which will be available next week as part of the QIIME 2 2020.11 release.

We apologize for any inconvenience this may have caused, and are happy to discuss further if you have any questions, or need help determining if you have been impacted by this bug.

6 Likes