Filtering out mitochondrial and chloroplast sequences

What is the recommended way to remove mitochondrial and chloroplast sequences in qiime2?

Hi @jessicalmetcalf! We currently don't have support for that in QIIME 2. @gregcaporaso wrote up a post describing how you can use QIIME 1 & 2 to accomplish this (as a workaround). In his post he mentioned that QIIME 2 support will be added in the upcoming 2017.8 release. Things have been re-prioritized in the meantime, so this will likely make it into the next release, 2017.9 (end of September).

This is going to be pretty confusing for the novice students in my class. They are learning QIIME2 via tutorials and then applying the tools to their own data sets. Flipping between QIIME1 and 2 is not ideal. Is it possible instead to do this within QIIME2 by using the taxonomy results to find the sequences to remove, and then use the “filter_features” command?

I hope you reconsider this as a high priority. Almost all amplicon data sets will include features that are not really part of the microbiome and need to be excluded (e.g. cholorplast, mitochondria, insects, etc) so this is a pretty core function.

1 Like

I completely agree it's not ideal. We are working as efficiently as we can to have QIIME 1 feature parity by the end of 2017, and each monthly release brings us closer to that goal. Unfortunately, until the end of 2017 some analyses will still need to be run in QIIME 1 or with another tool, and then have the results imported into QIIME 2.

As a heads up, there may be other missing functionality that you need for your class. You can see a mostly up-to-date list of QIIME 1 functionality that's scheduled to be ported by the end of 2017 in this spreadsheet and this spreadsheet. There's overlap between the two spreadsheets and the organization is messy; we're working on merging them and cleaning things up, but that'll give you an idea of what's available already and what's planned. Get in touch if you have any questions about the availability of specific analyses!

Filtering features based on taxonomy should work -- check out metadata-based filtering in the filtering tutorial for examples. There's at least a couple of ways to accomplish this; for example, you could supply your FeatureData[Taxonomy] artifact via --m-metadata-file and provide a SQL WHERE clause with --p-where to identify taxa to keep/discard (e.g. using exact string matching or regular expressions). Let us know if you have specific questions about how to accomplish that with your data!

Thank you for your input on prioritizing this functionality. This feature is a high priority for us as well, I didn't intend to imply otherwise (it's competing with other high priority items though).

The upcoming 2017.8 release is scheduled for next week, so this functionality won't make it into that release. It will likely make it into 2017.9 but I can't make any guarantees -- the only guarantee we can make is that the feature will exist by the end of 2017.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.

The latest release of QIIME 2 (2017.10) includes a new plugin, q2-quality-control, with similar support as QIIME 1’s exclude_seqs_by_blast.py script, for filtering out contaminants, non-target DNA, etc. The relevant QIIME 2 command is qiime quality-control exclude-seqs, check it out in the q2-quality-control tutorial!

There are also new commands for taxonomy-based filtering in the q2-taxa plugin, check out this new section of the filtering tutorial for details! :tada:

1 Like