After demonising step using DADA2, i see a lot OTUs related to Cyanobacteria in my taxonomy table. We normally removed these OTUs from our OTU table before the following downstream analysis.
Can I filter remove these OTUs using QIIME2 and how?
Hi, as far as I know, they're planning for a plugin called sequence bases filtering, which can be used to filter the Cyanobacteria. For now, I export the feature table and taxonomy file, align them manually and delete the taxa assigned as Cyanobacteria. Then I import the filtered feature table back to Qiime 2 for the downstream analyses.
Hi @hongwei2017! We support this type of filtering in QIIME 2 by using the feature-table filter-featurescommand! If you have your feature table, and have assigned taxonomy using feature-classifier, you can filter your feature table by using your taxonomy as metadata! A hypothetical filtering command could look something like this (you will need to adjust this to match your filenames, and you will need to modify the --p-where statement to meet your needs):
$ qiime feature-table filter-features \
--i-table my-feature-table.qza \
--m-metadata-file my-taxonomy.qza \
--p-where "Taxon NOT LIKE '%Cyanobacteria%'" \
--o-filtered-table feature-table-sans-cyanobacteria.qza
The --p-where "Taxon LIKE '%Cyanobacteria%'" statement instructs QIIME 2 to remove any features that have the string Cyanobacteria present anywhere within the label.
Give that a shot and let us know how it works for you, or if you need any additional help. Thanks!
PS - @yanxianl's solution could also work, but one downside is you would lose your provenance, which is (in my opinion) a pretty cool feature of QIIME 2! We plan on supporting some more advanced sequence-based filtering in the future - stay tuned!.
Is my-feature-table.qza the file that contains both otu sequences accounts and taxonomy? or just otu sequence numbers. Should i filter before or after normalisation? I am thinking in which step to apply for this filtering.
Your feature table will be a separate file from your assignment taxonomies (take a look at the Moving Pictures Tutorial for a high-level overview of a typical QIIME 2 analysis).
That decision is up to you --- I suggest you try it both ways and compare results!
Thanks!
I used the command lines you suggested but somehow, the result was a taxaplot only with taxa belonging to the phylum Cyanobacteria - opposite to what we intended to achieve.
Actually, by checking the usage of the feature-filter function, I found an alternative solution by passing an extra arguement, "--p-exclude-ids", to the command lines: