Chimera checking and bar plots

eDNA · July 1, 2017, 12:13am

Hello,

I used another software to obtain a feature table and representative sequences because I cannot use Qiime 2 to process my sequence data. I have successfully imported the feature table and representative sequences into Qiime2 for diversity analyses and taxonomy assignment.

I am wondering whether there are any commands in Qiime2 that can do chimera checking and filtering for my data (based on the feature table and representative sequences).
I used PR2 database for taxonomy assignment following the “Moving Pictures” tutorial. I used "qiime taxa barplot" to generate bar plots. In the "taxonomy.qzv" file, there are at most 8 levels of taxonomy for each OTU, but there are 9 levels in the barplots (although there are no difference in plots between level 8 and level 9). Is that normal?

Thanks,
Tom

jairideout · July 6, 2017, 11:33pm

Hi @eDNA!

QIIME 2 doesn't currently support this type of chimera checking/filtering. To my knowledge, chimera detection is only available in the DADA2 processing pipeline in QIIME 2. You may need to use another tool (e.g. QIIME 1) to perform chimera detection on your feature table and representative sequences.

We'd be interested in supporting this type of chimera detection in QIIME 2, perhaps in a new plugin. We have this on our list of future plugins but it isn't a high priority for the QIIME 2 development team. If you know of anyone that would be interested in developing a QIIME 2 plugin for chimera detection, we'd be happy to help with any questions that come up. The plugin developer docs are a good place to start.

That sounds pretty odd. Can you please send me your taxonomy.qzv file and the .qzv file generated by qiime taxa barplot so that I can take a look? You can attach those files to your forum post or share them via view.qiime2.org.

jairideout · July 7, 2017, 10:43pm

Don't worry about sending me those files. @thermokarst investigated this and found that the PR2 database has trailing semicolons in its taxonomic annotations:

KJ763800.1.1807_U       Eukaryota;Stramenopiles;Stramenopiles_X;Stramenopiles_XX;Stramenopiles_XXX;Stramenopiles_XXXX;Stramenopiles_XXXXX;Stramenopiles_XXXXX_sp.;
KM032338.1.1135_U       Eukaryota;Stramenopiles;Stramenopiles_X;Stramenopiles_XX;Stramenopiles_XXX;Stramenopiles_XXXX;Stramenopiles_XXXXX;Stramenopiles_XXXXX_sp.;

It turns out that qiime taxa barplot (and likely qiime taxa collapse) treat trailing semicolons in taxonomic annotations as an extra "empty" taxonomic level. We haven't seen trailing semicolons in the reference databases we usually use (e.g. Greengenes, Silva, UNITE), but it seems reasonable to ignore them, such that Foo;Bar;Baz and Foo;Bar;Baz; are treated equivalently.

I created an issue to track this bug. In the meantime you can just ignore the extra taxonomic level in the plot, or you can remove the trailing semicolons from your reference database (e.g. using sed) and rerun your analyses.

eDNA · July 10, 2017, 9:53pm

Thank you very much for your help, Jai.

system · August 11, 2017, 3:53am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.