What percentage of my sequences are unclassified?

JeffF · November 14, 2018, 11:57pm

How can I found out what percentage of my sequences were classified on a per sample basis if all of my samples were imported as one Qiime artifact? If it is highly variable I will need to control for that when comparing relative abundances. I used open-reference clustering method. Thanks!

Nicholas_Bokulich · November 15, 2018, 2:10pm

Hi @JeffF,

I am assuming that you have a feature table and taxonomy classification results. You can use qiime taxa barplot to build a barplot, which will show the % of unassigned in each sample.

If you want a tabular output instead, you can use qiime taxa collapse to collapse your feature table's features based on taxonomic affiliation, use qiime feature-table relative-frequency to convert to relative frequency, and then use qiime tools export to export to biom, biom convert --to-tsv to convert to text, and then examine that table directly.

I hope that helps!

system · December 16, 2018, 8:12pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.