Unsigned taxonomy appeared in the gneiss taxa_summary results

Hi, I used the following commonds to test differential abundance between groups at the taxnomic level 4 :

qiime gneiss balance-taxonomy
--i-table ./comp-table.qza
--i-tree hierarchy.qza
--i-taxonomy ./taxonomy.qza
--p-taxa-level 4
--p-balance-name 'y0'
--m-metadata-file ./manifest.xls
--m-metadata-column group_manner_1
--o-visualization group_manner_1-y0_taxa_summary-L4.qzv
qiime tools export --input-path group_manner_1-y0_taxa_summary-L4.qzv --output-path group_manner_1-y0_taxa_summary-L4

However, I found many unssigned sequenced at taxnomic level4 appears at the final results:
a2
a1

e2758f390d79b62de4c70d267735d645,Unassigned,Unassigned,Unassigned,Unassigned,Unassigned,Unassigned,Unassigned
0c4b6ae1fd659ce65a1e29a0c8dcbd19,D_0__Archaea,D_1__Nanoarchaeaeota,D_2__Woesearchaeia,D_2__Woesearchaeia,D_2__Woesearchaeia,D_2__Woesearchaeia,D_2__Woesearchaeia

It seems that qiime2 automatically supplement the unsigned taxonomic level the previous taxonomic level. It really confused me a lot.
I think it is better to remove these unsigned results since I can chose the
parameter "--p-taxa-level" to test the coresponding taxonomic levels.

Is there anything qiime can do to remove these unsigned features?
Or should I remove the sequences unannotated at level 4 before doing gneiss taxa_summary analysis ?

Thanks

Hi!

After taxonomy classification one can filter unassigned and assigned only to Bacteria;__ features out from the tables. One may also check some of the unassigned features by blasting them on NCBI website.

1 Like

Thanks for your reply!!!
If I want to do the gneiss taxa_summary analysis at D_4, just like the following:
D_0__Bacteria;D_1__Firmicutes;D_2__Bacilli;D_3__Lactobacillales;D_4__Lactobacillaceae;
Should I remove the sequences, since they are only annotated at the level D_1:
D_0__Archaea;D_1__Nanoarchaeaeota

Removing these lower level annotation results will effect the final significance ?

Thanks!

Hi!
In my opinion, there is no need to remove annotated at the level D_1 sequences since one can loose some insights on the results by doing it.
But I would be glad to hear other opinions as well

1 Like