Hello,
First post here so apologies if format or information is lacking.
I am trying to filter out features from my table.qza file that are attributable to no close matches being found to a particular taxonomic level within the Greengenes database. Below are examples of what I want filtered:
k__Bacteria;;;;;__
k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;__
k__Bacteria;p__Firmicutes;;;;
k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Burkholderiales;;
In previous posts, I saw that these “;;" notations are only input when bar plots are generated or some other permutation is done. The problem for me arises when converting a table.qza file to a biom file for import into phyloseq. These notations are then present within this biom file. The problem I am discovering is that these ";;” portions seem to be interpreted by phyloseq as unique taxonomic identifiers and are assigned their own taxonomic levels. When I look at my imported taxonomy file in R, I see 13 taxonomic ranks listed:
tax_table() Taxonomy Table: [ 949 taxa by 13 taxonomic ranks ]
rank_names (class)
[1] “Kingdom” “Phylum” “Class” “Order” “Family” “Genus” “Species” “Rank7” “Rank6” “Rank5” “Rank2”
[12] “Rank3” “Rank4”
What would be the easiest way to remove these assignments so I can move forward? I realize this ends up being a phyloseq issue but I am hoping to resolve it through filtering steps in Qiime2 before moving over there.
I have been following the tutorial posted here as far as adding taxonomy to my biom file and have looked through the filtering tutorials here to try to remove these but to no avail.
Thanks for any help!
TNT