In Q2 I generated a dada2 feature-table showing the following overview summary numbers
Total frequency 19,781 (I assume these are the total number of sequences that passed filer in dada2)
The feature detail then goes on to breakdown the frequency of sequences within each feature. I pushed these through the feature-classifier and generated a taxonomy artifact / visualization.
Due to the low number (relatively) of features, I manually matched up the feature ID's from the feature detail (dada2 table output) the taxonomy .qzv (exported into Excel) so I could have the frequencies observed paired to the taxons.
Since both files share the same Feature ID designations it seemed reasonable to expect that there was a way to combine these so I wouldn't have to do it manually - especially for more complex samples. I found a thread that discussed how to do this using the $qiime taxa collapse command...which I ran passing the --p-level 6 (in order to collapse to genus-level). I then extracted the collapsed table and looked at the .biom file created. Which displayed the following:
My question is why some genus-level classifications were not listed in the collapsed table .biom file? Why is the k__Bacteria only (8) feature included? The ones that are listed have the correct frequencies shown. The bulk of the sequences were clustered into 6 features - all of which were ID'd out to g__Streptococcus...yet they were not represented in the collapsed table?
Any help would be appreciated!