After I exported the feature-table and the taxanomy, I transformed them into excel-readable format and I noticed that they have different row number. The feature table was generated via dada2, and has 16610 rows, while taxanomy file has 16691 rows, there is a 81-row gap, and this really confused me. Also, now I am concerned that something was wrong during all my analysis, which scared me…
Great question! I think some more detective work is needed.
When you compare the names of the features in the two tables, are most of the names the same? What are the names of the specific 81 features missing from the taxonomy table?
I wonder if this could be an issue with features that were not assigned a taxonomy. Can you find logs from your taxonomy assignment step that might show which features could not be classified? If we can match up these names against the 81 missing features, we could confirm this is what happened!
Hey @colinbrislawn, I think that @ziyan has more features in their FeatureData[Taxonomy] output than in their FeatureTable[Frequency], the opposite of what you are suggesting above. @ziyan, can you confirm that? Also, how many features are present in the FeatureData[Sequence] output used to generate the FeatureData[Taxonomy]? Would you be able to share some of these files with us?
Yes, @thermokarst that is correct, I have 16092 rows of the taxanomy file(sorry for the wrong number previously), and 16011 rows of the exported teature table, please see the pics for clear version.