I obtained the blank evaluate-taxonomy.qzv file attached. I have tried to include a sample id for extracting frequency data and obtain the following error 'Cannot retrieve an element from an empty/null table' . I have viewed the input files in R using qiimeR. the taxonomy files and relative feature table are not empty and contain the overlapping features. I have tried running the same command above using the collapsed relative frequency table but obtained similar output and ran at various depths. I have applied evaluate-seqs and evaluate-composition with some of the same input files and obtained expected outputs.
Current process cutadapt-demux-dada2 denoise-feature classifier skearn-decontam in R- filter rep-seq, dada2 output, taxonomy by ids of contaminants decontam- quality control plugin.
I would appreciate any guidance on how to make this feature work!
The output is not actually blank; you are actually scoring 0 accuracy for all metrics/levels.
This is because the feature IDs do not align between your observed and expected taxonomies.
This method is most useful when you have a list of features with known taxonomies (e.g., simulated data), but your data are from a mock community so you know which taxonomies to expect, but only on a community-wide basis, not for individual sequences.
Thank you for such a fast reply. I have used the evaluated composition for the same data and it’s incredibly useful, especially for method validation work (so thank you to the creators).
For evaluate taxonomy the ideas is that the feature_ids match? I think I still need some time/coffee or an example to get my head around the features purpose. Perhaps an addition to the tutorial at a future point.
Yes! It is measuring accuracy (as precision, recall, and F-measure) by averaging across many features. The idea is you have a set of input sequences (e.g., simulated sequences or sequences taken from known species). You classify them. Then you compare the expected vs. observed taxonomy for each of these. The expected and observed taxonomies must map to the same feature IDs to perform this comparison. The feature table is just provided if you want to weight these scores using some abundance information.
Yes sorry that would be useful (this method was added after that tutorial was written, and looks like it was not updated)... I will put it on the to-do list