Difference between observed feature and OTU from taxa bar plot csv file

Hello,

I have a question about the difference between 'observed features' generated by apha diversity - core-metric-phylogenetics and the OTUs I get after exporting csv file from taxa bar plot analysis?

I have found in the forum that both are named as OTU yet I can see that for single sample I have different values.
Could you please explain it to me?

Kind regards,
Joanna

Hello,
Could you explain in more details what exactly you are comparing?
I am asking since in "observed features" metric you can find an information about how many features are observed in certain sample, meanwhile in csv files from taxabarplots you will find counts of each feature by samples. Did you count all non-zero values?
Also, to run core-metrics you need to rarefy the data, so number of observed features in rarefied data (core-metrics) can be different from tables in taxabarplot (no ratefaction).

UPD. In taxabarplot, features are collapsed to taxa, meanwhile in core-metrics features are ASVs/OTUs unless you collapsed it before calculation.

Best,
Timur

1 Like

Hello,

Thank you for the answer. It helped but partially. What I wanted to compare in the value from observed features table as mentioned to the number of taxa in taxa table. Here I counted the species that were listed in the table.

Sample ID observed_features vector lvl7-taxa csv*
1 1126 384
2 1567 493
3 922 325
4 643 261
5 999 353

*In total there were 911 positions
The alpha rarefaction was based on the tutorial 'Moving pictures' yet I dont understand the connection as the is no link in the script between rarefied files and building observed features vector or taxa bar plot. Could you please explain?

Why is there such a difference? Which one is the right one to tell this is the number of microorganisms in the sample?

Kind regards,
Joanna

Hello,
I think you should not compare this two columns like that.
I will try to explain it.

  1. Each feature in the feature table (ASV/OTU) is assigned to only certain taxonomy. But to each taxa numerous features may be assigned (all thumbs are fingers but not all fingers are thumbs). So it is the main reason for those differences.
  2. In taxabarplot, table is not rarefied (unless you used rarefied table). In core metrics data is rarefied, so absolute counts are not exactly the same as they would be in the taxabarplot.

You can report any of this numbers by your choice, just make sure that you name it right - ASVs or taxa.

Hello,

if I understand it right it goes this way:

Taxabarplot csv file contains the list of taxa and how many times each taxon appears in single sample

core metrics, observed features vector contains the number of OTU=features that each sample contains.

This looks reasonable to me.
And if it is true - to one taxa we may assign many features which matters in case of my question.

Kind regards,
Joanna

1 Like

That's correct!

Best,
Timur