Hi,
I’ve just run an l6 ANCOM on my data and I got the following result
Species A
Group1
0.00 49
25.0 128
50.0 265
75.0 534
100 1598
Group 2
0 1.00
25 1
50 1
75 63
100 3015
Group 3
0 303
25 1200
50 1615
75 2470
100 3460
I know that first numbers are the quartiles. Which one should I use to compare abundance between the 3 groups? The median?
And are the abundance values ratios? So is species A 1615 times more abundant group 3 than in group 2 (if we look at median values)? Or it more abundant in group 2 because when we consider 100% of samples, it has the highest number?
Thanks!
mortonjt
(Jamie Morton)
October 25, 2019, 2:15am
3
Bethanie:
I know that first numbers are the quartiles. Which one should I use to compare abundance between the 3 groups? The median?
And are the abundance values ratios? So is species A 1615 times more abundant group 3 than in group 2 (if we look at median values)? Or it more abundant in group 2 because when we consider 100% of samples, it has the highest number?
Thanks!
Yes these are quantiles of abundances. But I would be cautious about using those to make statements about increase/decrease of microbe abundances.
See the following papers on some of the caveats behind this
https://www.nature.com/articles/s41467-019-10656-5
Advances in sequencing technologies have enabled novel insights into microbial niche differentiation, from analyzing environmental samples to understanding human diseases and informing dietary studies. However, identifying the microbial taxa that...
GB Gloor, JM Macklaim, V Pawlowsky-Glahn and JJ Egozcue,
Frontiers in microbiology , 2017
Datasets collected by high-throughput sequencing (HTS) of 16S rRNA gene amplimers, metagenomes or metatranscriptomes are commonplace and being used to study human disease states, ecological differences between sites, and the built environment. There is increasing awareness that microbiome datasets generated by HTS are compositional because they have an arbitrary total imposed by the instrument. However, many investigators are either unaware of this or assume specific properties of the compositional data. The purpose of this review is to alert investigators to the dangers inherent in ignoring the compositional nature of the data, and point out that HTS datasets derived from microbiome studies can and should be treated as compositions at all stages of analysis. We briefly introduce compositional data, illustrate the pathologies that occur when compositional data are analyzed inappropriately, and finally give guidance and point to resources and examples for the analysis of microbiome datasets using compositional data analysis.
S Hawinkel, F Mattiello, L Bijnens and O Thas,
Briefings in bioinformatics , 2019 18 01
High-throughput sequencing technologies allow easy characterization of the human microbiome, but the statistical methods to analyze microbiome data are still in their infancy. Differential abundance methods aim at detecting associations between the abundances of bacterial species and subject grouping factors. The results of such methods are important to identify the microbiome as a prognostic or diagnostic biomarker or to demonstrate efficacy of prodrug or antibiotic drugs. Because of a lack of benchmarking studies in the microbiome field, no consensus exists on the performance of the statistical methods. We have compared a large number of popular methods through extensive parametric and nonparametric simulation as well as real data shuffling algorithms. The results are consistent over the different approaches and all point to an alarming excess of false discoveries. This raises great doubts about the reliability of discoveries in past studies and imperils reproducibility of microbiome experiments. To further improve method benchmarking, we introduce a new simulation tool that allows to generate correlated count data following any univariate count distribution; the correlation structure may be inferred from real data. Most simulation studies discard the correlation between species, but our results indicate that this correlation can negatively affect the performance of statistical methods.
1 Like
system
(system)
Closed
November 25, 2019, 8:16am
5
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.