Hasti_A
(Hasti A.)
March 17, 2024, 6:51pm
1
Hello!
I have a general question regarding the output data from Kruskal-Wallis pairwise comparison tests on microbiome data. Is this stats test, and the associated p-values generated, 1-tailed or 2-tailed? I have one reviewer who is curious.
Thanks, in advance!
Hello Hasti_A,
Could you tell us the command you used? I can look up how the test is run based on that.
Thanks!
Hasti_A
(Hasti A.)
March 17, 2024, 10:11pm
3
Hi Colin,
Sure! I'm focusing on just Shannon group significance, but this is the overall command I used:
> for metric in observed_features shannon pielou_e faith_pd
do
qiime diversity alpha-group-significance \
--i-alpha-diversity ${metric}_vector.qza \
--m-metadata-file ../metadata_for_comb_abdomens.tsv \
--o-visualization ${metric}_group_significance.qzv
done
Thanks!
Here's the Qiime2 plugin code for making that visualizer:
for name, group in data.groupby(metadata_column.name):
names.append('%s (n=%d)' % (name, len(group)))
groups.append(list(group[metric_name]))
escaped_column = quote(column)
escaped_column = escaped_column.replace('/', '%2F')
filename = 'column-%s.jsonp' % escaped_column
filenames.append(filename)
# perform Kruskal-Wallis across all groups
kw_H_all, kw_p_all = scipy.stats.mstats.kruskalwallis(*groups)
# perform pairwise Kruskal-Wallis across all pairs of groups and
# correct for multiple comparisons
kw_H_pairwise = []
for i in range(len(names)):
for j in range(i):
try:
H, p = scipy.stats.mstats.kruskalwallis(groups[i],
groups[j])
kw_H_pairwise.append([names[j], names[i], H, p])
And here are the docs for the kruskalwallis
function:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kruskal.html#scipy.stats.kruskal
The p-value for the test using the assumption that H has a chi square distribution. The p-value returned is the survival function of the chi square distribution evaluated at H.
As discussed in this GitHub issue , it's one-sided.
Bonus!
kw_H_pairwise['q-value'] = multipletests(
kw_H_pairwise['p-value'], method='fdr_bh')[1]
This means when you have multiple tests, the false discovery rate is controlled with the Benjamini & Hochberg (1995) method!
Hopefully this helps answer the ref's questions and gets the paper published!
2 Likes
Hasti_A
(Hasti A.)
March 17, 2024, 10:37pm
5
Fantastic! This is SO helpful, thank you so much. Appreciate your help.
2 Likes
system
(system)
Closed
April 18, 2024, 4:38am
6
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.