Can any one help me Kruskal-Wallis (pairwise) P and H value?

Aqleem12 · June 11, 2018, 4:36pm

Dear All,

Thanks for your support. I am omitting some of the information from my samples. When I run the Kruskal-Wallis (pairwise) sample tests in relevant to alpha diversity box plots section, I got amazed about the P and H value. These values are higher than recommended P and H value. So how can I Interpret my results? Can anyone help me? I am attaching file with the omission of some of the information. Waiting for your assistance.

Mehrbod_Estaki · June 11, 2018, 7:37pm

Hi @Aqleem12,
Can you explain what you mean by:

The values you see there are the output of the Kruskal-Wallis test, the H being the test statistic that is used to calculate the p values and there is no 'recommended value' for this.

The q-value is the adjusted p-value. The p- and q-values can be anywhere from 0-1 and if you accept the conventional .05 cutoff, the results from your study would indicate that there are no significant differences between any of your groups.

Aqleem12 · June 12, 2018, 6:30pm

Dear Sir,

The values are higher than the cut off value. I used one replicate per each sample. Might be this is the reason.

Mehrbod_Estaki · June 12, 2018, 9:17pm

Hi @Aqleem12,

I still don't understand what you mean by higher than cut-off value, as I mentioned above, these values are simply the outcome of the Kruskal-Wallis test and they are not values or parameters you manually set. The p or q values are within 0-1 range so nothing fishy is going on with your data, it is simply telling you that there is no differences between your groups.
You'll have to explain your question more clearly.

1 replicate per sample is what most people use, more important is how many samples you have in each category/group that is being compared.

Aqleem12 · June 12, 2018, 11:16pm

Dear Sir,

I applied evenness group significance under alpha diversity box plots section Kruskal-Wallis (pairwise). In my case, There were two groups i.e. Group 1 and Group 2. In other terminology there were two main subjects, Subject I was a cultivated apple and subject II was wild apple. Four different sample per each subject. one from root, one from shoot and two other from other parts of trees. i.e.
Subject I
D1= root
D2=shoot
D3=flower
D4=fruit
Subject II
R1=root
R2= shoot
R3= flower
R4=fruits

So one sample from each part of tree. We can say one replicate from each part of tree.

The command that I use for the above analysis is as;
qiime diversity alpha-group-significance
--i-alpha-diversity core-metrics-results/evenness_vector.qza
--m-metadata-file sample-metadata.tsv
--o-visualization core-metrics-results/evenness-group-significance.qzv

For better understanding here screen shot of metadata file is attached

Yours Sincerely
Aqleem Abbas

Mehrbod_Estaki · June 13, 2018, 12:02am

Hi @Aqleem12,

I'm afraid you've failed to explain your question regarding the cut-off values again. I

But here's a thought about your experiment design in general.
I don't know what the main question of your experiment is but given your metadata file there are really only 2 factors you can look at:

Factor 1: Bodysite - Based on your original attached photo this is what you have selected from your drop-down menu. Here you are comparing 4 groups (root, shoot, flower, and fruits) that have 2 replicates each, from Subject 1 and 2. To be honest I don't even know how or why a Kruskal Walis test would be performed on an n=2. I would not rely on any stat test that only has 2 replicates. Even if you get p-values it is rather meaningless...in my opinion you can get just as much information by just visually looking at the boxplots. Regardless your p values for this test are well above 0.05 which would indicate there is no difference between the groups anyways.

Factor 2: Subject - If you just want to compare your wild vs cultivated tree then you might select this category. When you select Subject from the drop-down menu you are combining all 4 sites of Subject 1 and comparing it to all 4 sites of subject 2. So 2 groups x n=4. This is still very low sample size and not all that reliable either in my opinion from a stats perspective. There is also the major confounding factor that your replicates are not true replicates since they are from different sites which are very likely different in community composition.

Unless you have some more samples to include in your analyses I'm not really sure there is any way to run meaningful stats on the data you currently have. This is more of a observational/exploratory pilot data set.

Aqleem12 · June 14, 2018, 11:46am

Sir,

Thanks. Your two factors options are valid. I want to analysis the microbiome samples from two trees wild as well as cultivated trees at four body sites at four time points. as Caporaso et al. (2011) did for two human from four sites at 5 points. My hypothesis are : There is difference between the two trees and difference between the 4 groups(root, shoot, flower and fruits). Might I did mistake in making metadata file? Please look at it. There should be statistically significance as the two trees are growing in different enviornments, different nutrients and different cultural practices. Waiting for your response!

Thanks

Yours Sincerely
Aqleem Abbas

Nicholas_Bokulich · June 14, 2018, 6:35pm

Evidently there is not a statistically significant difference. That is what @Mehrbod_Estaki indicated here:

Your P values are all very high, indicating no significant difference between groups. Your low sample size is going to greatly decrease statistical power, but with such high P values I surmise that even an increased sample size would not show a significant difference.

I hope that helps clarify this matter.