# Kruskal-wallis all group v.s pairwise

I need an explanation, i made Shannon,simpson,ACE and simpson evenness alpha diversity metrics. The kruskal-wallis(all grou) for all of them was significant (<0.05) but the kruskal-wallis(pairwise) for all of them was non significant. I am confused how to interpret my result,whether i consider it significant or not?
I read another post and i know that kruskal-wallis (all group) do statistical analysis between all group and kruskal-wallis(pairwise) between each two of them.

Hi @arar,

This is puzzling. What do the trends look like when you look at the distribution of your data?

My suspicion is that it has to do statistical power and a balance between group size and the number of groups. Your H-value/p-values suggest to me that you may have a lot of groups with a few samples in each of them. If this is the case, there may be a reason the total ranks would be different but the individual ranks donâ€™t quite shake out that way. Is there some way you can futher group or nest your data to combine it to give you more samples in fewer groups? What you lose in resolution you may gain in power and be able to confirm the group result.

What do the trends look like when you look at the distribution of your data?
I can't understand this question but if you mean the distribution of bata diversity, this is the graph

Each sample represent a different sea location

Yes, rigth i have 9 groups each contain 2 samples. , i can group samples based on an metadata column to have 3 groups, each one containing 8 samples
These for simpson-eveness:

These for simpson richness:

but also i still nead to interpret results without grouping (9 groups each contain 8 samples). i consider it significant or not?

Notice:when i run ancom, only 9 taxa out of 6000 found to be differently abundant between groups (9 groups each contain 2 samples) but the result was 100 taxa out of 6000 found to be differently abundant between groups (3 groups each contain 8 samples)

Hi @arar,

Where did the extra 6 samples come from? 9 x 2 = 18; 3 x 8 = 24.
But, also, is this group of 3 a cruder grouping of your 9 sites. So, like, maybe I have sites in the Baltic at locations A-I, but I know that A,B,C are in the north off the Swedish coast, D,E,F are in middle by the Finish coast, and G,H, and I are by Denmark so Iâ€™d group them into a superset of â€śSwedishâ€ť, â€śFinnishâ€ť and â€śDanishâ€ť or something equilivant.

I was hoping you could share the boxplots of the alpha diversity to help understand what that data looks like.

Itâ€™s a third option: too underpowered to answer this question. A kruskal-wallis test typically wants at least 5 samples per group to be functional. You could try a different model, but this one isnâ€™t appropriate.

This would support a community-wide difference, (do you also see this in beta diversity? Did you filter your samples to exclude low abundance ASVs that are present in only one sample?) But, again, probably not enough samples for comparison.

Sorry
My sample collected from 3 marine water bodies (the 3 groups), we collected 2 samples from 11 locations along the 3 water bodies,these water bodies are connected with each other.

These plots for eveness:

These plots for richness:

When i run beta group significnce, i didn't get significance between each 2 group in case of (11 location )but get signifocant result in case of (3groups)

Hi @arar,

It looks like you have very clear trends in terms of your body of water, but that location is a lost cause. (If you squint, there are a couple, but itâ€™s really just not appropriate for what youâ€™re doing.) I would stick with your larger grouping of three bodies of water or treat location (if it is a distance) as a continuous covariate and maybe look at a regression.

Please what do you mean by

Do you mean i need to find the differential abundant taxa with q2 gnesis?

If i am not able to increase the number of replicates, that is mean i canâ€™t say there is asignificanâ€™t difference or not.
Before analysis we thought that there will not be a big difference between locations

You see very clear differences in alpha diversity based on your sampling site (And probably in your PCoA). So, I think thatâ€™s clear and easy to interpret.

No, I mean that you can do a regression for alpha diversity with `q2-longitudinal` or your favorite regression software. (Thereâ€™s an example of this in the Parkinsonâ€™s Mice tutorial. In particular, you may be able to leverage that and an adonis for your data.)

No, youâ€™re not. But, I also canâ€™t easily say off two timepoints whether or not thereâ€™s a significant difference between my microbiome and yours. Weâ€™d need a bigger sample size to that comparison.

As i think q2 longtudinal is for data that changeover time but that isnâ€™t the case of my data, all samples are collected at the same time but from different location

So o can depend on the box plot for saying there is a significant difference between the 11 locations

Please look through the diversity section of the tutorial. to see an example of an alpha diversity regression on categorial data.

No. You can say thereâ€™s a difference between the 3 big groups. You can say nothing about your 11 groups.

Ok, thank you for your effort i will see the toturial

When i revised PcoA graph again i found the PcoA1 is only responsible for 18.12% of variability and pcoA2 responsible for 11.93%
Is that enough to consider there are a dissimilarity between samples

Another question is ancom a statistical test that also need ore than 2 replicates?

If you can explain 20+% of your variation with a PCoA, itâ€™s not a bad day. Remember, the PCoA is a data compression technique that takes n dimensional data and turns it into something our two and three dimensionally challenged brains can handle. Itâ€™s linked to, but not entirely linked to, your permanova results which tell you about the statistical significance.

