I am struggeling with the significance tests of my study. I sampled 24 orchards (5 samples per orchard) from which 12 are without treatment and 12 are with treatment. I want to know if the treatment has an impact on alpha diversity. Therefore I tested significance of my categorical variables with Kruskal Wallis test but I don`t understand what to use for the numeric variables. So,
how to test the significance, the correlation respectively of alpha diversity and e.g. nitrogen content in the soil?
First I thought I can use qiime diversity alpha-correlation (using --p-method pearson if the diversity index is normally distributed and spearman if the index is not normally distributed). Btw. do both variables have to be normally distributed to use pearson?
But then I used pearson/spearman in R and it gives me only the correlation coefficient. So is the "Test statistic value" in qiime the same as the spearman correlation coefficient? But then what does the p-value mean in qiime? So I used the wilcoxon rank sum in R, but the outcoming p-Value is completely different from the spearman test in qiime and wilcox is actually testing alpha diversity index with categorical data. Sorry I did not find something about alpha diversity and numeric variables in this forum which could help me.
Is there maybe something like adonis for alpha diversity outside of qiime2 (as far as I read in this forum there is nothing in qiime)?
2. Is it right that I can`t use alpha diversity longitudinal because it is only for time series or paied samples?
ANOVA I guess I can`t use because nothing is normally distributed...
Sorry, I am messing up all of these tests, I hope anyone can help me...
Based on your description, your experiment is a nested design, i.e., samples are nested within orchards and orchards are nested within experimental treatments. Therefore, statistical models that assume samples are independent, such as ANOVA and Krustal Wallis test, shouldn't be used for modeling your data. You need to model your data using linear mixed effects models, treating experimental treatments and nitrogen content as fixed effects, and orchards as random effects. You can do significance testing for both categorical and numerical variables in linear mixed effects models.
Resources on linear mixed effects models:
No. It's the model residuals that should be normally distributed not the variables themselves. See paper by Ernst and Albers, 2017 on the misconceptions about the assumptions behind the standard linear regression model.
Yes, qiime2-longitudinal was specially designed for dealing with longitudinal data.
The variables do not need to be normally distributed. It's the residuals that should be normally distributed, which is acutally not an important assumption. See paper by Ernst and Albers, 2017
The spread of residuals is not that different among different experimental groups. If you're concerned about heteroskedasticity, you can run heteroskedasticity-robust F test or use robust standard errors (e.g., HC4) in R.
Based on your description, you'd want to use your treatments (B VS. C) as the independent variable and alpha-diversity as the dependent variable. You shouldn't use Kruskal Wallis test because your samples are not independent of each other, i.e., your samples are clustered within orchards.
I will try these " heteroskedasticity-robust F test or use robust standard errors (e.g., HC4) in R".
If I only use the means of each orchard for statistical testing, could I then use Kruskal wallis? Or would you not recommend working with the means? (I have quite high variance in alpha diversity within orchards and for some orchards also the nutrient contents have a conciderable variance.)