I am running adonis2 in R on my beta diversity matrices and am having trouble deciding whether or not to use the strata argument.
We have human saliva 16S sequences from about N=150 people. Because some people have siblings in the study, the data is clustered at the family level. However, this clustering is very sparse, with N=110 families and an average of 1.35 people per family cluster.
I want to make sure to take this clustering into account so as not to violate assumptions. However, using the strata argument (strata=family_id) seems inappropriate to me, because for many people, this would restrict permutation to a cluster of size N=1.
I was hoping to get advice from someone more familiar with the adonis2/permanova method. Should I set strata at the family level? Is there another way to go about this clustering issue?
I think I might start by checking the assumption in your data that siblings have more similar oral microbiomes than non-related individuals. I tend to test something like that by asking if the within-sibling pairs are more similar to eachother (within-pair similarity) than the comparison to a non-related indiviudal and then testing with a permutative t-test. (This is not implemented in qiime2, and Im not sure if it's implemented in R; it should probably be on my list of things to add.)
If there is no effect of siblings, then I think you can say that and run your model with some confidence that you've at least tested for relatedness/similarity.
If there is an effect, then you might want to take one of multiple approaches. One option would be to take a representative member of each family (n=110) and test on that, thereby removing some of the interdependence. You could also do that as a sensitivity analysis: including and excluding siblings.
You might also treat sibling pairs differently, maybe looking specific at pairs that are concordant or discordant for the outcome you're interested in, and maybe do that as an additional analysis when you have families with multiple members.
From a stats perspective, I think Im also struggling with the single member stratum; R will let you do it, but it might also be worth finding a statistician and buying them a cup of coffee (virtually or in person) and picking their brain about the minimum cluster size needed for stratified permutations.
I'm not an expert on the ADONIS stat test (just a fan!), but wonder if you could partition by family and then by person. This is another way to control for family without stratifying.
adonis2(~family+person, ...)
This partitioning works great for fully blocked study designs, like when every person gets every treatment. Given that people are nested within families, I'm not sure how well this will work. You may run into the same issue with small group sizes.
Can you share with us the full nesting / blocking design of your cohort?
Hi Colin,
that's an interesting idea. Our nesting structure is somewhat messy; most people are singletons in their "family" group, but there are a handful of family groups of 2-4 people. I'm also unsure about how well this partitioning method would work given the number of singletons.
I think I will take Justine's advice and consult further with the biostatistics department at my institution and will post here if I figure out a course of action.