Hello everyone!
I have been trying to use the pairwiseAdonis tool in R to compare different groups in my QIIME2 output. I converted my filtered SV table file, tree file, taxonomy file, and metadata file from QIIME2 into a phyloseq and used the DADA2 pipeline commands (taken from here) to ordinate my samples on an NMDS plot. Each group of three samples enclosed in an ellipse indicates a set of biological replicates for a condition. The code for creating the phyloseq is as follows:
pswd <- qza_to_phyloseq(features = "Analysis2/16S-table-noplant-rarefied-10000_filtered.qza",
tree = "Analysis2/rooted-16S-tree-filteredSVs.qza",
taxonomy = "Analysis2/16S-rep-seqs-taxonomy.qza",
metadata = "Analysis2/metadata2.txt")
pswd.prop <- transform_sample_counts(pswd, function(otu) otu/sum(otu))
ord.nmds.bray_wd <- ordinate(pswd.prop, method="NMDS", distance="bray")
plot_ordination(ps.prop, ord.nmds.bray, color="Substrate", shape="Time", title="Bray NMDS") + geom_point(size = 4)
The resulting NMDS plot appears to show that my Day 90 groups differ more from each other than the Day 0 groups do:
I want to perform a PERMANOVA test to determine which Day 90 groups are significantly different from each other. However, pairwise PERMANOVA tests do not seem to work on my groups, and I think it is because each group has only three samples. When I run a pairwise PERMANOVA test to compare each group, it generates p values that are all >0.05 and do not make sense. For example (highlighted in blue), the p value for group "4 Day 0" vs. group "6 Day 90" is 0.6, and the p value for group "4 Day 0" vs. group "6 Day 0" is also 0.6. This is an odd result, given that the "4 Day 0" and "6 Day 0" groups are much closer to each other on the NMDS plot than the "4 Day 0" and "6 Day 90" groups:
The code for this pairwise PERMANOVA test is as follows:
library(vegan)
meta_distance <- as(sample_data(pswd), "data.frame")
devtools::install_github("pmartinezarbizu/pairwiseAdonis/pairwiseAdonis")
library(pairwiseAdonis)
res <- pairwiseAdonis::pairwise.adonis(distance(pswd.prop, method="bray"), meta_distance$Group)
When I run this code, it gives the following error:
'nperm' >= set of all permutations: complete enumeration.
Set of permutations < 'minperm'. Generating entire set.
I believe that for groups of only three samples, the number of total permutations that can be performed is less than adonis's default. Could this result in the nonsensical p values?
To try to work around this issue, I created a subset of my data that included only the "2 Day 90", "4 Day 90", "5 Day 90", and "6 Day 90" groups. Since I wanted to see if the "6 Day 90" group was significantly different than the other groups, I created a column in my data matrix named "Category". I assigned "2" in this column to the "6 Day 90" group and "1" to all of the other groups. I then created a distance matrix from this edited data matrix and performed a PERMANOVA test in which the distance matrix is "explained by" the "Category" factor. The code I used is as follows:
# Subsetting: 6 Day 90 vs. 2, 4, 5 Day 90
690_vs_2_4_5_D90 <- meta_distance %>%
filter(Group == "2 Day 90" |Group == "4 Day 90" | Group == "5 Day 90" | Group == "6 Day 90") # subsets my data
690_vs_2_4_5_D90$Category <- c(1,1,1,1,1,1,1,1,1,2,2,2) # creates Category column
690_vs_2_4_5_D90_dist <- 690_vs_2_4_5_D90 %>% # creates distance matrix from this data
select(all_of(.[["sample.id"]])) %>%
as.dist
adonis2(690_vs_2_4_5_D90_dist ~ Category, data = 690_vs_2_4_5_D90, permutations = 1000). # performs PERMANOVA analysis
This yields the following result (Pr(>F) is the p value for the analysis):
This, I would think, indicates that samples in Category 2 are significantly different than samples in Category 1. For comparison, I then performed this same subsetted analysis on four of the the Day 0 groups, since they do not appear to be significantly different from each other on the NMDS plot:
# Subsetting: four Day 0 samples
D0 <- meta_distance %>%
filter(Group == "2 Day 0" | Group == "3 Day 0" | Group == "4 Day 0" | Group == "5 Day 0")
D0$Category <- c(1,1,1,1,1,1,1,1,1,2,2,2)
D0_dist <- D0 %>%
select(all_of(.[["sample.id"]])) %>%
as.dist
adonis2(D0_dist ~ Category, data = D0, permutations = 1000)
This yields the following result:
This would indicate that that samples in Category 2 are not significantly different than samples in Category 1, which aligns with what I see on the NMDS plot.
I would like to know: Is this subsetted analysis a valid way of comparing my groups to circumnavigate the issue of having a small sample size in my groups? Is there an issue with comparing a group of 3 samples to a combined group of 9 samples in a permutational analysis of variance? I appreciate any help and insight on this matter.