Best practices for detecting and removing outliers in beta diversity (after rarefaction)

Salma_Sarker · August 3, 2025, 1:00am

Hi everyone,
I’ve rarefied my fungal community dataset to even sequencing depth and I’m preparing for beta diversity analysis (PERMANOVA, PCoA). Before running it, I used betadisper in R to calculate each sample’s distance to its group centroid (by Treatment) and flagged potential outliers as those with a distance > mean + 2×SD. Some of these samples look visually isolated on PCoA plots or have unusual taxonomic profiles.

Is this a reasonable approach for detecting outliers in beta diversity analysis? Is it recommended to remove such outliers even from the rarefied dataset? Are there best-practice guidelines in QIIME2 for handling outliers reproducibly?

Also, if someone makes this part of the comment clear.Continuing the discussion from Outliers in beta diversity analyses:

Thanks in advance for your advice!

-salma

colinbrislawn · August 4, 2025, 4:31pm

Hello Salma,

I’m not a statistician, but usually I keep all the data and use statistical tests that are less sensitive to outliers.

I’m interested in how other approach this!

cherman2 · August 4, 2025, 4:56pm

Hi @Salma_Sarker,

I am also not a statistician but I tend to agree with @colinbrislawn that I leave outliers be. My rational here is that unless I can prove that the data is wrong in some way, its a real signal and I am not sure how I would justify its removal to reviewers. (ex: I mislabeled a sample)

I would also say that if you are seeing outliers from sampling to a single sampling depth, I might try using q2-boots and seeing if that fixes your outliers. q2-boots provides rarefaction-based diversity metrics (q2-boots samples to an even sampling depth x times and then averages the results) instead of sampling once, giving more robust diversity metrics less susceptible to outliers caused by even sequencing depth sampling.

I hope this helps!

system · September 4, 2025, 10:56pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.