Mirror-image reversal of ordination data and PCoA graphs from core metrics

I have analyzed several combinations of groups of microbiome data (each group a different treatment), producing PCoAs for each. So far, the groups within each PCoA orient in the same shape, to the same axes, etc., no matter which other groups are present.

However, I've encountered a strange output with the last two PCoAs. I ran one analysis with 4 groups, and they oriented as expected. When I removed one of the groups however, the PCoA produced was a mirror image from its predecessor. I didn't know what to make of this, so I deleted all the data and started with a new file manifest, did new cutadapt and DADA2, etc. The PCoA from this seemed to be back to normal, with the groups orienting as expected without being mirrored.

I am stuck on the last analysis though. It contains all 5 experimental groups, and its PCoA is also coming out as a mirror image of the previous ones. I extracted the ordination data and saw the values in the first column were reversed (negatives became positives and vice versa) compared to the 4-group analysis. I don't know if this is relevant but it seemed odd to me. Here's a pic of the ordination data for the same samples, with the 4-group analysis on the left and the 5-group analysis on the right


I've reloaded everything again, but the results are the same. I have no idea what is causing this since there are no error messages and I seemed to fix it before. Any advice on this would be very helpful!


Hi @MicrobeManager,

This is a great question!

There are three things to keep in mind with PCoA.

  1. PCoA sits on top of your distance matrix which was probably calculated on rarefied data. Rarefaction is a semi stochastic process and its possible to get different ordinations based on your rarefaction.
  2. The ordination coordinates are dependent on the samples. If you introduce a new sample, you will shift your PCoA and that might drive clustering. (Body site is a classic example: if I want to look for clustering between with samples from :woman_scientist: and :woman_technologist: to see which kind of scientist has the weirdest microbes, the signal from combing the :poop: and :tongue: will overwhelm clustering from profession.)
  3. Within a PCoA, there is no true directionality. It's a map without landmarks. So, you can do things like move the center of the ordination and flip the sign along the axis without it affecting the relationship between points.

So here, you've changed your sample set. You (may?) have re-rarified (did you filter yoru distance matrix, then run PCoA, or just run core metrics?. Changing your samples and maybe your underlying distances and changed the ordination slightly, and that led to a change in the sign.

Now, you've probably changed the distance matrix, which changed the ordination, which can lead to flipping in the ordination.

I want to commend your due-diligence here, but it's not an error, just a fun particularity of ordination.

To make your ordination more stable, my preference is to filter my distance matrix and re-calculate my PCoA from a filtered distance matrix rather than re-run core-metrics every time I need to change my PCoA. I work with big datasets and the rarefaction and distance matrix calculations can take hours. You can use q2-diversity filter-distance-matrix to filter the distance matrix and then q2-diversity pcoa to calculate the ordination. q2-emperor plot will make the nice visualization. (Check out the q2-diversity and q2-emperor docs for more functions, options, and cool things you can do).

And then, remember you can always flip the visualization in the emperor plot. Go to the "axes" tab and there's a checkbox that lets you flip things. There are lots of reasons you might want to change the direction of an axis and it's pretty easy. If you're working programatically, the axes can be multiplied by -1.

I hope this helps!



Hi Justine,

Thanks for the very detailed reply, this makes more sense now. I did re-rarify each group of samples using only core-metrics. In fact, I created new manifests and did everything from the ground up for each combination of treatments I wanted to analyze. I am still new to QIIME2 so I wasn't sure how to break a larger dataset into smaller groups and pick and choose which to analyze for the core-metrics step. I probably did it the long way but I don't know all the ins and outs yet.

I don't think I fully understand this though. The file manifest was the exact same as the one that produced the mirrored samples, and the results from DADA2 and everything else were the same. How would this lead to the ordination being flipped back when I reloaded it?

This is good to know. I used code from this post that extracts the PC1 and PC2 values from the .qza and am graphing 2D PCoAs in R for publication. So I would just multiply the PC1 by -1 to flip them back?

Thanks again for your help!

1 Like

Hi @MicrobeManager,

I'm gooing to warn that you may want a :tea:, 'cause this will probably be another long answer.

So, for each data set, to get to the PCoA plot, you've got like 7 steps. Some of the are defined processes - there's no randomness to them and so if you put the same thing in, you'll get the same thing out every time. Some are sample independent filtering samples won't make a difference on the results. Some are feature independent - filtering features won't change the results.

step defined process sample independent feature independent computational cost
Import into QIIME 2 via a manifest :heavy_check_mark: :heavy_check_mark: -- :heavy_dollar_sign:
Primer trimming :heavy_check_mark: :heavy_check_mark: -- :heavy_dollar_sign:
Denoising with dada2 :x: -- :x: :heavy_dollar_sign::heavy_dollar_sign::heavy_dollar_sign:
Computing a tree(?) :heavy_check_mark: :x: :x: :heavy_dollar_sign::heavy_dollar_sign::heavy_dollar_sign:
Rarifying your feature table :x: :heavy_check_mark: -- :heavy_dollar_sign::heavy_dollar_sign:
Computing distances :heavy_check_mark: :heavy_check_mark: :x: :heavy_dollar_sign::heavy_dollar_sign::heavy_dollar_sign:
Computing the PCoA dimensionality reduction :heavy_check_mark: :x: :x: :heavy_dollar_sign:

You'll notice that there are a couple of undefined processes in the table. Dada2 and rarefaction both have an element of randomness in their algorithms that can change the result.

With DADA2, the model gets trained on a subset of sequences that get selected at random from the data, and this informs the error model. We hope :crossed_fingers: that the data is stable enough to tolerate this random element and that it doesn't make a major change to our results (it usually doesn't). The chimera filtering in DADA2 is also dataset dependent, and this can be influenced by the samples/sequences you put in. So, if you're changing your sequences, you're changing your results slightly. ...You also might be burning your computer, which isn't so bad with smaller studies but gets to be a pain when you get several hundred or several thousand. (It's easy to filter the tables, though! Check out the table filtering section in the data filtering tutorial and the q2-feature-table plugin for more filtering functions.)

You're also introducing some randomness into your process when you rarefy. Here, you essentially draw sequences at random out of your total population until you hit a certain depth. You select at random, and so you can potentially get a different distribution with multiple rarefaction interactions. (I cant seem to find the plot I'm looking for, but there was an example in the old emperor docs.) I think you might be able to solve some of the stochasticity by averaging over multiple rarefaction rounds, or you can just keep your one random step by filtering.
In general, rarefaction is relatively quick, but it scales with your sample size and depth. Im currently working with about 500 metagenomic samples, and it takes 20 minutes to rarefy my full table to 1M sequences. While it gives me the excuse of "cant work, data processing", its not terribly efficient if I want to make quick changes.

Next, you've got your distance calculation. This isn't random in that if you put in the same table and use the same metric, you'll get the same distances out. Your distances are always calculated pairwise) and the distance that gets calculated depends only on that pair fo samples. (It's slightly more complicated, but basically, the distance between LA and San Diego is the same whether we measure the distance from San Diego to San Francisco or not).
The challenge with pairwise distances is that even though they're identical, they take time to compute. There's been a lot of work int he past 5 or so years trying to speed them up for bigger datasets, but ultimately, these take time. Plus, you can always filter your distance matrix (see the tutorial here).

Now we calculate your PCoA based on the samples in your distance matrix and twist/stretch/flip the data to make the ordination work.
...But this ordination is a product of the samples in my distance matrix, which is the product of a stochastic rarefaction, which is the product of a stochastic process in DADA2.
Small changes in the upstream process can lead to sign flipping or slight changes in position, which could lead to filliping.

Yep! I'd do it on import and then forget about it. If you're ever doing a biplot, just make sure you flip your features the same way!



Thank you for that extensive answer! I think I have a better understanding now on how that introduction of randomness from DADA2 and rarefaction could affect the sign flipping in the PCoAs later on.

In terms of reloading a new manifest for subgroup of data I want to compare, I realize this isn't practical. Luckily my overall data set is small, but I can see it would be a problem for higher numbers of samples. In that case, I would probably do a single import, trim, and denoise with the entire group, but at what point should I start breaking the samples up into smaller groups for multiple comparisons? I considered doing this with the core-metrics step, but will the output be affected or produce errors if I use a rooted tree that was aligned to the entire dataset? Or does each subset of data need to be aligned to its own tree and then run? Maybe those are dumb questions, but that's where I got stuck before. Since the data set was so small it didn't take much time to reload everything, but I want to be able to do it the "right" way in the future.

Speaking of core-metrics, that includes the rarefaction and distance computing steps listed in the table you provided, correct? Do people use methods/commands other than core-metrics to do these steps? Again, maybe a dumb question, but I have mostly gone off the Atacama Soil and Moving Pictures tutorials for my workflow, and while that has worked well for what I need, there are probably many other QIIME2 commands I'm not aware of.

Thanks again for your assistance, this is really helping me get a better understanding of how to build and use these workflows!

1 Like

Hi @MicrobeManager,

I'm glad we're working through it together!

I think the current best recommendation for DADA2 is to process them samples on a per-sequencing plate basis because then you learn error for that place in particular. If you're running Deblur, it doesn't care about the place because it makes different assumptions. Then, you would want to combine the denoised plates. I'd probably continue and do taxonomic assignment, tree building, and maybe some basic qc filtering on the dataset (throw out anything without a phylum assignment, for example.) Some of this is practical: if you have access to a cluster or server, this is good stuff to start there and just let run without intervention.

Once you have the tables, you need to decide if you'll split before diversity analyses. Usually, I only make this split if it's data Im extremely unlikely to analyze together. For example, if I have data from two projects that go sequenced on the same run, or my adult human project contains data from two body sites. If that's not he case, I'll calculate diversity on everything. Again, the diversity calculations are computationally expensive, so I want to do them once, preferably on someone else's computer.

Once i have distance matrices, I can start subsetting them for diversity analyses. A good (although not perfect) rule of thumb in QIIME is that if it makes a visualization you should probably filter down to specific samples beforehand. (PCoA is an exception, you should do that before you generate the ordination).

So... when you filter depends on the research question!

It's not necessarily that these are the "right" way to do it... its just a way that minimizes stochasticity and lets me sleep without worrying about my computer running too hot. :woman_shrugging:. I have pretty specific goals in my workflows to offload the long running or memory intensive steps (usually before distance calculation) from my laptop and only run them once. So, that's my perspective here?

They brilliant @thermokarst reminded me that this is int he docs:


We’ll first apply the core-metrics-phylogenetic method, which rarefies a FeatureTable[Frequency] to a user-specified depth, computes several alpha and beta diversity metrics, and generates principle coordinates analysis (PCoA) plots using Emperor for each of the beta diversity metrics.

Some people do! It depends on what they want or need. I personallly prefer simpson to Pielou, so I tend to just rarefy my table and calculate alpha diversity directly on a rarefied table. Other people perfer metrics like jensen-shannon to UniFrac. Some people just like having all the parameters exposed all the time. I think philosophically, the tutorials are somewhere between a quick start and a full analysis. Like, they're a good answer to how to analyze microbiome data in QIIME 2 (and Id argue in general). But, they don't expose all the praameters or options. For that, I highly recommend reading the help documentation (--help on the command line or a convenient listing of the "core" plugins).

We're all happy to help! And I think most people have learned by doing: some of this is kind of take chances, make mistakes and get messy! :bus:. Those of us who have strong opinions often have them because we did the thing in the past, and have since learned better.



Thanks for another detailed answer, and for explaining how best to split samples. It sounds like then, based on my workflow, I would use align-to-tree-mafft-fasttree on all the samples, then split them after this for core-metrics since I want to primarily generate PCoAs and look at the Faith PD vectors. So for this, I would use feature-table filter-data to create subgroups, then run the core-metrics on these individually. My goal is to analyze how the community changes based on different treatments, but we wanted to subset the data for PCoA analysis to see how different treatments relate in different combinations (i.e. analyzing treatment A vs B, and then A vs B vs C, etc).

Thanks also for explaining how randomness fits into these analyses. Even though I uploaded separate groups of samples, it seems that the end values from diversity and phylogenetics analyses are almost identical, with negligible variation. Still, I think running things as a single group would be better for future analyses, especially with many samples!


Hi @MicrobeManager,

That sounds like a good balance to me between computationally expensive steps and processing your data with familiar commands.

We're always happy to help, feel free to come back with more questions!


1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.