Reasons for an extreme overgrowth of Akkermansia in a sample?

amirza · May 5, 2020, 7:03pm

Surprisingly, the relative abundance of the phylum Verrucomicrobia is 86% in sample X. Only 1 ASV account for the majority of the ASVs classified as this phylum in that sample. This one ASV is classified as Akkermansia (confidence level 92%) and its relative abundance is 84% in sample X. While this ASV is present in 94 samples (out of about 300 samples), the relative abundance of Akkermansia does not exceed 28% in any of the other samples. Interestingly, there are 55 ASVs classified as Akkermanisa (albeit, with confidence levels as low as 72%). This particular ASV is the most prevalent and most abundant among all ASVs classified as Akkermanisa in nearly all samples in which it is present. Surprisingly, sample X has 50 out of 55 of the ASVs classified as Akkermanisa present in the sample, whereas other samples have at most 2 present.

Based on the unweighted unifrac emperor 3D plot, sample X looks like an outlier, but not based on the weighted unifrac emperor plot. I haven't found anything abnormal about the sample itself other than this issue. It looks like the sample has a normal taxa profile by presence although by relative abundance, all other taxa in the sample are smaller by percentage relative to the other samples (See the figure below. Sample X is the last column, ordered by Verrucomicrobiae abundance in ascending order). This sample has 322 representative ASVs and 226,603 total sequences, both above the median by about 1 and 2 standard deviations, respectively. I ran the Deblur alternative pipeline and everything seems to have run well, there wasnt any major loss of reads in any of the pipeline steps. Quality of the sample looks good (checked using fastqc). Mock communities look good. Reads are from the V4 region of 16S rRNA and classified using my trained SILVA taxonomy. Its Bristol Stool scale type is 3, which is considered normal by the scale. The person was not treated for anything recently but was on medication. However, other participants were on the same medication and none of them had this extreme overgrowth.

Any ideas why there is an extreme overgrowth of Akkeremansia strains in this one stool sample? Have you seen something like this before? Could this be a technical issue? Your thoughts are much appreciated.

Best regards

jwdebelius · May 5, 2020, 7:22pm

Hi @amirza,

I would suspect that it's a biological condition, rather than a technical one. In my (ample anecdotal) experience with fecal samples, Akkermansia is often associated with energy balance, and specifically, fasting or starvation. (For example, you may find it in pre-treatment anorexics.) So, I would check your metadata and see where this individuals falls. Have they been fasting? (Have they all been fasting? That's a lot more Akkermansia than what I'd expect in an adult microbiome, Im used to seeing it at about 1%.

As far as I know from working with the American Gut and other big fecal datasets, there's really just one OTU (97%) which represents the Akkermansia in western human fecal samples. Im a bit suprised. you're seeing so many ASVs... it may be inflation, it may be something about clustering that i haven't explored before, it may be inflation from DADA2.

Best,
Justine

amirza · May 5, 2020, 8:08pm

Thanks for the quick response

The other reason for someone to have a high Akkermansia abundance could be constipation, but the person seem to have "normal" stool.

Hmm, the person consume only 1 standard deviation above the average energy consumption (within the cohort) and has normal weight for their age and sex. The person was not fasting it seems. This person is a young adult.

As far as I know from working with the American Gut and other big fecal datasets, there’s really just one OTU (97%) which represents the Akkermansia in western human fecal samples. Im a bit suprised. you’re seeing so many ASVs… it may be inflation, it may be something about clustering that i haven’t explored before, it may be inflation from DADA2.

May you elaborate on the potential inflation or potential clustering issue? How can I check my data to see if I have these technical issues that you mentioned?

Ali

jwdebelius · May 5, 2020, 10:58pm

Hi @amirza,

You have quite a few samples with more Akkermansia than I might expect for adult fecal samples in a healthy population. (Im used to about 1-5%). It doesn't necessarily mean something is wrong, just that the behavior you're seeing is odd.

Again, annecdotal observation. Have you done things like trimmed your primers? Do you see a similar number in deblur? I might check with multiple sequence alignment.

Best,
Justine

amirza · May 9, 2020, 6:53am

Yes, I did trim primers. I ran deblur but I havent ran dada2. But would it matter? data2 and deblur dont cluster ASVs, so I would predict that the same ASVs would come up unless filtered for poor quality or deemed as an artifact. How would you suggest I check the multiple sequence alignment? Simply check the alignment of the ASVs classified as Akkermansia and compare it with other ASVs?

Best,
Ali

jwdebelius · May 10, 2020, 4:45pm

Hi @amirza,

Deblur makes it simplier because it, in general, produces fewer ASVs and they're all the same length. So, with DADA2 you sometimes get multiple ASVs with slightly different lengths which differ by a single nucleotide and are mapped to the same genus.

I would recommend using a tool designed specifically for multiple sequence alignment to compare your ASVs. Ive used one through ENA, but there are several options.

Best,
Justine

amirza · May 16, 2020, 2:33am

I ran MSA on 20 out of the 55 ASVs classified as Akkermansia. Below are the results. Please let me know if anything stands out!

IDs are named as follows:
For example 1-Akk-HIGH-COVERAGE_0.9167077
1: Index number
Akk-HIGH-COVERAGE: Akk stands for Akkermasia. I labeled this ASV as "HIGH-COVERAGE" because it is by far the most prevalent ASV classified as Akkermansia.
0.9167077: qiime2 taxa classification confidence score. If there was 100% confidence score, it will be labeled as 1-<#>, 1 meaning 100% confidence and <#> will be an index number to differentiate the others with 100% confidence score.

The biggest differences compared to the HIGH-COVERAGE ASV are between bp 200-240. These have low confidence scores. I don't see any problems with these sequences. I would like to mention that I ran the alternative methods of read-joining in which the forward and reverse reads were merged before running deblur. I like this alternative method because it can increase the read quality when there is a large overlap between the forward and reverse reads (we sequenced the V4 16S rRNA region). Looking forward to your comments!

jwdebelius · May 16, 2020, 5:05pm

Hi @amirza,

This is a point at which you need to address your issue and determine what's happening with your data. I can't give you answers without seeing all the data and having for more details than are in the scope of the forum.

I would first worry about the fact that you have such high akkermansia over all, and then consider the distribution.

Best,
Justine