Songbird on paired samples

Hi all, especially @mortonjt ,

I’m trying to figure out Songbird on paired samples. Ive been through the tutorial and looked at the differential code from the original paper. Im slightly confused how to use this for paired samples, since as far as I can tell by the formula, the code is looking at a bulk comparison between before and after tooth brushing without conditioning on the individual. (The code is C(brushing_event) and I might have expected something closer to the (C(brushing_event) | host_subject_id) in an LME… except that I m not totally sure that works in this model.) I might also be missing something with pre-processing where the pairwise comparison is already set up.

Alternatively, if there is another kosher way to do paired sample tests for relative abundance deltas, that would be super useful. I found one paper, but it was a UniFrac-like approach and I’d like differential abundance for individual features or clades.



Hey @jwdebelius, very good question. One important thing to know about Songbird is that it doesn’t actually perform null hypothesis testing. The main point was to show that ranks are a useful concept to embrace. The reason why we showed that paired testing works is because the mean calculation in paired t-test is identical to a standard t-test - the only difference here is the hypothesis test, which we tackled using paired_ttest method (which was adapted from scipy) after identifying taxa and performing the appropriate log-ratio transformation.

For your case, there are 3 possibilities:

  1. You could do this approach of first identifying microbes, followed by either a paired t-test or LME.
  2. You could try to adapt this code to design your own statistical test. Note that this can be tricky (I don’t have the answer to this at the moment).
  3. There are other tools that better handle longitudinal data such as MIMIX and stray, but you may run into similar problems as discussed in (2).

Thank you! I will check those out and maybe be back with questions.


@jwdebelius just curious, what did you end up trying for this? Anything work well? I had the exact same question and was glad to see you had already asked it :laughing:

I’ve been messing around with edgeR and other RNASeq-related programs which handle paired samples well, but I’m not sure I fully trust those for microbiome data and would like to use ranks/ratios. Also can’t really use those for non-count based metabolome data.

Hi @vrbana,

I hope you’re well! @mortonjt helped me out with some code and ideas! He shared a baysian approach. Maybe ask him directly?


Hi @vrbana, I gave @jwdebelius some Stan scripts to build a linear mixed effects negative binomial model :

And see my blog post here to see how to build your own model :

So this will allow you to build something like DESeq2, but will give you a little more flexibility in statistical testing (DESeq2 only tests against the mean, whereas this will allow you to do ratio tests).

1 Like

Thanks Jamie! What model do you typically use for continuous data (i.e. metabolome or metagenome rel abund)? It’s my understanding that the negative binomial is only appropriate for count data.

It’s a good question, one that no one has a great grasp on.

Metabolomics data is generated from discrete abundances – molecular abundances are inherently discrete quantities. It’s the fragmentation that converts these discrete quantities into continuous qualities. Because the underlying quantities of interest is inherently discrete, I typically stick with discrete models. Is it a hack? Most definitely. But there are very few tools that can deal with discrete latent variables in this fashion ( I don’t believe it is currently tractable in this case). So your options are to either model everything in a discrete world (with some hacky approximation) or model everything continuously.

But you can certainly experiment with different distributions in Stan – Stan gives you quite a bit of freedom in this regard. You could even swap out the Negative
Binomial with a Gaussian distribution or even better a Truncated Gaussian distribution to model the abundances. As long as the confounding sequencing depth is accounted for (and your predicted abundances are strictly positive), these modeling choices should be ok.

1 Like

Interesting, even though the abundance is MS1 peak area? Seems like people treat it both ways in other papers; I wish we had discussed/argued about this in a joint lab meeting with the metabolomics folks! But thanks, I will try out some different distributions

1 Like

If you have access to replicates / dilution experiments - that can also help in deciding what an appropriate error distribution would be.

But I think you get an idea about the conceptual challenges - definitely curious to hear what you find out!

1 Like

Sorry I can’t for the life of me figure this out and can’t find any good examples, how do you incorporate subject into your DE abundance tool example? For example, in an R-style formula I would just add (1 | subject) and for statsmodels it is just groups=dataframe[‘subject’].

In my example I have subject ids directly coded :

This would require a small update to the input to the stan model highlighted in the above tutorial via

dat = {
    'N' : X.shape[0],
    'D' : table.shape[0],
    'p' : X.shape[1],
    'depth' : np.log(table.sum(axis='sample').values),
    'x' : X.values,
    'y' : table.values.astype(np.int64)
    'subj_ids' : <something goes here>

where subj_ids is some numeric conversion of your actual host subject ids.

There are also quite a bit of documentation on stan that I recommend checking out :

Also stan has its own channel ; for instance this page highlights an example fitting mixed effects models :

1 Like