Songbird on paired samples

Hi all, especially @mortonjt ,

I’m trying to figure out Songbird on paired samples. Ive been through the tutorial and looked at the differential code from the original paper. Im slightly confused how to use this for paired samples, since as far as I can tell by the formula, the code is looking at a bulk comparison between before and after tooth brushing without conditioning on the individual. (The code is C(brushing_event) and I might have expected something closer to the (C(brushing_event) | host_subject_id) in an LME… except that I m not totally sure that works in this model.) I might also be missing something with pre-processing where the pairwise comparison is already set up.

Alternatively, if there is another kosher way to do paired sample tests for relative abundance deltas, that would be super useful. I found one paper, but it was a UniFrac-like approach and I’d like differential abundance for individual features or clades.



Hey @jwdebelius, very good question. One important thing to know about Songbird is that it doesn’t actually perform null hypothesis testing. The main point was to show that ranks are a useful concept to embrace. The reason why we showed that paired testing works is because the mean calculation in paired t-test is identical to a standard t-test - the only difference here is the hypothesis test, which we tackled using paired_ttest method (which was adapted from scipy) after identifying taxa and performing the appropriate log-ratio transformation.

For your case, there are 3 possibilities:

  1. You could do this approach of first identifying microbes, followed by either a paired t-test or LME.
  2. You could try to adapt this code to design your own statistical test. Note that this can be tricky (I don’t have the answer to this at the moment).
  3. There are other tools that better handle longitudinal data such as MIMIX and stray, but you may run into similar problems as discussed in (2).

Thank you! I will check those out and maybe be back with questions.