Songbird model almost almost outperforms null


I have a Songbird model that looks like it almost outperforms the null model as shown here:

This are my parameters:

songbird multinomial
--input-biom adults-table.biom
--metadata-file metadata_final_testing64.tsv
--formula "C(Score_3, Treatment('low'))+bmi"
--epochs 10000
--batch-size 225
--differential-prior 0.1
--learning-rate 0.001
--training-column Testing
--summary-interval 1
--random-seed 3
--summary-dir results_ts3b_test64_e10k.bs225.dp0.1.lr1-3

My cohort is of 432 and I did a Train-test split of 60:40.

This is the best I've been able to get after playing around with the parameters for a couple of days now. As seen on the graphs, the null briefly overlaps the model but then dips lower, and the Q2 value is negative. There is a significant difference in beta diversity between the groups of interest (weighted unifraq, p = 0.04), so from my understanding, I should be able to get some meaningful results from Songbird.

Does this look ok to move on with downstream analysis? Do you have any suggestions on how to improve the model?


Negative Q2 is definitely not ideal - that your model doesn't generalize.

It is possible to have PERMANOVA pass, but have Songbird yield negative Q2 scores, since they are very different metrics. PERMANOVA determines if there is a difference, but not if it generalizes beyond your cohort, whereas Songbird tries to answer this with cross-validation.

1 Like

Jamie, thank you for your response. So if I understand what you're saying, it might not be possible to obtain a positive Q2 value for my specific cohort, despite the significant beta diversity difference. Is that correct?

Yes, that's correct.

I see. I appreciate the quick responses. I did have one last question and I apologize if maybe this isn't the best platform for it. You mentioned that the negative Q2 implies that the results aren't generalizable. However, if the model above hypothetically yields the lowest cross validation & loss scores for this cohort, would the associated Songbird rankings be informative of the differential abundance of features specific to this cohort to any extent?