qPCR hacks in QIIME 2?

ChrisKeefe · March 26, 2021, 9:18pm

My colleagues and I have 16s data and matched qPCR fold-change (RQ) data, and I'm wondering whether there are any quick and dirty approaches to combining them in QIIME 2. (Opportunistic wins, not time-intensive work).

It would be super cool to ask questions like "what kinds of relationships exist between gene expression and diversity, and it would be trivial to include RQ/fold change in our study metadata, but I'm new to qPCR, and I want to make sure quick and dirty doesn't mean "wrong".

Here are some random questions and ideas - I'd love any feedback you all have:

As I understand it, RQ is a normalized value (number of times more some gene was expressed than some other "calibrator" gene, where all genes compare to the same calibrator). This says nothing about relative functional effect of one gene or another, though. Raw comparisons of RQ from one gene to another can only tell us about level of gene expression, right? (If I have RQ=100 mice and RQ=4 elephants in my house, RQ itself can't tell me anything about how clean my kitchen will be.) So it's dangerous to ask questions about differences across genes without incorporating additional information?
If we added an RQ column for each gene quantified to our study metadata, we could use longitudinal volatility to look at changes in expression over time, but we don't really need q2-longitudinal to plot that.
Plotting alpha- or beta-group-significance would be gross. If we're grouping by some other meaningful category, (e.g. study site), those plots don't offer us enough variables to also incorporate fold-change. If OTOH we made fold-change a categorical variable (e.g. high fold-change, medium fold-change, low fold-change), we could group on that, and compare the relative expression-level of one gene to diversity. This feels like a gross hack, and I'm not sure how we'd choose those thresholds. Anyone aware of precedent for something like this?
What else am I missing? Emperor plots could fix some of our too-many-variables problems, but I don't think we'd find signal in the noise with our data set. Any other, better ideas? Literature worth looking at?

Thanks for making it this far!
Chris

colinbrislawn · March 28, 2021, 6:59pm

Hello Chris,

If you are able, would you be willing to tell us more about your study? Like, what's your blocked study design, how many samples in each block, what's your model animal, is there an intervention, etc

For me, this means unsupervised feature selection in one omic type, then testin to see if those features explain variance in a different omics type.

So you could get the top 10 RQ that change the most after intervention, then see if those can explain differences in 16S beta diversity using vegan::capscale() or diversity adonis.

On this paper, I used capscale() to see if beta diversity can be explained by a combination of

time
sunlight
three measurements of productivity (NPP, NHPa, NHPg)
and common taxa a the family level

to create this figure:

Full code here

For the record, predicting beta diversity of samples using the taxa in those samples is filthy

ChrisKeefe · March 29, 2021, 10:47pm

Thanks for the inspiration, @colinbrislawn!
It's been so long since I looked at the design of this longitudinal study, I can't remember the blocking of the top of my head.

I'll probably give something like this a shot, but the substudy I'm thinking about is running on pretty low N per sample group, so I may not have the to get real value from it. I'll keep you posted if we can make it work!