My colleagues and I have 16s data and matched qPCR fold-change (RQ) data, and I'm wondering whether there are any quick and dirty approaches to combining them in QIIME 2. (Opportunistic wins, not time-intensive work).
It would be super cool to ask questions like "what kinds of relationships exist between gene expression and diversity, and it would be trivial to include RQ/fold change in our study metadata, but I'm new to qPCR, and I want to make sure quick and dirty doesn't mean "wrong".
Here are some random questions and ideas - I'd love any feedback you all have:
-
As I understand it, RQ is a normalized value (number of times more some gene was expressed than some other "calibrator" gene, where all genes compare to the same calibrator). This says nothing about relative functional effect of one gene or another, though. Raw comparisons of RQ from one gene to another can only tell us about level of gene expression, right? (If I have RQ=100 mice and RQ=4 elephants in my house, RQ itself can't tell me anything about how clean my kitchen will be.) So it's dangerous to ask questions about differences across genes without incorporating additional information?
-
If we added an RQ column for each gene quantified to our study metadata, we could use
longitudinal volatility
to look at changes in expression over time, but we don't really needq2-longitudinal
to plot that. -
Plotting alpha- or beta-group-significance would be gross. If we're grouping by some other meaningful category, (e.g. study site), those plots don't offer us enough variables to also incorporate fold-change. If OTOH we made fold-change a categorical variable (e.g. high fold-change, medium fold-change, low fold-change), we could group on that, and compare the relative expression-level of one gene to diversity. This feels like a gross hack, and I'm not sure how we'd choose those thresholds. Anyone aware of precedent for something like this?
-
What else am I missing? Emperor plots could fix some of our too-many-variables problems, but I don't think we'd find signal in the noise with our data set. Any other, better ideas? Literature worth looking at?
Thanks for making it this far!
Chris