In 16s sequencing data, we found a few very interesting species as biomarkers, so our laboratory invested good money for metagenomics out of the desire for greater study. However, in metagenomics, those species are not significant. We feel very confused, and don’t know how to integrate the two kinds of data in the analysis. Can you recommand a strategy to integration two datasets or some paragon paper to study?
Welcome to the world of microbiome techniques. The fact that 16s and shotgun sequencing don’t line up is a common problem. However, a few things to consider:
- “Species” resolution in 16s should be considered very carefully. (As in if you didn’t use a specifically curated database, species resolution is generally a lie in the human microbiome except for a few genera.) So, that may be an issue.
- Naming conventions between databases are rarely consistent. Upper levels (phylum, class, sometimes order) often agree in databases, but lower levels are inconsistent. Yes, its a problem. No, there’s not a great solution other than annotate consistently and pay attention to the database and database version. But, if you’re assembling a custom database, this again may be an issue.
- Because abundance-based profiles don’t always line up in 16s and metagenomics, if you’re focusing on abundance based-metrics, you may be SOL. So, consider what assumptions your techniques make about your data, and what biases are behind it. Maybe consider copy number correction on your 16s, if you’re concerned this is an issue.
- How do you know that the organisms of interest matter and what does metagenomics buy you over what you’re getting from marker gene?
- Aren’t biomarkers a cliche in 2019/2020? Like, hasn’t He et al demonstrated nicely that “biomarkers” often don’t replicate (unless you’re really lucky)?