16S rRNA sequencing analysis theoretical question

jwdebelius · December 3, 2020, 4:47pm

You've got a couple of issues here.

You can't; you won't. Even if you used a single region, the number of 16s genes are not related to the number of cells as a general rule. I'd recommend looking at Quantitative microbiome profiling links gut community variation to microbial load and specifically recommend Figure S5. There are caveats. If you don't have a lot of bacterial cells, you will have trouble getting DNA to amplify. There are low biomass protocols that can help with this issue, but my experience has been its closer to binary. The quality if your extraction affects the final read count, but less os than you'd think.

You can check the number of 16s genes in your original sample using qPCR (which does not directly correlate to the number of cells because cells have multiple 16s genes and copy number variation is always a fun discussion, see posts below). Or, you can do flow cytometery like the paper above talked about, if you can get the protocol to work and can find a flow cytometer that will let you run bacterial cells.

This is kind of a separate challenge right now. Finding appropriate ways to combine multiple reads is really difficult as of 3 December 2020. I'll link the ion torent thread below where they work through a potential pipeline.

This is one of those generalizability/specificity problems. Its constrained by read length (longer reads -> more specificity), the region/primers you chose, and what you believe you need. There are cases where having that resolution is critical to the biology and cases where we don't know. Picking your hypervariable region depends on a lot of things, including:

Standard for your environment (for example, the vaginal people have primers they really like)
How much you want to compare across enviroments (EMP are designed to work lots of places but may give you lower resolution)
Specific primers can focus on specific clades of interest, do you need that specificity?
What does your database cover?
How long is your read length and will you be able to scaffold (V13 tends to be longer than a 2x300 Illumina run; V4 2x150 tends not to join)
Potential off target effects.

Best,
Justine

https://forum.qiime2.org/t/possible-analysis-pipeline-for-ion-torrent-16s-metagenomics-kit-data-in-qiime2/13476/83