Use of Unique Molecular Identifiers for 16S sequencing

vheidrich · November 9, 2020, 8:30pm

Hi there,

Someone recently asked me whether it would be interesting to use Unique Molecular Identifiers (UMIs - used in RNA-seq) for 16S rRNA sequencing in order to have a closer estimate of the number of each 16S in a given sample. The idea is that putting a unique sequence tag on each 16S molecule before PCR allows the identification of PCR duplicates and consequently leads to a better estimate of the true taxa abundances.

I know that that has been rarely (if ever) used in the literature, so I assume it is not a great idea, but I still can't say the reason. Why are we not using UMIs for microbiome amplicon-sequencing?

My first guess is that by including UMIs the region of interest will be shortened, decreasing taxonomic resolution. Any additional points am I missing?

Thank you

llenzi · November 10, 2020, 10:11am

Hi @vheidrich,

I am throwing few points on this, from the perspective of a 10X user, so it may vary with another single cell platform.
I agree with you that the you would have a decreased taxonomy resolution, in my case I have got reads of 150bp. So you would need to think carefully which would be the best 16S region for your case. I am wondering if genome fragment sequencing would make more sense instead 16S, to recover resolution?

My main concern would be on the number of bacterial cells in each sample. Ideally, the number of cells should be much lower than the number of UMI, so you don't end up with the same UMI in more that one cell. In the case of bacteria, for many sample types this may not be true I think (or at least need to be checked).
Overall, I suspect adding the single cell library prep, would make the cost much higher compared than the normal 16S amplicon prep. It may be not very cost-effective considering the limitation.
Hope it helps