Relationship between per-sample sequence counts and number of bacteria in original sample?

After demultiplexing in the Atacama Soil Microbiome tutorial we get a visualization that includes an “overview” tab with a table of “per-sample sequence counts.” We understand that this table shows the sequence count for each individual sample, but we are curious if each sequence count relates to the number of bacteria found in that individual sample? I’m sure that you can’t determine the exact bacterial load from each individual sample (because of the PCR amplification step), but does the sequence count at least let you compare relative bacterial burden between samples? For example, if a sample has 14,000 sequence counts and another has 2,000, can you say that one had more bacteria than the other (assuming PCR was equal in amplifying whatever was there originally)?

Hi @Kara,
If only that were the case!
Unfortunately read data cannot be used to infer bacterial loads. There are many reasons for this from a technical perspective, but perhaps the most important reason is that prior to loading your samples for sequencing, you normalize all your samples to the same concentration which means in theory you should expect the same number of reads per sample. Of course, in reality this never happens due to numerous factors (which are beyond the scope of this reply) but ultimately the answer is no.
There are other ways to measure bacterial load for example using flow-cytometry, but not using sequenced reads here.

1 Like

Just to add to @Mehrbod_Estaki's excellent advice: if you are at the start of your experiment, it is possible to simultaneously quantify cell counts in sequencing data by using spike-in controls or other methods.

Otherwise, as @Mehrbod_Estaki mentioned, you simply cannot compare read counts between samples.

1 Like

Thank you so much! This helps a lot.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.