How does contaminating human DNA effect your sample dilution prior to sequencing?

Labm · March 11, 2021, 12:37am

I am working with low microbial biomass samples which are known to have high contaminating levels of human DNA extracted alongside microbial DNA.

If you determine the DNA concentration after extraction there is a lot of variability between samples. I need to dilute all the samples to ~5 ng/ul prior to the pcr/sequencing steps. However, considering that the DNA concentration includes both microbial and human DNA, will this mess up downstream alpha and beta diversity analyses? (as you are not diluting the microbial DNA consistently between samples?) Or will it not be a problem - as long as you validate any relative taxonomic data with qPCR? I am performing an amplification check now to see if DNA concentration aligns with band brightness after dilution. EDIT: Performed the PCR with the V3-V4 primers on diluted DNA. The very high concentrations post extraction, which were then diluted the most did not have a visible band. This indicates to me that its a higher concentration now from more microbial DNA but from human DNA. I could send a higher concentration away but this leads me back to my original question - does diluting you DNA differently post extraction , prior to the sequencing (indexing etc) and illumina 16S effect your downstream results.

Mehrbod_Estaki · March 11, 2021, 2:42am

Hi @Labm ,
This is a rather complex issue, but not one that hasn't been discussed in the literature before. Your issue is essentially working with a version of low biomass samples. Low biomass can be samples that have a low total target DNA, or one that is in the presence of a higher abundance of non-target DNA (your case). In both scenarios this can be problematic in a variety of ways, such as low sequencing depth or sequencing bias. The best solution is, as you are exploring, is to resolve this during library preparation. There are a lot of strategies in dealing with this such as using kits specific for low-biomass samples,using less destructive lysing methods, depleting non-target DNA (ex. through gel extraction, enzymatic removal), etc. Ultimately, you will never able to get the perfect 5ng/ul of your target DNA for sequencing, and that's ok, you don't really need to. During the bioinformatics processing, you can always filter our non-target sequences, and as long as the remaining sequences have an adequate depth, this shouldn't affect your diversity measures. Some assortment of readings on the topic, here, here, and here. <- loads more out there, these just happen to more be more recent.