Contamination Introduced during Sequencing?

Hi all,

I’m currently analyzing 3 independent 16S experiments and am noticing what I believe to be a lot of cross contamination. Each experiment had a different host (although all were marine), each experiment was conducted in a different location (California, New York, and Massachusetts), one of the experiments used a different DNA extraction kit than the others, including PCR reagents, and experiments were processed at different times. The only thing similar between all experiments was the magnetic beads used to clean up the PCR reaction and the thermocycler used. But I’m finding sOTUs in common between (just about) every single sample (about 160). I find it hard to believe that the magnetic beads could introduce contamination because by the time I do the PCR clean up the amplicons are already barcoded. Could cross contamination like this happen during the sequencing process?

1 Like

Hello Jordan,

Sample-to-sample contamination!

Did you sequence any positive controls we could use to verify that these microbes are sneaking into samples they were not in originally?

What’s the magnitude of the cross over? 160 OTUs out of 200 or out of 2000 total OTUs? Are these OTUs equally abundant in all samples, or more common in some than other?

The reason I ask about magnitude is that the Illumina platform has a 0.1% - 2% barcode mislabeling rate according to Illumina itself (lol), and while there are methods to remove it, I find it rarely interferes with my scientific question.

Establishing the effect size of this contamination will let us see if this is Illumina background, or a systemic problem in the wetlab prep.



Thanks so much for replying! Just a Clarifying question. Is the mislabeling rate for a single OTU? For example, if one OTU has 100 reads across all libraries, then two of those reads might find themselves in libraries they aren’t suppose to be in?

  • Jordan

Hello Jordan,

I mislabeling rate can be thought of as the probability that any read is going to get the wrong barcode assigned.

So if you had a positive control sample with only one OTU inside of it. Like this:

otu positiveControl s1 s2 s3
mock 100 0 0 0
otu1 0 20 20 20
otu2 0 50 50 0
otu3 0 30 30 80

Note that 1) the mock microbe only appears in that one sample and 2) no other reads appear in that sample.

With a 2% error rate, you get this:

otu positiveControl s1 s2 s3
mock 98 0 1 1
otu1 1 19 20 20
otu2 1 49 49 1
otu3 3 30 29 78

So reads are sneaking into the control sample, and the control mock microbe is leaking into the real samples s1, s2, and s3.

This is separate from user error, like mislabeling a tube or accidentally double-dipping a pipette tip.

How many positive controls did you run?


Unfortunately we only ran negative controls. Something I will do differently in the future, along with mitigation strategies to avoid crosstalk. But this has been extremely insightful and will certainly help me choose which sOTUs in my negative controls to filter out. Thank you so much!



Be careful! See this thread about why filtering out everything in the NTC can be dangerous, or check out decontam directly.

Let us know if you have any other questions!


1 Like