Decontamination Method

smd · May 27, 2021, 3:35pm

Hello community,
recently my research group, similarly to others working with microbiome data, has been struggling with decontamination of our 16s samples.
We were using DECONTAM and found out that it is being very aggressive in our samples (saliva - high heterogeneity between subjects). Then I performed a difference between the number of reads of ASVs from controls and the reads of the ASVs of the samples, which is less strict.
today one colleague of mine found microDecon which I've not heard about yet. does anyone has used it? is it a good approach??

thank you in advance,
best regards,
Sara

HugoEira · May 28, 2021, 10:22am

Hello,

I am also very interested to know your opinions.
I have also been using DECONTAM but not sure what or if there is a better way of doing this.

Cheers,
Hugo

yanxianl · May 29, 2021, 5:06pm

Given that falsely removing true biological features may have a huge impact on the analysis, I generally prefer to do it manually following the principles outlined in the DECONTAM paper.

Here's what I do for my projects:

Filter features present in my negative controls, which gives candidate features for screening contaminants.
Make prevalence and frequency plots. The former plots show the relative abundance of each feature in the biological and control samples, including negative and positive controls. The latter plots show the correlation between sample DNA concentration and feature relative abundance: a negative correlation suggests contamination.
Screen contaminating features manually one by one based on the plots generated in step 2.

This process is slow but assuring. Usually, it takes me 1-2 days to finish the work and I don't need to come back again.

Below is an example showing the prevalence and frequency plot of a contaminating ASV classified as Pseudomonas. It's a contaminating feature commonly found in our lab.

Prevalence plot. Note that the relative abundance of this contaminating ASV is much higher in the negative control samples and increases in diluted mock samples (far right; original concentration, 1:16, 1:32).

image1462×900 43.2 KB
Frequency plot. I used raw Cq values as proxies for sample DNA concentrations. Therefore, you see a positive correlation.

image1214×911 81.9 KB

Besides negative controls, mock is also quite useful when it comes to contaminant screening as you know what should and should not be there. If you have serially diluted mock samples, it's even easier.

If you're interested, you can look at the code (code/03_filtering.Rmd) on my GitHub repo that generates the plots.