Processing samples with Dada2 in different batches changes results

Hello,
I've been looking into how the final count tables differ depending on which samples are processed with Dada2 at the same time. From my analysis, it looks like the final ASV and genus count tables are changed depending on the collection of samples processed together.

All the samples I will be working with are from the same batch (all were processed and sequenced together) since Dada2 should be run independently on each batch.

The workflow consisted of the following steps:

  1. Randomly generate processing groups consisting of 2-25 samples.
  2. Process these groups separately with the following steps:
  • Dada2 -> closed-reference OTU clustering to SILVA132 -> Taxonomy classification with Naive-Bayes classifier
  1. TSS normalize each sample
  2. For each sample, calculate the Bray-Curtis distance between all pairwise combinations of processing groups it was in.
  3. Plot Bray-Curtis distances between each processing group the sample was in.

I used this workflow to independently look at two different batches of samples (samples from different batches were not mixed) For the two batches, each sample was processed around 10 times with a different collection of other samples.

The following boxplots show the Bray-Curtis distance between a single sample processed several times. I looked at the Bray-Curtis distance of the ASV and Genus table.

For batch 1, it is clear that the final counts for some samples were quite different depending on the collection of samples it was processed with. For batch 2, the effect was less drastic with only 2-3 samples displaying differences between processing groups.

Furthermore, the differences are a lot more prominent at the ASV level than the Genus level. This makes sense to me, as at higher taxonomy ranks we are going to start collapsing a lot of those finer details into the same groups.

My current reasoning for this effect has to do with the fact that Dada2 only learns the error rates for the first few samples that get it to 1e8 base pairs. In each of the processing groups, it is likely different samples being used to learn the error rates and therefore changing the denoised ASVs generated by Dada2.

Any insights or suggestions as to these results is greatly appreciated!

1 Like

Hi @ryan4martin, welcome to the forum!

I have a couple of follow up questions if you wouldn't mind:

  1. Are your DADA2 parameters identical across all these runs? Is this the q2-dada2 implementation or the stand alone R? Which version # are you using?

  2. when you say ASVs here, are you referring to the direct output of DADA2, or the closed-reference OTUs after DADA2, because in your workflow you describe a clustering step after DADA2, but is unclear if that's just so you can collapse for the Genus plots or you used OTUs for the "ASV" plots as well. Also note that you can still collapse ASVs into Genus level without ever doing OTU clustering.

Some initial thoughts:

  • You are correct that DADA2 (by default) will use the first 1 million reads it sees to train its error model and so reshuffling the sample order, or processing different sample batches will yield different results, however, those differences are expected to be very small as long as the samples are all from the same run (which you mentioned they are) and there are sufficient reads for the error model to learn properly.
  • Some other DADA2-technical issues you may want to double-check are:
    • If using native DADA2, check the error models to see if they look reasonable
    • DADA2 expects that you have removed all non-biological sequences from your reads first, this includes your primers (especially if they are degenerate). This is an important one that can add this kind of variation.
    • Is DADA2 behaving as expected, for example on average how many of your reads are making it through the whole process? If a large portion of them are being discarded this would indicate some other issue happening upstream.
    • How do your results look if you don't apply the TSS normalization? I understand you need to apply some fort of normalization if you want real biological analysis, but in this scenario since you are comparing the same sample to itself only I think you can skip that step, in case TSS is driving some of these changes.
    • What happens if you remove your rare/sparse features? Rare features are hard to call by denoisers, and so are often influenced more by this kind of batch-to-batch effect.

Not sure if any of these will actually completely resolve what you see here by themselves, but may be a starting point.

4 Likes

@Mehrbod_Estaki thanks for bringing up some interesting points! Here is what I have so far in response to your questions and comments:

"1. Are your DADA2 parameters identical across all these runs? Is this the q2-dada2 implementation or the stand alone R? Which version # are you using?"

  • I am using Dada2 v1.18.0 in R with the same parameters for each run.

"2. when you say ASVs here, are you referring to the direct output of DADA2, or the closed-reference OTUs after DADA2, because in your workflow you describe a clustering step after DADA2, but is unclear if that's just so you can collapse for the Genus plots or you used OTUs for the "ASV" plots as well. Also note that you can still collapse ASVs into Genus level without ever doing OTU clustering."

  • The ASV in the plots refers to the direct output of the DADA2. I initially did the analysis at the Genus level and when I thought it might be a DADA2 problem I wanted to look at the direct output of DADA2.

"DADA2 expects that you have removed all non-biological sequences from your reads first, this includes your primers (especially if they are degenerate). This is an important one that can add this kind of variation."

  • The primer sequences are trimmed with DADA2's filterAndTrim function.

I am working on the other points you brought up and will post the results when I have them.

2 Likes

Hello,

I've finally had some time to continue checkout the rest of points you mentioned and here is where I am at so far.

  • How do your results look if you don't apply the TSS normalization? I understand you need to apply some fort of normalization if you want real biological analysis, but in this scenario since you are comparing the same sample to itself only I think you can skip that step, in case TSS is driving some of these changes.
  • What happens if you remove your rare/sparse features? Rare features are hard to call by denoisers, and so are often influenced more by this kind of batch-to-batch effect.

I checked the Bray-Curtis distance between the same sample processed in different batches like before with these variations. In the graph "Counts" refers to non-normalized values and "TSS" to the normalized values and "all features" uses all ASVs and "features in >10% samples" is the condition I used to remove rare features from the table before calculating the Bray-Curtis distance. The TSS normalization looks to increase the distance between samples but removing rare features as minimal effect.


  • Is DADA2 behaving as expected, for example on average how many of your reads are making it through the whole process? If a large portion of them are being discarded this would indicate some other issue happening upstream.

I tracked the number of reads remaining after each step of Dada2: after filtering, denoised forward reads (denoisedF) , denoised reverse reads (denoisedR), merged reads (merged), and after chimera removal (After Chimeria Removal). It looks like I am losing quite a few reads at the chimera removal stage for both batches, with ~50% of merged reads lost in batch 1 and ~30% of merged reads in batch 2.

batch_1_dada2
batch_2_dada2

This looks like quite a few reads being lost to the chimera removal step. I have been looking into the minFoldParentOverAbundance parameter and think the next step is to try a few different values here and see if I retain more reads. Any other suggestions or insights are greatly appreciated!

1 Like

Hi @ryan4martin ,
Thanks for the updated info and sorry about the slow response, I'm right in the middle of a big move so I'll follow up as soon as I can!

Hi @ryan4martin,
Thanks for the updates. 50% chimeras does seem a little excessive, but it also may be ok, depending on your samples. As in, that is not tally unheard of...
I'm not entirely sure what else to propose at the time being, so let's see what else the other experts think.
Is there anything biologically interesting about the handful of samples that seem to have the higher variance in the plot?