Merge samples from different Illumina Runs - same trimming options?

fstudart · January 14, 2020, 9:22pm

Hi Everyone,

Just an advice: If I wanna combine samples (already denoised with DADA2) from different sequencing runs, before merging them into a single feature-table, is it recommended to have used the same trimming options during the DADA2 denoising step?

I should also mention that both runs targeted the V4-region (250 bp-reads), and the trimming options I used in the past for each run are slightly different from each other.

Thanks very much,
FS

llenzi · January 15, 2020, 10:11am

Hi @fstudart,

Strictly speaking is a good policy to denoise different runs with same trimming parameters, before merging them.
In your case, R1 and R2 should be almost fully overlapping in the case of V4 region, so using different trimming setting should not have a very big impact.
That assuming the quality of the two runs are equally good (or bad …)
Still, there is only one way to know for sure …

Luca

fstudart · January 16, 2020, 7:13pm

Hi Luca!

Thanks for your input. As all my samples were sequenced the same way (paired-end sequencing, 250-bp reads, V4), can I select all samples I am interested now (from two different runs) into the same folder and start analyzing them from scratch as they came from the same run. By doing so, the trimming parameters would be the same. Would this be acceptable? Also, just wondering if there is a way to evaluate for possible batch effect in Qiime2 or using another tool?

Thanks very much,
FS

Mehrbod_Estaki · January 17, 2020, 8:31am

Hi @fstudart,
If these samples were ran through the same PCR run and sequencing run then you can denoise them together, otherwise it is recommended that you run each run through DADA2 separately with the same trimming parameters then merge afterward. The trimming parameters need to be exactly the same, because even a single nt difference in 2 features will lead to them being identified as different ASVs. You will see very clearly on an ordination plot that your samples will cluster more strongly based on sequencing run if your trimming parameters are not the same. The same goes for truncating parameters (cutting from 3’) if you were to use single-end reads, but since these are paired end and merge on 3’, the truncating is not an issue.
To identify batch effect is a bit tricky but the easiest way I go about doing this is as I mentioned just visually looking on your ordination plots to see you see some artificial clustering based on sequence runs (a column run you should add to your metadata file). As far as I’m aware there is one q2-plugin that deals with normalization of batch effects, q2-perc-norm, might be worth reading up on.
Good luck!

llenzi · January 17, 2020, 10:09am

Hi @fstudart,

when I wrote trimming option should not affect much in your situation I had in mind 3’ prime trimming length, in your case the total length of the merged sequence should be forced because the full overlap between R1 and R2, but I totally agree with @Mehrbod_Estaki , be careful with the 5’ trimming length, it could invalid my earlier assumption …
For processing the samples together, again, I agree with @Mehrbod_Estaki. If the samples were processed with the same protocols/methods but in different batches, it is probably safer denoise them separately (still you may have batch effects …).
Good luck

fstudart · January 20, 2020, 5:56pm

Hi Luca,

Thanks very much!

FS

system · February 20, 2020, 11:56pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.