How to analyse two different read length

steffi · March 2, 2020, 11:52am

Dear All,
I have two datasets with different read length. few of them are 251 and few of them are 276. How can i proceed DADA2 step as I have mentioned the trimming length? Kindly help me with this

timanix · March 2, 2020, 2:22pm

Hi!
If it is the same region, and the length of the shorter reads, trimmed according to their quality scores, as advised in the tutorial, will provide enough overlapping nucleotides for a longer reads as well, I suppose you can process them together with the same parameters.

Mehrbod_Estaki · March 2, 2020, 11:22pm

Hi @steffi,
Is the end goal to combine these 2 datasets together? Are these 2 datasets from the same run? I’m guessing not if they have different lengths. If from different runs, you should denoise them separately with DADA2 then merge. If you do want to merge them, as @timanix mentioned they need to be of the exact same region which means not only the primers used must be the same but also that the same trimming parameter is used on both.

timanix · March 3, 2020, 5:29am

Hi, @Mehrbod_Estaki, could you please explain why it is better to denoise samples of different runs separately? Is it affecting the process and how severe if yes?

Mehrbod_Estaki · March 3, 2020, 4:02pm

Hi @timanix,
This is the recommendation from the DADA2 developer. The reason behind it is that the error profile created during the training step is experiment/run specific. Meaning the error profile you create for one run will not be ideal to use with another. This is also true even if samples are from the same run but went through different PCRs, as this is another major step that leads to bias in the errors.

timanix · March 3, 2020, 4:24pm

Thank you for clarification!

system · April 19, 2020, 2:45pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.