Should I separate samples in the same run base on quality before denoising?

Hello everyone,
I have two sets of data from two sequencing runs. I have demultiplexed the samples separately (run A &b). after viewing the demux files for both runs, the quality for run A seems pretty good but that of run b looks bad. Then i tried to view each fastq file in run b using the FASTQC app, and then I realised not all the samples had bad quality, in fact, many had good quality. This can be seen in the pictures attached. My question is:

  1. should I separate the files with bad quality in run B and do a separate denoising step for the file with good and bad quality in this run?
  2. if I am to run a denoising step for a run that has a mix of good and bad quality sequences, how do ensure that there will be no bias in the defined trimming parameters.
    Thank you and I look forward to your answers.

Hi @madbullahi,

I would not recommend this; the denoising algorithm will separate out good and bad reads/samples for you. If you choose DADA2, the algorithm relies on the mix of good and bad reads to learn the error rate so it can be applied.

If you choose Deblur, the algorithm gets pre-filtered using a set of known parameters and the good samples won't affect the bad. You could separate the data, but it won't make a difference.

You look at your data, pick trimming parameters that optimize the mix of reads in your worse sequencing run, and then work from there. Those parameters should be applied to both runs and presented clearly.


1 Like

Thank you, Justine. This explanation makes sense. Also, I realized the run B with bad base-pair quality has adapters that have not been removed.

1 Like