How dada2 handle two runs with the same sample?

Dear all,

Recently I got some sequence data from Miseq PE300 mode.
The quality is a little bit terrible :sob: which I have to set the --p-max-ee-f and --p-max-ee-f to 10 to get more sequence passed the q2-dada2 filtering process. Of course I followed with a blast step to further remove the nontarget amplicon.

But still some of the samples data did not achieve a satisfied sequencing depth so I may have to resequencing part of the samples.
I use the same DNA ,PCR again with the same primer ,prepare library with the same protocal and resequnce the amplicon with the same Miseq PE300 mode.

So how do I handle the second data with q2-dada2.I imagine several strategies :face_with_monocle:

  1. denosied two runs separately then merge them

  2. discarded those non-satisfied sample in the first round,follow with strategy 1

  3. combined the same sample’s data in both two runs before analysis and follow a q2-dada2 denosie step only once.

I understand that DADA2 required separately denoising the sequencing data in each run and merging feature table and repseqs after because the batch effects may affect the Learn Error Rate Model.

But what if this special case?:worried: Hope somebody could solve my problem.
Thank u!!!

Sixvable

Hi @sixvable,

Yikes! That is a really permissive setting and may lead to errors creeping into your data. If the data are that bad you may want to consider resequencing the entire dataset… it is worth discussing low-quality runs with your sequencing core/service provider.

Yes, your merging options 1 or 2 are the way to proceed with merging separate runs.

If you do resequence only select samples beware you might run into issues with batch effects. Keep track of which samples were sequenced in each run (add this to your sample metadata!) and make sure this is totally random, i.e., that run does not covary with any sample metadata variables. Otherwise what looks like an effect from, say, “Treatment” could really be an effect of sequencing run!

Good luck!

1 Like

thanks!@Nicholas_Bokulich

the entire sample is 144 and only 20samples are unsatisfied. It may be a little bit wasteful to resequence all the samples :scream:
PE300 is always a bad choice but I have to use because of the long amplicon length. I thought my service provider may pool two many samples to a chip to save cost :sweat: Helplessness!

I will try to see what will happen!
Thanks for your precious advice!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.