Sample processed in diferrent sequncing runs with vast difference in sampling depth

We processed 33 samples in two different sequencing runs where 14 samples have reads around 2500- 4000 in each sample. The rest of the 19 samples have an range of 40,000 to 130,000.

I want to be able to analyze this data in qiime2 , but I am worried on how to choose an appropriate depth for dada2.

My question:1. Should I even process the 33 samples together?
2.If yes, does sampling depth of 2500 for all 33 samples enough ?
3.Is there an alternative way to normalize the difference between the two runs?


  1. You can, if experimental design requires to process them together (but only after dada2!)
  2. That’s a little bit low but still you can use this depth. Check alpha rarefaction curves to decide if you can proceed with it or sacrifice some samples to increase depth.
  3. You can check this additional plugin you may install.

But there are also some considerations. Are you processed by Dada2 all samples together? It is better to run dada2 separately but with the same parameters on each sequencing run and then merge resulted feature tables and representative sequences.
How many of reads you have in your files in demultiplexed raw data? The numbers you provided - it is before or after dada2? If after dada2, on which step (check stats) you are loosing most of the reads?


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.