Anyone use NextSeq or NovaSeq for 16S sequencing?

Hi all,

I know that NextSeq is only for short amplicons, ~150bp or around there, but I am still curious if anyone has used it for 16S sequencing? Nextseq gives around 120 million reads back compared to a Miseq for a similar price.

However, I know NovaSeq can handle larger amplicons and it gives back more than a billion reads. But I have not seen anything on people using it for 16S sequencing. Anyone know why? Seems way more cost-effective.


1 Like

Hello Sam,

Great questions!

I don’t run a sequencing core, but I’ve worked closely with folks who own both the Illumina MiSeq and NextSeq 550.

The Illumina MiSeq and NextSeq 550 are really different machines.

I think the MiSeq is commonly used for 16S analysis because of it’s longer reads (300 bp, paired end) and sweet spot for an amplicon study. With ~300 barcodes and 15 Million reads per run, you would hopefully get ~50k reads per sample, which is a reasonable number for an amplicon study.

While the NextSet 550 offers many more reads, the EMP primers still just offer ~300 barcodes. (Other barcoding schemes let you process more!) Plus, it might be preferable to sequence a longer region for better taxonomic resolution, than simply increase read depth. 300 bp > 6 million reads per sample :man_shrugging:

Oh, and buying one MiSeq is less expensive than buying one NextSeq 550 :money_with_wings:

But the NextSeq 550 is super reliable, and the MiSeq is ~11 years old.

So basically

  1. cheap machine, long reads, low depth, delicate
  2. expensive machine, shorter reads, greater depth, reliable

Are you planning to buy one of these machines, or learn more about the platforms before buying a run from a company or sequencing core?


P.S. If someone on here does run a sequencing core, please correct my mistakes! I would love to learn more about the NovaSeq, as I’ve not worked with one before!

Hi Colin,

Apologies on not answering this: my notifications were on a different email. Thank you for the detailed response. I agree with the differentiations between the NovaSeq and Miseq.

I thought it would be timely to leave a reply now however, as I am seeing that cores are offering NovaSeq 2x250 paired end sequencing at more reasonable prices. At 1.3-1.6B reads I think this could be huge for microbiome science. I am just not sure how well this has been validated for 16S sequencing.

If anyone on here has more info on that, it would be much appreciated!



Our group has been using the Novaseq. So far it offers tremendous sequencing depth while allowing very high multiplexing. We are still sorting out the bioinformatics end of things though. The volume of data obtained complicate and slow down analyses quite a bit. Also, it seems that we obtain a very large number of OTUs/ASVs compared to earlier techniques. I don’t have a lot more to add at the moment, but would enjoy hearing from others that may have bioninformatic tips and tricks.


This is pretty interesting. Do you have any samples that have been thrown into both a miseq and a novaseq for direct comparison? You could then compare the taxonomic composition of OTUs from each sequencer to see if the NOVAseq is spitting out OTU’s that make taxonomic sense rather than junk.
Also were you using the 16S 515F-806R region?
Would love to hear more as you dig through the data.

Hi @Sam_Degregori

I have been using Novaseq now for amplicon sequencing instead of Miseq.
Even I did not sequence same samples in both platform.I can tell Novaseq is way better than Miseq.
Base quality is much better and sequence depth is doubled.There is no reason to use Miseq if you have chance to use Novaseq.

I too was concerned about junk sequences, but we have not done a good comparison with Miseq (or some other platform) yet. Note that Singer et al. 2019 did such a comparison and suggest that the NovaSeq is just plain better:

We have used the 515-806 for 16s and its1f-its2 for its1.

In some of our preliminary work using vsearch to generate OTU tables, we have made truly monstrous tables (Tb size) that are impractical. We are considering hdf5 and other tools to try and reduce memory footprint. If anyone has suggestions I would love to hear them. Similarly, I am curious regarding compute time for dada2 operations for such large data. I haven’t had a chance to do any benchmarking myself yet.