Amplicon sequencing on MiSeq machines deliver 300 bp paired end reads and are certainly 'better' that 150 bp paired end reads from other machines. But 'how much' better are 300 bp reads? Is anybody aware of a comparative analysis of MiSeq (300 bp) and HiSeq or other (150 bp) data? From a phylogenetic perspective, longer reads (to the far end: Pacbio sequel reads) are much better to reconstruct the tree, but is this equally important for a functional perspective (the function of bacteria in a e.g. soil community, or presence of specific enzymes). I would expect that many functions/enzymes are present in a broader phylogenetic group and a high-resolution phylogenetic tree is not such important to investigate the function of microbial communities.
And it will depend on the V regions used for the analysis...
Such discussions came up when we were talking about the new Illumina iSeq-100, which can only deliver 150 bp reads.
I would very much appreciate comments from the community.
Best regards.
This is a fascinating questions because it speaks to the diminishing returns at the center of many technical advancements.
First, what region are you sequencing? If you are sequencing 16S v4 (250bp long), 300 bp pair-end reads will give you full coverage in both directions... but so will 250 bp reads. 150 bp reads will give you 50 bp of overlap, which is plenty for joining, but you don't get the full coverage so the two reads can correct each other.
If you are amplifying the 18S region using the primers recommended by the EMP (Euk_1391f), the reads are only ~200 bp long. Now the 150 bp reads will give you 100 bp of coverage, more then half the full read!
when we were talking about the new Illumina iSeq-100
Starting to do your own sequencing is extraordinarily time consuming, and there are tons of indirect costs. Don't do this!