How to merge Paired end sequencing while overlap is short

Dear experts
I have data from 16S rRNA sequencing, paired-end from v4 region. The sequencing is done at 2 x 150 read length.

I used DADA2 to merge them but came up very few sequences are left to process for downstream analysis, then I’ve been troubleshooting and found with my 815F&806R primer at 2x150 read length my overlap region is going to be only 4 to 5 nucleotide and wonder if this is the reason for having very few sequences left after merging?

How should we merge R01 and R02 in my situation? I read some discussion on google saying the forward and reverse fastq files contain reads in matched order. Does anyone know what does reads in matched order? if it’s matched when sequncing can I just merge my R01 and R02 with mergePairs(..., justConcatenate=TRUE in R. And can we do the same in Qiime?


HI @Dawud922,
A very recent discussion regarding the justConcatenate option you mentioned which I advise against. Assuming you meant 515f/806r primer set, you are right that the overlap with a 2x150 run is not sufficient for proper merging which would explain why dada2 is failing to merge these.

You shouldn’t… I would just discard your reverse reads and use the forward reads only moving forward.

Thank you Mehrbod for your suggestion. I’ll go forward with just my R01 reads.
Just out of curiosity, are R01 and R02 reads are matched in order in fastq file? It’s still not clear in my mind how the sequencing machine read out the data and write the fastq files while sequencing.

Hi @Dawud922,
Yes, the reads in your forward and reverse reads (and barcode file if you had them) are matched as in they are in the same order, meaning all line 1 in all those files correspond to the same sample/read.
Hope that clarifies this for you.