Joining 2x150 bp V4 paired-end reads from Iseq100

johanak · May 13, 2019, 1:08pm

Hi!

I ran into a problem of merging 2x150 bp V4 region paired-end reads produced by Illumina Iseq100. Primers used for sequencing are 515F–806R. As sequenced reads include primer sequences, the overlap of the reads is too small, which results throwing away a lot of reads while trying to merge. I have tried merging with DADA2 integrated into QIIME2 (dada2 denoise-paired, truncating only primer sequences, but nothing from 3' ends) and vsearch (vsearch join-pairs with parameter "--p-minovlen 5"). For example in case of using vsearch from 503298 reads only 27033 (3,5%) remained after joining.
The solution I have found is to analyse this data as a single-end data, but in this case I lose half of the information I have regarding the sequences.

What are the possible solutions to still perform analysis with paired-end sequences and information from both of the reads, if there is any solution?

I am using QIIME 2 version 2019.1.0 command line interface installed with conda.

Thank you for your help!
Johana

Mehrbod_Estaki · May 13, 2019, 8:41pm

Hi @johanak,
The qiime2 version of DADA2 requires for min 20nt overlap exist for proper merging of your reads. The native R version of DADA2 has a setting to allow for non-overlapping reads with the justConcatenate=TRUE however, this is not recommended. There are other tools out there that allow for non-overlapping sequences, however my personal opinion (and others may disagree) is that this is not a better option than simply using the forward reads. At the very least I haven't come across any convincing benchmarking of this, though would love to if anyone has any references in mind.
You shouldn't think of only using your forwards reads as you losing 50% of your data, since you still have the same number of reads, they are just a bit shorter. And with 16S data the loss of information between 100nt vs 200nt long reads for example is not linear. See Fig1 from the original RDP classifier paper.
There are many studies out there using single reads instead of paired so you shouldn't have any problem with reviewers either.
tldr; use your forward reads only and don't worry

system · June 14, 2019, 2:41am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.