Hi
I am working with multiple lanes of paired-end reads sequenced on an Illumina Genome Analyzer II system. It appears that my forward reads are of much lower quality than my reverse reads (see below). I am trying to denoise my reads in dada2, but when I try to denoise using both the forward and reverse reads, it appears that I am losing A LOT of reads at the "merge" step.
Visualization: paired-end-demux.qzv (285.0 KB)
Paired end Dada2 stats: denoise_1.tsv (1.1 KB)
To me, this seems to mean that the forward and reverse ends are not overlapping. I thought that perhaps I was truncating too much, but this has happened using both of the following codes:
Truncation based on demultiplexed visualization:
qiime dada2 denoise-paired --i-demultiplexed-seqs paired-end-demux.qza --o-representative-sequences rep-seqs-dada2.qza --p-trunc-len-f 45 --p-trunc-len-r 60 --o-table table-dada2.qza --o-denoising-stats stats-dada2.qza --p-n-threads 40
dada2 stats: denoising-stats.qzv (1.2 MB)
No truncation:
qiime dada2 denoise-paired --i-demultiplexed-seqs paired-end-demux.qza --o-representative-sequences rep-seqs-dada2.qza --p-trunc-len-f 0 --p-trunc-len-r 0 --o-table table-dada2.qza --o-denoising-stats stats-dada2.qza --p-n-threads 40
dada2 stats: denoising-stats.qzv (1.2 MB)
I have read about using only the forward reads for an analysis, but is it valid to use only the reverse reads?
I have imported the reverse reads from one lane and denoised, and it appears to be much more successful.
Code:qiime dada2 denoise-single --i-demultiplexed-seqs single-end-demux.qza --o-representative-sequences rep-seqs-dada2.qza --p-trunc-len 0 --o-table table-dada2.qza --o-denoising-stats stats-dada2.qza --p-n-threads 40
Dada2 stats: denoising-stats_SE3.qzv (1.2 MB)
For clarity, each lane contains 30 samples of 16S V3 region and ITS 2 region, and I am using QIIME2-2019.1 on a server.
I am also considering PANDASeq to merge my forward and reverse reads, although I would rather use the Q2 pipeline. Any thoughts?
Thanks in advance,
Laura