Poor Merging Results from Paired End data

Dear developers and community,

Thank you for your help, and for supporting Qiime:

Problem Description

Problem: I am yielding little to know sequences when merging during de-noising step with the following call to qiime dada2 denoise-paired (as per below).

Methods: I have at hand 18S sequences. The library length distribution per Bioanalyser has its mode about 481 bp. I have sequenced 2x250 bp on a MiSeq version 2 kit. FasQC check of all fastq files revealed nominal sequencing performance. I am receiving demultiplexed fastq files from the sequencing facility. Library design follows http://www.earthmicrobiome.org/protocols-and-standards/18s/. I am guessing the overlap is too small, and I would have had better performance if I used the 2x300 kit.

Results: qiime tools view on the summary stats reveal after de-noising and merging for a typical sample, that most reads can’t be merged. See below - the best result I ever get is 10% of input reads merged, trimming parameter adjustment doesn’t help.

  • input: 175777
  • filtered: 128898
  • denoised: 128898
  • merged: 24
  • non-chimeric: 24

Could you help me?:

  • How can I improve merging?
    • can I adjust the overlap setting (to less then 20 bp?)
    • Less stringent filtering during the earlier import and cuadapt steps I do?
  • How can I analyse my resulst if merging continues to be unsucessful?
    • Perhaps by analysing only the forward reads?
    • Which tool / plugin would I use?

Any comment would be appreciated. Thank you!

Code Snippet

# define input locations
# ---------------------------------
# define output locations
# ---------------------------------
# trimming parameters 18S - aiming for Phred 20 
# ---------------------------------------------
# run script
# ----------
for ((i=1;i<=1;i++)); do
   qiime dada2 denoise-paired \
      --i-demultiplexed-seqs "$trpth"/"${inpth[$i]}" \
      --p-trunc-len-f "${lenf[$i]}" \
      --p-trunc-len-r "${lenr[$i]}" \
      --p-n-threads "$thrds" \
      --o-representative-sequences "$trpth"/"${otpth_seq[$i]}" \
      --o-denoising-stats "$trpth"/"${otpth_stat[$i]}" \
      --o-table "$trpth"/"${otpth_tab[$i]}"

Latest used Qiime version

qiime info
System versions
Python version: 3.5.5
QIIME 2 release: 2018.11
QIIME 2 version: 2018.11.0
q2cli version: 2018.11.0

Installed plugins
alignment: 2018.11.0
composition: 2018.11.0
cutadapt: 2018.11.0
dada2: 2018.11.0
deblur: 2018.11.0
demux: 2018.11.0
diversity: 2018.11.0
emperor: 2018.11.0
feature-classifier: 2018.11.0
feature-table: 2018.11.0
fragment-insertion: 2018.11.0
gneiss: 2018.11.0
longitudinal: 2018.11.0
metadata: 2018.11.0
phylogeny: 2018.11.0
quality-control: 2018.11.0
quality-filter: 2018.11.0
sample-classifier: 2018.11.0
taxa: 2018.11.0
types: 2018.11.0
vsearch: 2018.11.0

Application config directory
/Users/paul/Library/Application Support/q2cli

Getting help
To get help with QIIME 2, visit https://qiime2.org

The huge dropoff from your dada2 stats at the “merged” step indicates that your pairs are not being joined. You need at least 20 nts of overlap in order to join with q2-dada2 (you can specify less using DADA2 directly, I think).

Yes, we recommend this often in these cases.

qiime dada2 denoise-single

You can pass your paired-end demux reads directly into this action, it will use only the forward reads.

Thank you for your response @thermokarst.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.