Having few percentage of input non chimeric

I performed Denoising with DADA2 using the following for my paired-end sequences.

qiime dada2 denoise-paired
--i-demultiplexed-seqs paired-end-demux.qza
--p-trunc-len-f 276
--p-trunc-len-r 224
--o-table table.qza
--o-representative-sequences rep-seqs.qza
--o-denoising-stats denoising-stats.qza.

The percentage of non chimeric input is below 30 %. I used this truncation value because the quality score started dropping to 30% at 276 BP and 224 BP for forward and reverse respectively.
Here is attached denoise qzv stat for demux and denoise.
Can I proceed with the downstream analysis? How can I able to increase the nonchimeric percentage?
Can anyone help, please?
demux.qzv (319.9 KB)
denoising-stats.qzv (1.2 MB)

Hi @Namraj_Jaishi ,

It appears that you are losing 50% of your data from failed merges, and more again as many are detected as chimeras.

What amplicon marker gene and region are you targeting? What length are you expecting after merging?

You'll likely have to play with the truncation parameters to obtain more merges, and also investigate the use of the --p-min-fold-parent-over-abundance to mitigate false positive chimera detection. You can also, optionally pool the data too. These are outlined here.

1 Like

Hello,
As from the sequencing center, Amplifications were done on the 16S rRNA V4 hypervariable region and the 18S rRNA region. The primers were 16S (354f 5’-GTGYCAGCMGCCGCGGTAA-3’ and 806R- GGACTACNVGGGTWTCTAAT) and 18S (F – TTGTACACACCGCCC and R - CCTTCYGCAGGTTCACCTAC). The actual primers used had partial TruSeq adapter sequences fused onto the 5' end. A 2nd round of PCR was performed using the product from the first to add the remaining TruSeq sequences needed for sequencing. The sequencing was MiSeq paired-end 300.
I tried playing with truncation length but could not be able to get the improved non-cimeric percentage.
Can you please help?

Hi @Namraj_Jaishi,

Again, most of your loss is due to failed merges, not chimeras. Did increasing the truncation length help increase the successful merges? Can you share that QZV file?

Sadly, it could just be that the data is not good enough to merge. In cases like this, it is valid to simply use the forward reads for analyses ( ignore the reverse reads).

1 Like

Here are the QZV files after importing, and denoising.
demux.qzv (319.9 KB)
denoising-stats.qzv (1.2 MB)
filtered table.qzv (876.4 KB)

What lengths have you tried already?

Why don't you try:

--p-trunc-len-f 289
--p-trunc-len-r 242

I will try this length and keep posted. Thank you so much.
Previously, I tried F 276, and R 224.

1 Like

I used 284 and 243 as forward and reverse truncation lengths. The percentage of input merged and input non-chimeric is lower than the previous one.
Here is the denoising stat.
denoising-stats284243.qzv (1.2 MB)

I was kind of suspecting that. Keep in mind the length of your amplicon. Also, read through the DADA2 options. I think what might be happening here is that there are too many mismatches in the area of overlap causing a failed merge.

Remember, DADA2 by default requires 12 bp of overlap for the merge. So you need to make sure you know the length of your amplicon and make sure you have at least 12 bp matching between the two reads. I think the amplicon should average ~420-430 bp after merging, correct?

Sadly there is nothing you can do at this point other than playing around with the truncation settings and see which works best. Perhaps try:

    --p-trunc-len-f 250
    --p-trunc-len-r 200

If you try a few other length settings and they do not work, I'd either simply use the forward read only (i.e. do. not merge), or try switching to using the deblur pipeline instead, to see if you obtain different results.

-Mike

1 Like

Hello Mike,
When I used forward read with truncation 255. It generated like this stat. The percentage of non-chimeric input
denoising-stats.qzv (1.2 MB)
is more than 50%.
Is it ok to proceed further with the downstream analysis?

Best,
Namraj

Hi @Namraj_Jaishi,

Using the forward reads looks substantially better! I personally feel that having non-chimeric reads above 70% if good. I've had to resort to only using the forward reads for a few data sets too. :slight_smile:

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.