i can not get data from reads merge in dada2 step

11134 · December 15, 2020, 7:42am

dear Qiime2 team.

i try to analysis iseq 100 data for 16S V4 region(of 533F-806R).

option of my data,

forward/reverse reads length is 150bp,
∴ (150+150) - (806 - 533 + 1) = 26bp overlap.
iseq 100 output is only primer(not include overhang) + V4 region( = 150bp),
forward reads primer : 19mer
reverse reads primer : 20mer.

so, my dada2 step's code is

$ qiime dada2 denoise-paired
--i-demultiplexed-seqs demuxed_paired_end.qza
--p-trim-left-f 19
--p-trim-left-r 20
--p-trunc-len-f 150
--p-trunc-len-r 150
--o-representative-sequences rep_seq.qza
--o-table table.qza
--o-denoising-stats denoising_stats.qza --p-n-threads 32

but, when view visualization(denoising_stats.qzv), i get result as following image's low 'percentage of input merged' & merged.

plz help me..
i don't know what is wrong.

in this case, what can i do more?

ChrisKeefe · December 15, 2020, 3:45pm

Welcome to the forum, @11134!
You're looking in the right places for your answers. Hopefully we can find some answers that work for you.

DADA2 requires 12 base pairs (bp) of overlap in order to correctly merge two sequences. The most common cause of reads failing to merge occurs when sequences are not long enough to overlap.

In your case, you have a target amplicon that is ~273 bp long. (A small percentage of ASVs will be a few bp longer or shorter). Specifically, there are ~273 bp of biological data in each sequence.

If I'm reading your post correctly, each forward read is 150-19 = 131 bp long, and each reverse read is 150-20 = 130 bp long. 131+130 = 261 bp of biological data. If this is the case (and each read includes the primer), then you are correct to trim the primers with --p-trim-left, but your reads are not long enough to join. You need 273+12 = 295 bp of biological data to join forward and reverse reads, and you only have 261 bp of biological data. In this case, you probably want to proceed with your analysis on only the forward or reverse reads.

If somehow the 150bp of each read does not include the primers (i.e. your reads are 150 bp long and primers have already been removed), then you shouldn't use --p-trim-left to "remove the primers" again. This would probably get you over the raw read-length requirements to start joining reads, but your level of success will still depend on the quality of your data. If your data quality drops significantly at the 3' ends, you may be better off using only forward or reverse data anyway.

There are a lot of good discussions about DADA2 trim/truncation parameters on this forum, and I'd encourage you to spend some time searching and reading those if you're in this latter situation. The Parkinson's mouse tutorial might also be useful to you.

Best of luck,
Chris

11134 · December 16, 2020, 12:45am

Hi chris! thank you for your answer!
i can understand your comment.
but i had tried your method already..

but result was same as latest try.

at first. i read your reference.
thank you

system · January 16, 2021, 6:45am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.