low proportion of non-chimeric reads

Dear all

I’m puzzled by the low proportion of non-chimeric reads I obtain with the command:
qiime dada2 denoise-paired
–i-demultiplexed-seqs demux-paired-end.qza
–p-trim-left-f 0
–p-trim-left-r 0
–p-trunc-len-f 280
–p-trunc-len-r 220
–o-representative-sequences rep-seqs-dada2.qza
–o-table table-dada2.qza
–output-dir denoising
–p-n-threads 48

denoising_stats.qzv:

sample-id input filtered denoised merged non-chimeric
#q2:types numeric numeric numeric numeric numeric
1 418763 311787 303312 258323 32586
10 293923 220106 213139 181133 27297
11 306225 233420 223616 185010 29583
12 294613 213471 207708 178502 27723
13 296516 229310 221983 190209 29777
14 339337 263878 255954 219032 25806
15 283101 218357 209309 176372 27013
2 445359 335203 330463 309202 42247
3 512766 383752 379470 354878 51668
4 385325 281007 276781 256989 33575
5 452725 346524 337246 286520 35729
6 466185 357871 347194 293090 31842
7 446120 340125 331603 286642 34744
8 333772 251997 248192 230108 31446
9 257512 194855 187246 154544 25264

Why an i loosing 80% of the reads ?
Best, Isabelle

there could be a high proportion of chimera, but not that high!

Are primers and adapters removed?

You may also want to adjust the min-fold-parent... option described here:

Raising this level has been recommended to reduce false-positive chimera identification:

good luck!

1 Like

Dear Nicholas
Many thanks for your reply !
Yes, i trimmed my reads and removed primers and adapters (with cutadapt and trimmomatic).
To have more non-chimeric reads, i had to use the following parameter:

 --p-min-fold-parent-over-abundance 4.0

but i’m not sure 4.0 was the best option.
I guess it was: here are my new results:

sample-id input filtered denoised merged non-chimeric
#q2:types numeric numeric numeric numeric numeric
1 418684 314414 307128 259587 167077
10 293868 221805 215559 180913 131975
11 306157 235094 226474 184649 138192
12 294557 214975 209721 178503 126875
13 296474 231263 224550 190165 145495
14 339261 266235 259151 219141 148347
15 283049 220300 212308 176494 133787
16 264783 202050 196330 167821 124140
17 249265 187381 182673 155767 102374
18 331300 245115 240467 211866 155136
19 471221 351648 345383 305818 216401
2 445266 338176 333812 311272 202944
20 373866 263616 258448 224434 155590
21 489716 374338 368836 327276 218676
22 452017 342898 334917 288588 218418
23 407695 313660 306590 266242 174507
24 331745 246345 241464 214223 151435
25 668362 503044 493842 439425 276127
26 565252 425086 417796 373417 250400
27 434525 326152 320563 290268 200269
28 441632 323068 317846 285004 202334
29 354844 270807 265906 237277 172074
3 512637 386582 382096 357060 229442
30 286781 222626 218440 194471 141057
31 343048 267123 262595 235371 151831
32 295625 222200 217078 188748 130742
4 385248 283269 279509 258170 165668
5 452643 349067 340552 287042 192772
6 466097 360919 351535 293474 183407
7 446053 342994 335302 287738 187609
8 333688 254207 250731 232115 151518
9 257457 196589 189943 154806 115599

Any comment is welcome, if you have some :slight_smile:
Best, Isa

1 Like

I am not sure either! I have not seen a benchmark of these settings (please share if you find one in the lit) but that value appears consistent with the recommendation on the dada2 issue tracker, and sounds reasonable based on the param description. You could try a few different settings to see how it impacts taxonomic composition vs. your expectations.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.