Dada2: good quality data, cut or not?

Pam_mp · October 6, 2023, 7:20pm

I am analyzing Miseq sequencing data and found good quality Forward and Reverse data. I did some tests with different truncation values and I'm unsure if I should provide the value 150 for the --p-trunc-len-f and --p-trunc-len-r parameters or if can I directly provide the value 0. In the latter case, by providing the value 0, could it have erroneous taxonomy results, considering that no sequence smaller than the given value will be discarded?

Testing the command below I obtained non-chimeric results between 55% and 84% (which doesn't seem bad to me), but when providing the value 0 in --p-trunc-len the results are even better: 75 to 90% . Which option would return more reliable results?

qiime dada2 denoise-paired
--i-demultiplexed-seqs demux-paired-end.qza
--p-trunc-len-f 150
--p-trunc-len-r 150
--p-trim-left-f 19
--p-trim-left-r 20
--p-min-overlap 8
--p-n-threads 50
--o-table table-dada2.qza
--o-representative-sequences rep-seqs-dada2.qza
--o-denoising-stats denoising-stats-dada2.qza
--verbose

Thank you very much for the help!

Nicholas_Bokulich · October 7, 2023, 10:24am

Hi @Pam_mp ,

No I would not worry about this. Setting the value to 0 just disables truncation, which means that all reads will be 150 nt (your read length)

No that is not a bad yield at all. However, the reads mostly seem to be lost during the merging step and that has me worried. So I would not truncate these reads, as it is causing some reads to fail to merge.

This seems clear: don't truncate! You get better yields in the end, probably because the reads are merging. Any low-quality sequences will be dropped in the initial filter (or corrected by dada2), so I would not worry about quality.

Good luck!

Pam_mp · October 9, 2023, 6:27am

Hi @Nicholas_Bokulich

I am very grateful for clarifying my doubts.

I wish you a great week!