DADA2 trimming tests - Approach

asbarros · September 4, 2024, 10:21pm

Hey everyone,

I was reading on some of the posts on novaseq and came across this one: Interactive quality plots of NextSeq data

This got me wondering on the "testing the best criteria for trimming in dada2". Considering paired-end data with 250 forward & reverse, what would be the best strategy for this? Intervals of 10 bp's in forward and reverse?

I am all for it but, I wanted (was hoping?) on an objective approach, considering the quality scores are less informative in this technology.

Cheers

buzic · September 5, 2024, 11:18am

Hi @asbarros,

I can see you’ve read and commented here. So I think what you are asking is that you’d like to know how to choose truncating parameters considering the quality plots are not very informative from sequencing that has binned quality scores. I would say that every dataset is different and there isn't a one size fits all and you need to consider the size of your target amplicon, overlap and the read length distribution in your data when truncating. Remember that during truncating reads lower than the value input are discarded.

When I am denoising a dataset I am looking for the best fit for that data, so yes, I always do as you have suggested here and run Dada2 multiple times, viewing each output. I would do this regardless of if it was binned scores or not as the quality plots just give me a starting and ending point for my list of numbers to trial. A simple way would be as a loop, something like so:

nums=(250 245 240 235)

for num in "${nums[@]}"
do
    qiime dada2 denoise-paired \
    --i-demultiplexed-seqs paired_end_read_filtered_cutadapt.qza \
    --p-trunc-len-f $num \
    --p-trunc-len-r $num \
    --o-table paired_end_read_filtered_cutadapt_${num}_table.qza \
    --o-representative-sequences paired_end_read_filtered_cutadapt_${num}_rep-seqs.qza \
    --o-denoising-stats paired_end_read_filtered_cutadapt_${num}_denoising-stats.qza

    qiime metadata tabulate \
    --m-input-file paired_end_read_filtered_cutadapt_${num}_denoising-stats.qza \
    --o-visualization paired_end_read_filtered_cutadapt_${num}_denoising-stats.qzv
done

Hope that helps.

asbarros · September 5, 2024, 11:32am

Hi @buzic

That's exactly what I was asking, amazing thank you!

In this loop, forward and reverse are always trimmed at same length but, I could "fix" forward and then change in reverse, correct?

Thanks once again

buzic · September 5, 2024, 12:09pm

@asbarros Yes, you can just change the bash loop by removing the $num against the --p-trunc-len-f or --p-trunc-len-r settings and replacing it with whatever you like.

No problem, best of luck!

system · October 6, 2024, 6:10pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.