This got me wondering on the "testing the best criteria for trimming in dada2". Considering paired-end data with 250 forward & reverse, what would be the best strategy for this? Intervals of 10 bp's in forward and reverse?
I am all for it but, I wanted (was hoping?) on an objective approach, considering the quality scores are less informative in this technology.
I can see you’ve read and commented here. So I think what you are asking is that you’d like to know how to choose truncating parameters considering the quality plots are not very informative from sequencing that has binned quality scores. I would say that every dataset is different and there isn't a one size fits all and you need to consider the size of your target amplicon, overlap and the read length distribution in your data when truncating. Remember that during truncating reads lower than the value input are discarded.
When I am denoising a dataset I am looking for the best fit for that data, so yes, I always do as you have suggested here and run Dada2 multiple times, viewing each output. I would do this regardless of if it was binned scores or not as the quality plots just give me a starting and ending point for my list of numbers to trial. A simple way would be as a loop, something like so:
nums=(250 245 240 235)
for num in "${nums[@]}"
do
qiime dada2 denoise-paired \
--i-demultiplexed-seqs paired_end_read_filtered_cutadapt.qza \
--p-trunc-len-f $num \
--p-trunc-len-r $num \
--o-table paired_end_read_filtered_cutadapt_${num}_table.qza \
--o-representative-sequences paired_end_read_filtered_cutadapt_${num}_rep-seqs.qza \
--o-denoising-stats paired_end_read_filtered_cutadapt_${num}_denoising-stats.qza
qiime metadata tabulate \
--m-input-file paired_end_read_filtered_cutadapt_${num}_denoising-stats.qza \
--o-visualization paired_end_read_filtered_cutadapt_${num}_denoising-stats.qzv
done
@asbarros Yes, you can just change the bash loop by removing the $num against the --p-trunc-len-f or --p-trunc-len-r settings and replacing it with whatever you like.