I’m working on a dataset of 2 x 300 bp obtained by Illumina MiSeq, and I’m now looking at the quality bar-plots, obtained after demultiplexing.
I was trying to figure out the best parameters for denoising (in particular, trimming length).
I calculated a 52 nt overlap ( lenght of the amplified region - median forward length - median reverse length).
( I amplified the V3 - V4 region)
So: 550 - 301 - 301 = - 52
According to this result, how many bp have to be trimmed (from 3’) in order to perform dada2 trimming? (parameters --p-trim-left-f and --p-trim-left-r)
I assumed them to be both 26 (?).
You don't need to trim anything if your reads are high-quality enough. Trimming and truncation decisions should be based on read quality. You want to make sure you don't trim too much, such that your reads fail to overlap, but otherwise more read length is usually better to permit better overlap.
You can also use the --p-trim-* parameters to trim primers from the 5' ends, but that will not affect overlap.
the --p-trim-* parameters trim the 5' ends of the reads. Is that what you meant?
It sounds like you are trying to trim all the overlapping nucleotides so that the reads join end to end? That will not work with q2-dada2... it joins reads based on overlap (minimum 12 nt overlap), not end-to-end joining.
Hello @Nicholas_Bokulich, thanks for your answer!
I tried with different criteria and my result were almost unaffected… so my reads are probably quite high-quality ones, not requiring, apparently, a strict trimming step.
The quality looked very good according to the box-plots I looked at.
I’ll stick to default parameters for now!