Is it incorrect to leave forward and reverse reads the same length?

I’m having some trouble determining whether it’s okay to leave forward and reverse reads the same length.

My raw reads are ~250nt. Normally I end up trimming them to be the same length because the quality scores are fairly similar in both forward and reverse. But is this going to negatively impact my results?

I just need some help wrapping my head around this!

1 Like

Hi @Ellenphant,
Since paired-end reads are ultimately merged on the 3’, truncating their length on that end does not matter, as long you don’t truncate too much which disallows for proper merging.
In fact, trimming on the 3’ (often the poor quality tails) is recommended when you’re using DADA2 for denoising since removal of those bad quality tails allows for more reads to pass the initial filtering step.

ex1: no truncating (2x20bp reads + 10bp overlap = 30bp PE)
F:             5'--------------------3' 
R:                        3'--------------------5'
merged:        5'------------------------------3'

ex2: truncating 3bp + 2bp (2x20 bp + 10 overlap = 30bp PE)
F:             5'-----------------...' 
R:                         3'..------------------5'
merged:        5'------------------------------3'

As you can see from the above example, both approaches would give you the same merged read of 30bp length.

1 Like

So for analyses that I have completed with both forward and reverse being kept the same length, it just means that I might have lost more sequences in the filtering? But that the taxonomic identification won’t have been affected?

1 Like

Hi @Ellenphant,
Correct, because the length of those passed sequences would have been the same.

1 Like

Phew. I thought that all of what I have done so far would be 100% invalid. Probably bad that I am losing sequences though.

Is there a general rule for determining the trim and trunc lengths for forward and reverse reads if quality scores never drop off?

Hi @Ellenphant,
That’s a good question, there is no general rule for determining trim/truncate lengths, though maybe some general ‘guidelines’ , there are lots of previous topics on the forum on this and I recommend reading through some of those to get a better idea of how to approach it. If your quality scores never drop off (which is rather weird for an Illumina run) then this is less of an issue for you. In general though it is an optimization problem where you want to truncate as much as of both the 3’ tails of your reads as possible without compromising merging. Since the forward reads are often in better shape, we tend to truncate less from the forwards and more from the reverse. There is this pre-print on FIGARO, a tool that tries to solve this optimization issue, I’ve never used it personally but might be of interest.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.