Why do I have more sequences (therefore more taxonomic groups recovered) when I use only my forward sequences ?

The thread I linked to you earlier should help with that. Another couple of good threads that'll help are:

Determining the truncation values can be tricky. Given these, and the earlier thread you should be able to estimate a reasonable set of truncation values that satisfies the minimum overlap requirement. For DADA2, the minimum overlap is ~12 nucleotide bases.

That is, the point of truncating is to remove the low-quality bases and reduce mismatching base pair collisions we discussed earlier. So, as long as you have enough for a minimum overlap you should be good to go. If you are unable to retain an acceptable number of merged reads per sample... then you can consider processing only the forward reads.

1 Like