Hi @elsamdea
These are good questions! You are correct... technically you can go either way... that is use cutadapt, or just trim the primers out directly within deblur or DADA2.
I prefer to use cutadapt to remove primers for the following reasons, which I think I've echoed elsewhere in the forum a few times:
- There may be spurious off-target sequences within your data. Just trimming will retain these reads.
- PCR / sequencing errors can add or remove bases from the beginning or end. Thus potentially inflating differences between sequences creating more sequence variants in the output.
- Quality at the beginning of the read is somewhat indicative of the quality later in the read. That is, if you cant find the primer... then what else is wrong with the sequence?
That is, using cutadapt to search through your reads, to find and remove the primers, is an additional form of quality control. That is, if you are unable to find the primers.. then chances are the reads are of low quality anyway and you might as well discard sequences from which you are unable to find the primers (e.g. --discard-untrimmed
).
For example here are a couple forum threads you can read through:
But there is no one "right way" to do things here. I just prefer to be as exacting, and retain the best quality data I can.