Cut adapt for Single end Sequences

SoilRotifer · February 25, 2021, 10:52pm

I had just noticed that you had set:

I would not recommend doing this as you will "see" your primers in the output. The problem is that you are not allowing for any error within the span of the primer sequence. Unless you allow for some errors / mismatches / indels (default is 10%), they will not be trimmed from the sequence as you are requiring the match to be exact. For example, the second sequence has a 1 bp mismatch to the primer. Sequence 5 and 6 are missing a base

P: GTGCCAGCMGCCGCGGTAA
2: GTGCCGGCAGCCGCGGTAA
5: GTGCCAGC--GCCGCGGTAA
6: GTG--CAGCAGCCGCGGTAA

Note: the M-A is not a mismatch with --p-match-adapter-wildcards set, as M represents A or C.

Also, I had mentioned there are other tricks in that thread I linked. For example you can set --p-discard-untrimmed as outlined in that very same thread, here.

So leave --p-error-rate as default and set --p-discard-untrimmed.

-Mike