DADA2 paired-end (EMP): All sequences of same length

benjjneb · June 6, 2019, 8:15pm

Ah I see what happened here.

The way you trimmed your reads initially resulted in the reverse read always completely overlapping the forward read. That is, the full amplicon length varies (tightly) between about 251-255 nts. But since you cut off the first 5 nts of the reverse read, it always started at a position within the forward read, and because it was truncated at 203 nts, it never extended past the other end of the forward read. So every merged read is just the length of the forward read. When you truncated sooner in the second set of parameters, the forward read no longer extended past the start of the reverse read, and you got back some length variation.

Nothing is really wrong in either case, but what I would recommend is not trimming off the initial 5 nts, mostly because that will make it harder to merge with other datasets later on that start/end at these standard primer set locations.