Using dada2 denoise-paired, I am confused about a few things:

  1. I know I need to worry about trimming the primers/adapters but am unsure if mine have already been taken off or not. I used 16S primers "341F+linkr" and 806R+linkr." How can I tell if they were sequenced?

  2. If I do need to trim them off, do I add the numbers of NTs from the primers/adapters to the low quality bases that I want to get rid of? If so, how can I tell how long my primer/adapters are?

  3. I also know that I need to think about having enough overlap for my amplicons. I've heard the minimum is about 20 bases, is that correct?

  4. I have an idea of what my trimming/truncating parameters would be, but suggestions would help. They are as follows:

trim-left-f 37
trim-left-r 11
trunc-len-f 114
trunc-len-r 124

Hi @Alan_Chan,

You can take a look through your raw reads using something like the “head” command to see if your primers/adapters are still intact. You can also ask your sequencing facility as they should have this information as well.

This depends on which primers/adapters you used when you were developing your libraries. You can look through your protocol and or inquire from the party that did this and they should be able to point you towards the right resources to get a proper count of this. If these are still intact in your reads then you can add them up and use the trim option to remove them. If I had to guess though I would bet they are already removed, you should double check though.

Almost, for dada2 the recommended overlap is 20 bases + natural variation in your amplicon. This is just to ensure you are not excluding targets that are naturally a bit longer. If my math is correct with your primer set and a 2x300 bp run you should have about ~140bp overlap. Basically you want to avoid truncating more than ~100bp in total between your 2 reads.

Your parameters are rather strict and as they are will certainly not have enough overlap coverage. Assuming the primer/adapters are already removed I would choose something like: Trim 17bp from both forward and reverse, truncate forward reads at 285 and truncate reverse reads at 220. This truncates around 95 bp total so should give you good overlap for merging. If you find that you’re not getting enough reads afterwards you can also try running just the forward reads since these seem to be in much better shape than the reverse reads.

Hope this helps!


