I have a question related to dada2. I have seen similar questions to old posts in the forum, but I didn't find a solution to my problem and I really appreciate your help. I´m having problems with filtering and merging. Both are very low.
I am analyzing 16S data of 96 samples.
The region is V4
Amplicon size is 291
Primers are: 515F and 806R
my reads 151 bp
overlap = 11bp
My overlapping is 11bp, which is quite short and I think that is why I am having problems with merging. I know that dada2's default for overlapping is 12bp. There is a way of changing it? How can I get my reads merged?
Thank you for sharing all this context. I think I now have a complete picture of what happened here.
The unsolved problem is read joining, so let's start with the overlap calculation:
I agree with your math:
151*2 - 806-515 = reads - region
302 - 291 = reads - region
11 = overlap
This is very close, and the exact positions of the primers will make or break joining. Like, the primers used during sequencing (which may be different from those used during amplification), could change the ability to overlap reads.
515 |--------------------------------| 806
f |--> <--| r primers
|---------------><---------------| sequencing from start of primers
|-----------<-->-----------| sequencing from end of primers
Because cutadapt trim-paired removes primers, we know primers were in the reads. Without primers, we get 132 f and 131 r, which is not enough to overlap this region.
You can't join these reads.
(If the sequencing core says they should join, see how they do it. Maybe their math is different )
The good news is that the forward read quality looks great, and you should be able to analyze your samples using just the forward read. Let us know if you have any questions about doing this!
Thank you! But I have another question. I´ve read on Initial QIIME Processing : earthmicrobiome That I don´t need to trim the primers, just barcodes. Since the sequencing already came demultiplexed to me, in theory a can just go and use dada2, because I don´t have barcores attached to them, right?
If no trimming is needed, so I will have 150bp?? Then If I use the --p-min-overlap of 6bp I would be able to merge them?
Another question, if I use just the forward read am I going to loose lots of information?
Thank you so much for your help, patience and guidence. I´m new at this and I have a lot questions hehe.
Great question! The EMP uses a special sequencing method so there are no primers in the reads. (This is the method I mentioned above)
515 |--------------------------------| 806
f |--> <--| r primers
|---------------><---------------| Normal Illumina sequencing
|-----------<-->-----------| EMP method (no primers!)
Your reads do have primers in them, as you found by running cutadapt.
Try it and see!
The taxonomic resolution may be reduced because the ASVs will be shorter. But you get to keep most of your reads and avoid any bias due to joining, so that's very good!