I have pair-end reads (2x300) from V4 16S region (515F 5′-GTGCCAGCMGCCGCGGTAA and 806R- 5′-GGACTACVSGGGTATCTAAT). I have 5 samples and 2 reads in fastq format (R1 and R2) for each sample. I noticed that the reads have primers and barcodes inside them, as shown in the attached figure. My question is if It is mandatory to remove only the barcodes or both primers and barcodes? How does it affect that I do not remove primers or barcodes sequences from the reads before starting the analyzes in qiime 2? I would really appreciate any help.
Barcodes yes, primers no but it is recommended.
Barcodes will cause serious issues — these are non-biological DNA! So they will mess up anything involving alignment, including identifying sequence variants. You must trim them.
Primers are conserved across all sequences, so are non-informative. Removing these will potentially improve sequence variant calling and taxonomy classification. Degenerate primers can also cause issues with variant calling, I believe. So trim if you can. dada2 has a
--p-trim parameter to make this easy. Otherwise use q2-cutadapt to trim primers.
I hope that helps!
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.