what percentage of reads should contain my primer sequence/region?

ranxx005 · May 29, 2019, 12:42am

I am new here. and need some help. The amplicon is targeting from 515F to 806R , high seq 2X250 sequencing. while I look at the read sequences, I only have 50% of R1 reads have forward primer in the sequence, 50% of R2 reads have reverse primer in the sequence. Is 50% normal? should I expect 100% (or at least close to 100%) reads have the forward primer or reverse primer in the read sequence? Thank you very much!

Nicholas_Bokulich · May 29, 2019, 12:17pm

Welcome @ranxx005!

That is indeed a bit unusual but I think I know the explanation. The (near-) perfect 50/50 split likely indicates that your sequences are not all in the same orientation. Confirm by looking for the forward in R2 and the reverse in R1.

Mixed orientations will cause some issues in analysis, e.g., reverse-complement sequences will be identified as unique ASVs with denoising, even though they are identical.

QIIME 2 does not have a way to re-orient mixed-orientation reads at this moment in time. I recommend finding another bioinformatics tool that can perform this reorientation, and then import to QIIME 2.

Good luck! (and please let us know how you solve this — every now and then someone posts to the forum asking about mixed-orientation reads)

ranxx005 · May 29, 2019, 8:46pm

It is OK to discard these "misoriented reads"? I have more than enough reads after discarding these reads.

ranxx005 · May 29, 2019, 8:46pm

When I use cutadapt to trim my primer sequences in the reads, should I discard the reads without trimming? If I choose to discard, I lose 35% reads, but i still have enough reads for each samples. As I posted earlier, people mentioned that some reads might be misorientation. Is it OK if I just discard them? will this introduce the bias in data analysis?

The command I used:
qiime cutadapt trim-paired --i-demultiplexed-sequences demux.qza --p-front-f GTGYCAGCMGCCGCGGTAA --p-front-r GGACTACNVGGGTWTCTAAT --p-error-rate 0.2 --p-discard-untrimmed True --o-trimmed-sequences demux-primer-trimmed-discard-no-trimmed.qza

Nicholas_Bokulich · May 29, 2019, 8:49pm

sure. the orientation should be random, so I see no reason why discarding is a problem (from a data integrity standpoint).

In your case, yes discuard. Cutadapt is a great way to figure out which reads are misoriented since primers are in your reads. Do not opt to keep untrimmed reads — then you are preserving the mixed-orientation reads, which will inflate diversity estimates downstream.

ranxx005 · May 30, 2019, 2:03pm

After deblur quality control, I do not have enough reads for one samples. I decide to go back to check the misorientated reads. i did find reverse primer in my read 1, plenty. I think I will need re-orient those reads. Do not know how to do it now. But will update if I figure it out

Nicholas_Bokulich · May 30, 2019, 2:22pm

An off-topic reply has been split into a new topic: what is the minimum acceptable number of reads post-denoising?

Please keep replies on-topic in the future.

system · June 30, 2019, 8:22pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.