Clarification on primers, are a portion of them left in this run?

ben · September 26, 2019, 6:40pm

Hi all,

I was wondering on a couple of technical questions and there's a question that's come up with my sequencing core. We have sequences which are run on Illumina MiSeq, they come out from the sequencer without trimming then are sent to us.

I've been trying to wrap my brain around this, but if sequences come with primers in-sequence we have to run cut-adapt and remove it at the adapter sequence. However, if the primers are not in sequence, then are they included in the header?

This may be a fundamental question, but I've been trying to make heads/tails of this situation. If you look at some of the representative sequences: https://view.qiime2.org/visualization/?type=html&src=https%3A%2F%2Fdl.dropbox.com%2Fs%2F6vnvuhobbcan521%2Frep-seqs.qzv%3Fdl%3D1

I can see that some of the sequences start with "AGCGTTA", where is this coming from? I can't find it among the primers. I tried finding it by reverse complementing, but the reverse complement sequence is not present.

This may be more Illumina related, but I've been trying to wrap my head around what happens to these primers when I go through the QIIME2 steps. Is there a resource where I can look up what's happening with these.

Ben

colinbrislawn · September 26, 2019, 7:59pm

Hello Ben,

I'm not 100% sure I understand your question, but I think I can get the conversation started.

I haven't see this. Sometimes the barcode is in the fastq header, but the adapters are in the read or not at all.

This is the part of the read after the adapter, which would imply the adapters are not in the read.

Correct, if and only if the primers and adapters are getting sequenced at all!

You know the adapters and primers must be getting amplified to create useful Illumina libraries, but with a little bit of cleverness during sequencing, they might never end up within your 250 bp reads!

The MiSeq uses several different primer sets during its sequencing by synthesis process. After clusters have been amplified on the flow cell using the Illumina(tm) primer, you can perform the sequencing by synthesis reading process using the exact same primer you used for initial PCR. This will begin 'reading' the DNA after the adapter region, leaving you with no primers or adapters to remove.

Clever, right?

Colin

ben · September 26, 2019, 8:07pm

Yes, actually sorry, I think I'm conflating two issues.

I think your explanation at the end actually makes sense. I have been wondering where these adapters/primer sections of my reads were. Essentially, I've been looking for the adapter portion to cut out the primers. The genome core told me that they come out of the sequencer untrimmed, but I think that they did not mention that the adapters were not in the sequence due to the way that Illumina does its sequencing with the primers.

Could you clarify this, how do you know that this portion is past the adapter, is it because when you blast the sequences you see a match for the 16S?

colinbrislawn · September 27, 2019, 6:33pm

I don't know for sure. It's just my guess.

So each DNA strand might look like this:

< Illumina adapter >< barcode -- 16S v4 F primer >< 16S v4 region> < 16S v4 R primer >< Illumina adapter>

So this portion doesn't look like the Illumina adapter, or the barcode, or the 16S v4 F primer, so I'm guessing it's the start of the 16S v4 region

You could blast it and see where your read starts to align to the ref database.

Colin

ben · September 27, 2019, 6:46pm

To be fair, I blasted several, all of them were 99-100% matches for bacterial 16S without adapters/primers found. Thank you for the insight. Ben

system · October 29, 2019, 12:46am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.