I was wondering on a couple of technical questions and there’s a question that’s come up with my sequencing core. We have sequences which are run on Illumina MiSeq, they come out from the sequencer without trimming then are sent to us.
I’ve been trying to wrap my brain around this, but if sequences come with primers in-sequence we have to run cut-adapt and remove it at the adapter sequence. However, if the primers are not in sequence, then are they included in the header?
I can see that some of the sequences start with “AGCGTTA”, where is this coming from? I can’t find it among the primers. I tried finding it by reverse complementing, but the reverse complement sequence is not present.
This may be more Illumina related, but I’ve been trying to wrap my head around what happens to these primers when I go through the QIIME2 steps. Is there a resource where I can look up what’s happening with these.
I'm not 100% sure I understand your question, but I think I can get the conversation started.
I haven't see this. Sometimes the barcode is in the fastq header, but the adapters are in the read or not at all.
This is the part of the read after the adapter, which would imply the adapters are not in the read.
Correct, if and only if the primers and adapters are getting sequenced at all!
You know the adapters and primers must be getting amplified to create useful Illumina libraries, but with a little bit of cleverness during sequencing, they might never end up within your 250 bp reads!
The MiSeq uses several different primer sets during its sequencing by synthesis process. After clusters have been amplified on the flow cell using the Illumina(tm) primer, you can perform the sequencing by synthesis reading process using the exact same primer you used for initial PCR. This will begin 'reading' the DNA after the adapter region, leaving you with no primers or adapters to remove.
Yes, actually sorry, I think I'm conflating two issues.
I think your explanation at the end actually makes sense. I have been wondering where these adapters/primer sections of my reads were. Essentially, I've been looking for the adapter portion to cut out the primers. The genome core told me that they come out of the sequencer untrimmed, but I think that they did not mention that the adapters were not in the sequence due to the way that Illumina does its sequencing with the primers.
Could you clarify this, how do you know that this portion is past the adapter, is it because when you blast the sequences you see a match for the 16S?