Removing Primers and adapter

Hello,
I am still confused after going through various discussion on QIIME2 forum. I am using Illumina Miseq 2*300 bp and hypervariable region is V3-V4. I got following information from sequencing center and they mentioned that they did not remove any primers and adapters
"Nextera-compatible primer design:

Forward primer:

TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-[locus-specific]

Reverse primer:

GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-[locus-specific]

The indexing primers are as follows (we have a large set of these that allows multiplexing of >400 samples). This step adds both the index and the flowcell adapters. [i5] and [i7] refer to the index sequence codes used by Illumina. The p5 and p7 flow cell adapters are in bold.

Forward indexing primer: AATGATACGGCGACCACCGA GATCTACAC[i5]TCGTCGGCAGCGTC

Reverse indexing primer: CAAGCAGAAGACGGCATACGAGAT[i7]GTCTCGTGGGCTCGG

Nextera adapter sequences (for post-run trimming):

Read 1: CTGTCTCTTATACACATCTCCGAGCCCACGAGACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG

Read 2:

CTGTCTCTTATACACATCTGACGCTGCCGACGANNNNNNNNGTGTAGATCTCGGTGGTCGCCGTATCATT"

After importing I used following command for removing primers

qiime cutadapt trim-paired --i-demultiplexed-sequences paired-end-demux.qza --p-adapter-f "TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG" --p-adapter-r "GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG" --o-trimmed-sequences trimmed-demux.qza

  1. Am I using the correct code for removing primers?
  2. Do I need to remove adapter first followed by primers or both at the same time?

From the discussion board, I noticed that my primers are different even I have same V3-V4 and 2"300 bp

After geting demux.qzv file, I used following command for trimming

qiime dada2 denoise-paired --i-demultiplexed-seqs paired-end-demux.qza --p-trim-left-f 0 --p-trim-left-r 0 --p-trunc-len-f 298 --p-trunc-len-r 298 --o-table table.qza --o-representative-sequences rep-seqs.qza --o-denoising-stats denoising-stats.qza

I got really low percentage around 7 to 15% of merged sequences.
I have attached the demux.qzv file for reference.

I would be grateful if I could get some suggestions here.
Thank you all!

Hi @umanand,

Only provide the PCR primers, not the entire adapter / primer construct. See:

-Mike

2 Likes

Hi Mike,
I am sorry, I do not get you. Do you mean I should use forward and reverse indexing primer?

Thank you!
Urmila

No, the actual PCR primers you used for amplification of the V3V4 region. I suspect they might be one of these two primer sets:

Region Primer Pair Name Reference Fwd primer (5' - 3') Rev primer (5' - 3')
V3V4 341F/357wF - 805R Herlemann et al. 2011 CCTACGGGNGGCWGCAG GACTACHVGGGTATCTAATCC
V3V4 341F/357wF-806R Lemons et al. 2017 CCTACGGGNGGCWGCAG GGACTACHVGGGTWTCTAAT

This is assuming the sequencing protocol used sequences through the primer, i.e. is present in the 5' region of your reads.

-Mike

Thank you Mike! I will try using this primer.

Sincerely,
Urmila

1 Like

I forgot to mention, use the --p-front-f and --p-front-r flags... not the --p-adapter-f and --p-adapter-r flags. See the help text for more information.

Also, I recommend contacting your sequencing facility to inquire as to which PCR primers were used to amplify your V3V4 region, and use those.

Hi Mike,
When I asked about primers, they sent me this information

"Nextera-compatible primer design:
Forward primer:
TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-[locus-specific]
Reverse primer:
GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-[locus-specific]
The indexing primers are as follows (we have a large set of these that allows multiplexing of >400 samples). This step adds both the index and the flowcell adapters. [i5] and [i7] refer to the index sequence codes used by Illumina. The p5 and p7 flow cell adapters are in bold.
Forward indexing primer: AATGATACGGCGACCACCGAGATCTACAC[i5]TCGTCGGCAGCGTC
Reverse indexing primer: CAAGCAGAAGACGGCATACGAGAT[i7]GTCTCGTGGGCTCGG

Nextera adapter sequences (for post-run trimming):

Read 1: CTGTCTCTTATACACATCTCCGAGCCCACGAGACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG

Read 2:
CTGTCTCTTATACACATCTGACGCTGCCGACGANNNNNNNNGTGTAGATCTCGGTGGTCGCCGTATCATT"

Thank you!
Urmila

Hi @umanand,

I think there is some confusion here. There are two types of primers when generating amplicon sequencing data: the primers used to amplify your target, i.e. the V3V4 primers, and the primers used to sequence your amplicons (the Illumina primers).

Did your lab perform the PCR? Or did the sequencing facility. If the sequencing facility did the entire sample preparation, i.e. PCR and sequencing, then they should be able to tell you what PCR primers they used to amplify the V3V4 region.

Note, you need to be very specific when you ask them about the primers. That is, ask them which PCR primers they used to amplify the V3V4 region? Usually the customer, i.e. your lab, requests what region to amplify. Many sequencing facilities provide a list of primers that they use (unless you provided them), or if they use proprietary primers, they will tell you how long the primers are, so that you can trim them off.

-Mike

Hi Mike,
Thank you for the suggestions. Sequencing facility did PCR and all other stuff.
One more question, do I need to remove adapters along with primers?

Thank you!

Hi Mike,
Sequencing facility provided this information when I asked which PCR is used to amplify V3V4 region.

Primer name Marker gene Target region Sequence
V3_357F_Nextera 16S rRNA V3-V4, V3-V5 TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGAGGCAGCAG
V4_806R_Nextera 16S rRNA V3-V4, V4 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGGACTACHVGGGTWTCTAAT

Do I need to use sequence here as primer?

Generally, for 16S rRNA gene sequences, you do not need to... as the adapters are often before the primer at the 5' end. The only time you may need to worry about this is for very short marker genes were you "read-through" the 3' end in a single read. In which case, you can simply use the approach outlined here, where you'd simply add the reverse compliment of the opposite primer and/or adapter. Again, most do not need to do this for 16S rRNA gene sequences.

No. This is the full Illumina construct.

If you look closely at this construct, you'll see that the primers I assumed they were using are contained within it. I've split them off into their own column:

Primer name Marker gene Target region Illumina Sequence PCR Primer
V3_357F_Nextera 16S rRNA V3-V4, V3-V5 TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG CCTACGGGAGGCAGCAG
V4_806R_Nextera 16S rRNA V3-V4, V4 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG GGACTACHVGGGTWTCTAAT

So, you want to use the sequences under the PCR Primer column for the cutadapt flags --p-front-f and --p-front-r respectively.

Hi Mike,
Thank you so much for the clarification. I really appreciate your help.
Sincerely,
Urmila

1 Like

Hi Mike,
Do I need to use primer twice same as the link you provided before,

qiime cutadapt trim-paired
--i-demultiplexed-sequences paired-end-demux.qza
--p-cores 4
--p-front-f CCTACGGGNGGCWGCAG GACTACHVGGGTATCTAAKCC
--p-front-r GACTACHVGGGTATCTAAKCC CCTACGGGNGGCWGCAG
Thank you!

No. Like I said, you do not need to do this for your marker gene.

I was only mentioning this strategy for your general knowledge, if you ever make use of short marker genes. For example, if you were to sequence less than 100 bp.

Hi Mike,
I found multiple codes for removing primer. I am not sure which one to follow (I will adjust my primers).

  1. qiime cutadapt trim-paired
    --i-demultiplexed-sequences paired-end-demux.qza
    --p-cores 4
    --p-front-f CCTACGGGNGGCWGCAG
    --p-front-r GACTACHVGGGTATCTAAKCC
    --p-overlap 3
    --p-error-rate 0.1
    --p-match-read-wildcards
    --p-match-adapter-wildcards
    --p-discard-untrimmed
    --o-trimmed-sequences primer-trimmed-demux-2.qza
    --verbose > cutadapt-log-2.txt

  2. qiime cutadapt trim-paired
    --i-demultiplexed-sequences paired-end-demux.qza
    --p-front-f CCTACGGGNGGCWGCAG
    --p-front-r GACTACHVGGGTATCTAAKCC
    --o-trimmed-sequences primer-trimmed-demux.qza

However, I used this code for removing primer.
qiime cutadapt trim-paired --i-demultiplexed-sequences paired-end-demux.qza --p-front-f CCTACGGGAGGCAGCAG --p-front-r GGACTACHVGGGTWTCTAAT --p-match-adapter-wildcards --p-match-read-wildcards --p-discard-untrimmed --o-trimmed-sequences trimmed-demux.qza

Could you please tell me if I am using the correct code? I would appreciate your help.

Thank you!
Urmila

However, I saw this code also

Hi @umanand,

Assuming you are using a recent version of QIIME 2 (2024.2, or any of the 2023 versions), I'd keep it simple and use default settings with the addition of --p-discard-untrimmed. This latter flag will remove any read-pairs in which the primers can not be detected. This is a nice form of quality control, and ensures all your output is trimmed, minimizing spurious read-length outputs.

qiime cutadapt trim-paired \
  --i-demultiplexed-sequences paired-end-demux.qza \
  --p-cores 4 \
  --p-front-f CCTACGGGNGGCWGCAG \
  --p-front-r GACTACHVGGGTATCTAAKCC \
  --p-discard-untrimmed \
  --o-trimmed-sequences primer-trimmed-demux.qza \
  --verbose

Remember, if you add --help at the end of any command, the help test will appear. For example, if you run qiime cutadapt trim-paired --help you can see the default settings, and other parameter options.

Hi Mike,
Thank you so much! I am grateful for your help!

1 Like