Forward and Reverse primer removal

Brigitta1 · July 22, 2022, 9:03am

Hello,
I used Qiime 2 s/w to analyze my DNA sequences. I noticed that unlike in Qiime 1, there was no step that required the input of the forward and reverse primer sequences used except for when I was downloading the greengenes database for F515 and R806 primers. Does Qiime 2 automatically remove forward and reverse primers from my DNA sequences in the analysis process or were the primers already removed from my DNA sequences?

Thank you in advance,
Brigitta

llenzi · July 22, 2022, 10:10am

Hi @Brigitta1,

No, qiime2 does not automatically remove the primers, it is agnostic of the primer used in your experiment!
For this, you can use the q2-cutadapt plug in:

https://docs.qiime2.org/2022.2/plugins/available/cutadapt/trim-single/

I usually use this as first step, to avoid possible problems at denoising stage. However, anytime before taxonomy assignation step should do!

Hope it helps
Luca

colinbrislawn · July 22, 2022, 1:18pm

One other quick detail:

Some 16S sequencing protocols, including the EMP protocol, do not sequence the primers so they are not there to be removed. If you run q2-cutadapt and nothing changes, check the wet-lab methods to see if your reads should included primers at all.

Brigitta1 · July 25, 2022, 3:56am

Thank you @llenzi and @colinbrislawn for your responses.

I went through the q2-cutadapt plug in. However, I didn't quite understand how to use it. Before I was aware that my sequences could have adapters and primers still attached to them, I analyzed my sequences and performed the taxonomic classification step. Could my results be wrong?

I used the following command to denoise my data,
qiime dada2 denoise-pyro **
--i-demultiplexed-seqs single-end-demux.qza **
--p-trim-left 15 **
--p-trunc-len 170 **
--o-representative-sequences rep-seqs-dada2.qza **
--o-table table-dada2.qza **
--o-denoising-stats stats-dada2.qza
I used trim-left 15 because I saw it somewhere in the forum that it is necessary for ion torrent data so do you think that line would have caused the adapters to be removed if there were any?

Best Regards,
Brigitta

llenzi · July 25, 2022, 8:33am

Hi @Brigitta1,

As general point, it is advisable to remove PCR primer before any taxonomic assignment, that because any alternative bases in the primer (eg 'Y', 'N' etc) could have an impact in the assignment.

I am not familiar with the library preparation for IonTorrent sequences, so I am not sure in your case. If your sequences start with the PCR primer, which may be not the case if there are any bases related to the IonTorrent sequencing adapter before it, it is possible that your trimming parameters get rid of it. How long is your forward PCR primer?
On the other side, is the reverse PCR primer contained in your sequences? How long is your expected amplicon? Are the IonTorrent sequences covering all the expected length or are ending earlier? Reading your dada2 settings, I expect the reverse PCR primer is definitely removed because I guess 170 bp is shorter then your expected amplicon length!

Coming back on q2-cutadapt, did you try to use it? If so, what command? Did you get any error?

Hope it helps
Luca

Brigitta1 · July 25, 2022, 9:37am

Hi @llenzi ,
The primers used in the amplification step are universal primers (F515 and R806). However, I am not sure of the adapters that were used. Do you think I have to trim my reads?

I haven't used any q2-cutadapt command on my sequences yet. if that step is really necessary, I was hoping of using the following command.
$ qiime cutadapt trim-single
--i-demultiplexed-sequences demultiplexed-seqs.qza
--p-front GCTACGGGGGG
--p-error-rate 0
--o-trimmed-sequences trimmed-seqs.qza
--verbose
Can I know your opinions on it?

Thank you in advance,
Brigitta

llenzi · July 25, 2022, 10:03am

Hi @Brigitta1,
you can try the cutadapt command, so at least you will now if there are any primer sequences in your data.
The command should be ok at first glance, my only question is n the primer sequences. It looks a bit to short for me, and as universal not as many degenerated bases as I would expect.
I did a quick search and I found the following as F515:
CACGGTCGKCGGCGCCATT

Could you confirm the sequence?
Cheers
Luca

Brigitta1 · July 25, 2022, 10:45am

Hi @llenzi
You are right about the sequence of the primer. The sequence I used in the command was only an example. Not the real sequence. I'll try the cutadapt command.

But I have a small doubt. I know for a fact that my sequences were demutiplexed because they were already sorted into their respective samples. I have used the primers F515 and R 806 for the amplification of the 16s V4 region of bacteria. What effect will the primers have in the taxonomy classification step?

Thank you,
Brigitta

llenzi · July 25, 2022, 4:07pm

Hi @Brigitta1,

I guess the effect on the primer in the sequences may vary depending on what tool do you use for the taxonomic assignment, and what settings do you use. However, any artificial sequences may create fake similarity which should be considered false positive matches.

As for me, I do like to remove any non-biological sequences as very first step,
so I know there are no effect on this. For example, the presence of primers in th edenoising step may result in the production of two alternative ASVs for a given amplicon, which would have not be produced if primers were removed (remember that sequences differing by a single bases are called as separate ASV by dada2).
Altough these ASVs amy be still assigned to the same species, you will end up on overestimating the alpha diversity for that sample.
A thread you may waht to look at is the following:

(which I totally forgot I was involved ...).
Hope it helps
Luca

Brigitta1 · July 25, 2022, 4:22pm

Thank you so much @llenzi

I just got my primer details. The same forward primer (515F primer and adapter) was used for all the 26 samples and it is 42 bases long. The reverse primer (806R primer, adapter, and barcode) differs from one sample to another and they are all 63 bases long. How am I supposed to trim this off?
The following command is not gonna work as it trims only the bases in the front.
$ qiime cutadapt trim-single
--i-demultiplexed-sequences demultiplexed-seqs.qza
--p-front *****
--p-error-rate 0
--o-trimmed-sequences trimmed-seqs.qza
--verbose
how can i make a mapping file that tells qiime to trim the specified bases from the end of each read? Would this command be alright?
qiime cutadapt trim-single
--i-demultiplexed-sequences single-end-demux1.qza
--p-adapter CCATCTCATCCCTGCGTGTCTCCGACTCAG
--p-front CCTCTCTATGGGCAGTCGGTGATGTGCCAGCMGCCGCGGTAA
--p-error-rate 0.1
--o-trimmed-sequences trimmed-seqs.qza
--verbose

I also noticed that the complementary bases of my sequences in the fasta file do not correspond to the primer sequences. Could this mean that my primers have already been removed? How do I reassure about this?

Please be kind enough to help me figure this out @llenzi @colinbrislawn
Thank you in advance

colinbrislawn · July 26, 2022, 3:12pm

Try passing just the primer and adapter without the barcode. After dropping the barcode, I think the primer+adapter should be the same for all samples, so it will work with cutadapt.

Interesting! Can you show us an example?

system · August 26, 2022, 9:13pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.