Hi
I imported my paired-end demultiplexed data. now I want to denoise it but I don't know how to remove the primer.
I checked the forward fastq files, all of them start with CCTACGGG ( rarely sequences start with N)
on the other hand in the reverse files, I couldn't find any repeated sequence at the end of them. the first question is how to be sure I found the primer correctly?
secondly please let me know what is the best way to remove the primers? using dada2 or cut plugin? and should I remove primer from forward reads and reverse reads separately? because the demux file is for paired-end data.
I send the interactive quality plot for the paired-end-demux.qzv picture maybe it could be useful.
Thank you
Hi @mohsen_ej,
Youâll want to use the q2-cutadapt trim-paired plugin to remove primers from both your forward and reverse primers. Each should be on their respective 5â sites, this is why you donât see any repeated patterns on the 3â of your reads.
By default, your primers (if they are still intact) will be removed, and if they are not then nothing will happen. Those ambiguous N nts will be taken care of during denoising (as in reads with N in them will be dropped).
Thank you.
does it find the primers automatically or I should give it the primers or something?
I did it by this command :
qiime cutadapt trim-paired \
--i-demultiplexed-sequences paired-end-demux.qza \
--o-trimmed-sequences paired-end-demux-trimmed.qza
I'm not sure if I did it correctly because after I convert it to a .qzv file I don't feel many changes in the interactive plot.
how can I know I did it properly or not?
also, is it possible to have 297nts in forward read but don't have it in reverse read? you can see that in the picture.
thank you
Hi, @mohsen_ej!
Cutadapt won't know about your primers unless you specify them using the appropriate parameters. You will find the answer to your question by reading the help text for the cutadapt trim-paired
command, which @Mehrbod_Estaki linked to above. You can also view this information by typing qiime cutadapt trim-paired --help
in your terminal.
After removing your primers, you can then use qiime demux summarize
to visualize the results.
Let us know how that goes!
Thank you very much for your response.
as I am new to qiime
could you please give me an example about this?
you know, I have read the cutadapt help but I am not sure if I understood the issue correctly.
while the primers have used are
16S Amplicon PCR Forward Primer = 5â
TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG
16S Amplicon PCR Reverse Primer = 5â
GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC
also it includes illumnia overhang adapter
Forward overhang: 5â TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGâ[locusspecific
sequence]
Reverse overhang: 5â GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGâ[locusspecific
sequence]
Iâm not sure which parameters should I consider in the cutadapt command.
-pp-adapter or front or anywhere or ⌠how it will be
Really sorry if Iâm asking simple question.
Hi @mohsen_ej, if you search the forum, you'll come across many examples of how to use cutadapt. There are quite a variety of ways to leverage this tool.
You'll likely just want to specify your specific primer sequences not the entire construct. That is, your primers are likely these (anything after the ...GAGACAG):
--p-front-f CCTACGGGNGGCWGCAG \
--p-front-r GACTACHVGGGTATCTAATCC \
Here are a couple to get you started:
-Mike
Thank you for your response.
That was great.
so you mean I donât need to consider overhang or something? just specific primer that is anything after GAGACAG in both reverse and forward reads?
Iâm asking because I want to be sure I understood it.
Thank you
Yes, see the example command options I provided.
Thank you I read them.
I ran this command :
qiime cutadapt trim-paired
âi-demultiplexed-sequences paired-end-demux.qza
âp-front-f CCTACGGGNGGCWGCAG
âp-front-r GACTACHVGGGTATCTAATCC
âo-trimmed-sequences paired-end-demux-trimmed.qza
how can I be sure that I did it correctly?
Thank you very much
Hi @mohsen_ej,
Thank you for attaching the QZVs. Youâll want to add the following flags, which were also mentioned in the posts I linked above:
--p-match-adapter-wildcards --p-match-read-wildcards --p-discard-untrimmed
Youâll likely not need --p-match-read-wildcards
, but it does not hurt to throw it in.
This will allow cutadapt to match the IUPAC codes in your primers (i.e. W, V N,âŚ) with the reads, and discard any sequences in which it could not find both primers. The latter ensures you only have sequences that were trimmed.
So your full command should be:
qiime cutadapt trim-paired
--demultiplexed-sequences paired-end-demux.qza \
--p-front-f CCTACGGGNGGCWGCAG \
--p-front-r GACTACHVGGGTATCTAATCC \
--p-match-adapter-wildcards \
--p-match-read-wildcards \
--p-discard-untrimmed \
--o-trimmed-sequences paired-end-demux-trimmed.qza
-Cheers!
-Mike
Thank you @SoilRotifer for your helps and sorry if Iâm taking your time with simple questions.
But as you can see there is still low score in the reverse read. do you think I can use the reverse read or its better to ignore that and continue with forward read? if I can use both of them, can I say :
âp-trim-left-f
âp-trim-left-r
âp-trunc-len-f 283
âp-trunc-len-r 256 \
does it make sense?
and one more thing, I didnât understand how did you identify specific primer (anything after GAGACAG ). why?
Really sorry for questions.
Please search through the forum first. Many users have had similar issues with determining trimming and truncation settings. You may have to iterate through several settings.
That just came from my experience working with a lot of data sets. I've just became familiar with a variety of primer constructs and protocols. Ideally, you should always ask your sequencing facility which amplicon / gene region, i.e. PCR / sequencing primers, were used for your project. They should also provide you with a citation for these too.
-Mike
Thank you for your information.
I asked because I found out there is no exact solution for this issue but I will read more.
many many thanks for your guidance.
Based on your product size (V3-V4 or V3 region or V4 regions), you can determine the trunc len-f and trunc-len-r. Next, it is quite common to have poor quality in reverse reads. So just use both tags as below
qiime dada2 denoise-paired
âi-demultiplexed-seqs demux.qza
âp-trim-left-f 10
âp-trim-left-r 10
âp-trunc-len-f 280
âp-trunc-len-r 200
âo-table table.qza
âo-representative-sequences rep-seqs.qza
âo-denoising-stats denoising-stats.qza
Best
Gnanendra
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.