I'm processing a batch of pairedend sequences. The primer is 338f_ 806r. There are unresectable primers in my rawdata. So I wanted to use the cutadapt plug-in to remove the primer. I ran the following command.
qiime cutadapt trim-paired --i-demultiplexed-sequences demux.qza --p-front-f ACTCCTACGGGAGGCAGCAG --p-front-r GGACTACHVGGGTWTCTAAT --o-trimmed-sequences trimmed-demux-paired-1.qza --verbose
trimmed-demux-paired-1.qzv (314.8 KB), this is the document after cutting.
demux.qzv (309.4 KB) , this is the file before cutting.
I found that the front part was really cut off . And the length of the sequence is shorter. So I went on running dada2.
time qiime dada2 denoise-paired --i-demultiplexed-seqs demux.qza --p-trim-left-f 0 --p-trim-left-r 0 --p-trunc-len-f 270 --p-trunc-len-r 210 --o-table table-front.qza --o-representative-sequences dada2-rep-seqs-front.qza --o-denoising-stats stats-front.qza --p-n-threads 0
stats-front.qzv (1.2 MB) table-front.qzv (518.4 KB) dada2-rep-seqs-front.qzv (717.7 KB)
But I found that there were still primers in the combined sequence. This led me to a very abnormal result.
I want to know what causes this and how I can solve it.
I hope you can give me some suggestions. Thank you very much. Looking forward to your reply!
I am apologizing in advance if my suggestion will not be correct, but based on the file names in your commands you are processing by Dada2 demultiplexed files, the same you used for cutadapt. Could you try to repeat the same Dada2 command but this time with trimmed reads you obtained after cutadapt ?
Thank you very much. I’ve been looking for a long time to find out what went wrong. It was a stupid mistake.
Hi @LiyingXie, I would like to add that since your primers contain IUPAC ambiguity bases, e.g. H, V, W. I would highly recommend that you add the following flags:
--p-discard-untrimmed. That is if the former flag is not enabled, only ATGC matches will be allowed, any ambiguous bases will count as a mismatch resulting in cutadapt in not “finding” the primer sequences, thus not removing it.
The latter flag, will discard any sequences in which you can not find the primer. I like to use this as a form of quality control, as it’ll remove sequences in which the primer could not be found. Otherwise, the sequence will remain in your data and can affect your downstream analysis.
You likely do not need
--p-match-read-wildcards, but it wouldn’t hurt to use that in too.
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.