Hello everyone,
I apply qiime2 to my 16S rRNA sequencing data, which is an amplification of the V4 region with a primer pair of "GTGCCAGCMGCCGCGGTAA" and "GGACTACHVGGGTWTCTAAT". Whereas I use the cutadapt plugin to remove the primer with different commands below, some primers still remain in the middle of sequences.
The commands I have tested are as follows:
qiime cutadapt trim-paired --p-cores 20 --i-demultiplexed-sequences paired-end-demux.qza --p-front-f GTGCCAGCMGCCGCGGTAA --p-adapter-f TTACCGCGGCKGCTGGCAC --p-front-r GGACTACHVGGGTWTCTAAT --p-adapter-r ATTAGAWACCCBDGTAGTCC --o-trimmed-sequences paired-end-demux_de_primer.qza
qiime cutadapt trim-paired --p-cores 30 --i-demultiplexed-sequences paired-end-demux.qza --p-front-f GTGCCAGCMGCCGCGGTAA --p-adapter-f TTACCGCGGCKGCTGGCAC --p-front-r GGACTACHVGGGTWTCTAAT --p-adapter-r ATTAGAWACCCBDGTAGTCC --p-anywhere-f GTGCCAGCMGCCGCGGTAA --p-anywhere-r GGACTACHVGGGTWTCTAAT --o-trimmed-sequences paired-end-demux_de_primer.qza
qiime cutadapt trim-paired --p-cores 80 --i-demultiplexed-sequences paired-end-demux.qza --p-front-f GTGCCAGCMGCCGCGGTAA --p-front-r GGACTACHVGGGTWTCTAAT --p-match-read-wildcards --p-match-adapter-wildcards --o-trimmed-sequences paired-end-demux_de_primer.qza
The result is subjected to DADA2.
qiime dada2 denoise-paired --p-n-threads 20 --i-demultiplexed-seqs paired-end-demux_de_primer.qza --p-trunc-len-f 0 --p-trunc-len-r 0 --o-table dada2_table.qza --o-representative-sequences dada2_rep_set.qza --o-denoising-stats dada2_stats.qza
denoising statistics
qiime metadata tabulate --m-input-file dada2_stats.qza --o-visualization dada2_stats.qzv
qiime feature-table tabulate-seqs --i-data dada2_rep_set.qza --o-visualization rep-seqs.qzv
The length of merged sequences is supposed to be ~250 bp, but some merged sequences are longer than 300 bp. I found that the primer "GTGCCAGCMGCCGCGGTAA" still remained in some sequences, like the case below.
>b4559da283e0265b85157cd7532d5f37
TTAGAAACCCTTGTAGTCCATTGGCGTACG***GTGCCAGCCGCCGCGGTAA***TACGTAGAAGACTAGTGTTAATCATCTTTATTAGGTTTAAAGGGTACCTAGACGGTAAATTAAACTCTAAATGAGTACTTGTTTACTAGAGTTTTATGTAAGGAGGAAGAATTTCTGGAGTAGTGATTTAATATGAATAATCTCAGAGAGACTGGTAACGGCGAAGGCATCCTTCTATGTAAAAACTGACGTTGAGGGACGAAGGC
Can anyone provide suggestions on how to address this issue? Thanks.