Hello,
I have a very similar issue. I have been checking several posts on the Qiime2 forum related to removing primers and using dada2. I’m currently working on my paired-end demultiplexed MiSeq primers 341F & 785R ( V3-V4 16S region) with DADA2. I am only working with one sample first, because I want to set the best parameters, and after that I will run all my samples (almost 100 samples).
I started using DADA2 to also remove my primers (forward length 17 and reverse length 21).
With the primers that I am working, my truncating length should not go over 116 bp (785-341=444bp; 2x300reads=600; 600-444=156bp - 40 bp (20bp minimum overlap required + 20bp natural variation) =116 bp) (which was suggested in other post).
So, I decided to do the following:
qiime dada2 denoise-paired
--i-demultiplexed-seqs demux-paired-end.qza
--p-trim-left-f 17
--p-trim-left-r 21
--p-trunc-len-f 283
--p-trunc-len-r 210
--o-table table.qza
--o-representative-sequences rep-seqs.qza
--o-denoising-stats denoising-stats.qza
However, I was getting very low percentage of input non-chimeric (~20%).
So, after reading more posts I decided to use cutadapt to remove my primers, so I did:
qiime cutadapt trim-paired
--i-demultiplexed-sequences demux-paired-end.qza
--p-front-f CCTACGGGNGGCWGCAG
--p-front-r GACTACHVGGGTATCTAATCC
--o-trimmed-sequences trim-paired-demux.qza
--verbose
And I got this results:
After this result, I decided to run DADA2. I tried several options, 7 different combinations of parameters, but the best percentage I got was when I used the following parameters):
qiime dada2 denoise-paired
--i-demultiplexed-seqs trim-paired-demux.qza
--p-trim-left-f 60
--p-trim-left-r 35
--p-trunc-len-f 275
--p-trunc-len-r 200
--p-max-ee-f 5
--p-max-ee-r 5
--o-table table6-maxee.qza
--o-representative-sequences rep-seqs6-maxee.qza
--o-denoising-stats denoising-stats6-maxee.qza
--verbose
However, I still have some questions:
- If I removed my primers using cutadapt, why do I still get some empty space to trim on the sequences? is that normal?, Do you think the parameters I used for DADA2 after cutadapt are ok (--p-trim-left-f 60 --p-trim-left-r 35) ?
- Why do I still get so low percentage (around 30%)?, is there any specific percentage of input non-chimeric that we consider as good or bad ?
- On my last parameters, I truncated 275 forward and 200 reverse, which means I exceeded 116 bp (I cut 125 bp), does that mean I truncated it too much?
Thank so much,
Any suggestion will be appreciate it.