I used cutadapt to search a set of single-ended demultiplexed sequences for my specified adapters using the forward primer sequence
(--p-front), the reverse complement of the reverse primer sequence
(--p-adapter), while adding a few other parameters to allow for IUPAC nucleotide reading
(--p-match-read-wildcards), and with default options for chimera checking. I also discarded all untrimmed sequences (i.e. reads in which the specified adapter was not found).
Based on the sequence quality scores above, I used
dada2 denoise-single to filter and dereplicate my sequences, and remove chimeras from them. I first used a 235 threshold for
--p-trunc-len, which made me lose a very large number of sequences in the first step, so I decided to trim more sequences by lowering that threshold to 225, and I ended up losing significantly less sequences in the filtration step. However, I still end up with less than half of my sequences after chimeras checking and removal. So, finally, I adjusted the expected error rate from 2 to 5, but this did only a little to improve my results.
- One of the suggested explanations on the forums (loss of reads after DADA2 as chimeras) for this issue is DADA2's struggle to deal with non-biological sequences (e.g. adapters/barcodes). In my amplicons, the barcode sequence proceeded the linker primer sequence. Will the barcode sequence be removed from my sequences with cutadapt with the parameters I specified (see below)? If not, how do I remove it after the cutadapt step?
qiime cutadapt trim-single --i-demultiplexed-sequences single-end-demux.qza --p-front [NGS-adapter+LinkerPrimerSequence] --p-adapter [ReversePrimerSequence+NGS-adapter] --p-match-read-wildcards --p-match-adapter-wildcards --p-discard-untrimmed --o-trimmed-sequences single-end-demux-cutadapt-trimmed.qza
- Is the
--p-trunc-leftnecessary in my case, based on the graph results?
- If I were to experiment with
--p-min-fold-parent-over-abundance, would using a value of 2 or 3 be plausible?
I appreciate your time and support