Replicates created after denoising

I am working with short amplicons and consequently I ended up with the 5' primer, the adapter and even random nucletodes (b/c it reached the end of the adapter), in the reads. My approach is to first run denoising with DADA2 - with reasonable trimming based on quality scores - and then cut off the non-biological sequences with cutadapt. In the rep seqs file I now have multiple ASVs that have the exact same sequence. They were originally considered unique because denoising was done while the non-biological sequences were still attached. Any recommendations on how I could collapse these identical reads? I suspect I could have taken care of all of this by demultiplexing with cutadapt, but I couldn't quite figure out the commands. Is there an easy fix here? Thanks for any ideas!

I think, better approach would be:

  1. Demux multiplexed fastq files with cutadapt
  2. Remove primers/adapters with cutadapt as well
  3. Denoise and merge with Dada2.

Primers in sequence could also affect denoising step so just removing duplicates in rep-sequences and feature table will not fix it without biases.

There are great tutorials among the docs to help you with corresponding commands. If you still encounter problems even with tutorials just create a new post to troubleshoot it.


OK, I will give it a try! Is the idea to do steps 1 & 2 separately or all in the same command?

It is two different steps and plugins for denoising and primers removal are different, although both of them use cutadapt.

1 Like

It looks like the barcodes need to be in the sequence. Is there a way to do it if the bardcodes are in a separate file?

Asking this question another way: Is there any reason I shouldn't use "qiime tools import" for the demux and then use cutadapt trim on the demux.qza file?

1 Like

Sorry, I am a little bit lost.

So your files already demultiplexed?
You need to demuliplex with cutadapt before primer removal and denoising if you have only 2 fastq files (forward and reverse) for all samples together.
If you have paired reads fastq files for each sample separately, that's mean it is already demultiplexed and after import you can remove primers as the first step and then proceed to dada2.

1 Like

Correct, they are already demultiplexed so we are on the same page. I gave it a try - using cutadapt from within Qiime - and the data looks much better now. Thanks! One issue is that cutadapt is not removing my forward primer. I will create a separate topic about this. Thanks again for your help!


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.