How to remove primer with different length

Mehrbod_Estaki · May 13, 2019, 1:35am

Hi @Lei,
These heterogeneity spacers a great idea that are unfortunately underused in my opinion in the field. As so, there isn't a way to demultiplex these variable length barcodes in Qiime2 yet, though it has certainly been brought up before and is somewhere in the developers' radar, though I wouldn't hold my breath for this, I don't think its too high on the priority list.
That being said, you can demultiplex these outside of qiime2 with other tools like mothur and bcl2fastq and @nounou's custom code which may be of help. You can even hack something in Qiime2 by demultiplexing your samples based on the barcodes, then separate the samples into groups based on the # of extra Ns they have, then use cutadapt 3 times separately to remove the 3 variations of the the primers+spacers.
There is likely more options out there too but those should get you started. Ultimately, it is important that you do remove these before running DADA2 as otherwise you are going to call alot of incorrect ASVs since:

   GGTTCCAA
  NGGTTCCAA
 NNGGTTCCAA
NNNGGTTCCAA

Will be called 4 different features even though they should all be the same.
Good luck!