How to demultiplex fastq file that still includes Barcodes and LinkerPrimer?


I have a single fastq file. This file includes the sequences of 19 samples from a single Illumina MiSeq run. Each sample has a different barcode (8 bp). After the barcode an additional LinkerPrimer is attached (17 bp), followed by the actual sequence. All sequences are orientated in the same forward direction. The quality score is written in phred33 format.

I wanted to use a “Fastq manifest” to import the fastq, however I think this desires demultiplexed data with a single fastq file for each sample?

I further thought about using the “EMP-protocol for multiplexed single-end fastq”, however for this protocal I have to seperate somehow the barcodes from the sequences in an additional file (and remove the LinkerPrimer).

What would be the best way to import these data in qiime2?

Thanks in advance,

1 Like

Hi @Martin,
Thanks for posting! To be clear, the presence of the linkerprimer is not a problem (when denoising with dada2 you can just set the --p-trim-left parameter to equal the length of the linkerprimer to remove it); however, the presence of the barcode is.

In a future release of QIIME 2 we plan to add support for extracting barcodes from reads in the way that you describe; for now, you will need to use (a qiime1 command) or other external scripts to extract the barcodes into a new fastq file before importing to QIIME 2.

The good news is that once that barcode extraction occurs you will have separate forward read and barcode read fastq files, which can then be imported and processed using the EMP protocols!

I hope that helps! Let us know if you still have trouble after extracting barcodes.

Excelent! This worked perfect.Thanks for the quick reply.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.

QIIME 2 2017.12 has a new cutadapt plugin which provides demux-single and demux-paired for demultiplexing reads where the barcodes are included within your reads!

1 Like