Questions on demultiplexing using

Hello, I'm new to Bioinformatics and this is the 1st time I'm using Qiime2. What I'm working on is demultiplexing a paired-end raw data set. Here is an example of what my barcodes.fastq.gz have:

@M01522:221:000000000-BRWK5:1:1101:20002:1874 1:N:0:CCTTGA

According to my understanding about Fastq files, the CCTTGA (6 nucleotides) is where the barcode should be. But in my sample-metadata.tsv file, the barcode is CAGTTCAT (8 nucleotides). At the same time, when I export the demultiplex-seqs.qza file after demux to see what my sequence looks like, I found out that my sequences had this form of name (the barcode when directly into the name):


And inside the file, it looked like this:

@M01522:221:000000000-BRWK5:1:1101:15205:1891 1:N:0:CCTTGA

The visualization of Per-sample sequence counts looks very weird (some sample has very large amount of reads (S17 with 600000+ reads) while some has very few (S6 with 5000+)

I don't understand what exacly what this CCTTGA is and I don't know if there was something wrong with my barcodes or sample-metadata files. Please enlight me! Thank you!

Hello @Minh_Tr_n,

Did you demultiplex using qiime2? If so, can you post the command you used?

Thanks for the quick response!
Here are the commands and files that I used for my demux:


mkdir muxed-pe-barcode-in-seq

qiime tools import
--type MultiplexedPairedEndBarcodeInSequence
--input-path muxed-pe-barcode-in-seq
--output-path multiplexed-seqs.qza

qiime cutadapt demux-paired
--i-seqs multiplexed-seqs.qza
--m-forward-barcodes-file sample-metadata.tsv
--m-forward-barcodes-column BarcodeSequence
--p-error-rate 0.125
--o-per-sample-sequences demultiplexed-seqs.qza
--o-untrimmed-sequences untrimmed.qza

qiime demux summarize
--i-data demultiplexed-seqs.qza
--o-visualization demultiplexed-seqs.qzv


Reads and Barcodes:

Visualization after demux

demultiplexed-seqs.qzv (321.3 KB)

Hello @Minh_Tr_n,

From the command line help text for cutadapt demux-paired:

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes
are expected to be located within the sequence data (versus the header, or a
separate barcode file).

I believe that if you have a separate barcode file demux emp-paired is the recommended command.

Sorry for not mentioning, my file is not in EMP format. The primers I used are these 2 and they are just common none-EMP-primers.


That's why I used demux paired-end, not emp paired-end commands.

Hello @Minh_Tr_n,

Although you may have not used the primers published in the protocol, I believe that your data is in a compatible format for the demux emp-paired action--give it a try and see if it works. You may have to turn off golay error correction. You can follow along with the first step from this tutorial to do so.


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.