Errors: Demultiplexing with Mapping File

Hello there! I am demultiplexing different projects and have used the command below:

qiime demux emp-paired
–m-barcodes-file 54386_mapping_file.txt
–m-barcodes-column BarcodeSequence
–p-rev-comp-mapping-barcodes
–i-seqs emp-paired-end-sequences.qza
–o-per-sample-sequences demux.qza
–o-error-correction-details demux-details.qza

And received the error:
Plugin error from demux:

Barcode header lines do not contain description fields but sequence header lines do.

Debug info has been saved to /tmp/qiime2-q2cli-err-jmp0ctkt.log

My mapping file seems to be in the correct format:
54386_mapping_file.txt (140.4 KB)

I have also used a similar command for a separate mapping file and sequences, and received a separate error:
Plugin error from demux:

Unable to allocate array with shape (91548818, 6) and data type object

Debug info has been saved to /tmp/qiime2-q2cli-err-72w7xt_t.log

I have checked in the forum and apparently this is an issue with memory space, but can confirm that memory is not an issue with my computer. If it helps, here is my mapping file with the 2nd error:
54434_mapping_file.txt (21.5 KB)

What do these error messages mean particularly? What are the solutions to them with my mapping files (or not)? Thank you and I appreciate any help!

The barcodes.fastq.gz file is improperly formatted. It needs to have 4-line fastq records:

@id 
AAAACCCCGGGG
+
[email protected](53DA

The error message indicates that you are either missing the first line, or, it is present, but empty.

This is an “out of memory” error. This means you do not have enough available memory on the computer to run this command.

2 Likes

Hi @thermokarst. Thank you for your response. I have checked the barcodes fastq and if I understood it correctly, it seems to have the same format which you stated. Here are the first few lines:

@DGZN8DQ1:549:H7C23BCXX:2:1101:1087:1870
CGTCGTATGAAT
+
@@@@@@@@@@@@
@DGZN8DQ1:549:H7C23BCXX:2:1101:1126:1870
TTTGCATCAGGG
+
@@@@@@@@@@@@
@DGZN8DQ1:549:H7C23BCXX:2:1101:1189:1870
CCGTCTATGTTT
+
@@@@@@@@@@@@

Is it still wrong, and something which I should check with the people who did the sequencing? Thank you again!

Okay. I have just compared this barcode fastq to others that worked. Having ‘@’ all the way is most likely the root of the problem. If that is the case, is there a way to demultiplex whilst disregarding quality score on the 4th line? Many thanks!

I have just used a separate package called ‘seqtk’ to replace the several “@” with “#”, giving the current lines for my barcodes.fast.gz:

@DGZN8DQ1:549:H7C23BCXX:2:1101:1087:1870
CGTCGTATGAAT
+
############
@DGZN8DQ1:549:H7C23BCXX:2:1101:1126:1870
TTTGCATCAGGG
+
############

However, I am still getting the same error about description fields in the sequence but not barcode header lines. How do I rectify this to do the demux successfully?

Thanks and apologies for the additional comments.

Hello. When using the command ‘demux emp-paired’, I received the error about sequence header lines containing ‘description fields’ but not barcode header lines.

The first few lines of my forward sequence file is:
@DGZN8DQ1:549:H7C23BCXX:2:1101:1087:1870 1:N:0:CGTCGTATGAAT

GANGGAGGATGCAAGTGTTATCCGGAATCACTGGGCGTAAAGCGTCTGTAGGTGGTTTGTTAAGTCAACTGTTAAATCTTGAGGCTCAACTTCAAATCAGCAGTCGAAACTATCAGACNNNNNNNNNNNNNNNNNNNNNNNNNNNTNNNG

+

DD#<<EHHIIIIEHIHIIIIIIIIIIIIIIIIIIIIGHIIIHHHGHIIIIHIGHIIIIIIIIIIIIIIIIHHHIIIIIIIIIIIIIIIIIIIIIIHIHEHHGHHIIIIIIIIHIIIHH################################

And for my barcode:

@DGZN8DQ1:549:H7C23BCXX:2:1101:1087:1870

CGTCGTATGAAT

+

############

If not mistaken, the main reason for this error is probably the 1st line of the barcode missing “1:N:0:CGTCGTATGAAT”. If this is correct, how do I make the barcode headers have corresponding description fields to the sequence headers? Cheers!

Hi @YinXun - please do not repost the same question across multiple topics - this is called cross-posting, and is a violation of our Code of Conduct. I have merged the new post back into the existing topic, for continuity.

Someone will help you soon. Thanks!

Hi @YinXun - the message I sent above still applies — your barcodes aren’t formatted correctly, specifically the ID headers don’t match the forward read headers. Specifically it sounds like there is a description field in your reads, but not in your barcodes:

@DGZN8DQ1:549:H7C23BCXX:2:1101:1087:1870 1:N:0:CGTCGTATGAAT this_is_a_description

vs

@DGZN8DQ1:549:H7C23BCXX:2:1101:1087:1870 1:N:0:CGTCGTATGAAT