qiime demux emp-paired error with 4 input files: R1 R2 I1 I2, Barcode header lines do not contain description fields but sequence header lines do.

I am running Qiime 2 version 2019.1.0.

I have been working with Illumina R1, R2, I1 and I2 files as my input data. I have reviewed this post
https://forum.qiime2.org/t/dmx-with-4-initial-read-files-i1-i2-r1-r2/4190/3 and have performed the following:

Merged the I1 and I2 index files into one by simply concatenating the sequence and quality lines. For example, using this I1 read and I2 read:

    @M70271:95:000000000-C7WK3:1:1101:11188:1742 1:N:0:1
    TCGACGTC
    +
    -,,,8+8,

    @M70271:95:000000000-C7WK3:1:1101:11188:1742 2:N:0:1
    TCTTCTTT
    +
    -,,,,6,6

I create this merged read:

    @M70271:95:000000000-C7WK3:1:1101:11188:1742
    TCGACGTCTCTTCTTT
    +
    -,,,8+8,-,,,,6,6

I removed all primers from my R1 and R2 reads using cutadapt.

I created a metadata file and verified that it is properly formatted using keemei. My metadata file looks like this:
#SampleID BarcodeSequence
83WellA1-16S TAAGGCGACTCTCTAT
83WellA2-16S CGTACTAGCTCTCTAT
83WellA3-16S AGGCAGAACTCTCTAT

I then imported my forward.fastq.gz, reverse.fastq.gz and merged barcodes.fastq.gz using this command:
qiime tools import
–type EMPPairedEndSequences
–input-path /test/trimmed_files
–output-path /test/trimmed_files/emp-pair-seqs.qza

Next I ran qiime demux emp-paired with this command:

qiime demux emp-paired
–i-seqs /test/trimmed_files/emp-pair-seqs.qza
–m-barcodes-file /test/trimmed_files/Qiiime_multiplexed_barcode_mapping_file.txt
–m-barcodes-column BarcodeSequence
–o-per-sample-sequences demux.qza
–verbose

And that produces this error:

Traceback (most recent call last):
File “/hpc/apps/qiime2-2019.1/install/lib/python3.6/site-packages/q2cli/commands.py”, line 274, in call
results = action(**arguments)
File “</hpc/apps/qiime2-2019.1/install/lib/python3.6/site-packages/decorator.py:decorator-gen-422>”, line 2, in emp_paired
File “/hpc/apps/qiime2-2019.1/install/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable
output_types, provenance)
File “/hpc/apps/qiime2-2019.1/install/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 365, in callable_executor
output_views = self._callable(**view_args)
File “/hpc/apps/qiime2-2019.1/install/lib/python3.6/site-packages/q2_demux/_demux.py”, line 331, in emp_paired
for barcode_record, forward_record, reverse_record in seqs:
File “/hpc/apps/qiime2-2019.1/install/lib/python3.6/site-packages/q2_demux/_demux.py”, line 194, in iter
'Barcode header lines do not contain description fields ’
ValueError: Barcode header lines do not contain description fields but sequence header lines do.

Plugin error from demux:

Barcode header lines do not contain description fields but sequence header lines do.

See above for debug info.

At this point, I do not see what the issue is. What could be the problem here?

the barcode header lines do not match the sequence header lines. What do the corresponding sequence header lines look like?

My sequence headers look like this:

@M70271:95:000000000-C7WK3:1:1101:11188:1742 1:N:0:1

@M70271:95:000000000-C7WK3:1:1101:11188:1742 2:N:0:1

The only difference between the sequence headers and the merged barcode header is that the characters after the space are missing. I left them off when merging the barcode sequences because it doesn't seem to make sense to keep them. If I did keep them, which one would I keep? The R1 header ending or the R2 header ending? Or should I make a new header ending?

I think that may be the mistake. I am not 100% sure, but I think the EMP format probably expects the barcode sequences to match the header lines of the forward reads. Do you want to give that a try and let me know what happens?

Thank you for taking the time to look at this. I got it working but had already started on my solution before your most recent reply.
I ended up adding this ending to all of my concatenated barcode fastq records: 3:N:0:1. This is very similar to what you suggested (add 1:N:0:1 to the barcodes header line so it matches the R1 reads).

I saw a reference to barcode R3 files in this post https://forum.qiime2.org/t/clarify-on-demux-emp-paired/5565 which mentioned

(barcodes should be in R3 ). It will use the order of R3 to work out which sequences from R1 and R2 belongs to which samples.

1 Like