Loss of data in demux emp-single

I am losing the vast majority of my reads when demultiplexing. Over 90%. I am using the following code:

qiime demux emp-single \
--m-barcodes-file mapping.tsv \
--m-barcodes-column barcode \
--i-seqs imported_data_forward.qza \
--p-rev-comp-mapping-barcodes \
--o-per-sample-sequences demux_forward.qza \ 
--o-error-correction-details demux_error.qza 

I have seen other posts where the direction of the barcode can fix the issue but that is not the case here as far as I can tell.

I am running Qiime2 2022.2 using conda on a M1 Mac. Any help on this would be super appreciated! Thank you.

Hi @gaberunte,

Welcome to the :qiime2: forum!

It's hard to say for sure what is going on without looking at your barcodes file and associated sequences - but I suspect there may be some barcode mismatching going on if you are losing almost all of your reads when demultiplexing. I would run a quick manual inspection of a few barcodes within your barcode file to make sure they match up in your sequence file, that you're identifying the correct barcodes-column (i.e. that barcode is the correct column), and that the barcode sequence reads should in fact be reverse complemented prior to demultiplexing.

If all of that seems reasonable - if you're comfortable with sharing your barcodes file and sequence data (private message is fine if you'd prefer not sharing publicly on the forum), I can take a closer look and see if anything stands out on my end!

Cheers :lizard:

1 Like

Thank you for the reply!

I attempted to manually inspect a few barcodes and got a number of hits on each, though I am not certain that means everything is fit to go. When I run the demux, it is not that I get no hits, but that I get only a couple hundred reads per sample rather than tens of thousands. I have attempted to run in in both rev-com directions and --p-rev-comp-mapping-barcodes
returns the limited read could while --p-no-rev-comp-mapping-barcodes does not return anything. Thank you for any help with this.

The files are too large to upload here, but the sequences and qiime objects I have created are in this google drive folder!

Hi @gaberunte,

Apologies for the delay in response on this!

Thanks for providing all of your data - I took a look at your demux details, and did see that high read loss. I tried re-running demux emp-single with both --p-rev-comp-mapping-barcodes and --p-rev-comp-barcodes and the sequence retention was much higher (around 75%).

I double checked that golay error correction wasn't leaving out any sequences as well by re-running emp-single with just --p-no-golay-error-correction but only retained around 60% of the sequences.

It seems like the quality of your barcodes is why we're only seeing around a 75% sequence retention when demultiplexing - but with that being said, you should still be able to work with this vs. losing almost all of your reads.

So, just to recap - you'll see much higher sequence retention when running the following command:

qiime demux emp-single \
--m-barcodes-file mapping.tsv \
--m-barcodes-column barcode \
--i-seqs imported_data_forward.qza \
--p-rev-comp-barcodes \
--p-rev-comp-mapping-barcodes \
--o-per-sample-sequences demux_forward.qza \ 
--o-error-correction-details demux_error.qza

Hope this helps!

Cheers :lizard:

1 Like

Hi @gaberunte,

Another thing that you may want to do is reach out to your sequencing center to confirm what the correct orientation of your barcode sequences is, and whether or not they are Golay barcodes. It seems as though they are Golay and should be reverse complimented (based on the results from my previous message) - but it would be best to confirm with them so that you know for sure!

Cheers :lizard:

1 Like