Loosing reads when demultiplexing with cutadapt

Hi all!

I am using Cutadapt 3.2 with QIIME2 version 2021.2. I am demultiplexing some 16S paired end data using the following command:

qiime cutadapt demux-paired --i-seqs pool1-multiplexed-seqs.qza --m-forward-barcodes-file Pool1.tsv --m-forward-barcodes-column seq_barcode_F --m-reverse-barcodes-file Pool1.tsv --m-reverse-barcodes-column seq_barcode_R --p-error-rate 0 --o-per-sample-sequences 3_pool1-demultiplexed-seqs.qza --o-untrimmed-sequences 3_pool1-untrimmed.qza --verbose

After doing this, I wanted to perform the quality filters with trimmomatic instead of using cutadapt, so exported the data and found out that many samples had 0 reads (145/170) :frowning_face:

Maybe I am doing something wrong, but have no clue. I would really appreciate any help!

Note: after runnig cutadapt demux-paired, I get this error:

WARNING: Adapter 'ACACACAC' (regular 5') was specified multiple times! Please make sure that this is what you want.

note2: As I had different pools for all my sequences, I had to demultiplex each appart. Here I only attach the first one, which contains only 34 samples.

Pool1.tsv (1.8 KB)

2_pool1-demultiplexed-seqs.qzv (316.3 KB)

1 Like

Hello Nickole,

Thank you for your detailed post. I'm not 100% sure what's causing this issue, but hopefully we can find some clues and find your missing reads. :female_detective: :mag_right:

I noticed that all the samples that had reads, had agactatg in the seq_barcode_R column. This makes me think that something is messed up with barcodes. Maybe they are reversed, reversed complimented, or just copied from the wrong file?

Let me know if you find out more about the barcodes, or any other clues!

Other notes:

Because it's a warning (and not an error), you can continue with your analysis as long as that's what you expected!

This is the way :+1:

1 Like

Hi Colin! Thanks for your response

I also think there is something wrong with the barcodes, but not sure what can it be, I tried to replace the barcodes in the metadata with the reverse and the complimented, but still doesnโ€™t work :frowning_face:

I also checked for the reverse barcodes and they are actually present on the data, not sure how to proceed :pleading_face:

Here I placed the data from pool1, maybe we can find something with this. Thank you again!

1 Like

We could follow up on this clue. Where did you see them? (R1 or R2, near the start or end, etc.)

It's possible something is wrong with cutadapt, or we are running into one of its limitations.

The reverse barcodes were in the reverse.fastq file.

I found something curious, the reads with the only reverse barcodes that worked, had the barcode in the middle of the sequences, but the others (the ones that didn't work), are at the start.. here goes an example:

Barcode that worked:

@M03485:62:000000000-K26TC:1:1101:16108:8835 2:N:0:1

Barcode (I think) didn't work:

1>AAA1>1A1AFGGCCG0EF0E?00GHGHC/EF1DFDFGGFFHGCBCE?/[email protected]/EFFB222FGB>FG12?E/EEFGHGHGD1GHHF1B/1<@@CG-<---;;:000::..://../0000000;A.9----./0;9.;-9-//99/[email protected]>--;//////;;/----9///-;;9;9--;--///---;--9--//9---------/---99---://///9/-;--/-
@M03485:62:000000000-K26TC:1:2114:27265:13626 2:N:0:1

Hope is't something we can solve!

1 Like

I wonder if the reads or the barcodes have been flipped...

For the barcode that did work (ACACATGT, near the end), I see multiple hits to the reverse-complement (ACATGTGT) in the seq_barcode_R. This makes sense, as it's near the end of your read and in the seq_barcode_R column.

For the barcode that didn't work (ACGACGAG, at the start), I see an exact match to that in the seq_barcode_R column. (I don't see any matches to the reverse complement, the reverse, or the complement.)

Maybe this should be in the seq_barcode_F column? :thinking:

Something weird happed :face_with_monocle: I fliped the barcodes when demultiplexing again as you suggested and I got not 5 as before, but 8 samples with reads, but still missing a lot :pleading_face: here I attacht the .qzv

This makes me think that something is actually wrong with the barcodes... but they just sent me this table in excel to compare:

Primers_barcodes.csv (1.7 KB)

flipped_pool1-demultiplexed-seqs.qzv (313.9 KB)

Yeah, something is still wrong up here. After flipping the barcodes, the 8 samples that demultiplexed all have acacacac in the seq_barcode_F column. (or perhaps in the seq_barcode_R column if that's how you flipped it)

This gives me less confidence that either way is correct :scream_cat:

Was the sequencing center able to successfully demultiplex your samples? It's possible we are still missing something, but if they are using a custom protocol, they may have a way to solve this problem. If so, they could send you pre-split results with each sample in a pair of fastq files, and we could take it from there!

It was a new protocol at the sequencing center :frowning:

Finally I couldn't demultiplex it with Qiime, so used cutadapt instead, the one that is not installated in qiime2.

Thanks for your help!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.