Demultiplexing results in way fewer samples than I sequenced

I’m using qiime version 2018.11.0 on an Oracle VM Virtual box

I sequenced 89 samples using the EMP protocol on Miseq. I performed the extra steps required to convert the output to fastq as described here

and here

This resulted in 3 files, read 1, read 2 and index files. The read 1 and read 2 files are about 1.2GB big, and the index file is 188MB big. I did the following on qiime 2

qiime tools import
–type EMPPairedEndSequences
–input-path emp-paired-end-sequences-20191223
–output-path emp-paired-end-sequences.qza

This generated the folder emp-paired-end-sequences.qza

qiime demux emp-paired
–i-seqs emp-paired-end-sequences.qza
–m-barcodes-file soil_sample_metadata-20191223.tsv
–m-barcodes-column barcode-sequence
–o-per-sample-sequences demux.qza

This generated the folder demux.qza

After this, I did

qiime tools export --input-path demux.qza --output-path demux.fastq

But this generated a demux.fastq folder with only 8 samples instead of 89! And most these 8 files are only 360 bytes big. I don’t know if I’m doing anything wrong here. I called Illumina, but they weren’t able to help.

I would really appreciate any help!

Hi @kmz,
Could you please run qiime demux summarize on the demux.qza file? This will be useful for inspecting the results (and sharing here so we can inspect).

You should check out the barcode sequence reverse complementing options in demux emp-paired; I suspect your barcodes need to be reverse complemented in your sequence file, so you are losing sequences (and samples) during demux because they do not match the orientation in which the barcodes appear in your sample metadata file.

If your demux.qzv summary shows very few sequences in the 8 remaining samples, that would all but confirm that this is the issue.

Let us know if that solves this!

Hi @Nicholas_Bokulich

Thanks for your reply!

I did run the summarize on the .qza file. The summary shows very few sequences as well.

But I did this analysis about a year ago with a different sequencing run with the EMP protocol, and did not have to provide the reverse compliment of the barcodes. The analysis worked fine without that. Has something changed in Illumuna or the qiime analysis since then?

Some changes were made to the demux emp methods within the past year — not sure if those changes would have impacted the reverse complementing, but it is worth giving it a try. Based on your report it sounds like that is probably the issue.

I see. Is there a quick way to do it via qiime commands? Or do I have to make a new .tsv file with the reverse compliments of the 89 barcodes?

Yes: please see the help documentation. It is possible to reverse-complement with a single option.

That worked! Thank you very much!

1 Like