demux with very little reads

Hello,

I have a dataset I ran with qiime2 in April 2018. When I ran demux, it went fine and I got 78,000 mean reads/sample. I am going back over that dataset and ran demux today and got an average of 2,000 reads/sample for the same dataset. I am trying to figure out where I went wrong.

I have the command I ran in April 2018 (qiime2-2018.2):

qiime demux emp-paired
–m-barcodes-file /Volumes/Franklin/HB_MTC/MiSeq01/MiSeq01_manifest.tsv
-m-barcodes-column BarcodeSequence
–i-seqs /Volumes/Franklin/HB_MTC/MiSeq01/qiime_import/MiSeq01.qza
–output-dir /Volumes/Franklin/HB_MTC/MiSeq01/demux

And here is the one I ran today(current version- 2019.4):

qiime demux emp-paired
–m-barcodes-file MiSeq01.tsv
–m-barcodes-column BarcodeSequence
–p-rev-comp-mapping-barcodes
–i-seqs 01.qza
–o-per-sample-sequences demux.qza
–o-error-correction-details demux-details.qza

I tried removing --p-rev-comp-mapping-barcodes but that causes a plugin error (No sequences were mapped to samples. Check that your barcodes are in the correct orientation). My tsv was checked by Keemi and deemed good. I have a 2.48 GB .qza input file. I also tried removing the inherent error correction by golay to see if that was the difference and I got less reads.

Any thoughts/suggestions?

2 Likes

Hi @hbussan,

That is very strange. I would probably start by verifying we are using the same source data. You should be able to use the file named MiSeq01.qza in place of 01.qza (if that still exists). I would expect to see the same results (if not more as a result of Golay correction).

Otherwise, we may be able to recover some file checksums from the provenance of any downstream visualization or artifact from your 2018 analysis and compare that to 01.qza.

2 Likes

So I ran the 2019 demux on the MiSeq01.qza from 2018 and got 2,000 reads/sample again. Has this been a similar problem for others?

I am using Earth Microbiome V4 primers and I made the import file using forward, reverse, and index so I used emp-paired. The import artifact I made is the exact same size as the previous. I have plenty of processors and memory, and I have my activity monitor up to make sure there’s no pressure. Is there a time out feature on demux that will stop it after a certain amount of time?

@hbussan, I’ve been demuxing paired-end MiSeq V4 data on 2018.8, 2019.1, and 2019.4 recently with no issues, but I’m just one data point. :slight_smile:

Have you had any luck verifying your source data, as @ebolyen suggested? You can use qiime tools peek or look at the peek or provenance tabs of your artifacts in qiime2view to confirm that the UUIDs of your data files are the same. This may not be the issue, but it’s a good place to begin troubleshooting.

demux doesn’t have a built-in timeout. How long is the command running for you? (You can also check this in the provenance tab at qiime2view).

@ChrisKeefe Thanks for the reply! I tried running demux on the 2018 import artifact and got the same result from as when I ran the 2019 import with 2019.4. I am currently going back to different versions and might just try a re-install. I’ll check out the tabs you mentioned!

@ebolyen I confirmed the import artifacts are the same. I also reinstalled. I re-ran both using 2018.2 and got 78,000 reads/sample again on average. Any further suggestions?

I have another update - I re-ran the 2018.2 data. When I ran it like this:

qiime demux emp-paired --m-barcodes-file MiSeq01_2019/MiSeq01_manifest.tsv --m-barcodes-column BarcodeSequence --i-seqs MiSeq01_2019/MiSeq01_import.qza --output-dir MiSeq01_2019/demux

I get a 572 MB .qza files that have 78,000 reads/sample.
When I run this (specifying o per sample sequence rather than output):

qiime demux emp-paired --m-barcodes-file MiSeq01_2019/MiSeq01_manifest.tsv --m-barcodes-column BarcodeSequence --i-seqs MiSeq01_2019/MiSeq01_import.qza --o-per-sample-sequences MiSeq01_2019/demux

I get at 53 KB file with only three samples demultiplexed with 1-50 reads each.

Its also weird to me that when I run with reverse complement barcoding on 2018.2 I get no mapping, but when I must run with reverse complement barcoding on 2019.4 otherwise without I get zero reads.

:woman_shrugging:

1 Like

I seem to be having the same problem. I ran demux emp-paired under 2019.4 against a reads.qza I created with 2018.11 and got a ‘No sequences were mapped to samples’ error. I regenerated the reads.qza in 2019.4 (which worked fine) and the demux failed again. I then switched back to 2018.11 and demux worked fine against both the original reads.qza and the one I created under 2019.4 I am using the same metadata file in all cases and the only difference in the demux command is including --o-error-correction-details under 2019.4