demux with very little reads

hbussan · June 10, 2019, 5:31pm

Hello,

I have a dataset I ran with qiime2 in April 2018. When I ran demux, it went fine and I got 78,000 mean reads/sample. I am going back over that dataset and ran demux today and got an average of 2,000 reads/sample for the same dataset. I am trying to figure out where I went wrong.

I have the command I ran in April 2018 (qiime2-2018.2):

qiime demux emp-paired
--m-barcodes-file /Volumes/Franklin/HB_MTC/MiSeq01/MiSeq01_manifest.tsv
-m-barcodes-column BarcodeSequence
--i-seqs /Volumes/Franklin/HB_MTC/MiSeq01/qiime_import/MiSeq01.qza
--output-dir /Volumes/Franklin/HB_MTC/MiSeq01/demux

And here is the one I ran today(current version- 2019.4):

qiime demux emp-paired
--m-barcodes-file MiSeq01.tsv
--m-barcodes-column BarcodeSequence
--p-rev-comp-mapping-barcodes
--i-seqs 01.qza
--o-per-sample-sequences demux.qza
--o-error-correction-details demux-details.qza

I tried removing --p-rev-comp-mapping-barcodes but that causes a plugin error (No sequences were mapped to samples. Check that your barcodes are in the correct orientation). My tsv was checked by Keemi and deemed good. I have a 2.48 GB .qza input file. I also tried removing the inherent error correction by golay to see if that was the difference and I got less reads.

Any thoughts/suggestions?

ebolyen · June 10, 2019, 8:32pm

Hi @hbussan,

That is very strange. I would probably start by verifying we are using the same source data. You should be able to use the file named MiSeq01.qza in place of 01.qza (if that still exists). I would expect to see the same results (if not more as a result of Golay correction).

Otherwise, we may be able to recover some file checksums from the provenance of any downstream visualization or artifact from your 2018 analysis and compare that to 01.qza.

hbussan · June 11, 2019, 4:35pm

So I ran the 2019 demux on the MiSeq01.qza from 2018 and got 2,000 reads/sample again. Has this been a similar problem for others?

I am using Earth Microbiome V4 primers and I made the import file using forward, reverse, and index so I used emp-paired. The import artifact I made is the exact same size as the previous. I have plenty of processors and memory, and I have my activity monitor up to make sure there's no pressure. Is there a time out feature on demux that will stop it after a certain amount of time?

ChrisKeefe · June 11, 2019, 5:27pm

@hbussan, I've been demuxing paired-end MiSeq V4 data on 2018.8, 2019.1, and 2019.4 recently with no issues, but I'm just one data point.

Have you had any luck verifying your source data, as @ebolyen suggested? You can use qiime tools peek or look at the peek or provenance tabs of your artifacts in qiime2view to confirm that the UUIDs of your data files are the same. This may not be the issue, but it's a good place to begin troubleshooting.

demux doesn't have a built-in timeout. How long is the command running for you? (You can also check this in the provenance tab at qiime2view).

hbussan · June 11, 2019, 7:02pm

@ChrisKeefe Thanks for the reply! I tried running demux on the 2018 import artifact and got the same result from as when I ran the 2019 import with 2019.4. I am currently going back to different versions and might just try a re-install. I'll check out the tabs you mentioned!

hbussan · June 11, 2019, 9:03pm

@ebolyen I confirmed the import artifacts are the same. I also reinstalled. I re-ran both using 2018.2 and got 78,000 reads/sample again on average. Any further suggestions?

hbussan · June 12, 2019, 6:32pm

I have another update - I re-ran the 2018.2 data. When I ran it like this:

qiime demux emp-paired --m-barcodes-file MiSeq01_2019/MiSeq01_manifest.tsv --m-barcodes-column BarcodeSequence --i-seqs MiSeq01_2019/MiSeq01_import.qza --output-dir MiSeq01_2019/demux

I get a 572 MB .qza files that have 78,000 reads/sample.
When I run this (specifying o per sample sequence rather than output):

qiime demux emp-paired --m-barcodes-file MiSeq01_2019/MiSeq01_manifest.tsv --m-barcodes-column BarcodeSequence --i-seqs MiSeq01_2019/MiSeq01_import.qza --o-per-sample-sequences MiSeq01_2019/demux

I get at 53 KB file with only three samples demultiplexed with 1-50 reads each.

Its also weird to me that when I run with reverse complement barcoding on 2018.2 I get no mapping, but when I must run with reverse complement barcoding on 2019.4 otherwise without I get zero reads.

Chris_Hemmerich · June 18, 2019, 7:02pm

I seem to be having the same problem. I ran demux emp-paired under 2019.4 against a reads.qza I created with 2018.11 and got a 'No sequences were mapped to samples' error. I regenerated the reads.qza in 2019.4 (which worked fine) and the demux failed again. I then switched back to 2018.11 and demux worked fine against both the original reads.qza and the one I created under 2019.4 I am using the same metadata file in all cases and the only difference in the demux command is including --o-error-correction-details under 2019.4

WrigleyS · June 25, 2019, 12:33am

I have been having a similar problem as well since upgrading to 2019.4. We pooled samples for a few studies with colleagues. I ran our data first, successfully demuxing on the 2019.1 version of Qiime2.

However, once I was done with our data, I upgraded to the 2019.4 version of Qiime2 and have not been able to demux since. I spent time troubleshooting via the forums, such as this post here that suggests the error could have been files being swapped such as barcodes and forward reads. However, I checked this and that is not the case.

Additionally, as a final troubleshooting step, I went back to my data that was previously successfully demuxed on 2019.1, using the same metadata file that was originally used as well as the same emp-paired-end-sequences file that successfully demuxed, and got the same error code below via demux on 2019.4.

I am trying to reinstall 2019.1 now to see if that solves the issue......

Mehrbod_Estaki · June 25, 2019, 6:53am

Hi @Chris_Hemmerich and @WrigleyS,
Is it possible that you are running demux with the new default of golay barcodes error correction when your barcodes are in fact not golay? In other words try demuxing with --p-no-golay-error-correction parameter which was the default in version 2019.1 before golay error correction was added in 2019.4.

WrigleyS · June 25, 2019, 11:26pm

@Mehrbod_Estaki

The above code you suggested worked! We are using Earth Microbiome Project primers with the Golay barcodes, so I'm not totally sure why this was occuring. However, for now I am just happy it worked!

Thank you for the guidance!

-Scott

Mehrbod_Estaki · June 26, 2019, 1:40am

Glad you got it working! I'm not exactly sure why it wasn't working but stay tuned as there may be some internal issues with the error-correction option that is currently under investigation.

Chris_Hemmerich · June 26, 2019, 4:36pm

Thanks, that fixed things for me as well!

ebolyen · July 1, 2019, 6:03pm

@hbussan does disabling Golay correction result in more reasonable read counts?

It's starting to sound like a lot of people have protocols that are almost, but not quite, entirely unlike EMP.

ChristianEdwardson · August 5, 2019, 2:51am

Jumping in kind of late here, but saw a potentially similar issue and resolution on another forum post and thought it might be helpful here.

You may want to try adding '--p-rev-comp-barcodes' to your command.

system · September 5, 2019, 8:51am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.