2019.1 vs 2018.11 Speed

kris-budd · February 11, 2019, 6:48pm

Hey Guys, I'm just wondering if anyone is experiencing a huge speed difference between 2018.11.0 and 2019.1. I'm running nearly the same code on same data set with the only big difference being the update from 2018.11.0 to 2019.1. I also am working through the VirtualBox so the update that went along with that to handle the qiime update from VirtualBox v5.2.12 to v5.2.26. Allocating 5.5 GB of base memory for both.
All the plugins cooresponded to 2018.11 or to 2019.1 as well.

Ran in a few hours on 2018.11:::
qiime deblur denoise-16S
--i-demultiplexed-seqs demux-joined-filtered.qza
--p-trim length 292
--p-sample-stats
--o-representative-sequences rep-seqs.qza
--o-table table.qza
--o-stats deblur-stats.qza

Is going on 4 days on 2019.1 (Still Running):::
qiime deblur denoise-16S
--i-demultiplexed-seqs demux-joined-filtered.qza
--p-trim length 252
--p-sample-stats
--o-representative-sequences rep-seqs.qza
--o-table table.qza
--o-stats deblur-stats.qza

Data set info:
Sequencing of 16S rRNA from 48 samples produced 23,674,488 reads with a mean of 493,218.5 reads per sample (median 255,112.5; minimum 10,639; maximum 11,395,868). Following joining of paired reads with vsearch, we obtained 20,242,213 joined reads with a mean of 421,712.8 reads per sample (median 217,426.5; minimum 8,252; maximum 9,835,668).

Any thoughts on why the significant time difference?

wasade · February 11, 2019, 7:34pm

Hi @kris-budd,

Are you able to monitor top or htop and see what processes seem to be running? Behind the scenes, Deblur relies extensively on mafft, vsearch and sortmerna. And Deblur itself is a Python program.

I'm not aware of any change to the codebase from 2018.11 to 2019.01 that would have any level of performance impact. It also isn't exactly clear to me why reducing trim length would have such a large bearing on runtime. So am really curious what may be going on here.

Best,
Daniel

ebolyen · February 11, 2019, 8:06pm

Does Deblur (or it least the plugin) drop reads less than the trim length? That might explain the difference if before it was operating on significantly less data (I'm assuming quality-filter was used before to trim at some q-score threshold).

kris-budd · February 12, 2019, 12:08am

So it has finally finished running. No errors thrown and Deblur workflow appears to have worked in near identical fashion to the first run (taking into consideration the altered trim length). That being said, my largest read count sample performed much better in terms of filtering and chimera id through deblur than the first run which was my initial motivation for rerunning (with 9.7 million raw reads I had a hard time believing it didn't find a single chimera).

Maybe the initial run was the error and this new run was the correct output and timeframe considering the data?

Just to answer previous questions::
Dropped Reads with lower Trim Length:: The trim length appears to have no or very little change in output. Mean, Max and Min raw reads were identical in the deblur stats output

Quality Filter::
Quality filtering was identical between the runs using
qiime quality-filter q-score-joined \
--i-demux demux-joined.qza
--o-filtered-sequences demux-joined-filtered.qza
--o-filter-stats demux-joined-filter-stats.qza

(same demux-joined-filtered.qza serving as the input for deblur)

Top/htop::
I reran the workflow (and killed early so I didn't have to wait again) so I could view htop while running. I'm not all that great at interpreting everything but attached below is what I'm seeing while it is running. I've lowered the allocated mem to 4.0GB during this run as multitasking with online browser for the host computer is mindnumbingly slow when I'm up at 5.5GB dedicated to the VirtualBox.

wasade · February 12, 2019, 12:35am

Yes, good call! Sorry for not catching that

wasade · February 12, 2019, 12:39am

Unfortunately, the top / htop output would be more useful if it's a long way into the processing.

One possible concern is that operating at 5.5GB may simply be too little memory which could result in heavy swapping. Note that 20M reads at 250nt is approximately 4.8GB of memory. I don't recall off hand if this mode of execution will attempt to read all the data into main memory, but that would be problematic here.

Best,
Daniel

kris-budd · February 12, 2019, 12:47am

I'll give it a try with this lower memory allocation and see if that improves performance speed. Always a possibility that host computer is the issue and nothing to do with the version update!

system · March 15, 2019, 6:47am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.