Specifying how many threads to run on with qiime2 feature-classifier extract-reads

Hello!

I’ve been trying to run the feature-classifier extract-reads so I can prepare to train my classifier on some COI reference sequences. The problem I ran into is that it took over a week, and had not yet been completed. For previous versions of Qiime2, I noticed that some people could add an extra parameter ( --p-n-threads) to specify how many threads should be used. However, in trying to do that now (with my current version of Qiime2, qiime2-2019.1) I get an error stating: no such option: --p-n-threads.

Is this no longer a feature in the version I possess? How can I accommodate this?

1 Like

Hello Andrea,

Great question!

When I look at the documentation for feature-classifier extract-reads, I can see that this particular plugin does not have the --p-n-threads (you didn’t miss it!), and so I guess it doesn’t support multithreading. :woman_shrugging:

Maybe you could make this multithreaded yourself, by dividing your FeatureData[Sequence] into several parts, then running extract-reads all at once on all those different parts?

Colin

And then make sure to merge again!

That said, I suspect the reason this doesn’t have the n-threads is that it’s mostly IO-bound, so adding CPUs doesn’t make it read from the hard drive any faster, so I don’t know if splitting and merging is worth the effort here.

I did something to confirm whether or not this operation is IO-bound (to the best of my ability).

Using the command:
> sudo iotop
And I don’t observe that anything is being significantly used, and that using
> top
shows that nearly 100% of the CPU is being used. I’m taking this as an indication that it is CPU-bound. Is there somewhere I can confirm this in the documentation before I try splitting and then later merging my files?

1 Like

I think this is solid evidence that this step is CPU bound. Good detective work! :+1:

Estimating bottlenecks is hard, which is why we don’t usually mention how much RAM, CPU, or IO is needed for a specific step. So your first hand observation better than the best documentation :+1:

2 Likes

I created an issue for this. No real timeline whatsoever, but it looks like something we should be able to do.

1 Like