I am trying to assign taxonomy to a set of data generated from QIIME 1.9.1 following the above mentioned protocol (using VSEARCH consensus Taxonomy). It is running for almost 24 hours and still not generated any output. I am using the VBox version of QIIME 2. Could anyone please suggest how much more time would it require or any problem is there in the run?
That does seem like it is taking a while. How large is your dataset? (QIIME 1 has a tendency to create far more OTUs than QIIME 2, which has a better denoising process.)
Since you are running this in VBox, I would also double-check that you have enough memory and CPU allocated to the virtual machine. If your VBox instance is starved for compute-power than anything will run slowly.
Thanks @ebolyen. Unfortunately, @Aishiki’s results fit with our latest test results.
We are currently recommending that users avoid using classify-consensus-vsearch for more than tens of sequences.
Fortunately, classify-consensus-blast gives very similar performance to classify-consensus-vsearch in terms of accuracy, but in our tests runs 50 times faster. If run time is still an issue, classify-sklearn was 500 times faster in our tests. There is a tutorial for how to use classify-sklearnhere.
We haven’t had a chance to look at why classify-consensus-vsearch is so slow. @Nicholas_Bokulich may have more to say on this in future.
Thanks to both @BenKaehler and @ebolyen for your suggestions. Fortunately at last the run was over after ~30 hours or so. Although has not yet been able to check the data. Will keep you updated once I check my data. Further I shall try again to run classify-consensus-blast.
The performance issue with classify-consensus-vsearch was investigated and it turns out that vsearch is generally much slower with full-length sequences (see this issue for more details) so the performance is what would be expected. Thanks!