Installing QIIME2 *without* Conda

(Maxime Boissonneault) #1

Hi,
How can one install QIIME2 without Anaconda, using straight python/pip ?

Thanks

(Colin J Brislawn) #2

Hello Maxime! Welcome to the Qiime 2 forums! :qiime2:

You should be able to install all these requirements using any method, including pip. All the dependencies are listed here:
https://raw.githubusercontent.com/qiime2/environment-files/master/2019.4/release/qiime2-2019.4-py36-linux-conda.yml

I guess the other question I want to ask is, why not use conda? It’s faster than pip and lighter than Docker!

Colin

1 Like
(Maxime Boissonneault) #3

Hi Colin,
Those are conda dependency, some of which are likely not python. Why don’t you have a package hosted on pipy and allow pip install ?

pip install qiime won’t even find version 2. Is there a different package name to use ?

Or the source code on github with a setup.py ?

There are many reasons not to use Conda, but in short, Conda works fine for a personal computer, but it does not work well on a HPC cluster, which is where our users want to run. In fact, Conda causes all sort of issues on an HPC cluster, so much so that we explicitly recommend to our users not to use it.

(Matthew Ryan Dillon) #4

Hi @Maxime_Boissonneault!

We don’t distribute python packages for the QIIME 2 ecosystem, because many components of QIIME 2 are non-python, and thus can’t be distributed via pip. We use conda because it is agnostic to the software in question.

As far as @colinbrislawn’s recommendation above, I would personally not recommend going that route, for the reasons I just mentioned — you will need to install all the other dependencies listed in that environment file, many of which are not python packages at all.

Hmm, that has not been our experience, perhaps you can provide a concrete example? It sounds like perhaps the deployment of conda on that HPC has some atypical issues.

1 Like
(Maxime Boissonneault) #5

Hi @thermokarst,
Mmm, where to start…

  1. conda distributes binaries that are not necessarily optimized for our CPU architecture, hence wasting CPU cycles
  2. conda distributes binaries that are not necessarily compatible with our software infrastructure, i.e. don’t find some libraries because they are not installed in standard locations
  3. conda installs everything in the user’s home, putting a lot of stress on our filesystem (and slowing down the whole cluster)
  4. conda is designed around a per-user installation, meaning that each of our ~1000 python users would need to have their own installation if we were to recommend conda
  5. conda messes the user’s .bashrc

There are plenty of python packages that use non-python components, and still distribute python packages : h5py, mpi4py, tensorflow, pytorch, basemap, Cartopy, PyQt, pyproj, netCDF4, to name just a few. We provide python wheels specially tailored to use the underlying libraries that we have built and optimized for our clusters.