New to Qiime and have questions on installing and setting up Qiime on Ubuntu 16.04 in vSphere v6.7

qiime2_sys_admin · October 17, 2018, 3:15am

I originally posted this on the Qiime 1 forum and was told to post my questions here. Sorry for any confusion.

Hello, I am a Linux System Admin and I work in a research facility where a few doctors, scientists and researchers have reached out to me to setup Qiime on Ubuntu 16.04 in VMWare vSphere. They want to have two Ubuntu VMs, one with Qiime v1 and the other with Qiime v2. Most of them want to do their work on the VM with Qiime v1, since they are very familiar with that version and start to learn more about Qiime v2 with the other VM.

I don’t know much about the software and I’ve been reading this Google Group along with the website and I have a few questions, in the hopes of getting the setup correct.

When I build, I like to optimize the VM to the best of my ability. Looks like I can allocate one data disc in vSphere to host the Ubuntu VM. Will this be ok? The people that I’m supporting are asking for 1 TB size VMs as they are processing very large samples (I think this is the correct term).
I will use paravirtual scsi adapters for connections, unless others have other recommendations.
I want to setup multiple partitions in the Ubuntu VM, such as /var, /var/tmp/, /tmp, /opt, /home, and /usr/local. Will this cause an issue or are there recommendation partitions with working with Qiime?
Looks like Qiime doesn’t use a daemon, but reminds me of python pip. Where from a Bash Shell prompt, the end user can type “qiime” and it changes the prompt to show 'qiime" and they can start their work with the software. From shadowing these users looks like do most of the software work from their home directory. Should I allocate more space to /home directories due to this?

Hope these questions are clear and thanks in advance.

Chris

ebolyen · October 17, 2018, 11:07pm

Hi Chris!

That sounds like a good approach!

That should be more than enough. Most amplicon datasets I've seen range from 3gb to 20gb. But as we're moving into a multi-omics world, that size will only go up.

(Regarding terminology, it may be the case that the samples are very large, but more commonly we'd have many samples making the dataset very large )

I can't say whether that is a good or bad idea w.r.t. vSphere, but guessing by the name it seems fine to me, Linux certainly should have no issue recognizing that.

None of that will be an issue, but it sounds a little complex for the inside of a VM. I suppose with that many partitions you will have to use LVM, so resizing the partitions won't be as nightmarish as it could otherwise be.

Something to keep in mind is QIIME 2 does almost everything in the /tmp directory while it is processing/recording data, so make sure to allocate enough disk space there (single tenant, 75gb should be fine, if multiple tenants are in the same VM you might want to scale that up). You can also configure QIIME 2 to use a different directory by setting the TMPDIR environment variable, but since this is a custom VM, it's probably easier to just set up /tmp with lots of room.

As an aside, if you ever intended to map a physical device inside the VM, /tmp is probably the one worth mapping as that would make it much faster.

That is correct, QIIME 2 has a command line interface which accepts subcommands instead of having lots of little commands (like QIIME 1) and does not have a daemon, it will only run when invoked (and it will block the shell while it runs).

Allocate most of your space to /home (paying attention to /tmp still) and I think your users will be much happier

system · November 18, 2018, 5:07am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.