What AWS EC2 size is best for teaching Qiime2?


(Patrick O) #1

Hello, I am a grad student TA trying to teach qiime2 to a couple of small groups in the fall. I thought a good way to do this would be to set up an AWS EC2 instance the students could log into and run the tutorials on (since their personal pcs are often windows, teaching linux command line often requires this anyway).

So I figured out the free tier for EC2 is simply not enough to run qiime2 (please correct me if I’m wrong here!). I’d like to ask the department to pay for this resource, but I’d like to keep it small enough to just run the basics. What tier of EC2 do you think I’ll need to buy to run this? How much space do you put on your instances that run qiime?

Sorry if this is a really newbie question, I just really want to make this work!

  • Patrick

(Matthew Ryan Dillon) assigned thermokarst #2

(Matthew Ryan Dillon) #3

I can’t really give you specific guidance, since this depends on so many factors (how many students? how many accounts per machine? what dataset will you run? what workflows will be run?), but I can give you some general guidelines for what the QIIME 2 workshop instructors use when teaching QIIME 2 Workshops.

We typically deploy one m4.2xlarge for every 7ish students in attendance (we typically run small ad hoc clusters when we teach open-enrollment workshops). By “ad hoc,” I mean, these clusters are short-lived — we spin them up the day before a workshop and tear them down at the conclusion of the workshop.

Now, with that said, the m4.2xlarge is probably overkill, we just don’t want to be in a situation where 75 workshop attendees are left in the lurch just because we opted for a smaller EC2 instance. So, that m4.2xlarge has 8 vCPUs and 32 GB RAM - we generally shoot for one CPU per tenant on the machine. Keep in mind, that isn’t practical when using a “real” dataset — we teach the QIIME 2 workshops with a “lite” dataset that has been trimmed and crafted to go easy on the resources (in the past we typically use the Moving Pictures tutorial dataset).

Hope that helps! :qiime2:

(Matthew Ryan Dillon) unassigned thermokarst #4