I have 16s rRNA sequence of around 450 stool samples and the generated file 900 files (forward and reverse seq) with 53 GB size. I tried to run all files on my iMAC i5 with 4 cores and 16GB RAMs but it takes days and the run keep going without stop and I think it is lagged or the command timed out.
I think I will go with amazon AWS or Google Cloud. I don't know which one is best? and could you please advice what is the best and enough resource to run all files in acceptable time like few hours or a day?
Please Note that I tried QIIME 1 for this failed run and I'm still trying to learn QIIME 2.
Amazon AWS and Google Cloud are both great options. We currently provide an official Amazon AWS EC2 Image, so if you are planning on instantiating some cloud resources specifically for this analysis, the official EC2 image will be your quickest route to having QIIME 2 installed.
As far as trying to learn QIIME 2, we suggest you start with our "Getting Started" guide, which presents a roadmap for getting geared up with QIIME 2!
I tried to start instance with this image but the highest resource I can get is 8 cores with around 64GiB RAM. Other instance types like r4.4xlarge, r4.8xlarge and r4.16xlarge are disables. That's what show up
Instance type is disabled.
To enable this instance type, return to the previous step and select an AMI that supports HVM virtualization.
I don't know if this resource is enough, I would prefer to use a higher resource at least 16 cores with 120 GiB RAM.
Hi @Faisal, because each dataset is highly unique, it is hard to say what will work "best" for you and your data. I think the 120 GiB RAM machine you mentioned above should be a good start for you to get moving on your analysis. If that proves to be too small, you can provision a EC2 instance and install QIIME 2 natively. I have opened up an issue on our VM bug-tracker to set up QIIME 2 images on EBS-backed EC2 types, which would stop many of those instance types from showing up as disabled in the menu on EC2. Thanks!