It's QUITE large (11GiB) so I had a lot of memory issues, finally I launched a c4.8xlarge instance. The problem is that even with --VERBOSE flag I don't know where I am having the problem. The job just stops running and as it's always running for a Looong time the ssh connection is often closed so I'm not able to see the verbose printuout always. I have tried to access the logs but I can not find them!!
Looks like you are encountering memory issues due to large dataset (218 samples with a lot of sequences in each sample!).
You could try to decrease the number of threads (2-4) and if it possible to increase allocated memory (if you use HPC), or use a more powerful machine.
I don't think you'll be able to recover log files when the ssh connection is interrupted, and it's likely that the interruption of the ssh connection is causing the failure. As a next step, I recommend that you use tmux to allow your job to continue even if the ssh connection is interrupted. This will either allow the job to finish sucessfully, or allow you to re-connect to the server and see the full error message, which will include a path to the log file, if the job doesn't finish.
This post provides a good discussion of how to use tmux for this. It's likely that tmux is already installed on the system you're using, but if not you can install it or use screen.
Do you want to give this a try and let us know how it goes?
(Forum moderators: please correct me if I'm wrong and there is a reliable way to access the error log here.)
Hi, I was able to run it in the backgroup and save the --verbose to a file.
However, when moving into feature-classifier I have again a memory issue. I know there's a long discussion about it but I have tried everything and I can´t find a way to run it with silva.
Thank you for replying. I will try that.
I was actually wondering whether I could just split my repseq (which is actually a fasta.fna file) into batches to predict the classification and then merge them together. Do you think that would work the same as running the whole file? if I'm not training, just classifying, I don't see how it would be different to split the file in batches than run it together.