Hi, I don't know if I should make a new post, but I've been continuing through the Moving Pictures tutorial and ran into a similar type of problem. This time I am running QIIME2 via my university's computer cluster. I've tried this command a few times, increasing the amount of computing power each time. The first time I tried to run this command, my computer showed the spinning blue circle for several hours before it crashed. This time, I requested and was allocated 120 GB on 8 cores for 12 hours. I get the same errors each time:
The only potential conflict I can think of is that I made the input files using Qiime2 version 2019.4 on my Ubuntu app (as in the first post), and I am running this command on Qiime2 version 2019.7 on my university's cluster. Could that be a problem?
Hello again Rachel,
Maybe! Some qiime plug-ins like sci-kit learn to need matching version to run.
But in this case, a quick search makes this look like a 'divide by zero` error.
Given that ancom is all about ratios, dividing by zero is possible, but that also would be caught during testing. I wonder what else could cause this error.
Could you post the full error message?
That is the full error. That error showed up within a minute after I entered the command. The command was still going after the 12 hours of time I requested, but it didn’t get any farther.
Thanks for that important detail. The command is still running and you need to wait to see the output… what you received was just a warning, not an error. The good news is that you will get an output, the bad news is you need to wait a bit longer (ANCOM can be slow, filter out low-abundance features before running unless if you really care about them), the news that is somewhere between good and bad is that you saw this warning and need to decide what to make of it.
What to make of it: don’t panic! I manually traced where this warning is coming from — it is the F-oneway test that ANCOM is performing under the hood to determine whether a the log ratio of two features is greater or lesser in one group than another. The F-oneway test is diving mean squared error between treatments by MSE within treatments. For that log ratio, one or both of these is 0! i.e., no variance, almost certainly all zeroes. So the result will be non-significant… so not a data integrity issue, just a friendly warning
another good reason to filter out low-abundance features before running this test!
Do you have an idea of how many hours I should give it with 120 GB of computing power? Should I just set it as a task for 3 days? Unfortunately I am interested in low abundance features.
hmm… I am not sure there is an easy way to estimate runtimes and memory demands, since it will depend on a variety of factors.
I would just recommend letting it cook for a few days and pray to the QIIME 2 gods. I have never seen ANCOM take more than a day or so on large-ish datasets but maybe you have an enormous number of features?
You could also sort out other ways to narrow down your feature count — if you are interested in rare, you could filter out abundant! Or even if you want everything at least filter out features not found in at least N samples (filter-samples will do that for you). Or filter out low-variance features (no function to do that in QIIME 2).
You could also try another method: q2-sample-classifier will find features (rare or not) that differentiate sample types. You could perform “feature selection” with q2-sample-classifier’s classify-samples or regress-samples commands (depending on whether you are differentiating samples based on a categorical or numeric metadata column), then filter out features that do not exceed some importance threshold. But to heed this advice is to open up a really big (if extraordinarily rewarding) can of worms! Performing feature selection may break some of the statistical assumptions of ANCOM so proceed at your own risk (or ask the ANCOM developers).
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.