adonis error: 'non-zero exit status 1' because of 'Error in G * t(hat) : non-conformable arrays'

Hi everyone,

I am using Qiime2 2019.7 natively installed on an iMac (MacOS 10.12.6).

I have a dataset of samples from participants of varying ages from early childhood to late adolescence. I suspect that age may be a confounding factor in this dataset and so I am trying to run qiime diversity adonis with formulas that include age and the other metadata factors that came up significant in my initial beta diversity analyses.

The first commands I tried to do this look like (have edited a little bit for privacy):

for i in Factor1 Factor2 Factor3
do
qiime diversity adonis --i-distance-matrix "$i"_diversity_30000/unweighted_unifrac_distance_matrix.qza --m-metadata-file ../ProjectName_2015_map_MHD_20190903.txt --p-formula "Age*"$i"" --o-visualization "$i"_diversity_30000/uwuf_Age_"$i"_results.qzv
done

In each case, the specific distance matrix being called had been constructed from a feature table previously filtered using qiime feature-table filter-samples to remove samples that had 'unknown' values for the metadata column being tested (there were no unknown values in Age). I did this individually for each metadata variable/column that had missing or unknown values to try to retain as many samples as possible for each individual analysis. That is, I removed samples that had 'unknown' values in Factor1 from the table, then generated a distance matrix from that table and used the distance matrix to test the significance of Factor1, then repeated the whole process for Factor2 and Factor3 and so on. I didn't remove all samples with unknown values in any column.

I got output that I think makes sense when the additional Factor being tested was categorical (e.g. a disease state), but didn't get output for either of two continuous variables I was trying to test an interaction with Age for.

I re-ran the command for one of the continuous variables individually (i.e. not as part of a loop) and got this error message:

Plugin error from diversity:
Command '['run_adonis.R', '/var/folders/mb/vtxl0zps5435mr78csnrjpjr0000gr/T/tmpyequa9yw/dm.tsv', '/var/folders/mb/vtxl0zps5435mr78csnrjpjr0000gr/T/tmpyequa9yw/md.tsv', 'Age*Factor3, '999', '1', '/var/folders/mb/vtxl0zps5435mr78csnrjpjr0000gr/T/qiime2-temp-bjuwz0ri/adonis.tsv']' returned non-zero exit status 1.

Re-running with the --verbose flag (sorry for the wall of text):

Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command: run_adonis.R /var/folders/mb/vtxl0zps5435mr78csnrjpjr0000gr/T/tmp4det8n0c/dm.tsv /var/folders/mb/vtxl0zps5435mr78csnrjpjr0000gr/T/tmp4det8n0c/md.tsv Age*Factor3 999 1 /var/folders/mb/vtxl0zps5435mr78csnrjpjr0000gr/T/qiime2-temp-fxbasdd3/adonis.tsv

R version 3.5.1 (2018-07-02)
Loading required package: permute
Loading required package: lattice
This is vegan 2.5-5
Error in G * t(hat) : non-conformable arrays
Calls: adonis -> sapply -> lapply -> FUN
Execution halted
Traceback (most recent call last):
File "/Users/matildahd/miniconda3/envs/qiime2-2019.7/lib/python3.6/site-packages/q2cli/commands.py", line 327, in call
results = action(**arguments)
File "</Users/matildahd/miniconda3/envs/qiime2-2019.7/lib/python3.6/site-packages/decorator.py:decorator-gen-432>", line 2, in adonis
File "/Users/matildahd/miniconda3/envs/qiime2-2019.7/lib/python3.6/site-packages/qiime2/sdk/action.py", line 240, in bound_callable
output_types, provenance)
File "/Users/matildahd/miniconda3/envs/qiime2-2019.7/lib/python3.6/site-packages/qiime2/sdk/action.py", line 445, in callable_executor
ret_val = self._callable(output_dir=temp_dir, **view_args)
File "/Users/matildahd/miniconda3/envs/qiime2-2019.7/lib/python3.6/site-packages/q2_diversity/_beta/_visualizer.py", line 367, in adonis
_run_command(cmd)
File "/Users/matildahd/miniconda3/envs/qiime2-2019.7/lib/python3.6/site-packages/q2_diversity/_beta/_visualizer.py", line 393, in _run_command
subprocess.run(cmd, check=True)
File "/Users/matildahd/miniconda3/envs/qiime2-2019.7/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['run_adonis.R', '/var/folders/mb/vtxl0zps5435mr78csnrjpjr0000gr/T/tmp4det8n0c/dm.tsv', '/var/folders/mb/vtxl0zps5435mr78csnrjpjr0000gr/T/tmp4det8n0c/md.tsv', 'Age*Factor3', '999', '1', '/var/folders/mb/vtxl0zps5435mr78csnrjpjr0000gr/T/qiime2-temp-fxbasdd3/adonis.tsv']' returned non-zero exit status 1.

Based on other(1) threads(2) here, I'm guessing the line about non-conformable arrays points to the source of the problem, but I don't know where to go from here. I don't think there should be NA or missing values (for the specific variables being tested) because I filtered them out of the feature table before generating the distance matrices.

Any ideas on how to make this work, and also why it might have worked for categorical variables but not continuous?

Thank you very much!

1 Like

Hello @Matilda_H-D,

Welcome back to the forums. Let's dive in.

Great use of the vegan adonis test! :+1:
Great use of for loops! :trophy:

It sounds like this is working as intended for categorical variables, but not for continuous ones... which is weird because adonis(~continuous) should work fine.

I'm not 100% sure what's going on here, but I think it has to do with the continuous variable, and the * in your formula.


As you know, you can set up R adonis() formulas using:
~var1 + var2 to partition by var1 followed by var2 or
~var1*var2 to partition by var1 followed by var2 followed by var1:var2

By using the * crossing function in your formula, you have added this interaction term as the final thing in your adonis formula.

And, as you know, adonis returns results of the last term you pass.

So running adonis(~Age*Factor3) will not return you results for Age or Factor3, but the interaction of Age:Factor3, which might not be what you want... :scream_cat:

I don't have a perfect answer, but the first thing I would try would be to run adonis without the interaction term, using
--p-formula "Age+"$i""

While this will omit any interaction terms, this will give you a much cleaner test, and hopefully work equally well for both continuous and categorical variables.

This is a really useful way to use the adonis test, so I'm curious to see what you find!

Colin

1 Like

@Nicholas_Bokulich suggested this:

Making sure that R knows your continuous columns are continuous by removing all text strings is another good place to start. :+1:

1 Like

Hi @colinbrislawn (and @Nicholas_Bokulich),

Thank you -- I'm getting back into analysis on a new project after finally pretty much finishing a paper I started back in QIIME 1 days!

I followed Nicholas' advice and removed all the negative control samples from my mapping file (which obviously don't have an age value, and had been removed from the feature table before generating distance matrices), and the adonis test seems to be working now! Thank you.

I assume I can still trust the original results I got for testing Age + a categorical variable, even though the metadata file would have contained NA values for controls in those columns? Given that there were no controls in the distance matrix?

However, Colin's initial response raised some flags for me so I want to check that I understand what the adonis test is actually doing!

My understanding was that by specifying adonis(~Age*Factor3), I would get something like:
(1) variance explained by Age alone
(2) variance explained by Factor3 alone minus any variance that had already been accounted for by Age
(3) the variance explained by the interaction between Age and Factor3

However, Colin's response makes me think that maybe I am only getting (3) back??? Can somebody please confirm?

Thanks again,
Matilda

PS For your interest, assuming that I have run these correctly, most of the factors I was testing do not significantly interact with Age for this dataset, with the R^2 value for the interaction being lower than that for either of the individual factors (and large p-value). Which I take to mean the factors have somewhat independent relationships with unweighted UniFrac distance. However, when I reverse the order of the factors in the formula (i.e. "Factor3*Age"), Age sometimes explains somewhat less and Factor3 somewhat more when Factor3 is tested first, which I guess means that for some of the variance you can't differentiate which of the factors is actually explaining it.

1 Like

I usually run the vegan::adonis() test directly in R, so I'm not sure how results are presented in the Qiime 2 :qiime2: Artifact. Here's what a typical outputs look to me:

restroomadonis = adonis(d ~ SURFACE + GENDER + BUILDING + FLOOR, df)
restroomadonis
## 
## Call:
## adonis(formula = d ~ SURFACE + GENDER + BUILDING + FLOOR, data = df) 
## 
## Terms added sequentially (first to last)
## 
##           Df SumsOfSqs MeanSqs F.Model    R2 Pr(>F)    
## SURFACE   10      8.19   0.819    4.12 0.470  0.001 ***
## GENDER     1      0.50   0.504    2.53 0.029  0.007 ** 
## BUILDING   1      0.31   0.310    1.56 0.018  0.081 .  
## FLOOR      2      0.47   0.236    1.19 0.027  0.212    
## Residuals 40      7.96   0.199         0.456           
## Total     54     17.44                 1.000           
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(From this phyloseq tutorial.)

Hopefully you are seeing all of your terms, including the interaction term, so you can compare them. And let us know if the suggestion about continuous variables helps.

Colin

I think I said this poorly, sorry to cause any confusion!

You are 100% correct. That formula will partition out variance based on those three sources in that exact order.

I was worried that Qiime 2 wasn't reporting on all three (but it sounds like it is so that's good!)
and I was worried that you wanted to partition out Age:Factor3 before Factor3.

If you can see all three p-values and R2 values, and the order looks OK, I think you are good to go!

Colin

2 Likes

Hi all,

I was getting the same error in R; the solution was that I had to transpose the otu_table.

adonis(dist(t(otu_table(phy)), method='euclidean') ~ season * depth_m,
data=dat)

Without transposing, I'd get the error:

Error in G * t(hat) : non-conformable arrays

Best,

Eduardo