Gneiss Singular matrix error

Hi,

I get the following error (below) when I run the following command,

Command:

qiime gneiss lme-regression \
--p-formula bacteria_reads+cdiff_reads+CDItestResult+CaseControlAnnot+SampleType+ABX_admin_on_sample_collection+ABX_admin_24hrprior_sample_collection \
--i-table gneiss/balances.qza \
--i-tree gneiss/hierarchy.qza 
--m-metadata-file mapping_final.tsv \
--p-groups eRAP_ID \
--o-visualization gneiss/regression_summary.qzv --verbose

Error:

Traceback (most recent call last):
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/q2cli/commands.py", line 274, in __call__
    results = action(**arguments)
  File "<decorator-gen-215>", line 2, in lme_regression
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
    output_types, provenance)
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/qiime2/sdk/action.py", line 428, in _callable_executor_
    ret_val = self._callable(output_dir=temp_dir, **view_args)
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/q2_gneiss/regression/_regression.py", line 72, in lme_regression
    res.fit()
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/gneiss-0.4.2-py3.5.egg/gneiss/regression/_mixedlm.py", line 172, in fit
    self.results = [s.fit(**kwargs) for s in self.submodels]
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/gneiss-0.4.2-py3.5.egg/gneiss/regression/_mixedlm.py", line 172, in <listcomp>
    self.results = [s.fit(**kwargs) for s in self.submodels]
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/statsmodels/regression/mixed_linear_model.py", line 1988, in fit
    **kwargs)
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/statsmodels/base/model.py", line 451, in fit
    full_output=full_output)
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/statsmodels/base/optimizer.py", line 184, in _fit
    hess=hessian)
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/statsmodels/base/optimizer.py", line 286, in _fit_bfgs
    disp=disp, retall=retall, callback=callback)
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/scipy/optimize/optimize.py", line 859, in fmin_bfgs
    res = _minimize_bfgs(f, x0, args, fprime, callback=callback, **opts)
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/scipy/optimize/optimize.py", line 913, in _minimize_bfgs
    gfk = myfprime(x0)
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/scipy/optimize/optimize.py", line 292, in function_wrapper
    return function(*(wrapper_args + args))
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/statsmodels/base/model.py", line 430, in <lambda>
    score = lambda params, *args: -self.score(params, *args) / nobs
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/statsmodels/regression/mixed_linear_model.py", line 1466, in score
    params.fe_params = self.get_fe_params(params.cov_re, params.vcomp)
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/statsmodels/regression/mixed_linear_model.py", line 1178, in get_fe_params
    fe_params = np.linalg.solve(xtxy[:, 0:-1], xtxy[:, -1])
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/numpy/linalg/linalg.py", line 384, in solve
    r = gufunc(a, b, signature=signature, extobj=extobj)
  File "/hpc/packages/minerva-common/qiime2/2018.4/lib/python3.5/site-packages/numpy/linalg/linalg.py", line 90, in _raise_linalgerror_singular
    raise LinAlgError("Singular matrix")
numpy.linalg.linalg.LinAlgError: Singular matrix

Plugin error from gneiss:

  Singular matrix

See above for debug info.

Any help on this would be appreciated.

Thanks,
Ajay.

Hi @ajaybabu27, I haven't seen this error message before. But after digging around for related bug reports on statsmodels, it likely has to do something with ill-defined input.

https://groups.google.com/forum/#!topic/pystatsmodels/cWdOIQmcs4w

There are really only two input types that could be causing issues here, the input metadata or the biom table. Could you do the following?

  1. Can you try rerunning the model on few inputs, and report which ones crash?
  2. Have you confirmed that your biom table doesn't have serious problems with zeros? See here for discussion on this: Detected zero variance balances while performing "gneiss ols-regression" - #10
  3. If either of the two solutions aren't working, could you post all of your qzas + metadata to reproduce this error?
1 Like

Thanks for the quick reply ! To add information to my initial query. I actually filtered my BIOM table using the following command

qiime feature-table filter-features \
  --i-table table.qza \
  --p-min-frequency 10 \
  --o-filtered-table feature-frequency-filtered-table.qza

My initial data matrix had 2126 features across 190 samples and had a sparsity level of 0.97 (i.e. 97% of my matrix are zero values). After applying the above filter command it reduced to 1614 features across 190 samples with 0.96 sparsity.

I am currently debugging the annotation of my co-variates in my metadata file and playing around with running the model with fewer co-variates. I will keep you posted and do let me know if you find any issues with the sparsity level in my data matrix.

Thanks,
Ajay.

Something already sounds funky about this - 96% sparsity is incredibly high for this sort of dataset. What sort of samples do you have? And what do your sample sequencing depths look like?

Apologies for the late reply. I took a re-look at all my samples and I filtered all the QC libraries & bad libraries. Just looking at my good quality libraries (samples=173 and features=1584), I still the find the matrix to have 96% sparsity. The mean read depth per sample is about 40,000 reads. Here is the quantile distribution to look at the spread:

     0%     25%     50%     75%     100%   (samples)
   4417    28236   39082   48921   231597  (read depth)

This is a 16S V4 gut microbiome (human) dataset.

1 Like

Hi @ajaybabu27, this is a step in the right direction, but we need much more information in order to diagnose this.

Could you send over all of the files used to run the command? I have a feeling that there is some ill-defined inputs (i.e. multiple strains that only appear in 1 sample).

1 Like

Thanks @mortonjt. I have sent you my files as a private message. After modifying my mapping file to fix missing or confounding data entries, I still get the Singular matrix error in the end after some initial warning messages that repeats multiple times. I have summarized all the unique standard error outputs:

/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/statsmodels/base/model.py:508: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  "Check mle_retvals", ConvergenceWarning)

/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/statsmodels/regression/mixed_linear_model.py:2026: ConvergenceWarning: Gradient optimization failed.
  warnings.warn(msg, ConvergenceWarning)

/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/statsmodels/regression/mixed_linear_model.py:2045: ConvergenceWarning: The MLE may be on the boundary of the parameter space.
  warnings.warn(msg, ConvergenceWarning)

/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/statsmodels/regression/mixed_linear_model.py:2066: ConvergenceWarning: The Hessian matrix at the estimated parameter values is not positive definite.
  warnings.warn(msg, ConvergenceWarning)

Traceback (most recent call last):                                                                                                                                                                                                     
  File "/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/q2cli/commands.py", line 274, in __call__                                                                                          
    results = action(**arguments)                                                                                                                                                                                                      
  File "<decorator-gen-235>", line 2, in lme_regression                                                                                                                                                                                
  File "/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/qiime2/sdk/action.py", line 232, in bound_callable                                                                                 
    output_types, provenance)                                                                                                                                                                                                          
  File "/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/qiime2/sdk/action.py", line 429, in _callable_executor_                                                                            
    ret_val = self._callable(output_dir=temp_dir, **view_args)                                                                                                                                                                         
  File "/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/q2_gneiss/regression/_regression.py", line 73, in lme_regression                                                                   
    lme_summary(output_dir, res, tree)                                                                                                                                                                                                 
  File "/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/gneiss-0.4.4-py3.5.egg/gneiss/plot/_regression_plot.py", line 354, in lme_summary                                                  
    for r in model.results})                                                                                                                                                                                                           
  File "/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/gneiss-0.4.4-py3.5.egg/gneiss/plot/_regression_plot.py", line 354, in <dictcomp>                                                   
    for r in model.results})                                                                                                                                                                                                           
  File "/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/statsmodels/regression/mixed_linear_model.py", line 1351, in loglike                                                               
    fe_params = self.get_fe_params(cov_re, vcomp)                                                                                                                                                                                      
  File "/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/statsmodels/regression/mixed_linear_model.py", line 1175, in get_fe_params                                                         
    cov_re_inv = np.linalg.inv(cov_re)                                                                                                                                                                                                 
  File "/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/numpy/linalg/linalg.py", line 526, in inv                                                                                          
    ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)                                                                                                                                                                    
  File "/hpc/packages/minerva-common/anaconda3/5.0.1/envs/qiime2-2018.6/lib/python3.5/site-packages/numpy/linalg/linalg.py", line 90, in _raise_linalgerror_singular                                                                   
    raise LinAlgError("Singular matrix")                                                                                                                                                                                               
numpy.linalg.linalg.LinAlgError: Singular matrix

Plugin error from gneiss:                                                                                                                                                                                                              
                                                                                                                                                                                                                                       
  Singular matrix                                                                                                                                                                                                                      
                                                                                                                                                                                                                                       
See above for debug info.

For the sake of completeness for the community, here are the list of commands I used:

qiime feature-table filter-features --i-table 5_gneiss/table.qza --p-min-frequency 10 --o-filtered-table 5_gneiss/feature-frequency-filtered-table.qza
qiime gneiss add-pseudocount --i-table 5_gneiss/feature-frequency-filtered-table.qza --p-pseudocount 1 --o-composition-table 5_gneiss/composition.qza
qiime gneiss correlation-clustering --i-table 5_gneiss/composition.qza --o-clustering 5_gneiss/hierarchy.qza
qiime gneiss ilr-transform --i-table 5_gneiss/composition.qza  --i-tree 5_gneiss/hierarchy.qza --o-balances 5_gneiss/balances.qza
qiime gneiss lme-regression --p-formula eRAP_ID+testResult+Annot+OneDay --i-table 5_gneiss/balances.qza --i-tree 5_gneiss/hierarchy.qza --m-metadata-file 5_gneiss/mapping_final_combined.tsv --p-groups eRAP_ID --o-visualization 5_gneiss/regression_summary.qzv --verbose
1 Like

Hi @ajaybabu27, I took a look at your dataset and it looks like 1/3 of your features are singletons -- this is expected to break, because none of these algorithms will have the resolution to make any meaningful inference about these organisms.

Below is a screenshot of the counts -- you have 888 singleton organisms, 302 doubleton organisms, ..
image

When I filtered out organisms that appeared in less than 10 samples, the lme-regression was able to run without errors.

Below was the filtering command that I ran

qiime feature-table filter-features --i-table table.qza --p-min-samples 10 --o-filtered-table feature-frequency-filtered-table.qza

Hope this helps!

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.