Q2-gneiss error: array must not contain infs or NaNs

(Manuel Ponce Alonso) #1

Hi, I’m now trying the above instructions.
I’m doing a gradient-clustering analysis with gneiss, and I found the following error after running this comand:

qiime gneiss gradient-clustering --i-table /home/qiime2/C-table.qza --m-gradient-file sample-metadata1.tsv --m-gradient-column Accumulatedscore --o-clustering gradient-hierarchy.qza --verbose

Traceback (most recent call last):
  File "/home/qiime2/miniconda/envs/qiime2-2018.8/lib/python3.5/site-packages/q2cli/commands.py", line 274, in __call__
    results = action(**arguments)
  File "<decorator-gen-288>", line 2, in gradient_clustering
  File "/home/qiime2/miniconda/envs/qiime2-2018.8/lib/python3.5/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
    output_types, provenance)
  File "/home/qiime2/miniconda/envs/qiime2-2018.8/lib/python3.5/site-packages/qiime2/sdk/action.py", line 362, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/home/qiime2/miniconda/envs/qiime2-2018.8/lib/python3.5/site-packages/q2_gneiss/cluster/_cluster.py", line 94, in gradient_clustering
    t = gradient_linkage(table, c, method='average')
  File "/home/qiime2/miniconda/envs/qiime2-2018.8/lib/python3.5/site-packages/gneiss-0.4.4-py3.5.egg/gneiss/cluster/_pba.py", line 203, in gradient_linkage
    t = rank_linkage(mean_X)
  File "/home/qiime2/miniconda/envs/qiime2-2018.8/lib/python3.5/site-packages/gneiss-0.4.4-py3.5.egg/gneiss/cluster/_pba.py", line 125, in rank_linkage
    dm = DistanceMatrix.from_iterable(r, euclidean)
  File "/home/qiime2/miniconda/envs/qiime2-2018.8/lib/python3.5/site-packages/skbio/stats/distance/_base.py", line 778, in from_iterable
    key, keys)
  File "/home/qiime2/miniconda/envs/qiime2-2018.8/lib/python3.5/site-packages/skbio/stats/distance/_base.py", line 159, in from_iterable
    dm[i, j] = metric(a, b)
  File "/home/qiime2/miniconda/envs/qiime2-2018.8/lib/python3.5/site-packages/scipy/spatial/distance.py", line 433, in euclidean
    dist = norm(u - v)
  File "/home/qiime2/miniconda/envs/qiime2-2018.8/lib/python3.5/site-packages/scipy/linalg/misc.py", line 129, in norm
    a = np.asarray_chkfinite(a)
  File "/home/qiime2/miniconda/envs/qiime2-2018.8/lib/python3.5/site-packages/numpy/lib/function_base.py", line 1215, in asarray_chkfinite
    "array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

I think that there is something wrong with my feature table, because if I run the command with the tutorial files, it works fine.
I hope we could fine the source of the error.
Thanks

C-table.qza (166.7 KB)
sample-metadata1.tsv (2.5 KB)

0 Likes

Correlation analysis of beta-diversity
(Nicholas Bokulich) assigned mortonjt #2
0 Likes

(Jamie Morton) #3

I’d check out the search bar feature in the qiime2 forums – it is actually really nifty.
The problem you see here is probably similar to the below posts.

https://forum.qiime2.org/t/integer-division-or-modulo-by-zero-error/3620/17
https://forum.qiime2.org/t/yet-more-trouble-with-gneiss/2692/20
https://forum.qiime2.org/t/gneiss-nan-plugin-error/2880/8
https://forum.qiime2.org/t/gneiss-balances-zero-values-for-model-mse/2176/15
https://forum.qiime2.org/t/gneiss-zero-balance-error/1857/13
https://forum.qiime2.org/t/gneiss-zero-balance-error/1857/13

2 Likes

(Nicholas Bokulich) unassigned mortonjt #4
0 Likes

(Manuel Ponce Alonso) #5

Hi Jamie, thanks for the answer and sorry for not to use the bar feature, I tried to search by other way.
I believe that gneiss plugin includes pseudocount by default (i’m using 2018.8 version ). I previously run the Correlation-clustering with the same table.qza and with no errors. I tried “add-pseudocount” but it no longer belongs to gneiss plugin.
Also, as one of the posts you attached, I tried to filter very-low-abundant features --p-min-frequency 7 --p-min-samples 2, but the same error happened.
I don’t understand what I’m doing wrong…

0 Likes

(Jamie Morton) #6

You probably weren’t applying the filtering criteria correctly, these two microbes both have zero counts

00c3067c46e95ab3728cbac7dcfd787f 0.0
1528863ddc260fe908f2badec807b109 0.0

So there was a divide by zero error thrown.

gradient-clustering doesn’t require that the inputs be nonzero (so no pseudocounts actually necessary) - but it does require that all of the OTUs are observed in at least 1 sample.

0 Likes

(Manuel Ponce Alonso) #7

Hello Jamie,

I applied more strict filtering parameters, but the error remains.
I attached a qzv file, when I couldn’t find any feature with no count.
Also, I tried to find those features that you refer to in the original table, but I couldn’t. (It seems that you are able to visualize them, but I can’t with qiime feature-table summarize).

filtered-table2.qza (172.1 KB)
visualization.qzv (396.6 KB)

0 Likes

(Matthew Ryan Dillon) assigned mortonjt #8
0 Likes

(Jamie Morton) #9

hmm - I’m able to get this to run with

qiime gneiss gradient-clustering \
      --i-table filtered-table2.qza \
      --m-gradient-file sample-metadata1.tsv \
      --m-gradient-column Accumulatedscore \
      --o-clustering gradient-hierarchy.qza --verbose

Are you sure you have the most up-to-date version of qiime installed?

1 Like

(Matthew Ryan Dillon) unassigned mortonjt #10
0 Likes

(Manuel Ponce Alonso) #11

Yes, I’m about to install the 2019.1 version, and re-run…I finally found the source of the error: the table.qza and the metadata file doesn’t share all the samples for Accumulativescore variable. I filtered out those samples with no accumulativescore values and it works!
It worked for you because I sent you a modified metadata file that included accumulativescore values for all the samples in the table, by mistake.

Thanks a lot for your patience, Jamie.

0 Likes