Plugin error from diversity: array must not contain infs or NaNs

Hi there,
I've been trying to run diversity analyses on a dataset but keep geeting the following error:

Plugin error from diversity:

array must not contain infs or NaNs

The dataset is a filtered table and alpha diversity runs smoothly. Below the commands:

#filter-table by samples of interest      
qiime feature-table filter-samples \
--i-table table-many-samples.qza \
--m-metadata-file smp_ID \
--p-min-features 2 \
--o-filtered-table table.ftd.qza

(I then filtred-out Unassigned reads...)

qiime taxa filter-table --i-table table.ftd.qza --i-taxonomy taxonomy.qza --p-exclude Unassigned --o-filtered-table table.ftd.noUnassigned.qza

... and got a file containing the rep-seqs...

qiime feature-table filter-seqs --i-data rep-seqs.full.qza --i-table table.ftd.noUnassigned.qza --o-filtered-data rep-seqs.ftd.noUnassigned.qza

... and aligned the rep-seqs...

qiime phylogeny align-to-tree-mafft-fasttree --p-n-threads 15 --i-sequences rep-seqs.ftd.noUnassigned.qza --o-alignment aligned.qza --o-masked-alignment masked.qza --o-tree unrooted.qza --o-rooted-tree rooted.qza

...and run core-metrics-phylogeny
qiime diversity core-metrics-phylogenetic --i-phylogeny rooted-tree.qza --i-table table.ftd.noUnassigned.qza --p-sampling-depth 20000 --m-metadata-file metadata3 --output-dir core-metrics-output --p-n-jobs 7

Then the error

/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/sklearn/utils/validation.py:475: DataConversionWarning: Data with inp
ut dtype float64 was converted to bool by check_pairwise_arrays.
warnings.warn(msg, DataConversionWarning)
/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/stats/ordination/_utils.py:186: RuntimeWarning: overflow encoun
tered in multiply
return distance_matrix * distance_matrix / -2
/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/stats/ordination/_utils.py:198: RuntimeWarning: invalid value e
ncountered in subtract
return E_matrix - row_means - col_means + matrix_mean
/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/stats/ordination/_utils.py:198: RuntimeWarning: invalid value e
ncountered in add
return E_matrix - row_means - col_means + matrix_mean
Traceback (most recent call last):
File "/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/q2cli/commands.py", line 274, in call
results = action(**arguments)
File "", line 2, in core_metrics_phylogenetic
File "/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
output_types, provenance)
File "/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py", line 455, in callable_executor
outputs = self._callable(scope.ctx, **view_args)
File "/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/q2_diversity/_core_metrics.py", line 65, in core_metrics_phyl
ogenetic
pcoas += pcoa(distance_matrix=dm)
File "", line 2, in pcoa
File "/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
output_types, provenance)
File "/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py", line 362, in callable_executor
output_views = self._callable(**view_args)
File "/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/q2_diversity/_ordination.py", line 18, in pcoa
inplace=False)
File "/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/stats/ordination/_principal_coordinate_analysis.py", li
ne 126, in pcoa
eigvals, eigvecs = eigh(matrix_data)
File "/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/scipy/linalg/decomp.py", line 327, in eigh
a1 = _asarray_validated(a, check_finite=check_finite)
File "/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/scipy/_lib/_util.py", line 238, in _asarray_validated
a = toarray(a)
File "/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/numpy/lib/function_base.py", line 1215, in asarray_chkfinite
"array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

A similar error was reported here Qiime diversity beta-phylogenetic Data must be symmetric and cannot contain NaNs error
but I've checked and there are no samples with 0 observations in the table.

qiime feature-table summarize --i-table table.ftd.noUnassigned.qza --o-visualization table.ftd.noUnassigned.qzv

table.filtered.qzv (430.6 KB) table.ftd.noUnassigned.qzv (430.1 KB) rep-seqs.ftd.noUnassigned.qza (163.4 KB) rooted.qza (132.4 KB) metadata3.tsv (82 Bytes)

It seems the error has something related to a filtered table, but I don't know what.
Could you guys help me on that?

Cheers

Update:
I am not sure what's happenig, but I was able to access Unifrac and PCoA plotting by running standalone commands:
qiime feature-table rarefy --i-table table.ftd.noUnassigned.qza --p-sampling-depth 20000 --o-rarefied-table table.ftd.noUnassigned.rarefied.qza

qiime diversity beta-phylogenetic --i-table table.ftd.noUnassigned.rarefied.qza --p-metric weighted_unifrac --i-phylogeny rooted.qza --o-distance-matrix w_unifrac_matrix.qza

qiime diversity pcoa --i-distance-matrix w_unifrac_matrix.qza --o-pcoa _unifrac_pcoa.qzv

qiime emperor plot --i-pcoa w_unifrac_pcoa.qzv.qza --m-metadata-file metadata3 --o-visualization w_unifrac_emperor

w_unifrac_emperor.qzv (822.7 KB)

1 Like

Hi @lca123,

That all looks perfectly fine to me! And the fact that you were able to get specific ordinations out is encouraging. I can’t say I’ve seen this before. @yoshiki have you ever seen an instance where skbio’s PCoA ever produced inf/NaN?

I’ll see if I have some time tomorrow to try and reproduce this.

1 Like

I’ve just re-run all those commands and, as a second test, the command below alone:

qiime diversity core-metrics-phylogenetic --i-phylogeny rooted-tree.qza --i-table table.ftd.noUnassigned.qza --p-sampling-depth 20000 --m-metadata-file metadata3 --output-dir core-metrics-output --p-n-jobs 7

and now that error I’ve mentioned from here Qiime diversity beta-phylogenetic Data must be symmetric and cannot contain NaNs error returned:

Plugin error from diversity:

  Data must be symmetric and cannot contain NaNs.

Is that possible that somehow the numbers (counts) in the table are being interpreted as float even if they are integer? I highlighted it

/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/sklearn/utils/validation.py:475: DataConversionWarning: Data with inp
ut dtype float64 was converted to bool by check_pairwise_arrays.
warnings.warn(msg, DataConversionWarning)
Traceback (most recent call last):
File “/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/q2cli/commands.py”, line 274, in call
results = action(**arguments)
File “”, line 2, in core_metrics_phylogenetic
File “/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable
output_types, provenance)
File “/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 455, in callable_executor
outputs = self._callable(scope.ctx, **view_args)
File “/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/q2_diversity/_core_metrics.py”, line 59, in core_metrics_phyl
ogenetic
metric=‘unweighted_unifrac’, n_jobs=n_jobs)
File “”, line 2, in beta_phylogenetic
File “/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable
output_types, provenance)
File “/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 362, in callable_executor
output_views = self._callable(**view_args)
File “/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/q2_diversity/_beta/_method.py”, line 99, in beta_phylogenetic
variance_adjusted=variance_adjusted, bypass_tips=bypass_tips)
File “/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/unifrac/_methods.py”, line 103, in unweighted
variance_adjusted, 1.0, bypass_tips, threads)
File “unifrac/_api.pyx”, line 90, in unifrac._api.ssu
File “/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/stats/distance/_base.py”, line 106, in init
self._validate(data, ids)
File “/home/leonardo.alves/miniconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/skbio/stats/distance/_base.py”, line 873, in _validate
“Data must be symmetric and cannot contain NaNs.”)
skbio.stats.distance._base.DistanceMatrixError: Data must be symmetric and cannot contain NaNs.

1 Like

My best guess here would be that the number of jobs is leading the UniFrac computation to fail, can you try re-runnin the qiime diversity core-metrics-phylogenetic command again but without the --p-n-jobs 7 flag?

1 Like

Think you’re right! But there’s something still weird:
I re-run with --p-n-jobs 7 and got the error:

qiime diversity core-metrics-phylogenetic --i-phylogeny rooted-tree.qza --i-table table.ftd.noUnassigned.qza --p-sampling-depth 20000 --m-metadata-file metadata3 --output-dir core-metrics-output-v2 --p-n-jobs 7
Plugin error from diversity:

  array must not contain infs or NaNs

Then I re-run without --p-njobs 7 and it worked. Got all the files. But then, I re-run again with --p-n-jobs 7 and it also worked! And tried with/without --p-n-jobs 7, 5, 2 and every command is now working.
Well, I now have the files from core-metrics-phylogeny . Maybe this is something from my machine, but I can work from here.
Thank you.

qiime diversity core-metrics-phylogenetic --i-phylogeny rooted-tree.qza  --i-table table.ftd.noUnassigned.qza --p-sampling-depth 20000 --m-metadata-file metadata3 --output-dir core-metrics-output-v2
Saved FeatureTable[Frequency] to: core-metrics-output-v2/rarefied_table.qza
Saved SampleData[AlphaDiversity] % Properties(['phylogenetic']) to: core-metrics-output-v2/faith_pd_vector.qza
Saved SampleData[AlphaDiversity] to: core-metrics-output-v2/observed_otus_vector.qza
Saved SampleData[AlphaDiversity] to: core-metrics-output-v2/shannon_vector.qza
Saved SampleData[AlphaDiversity] to: core-metrics-output-v2/evenness_vector.qza (other files...)


qiime diversity core-metrics-phylogenetic --i-phylogeny rooted-tree.qza  --i-table table.ftd.noUnassigned.qza --p-sampling-depth 20000 --m-metadata-file metadata3 --output-dir core-metrics-output-v2 --p-n-jobs 7
Plugin error from diversity:

  array must not contain infs or NaNs


qiime diversity core-metrics-phylogenetic --i-phylogeny rooted-tree.qza  --i-table table.ftd.noUnassigned.qza --p-sampling-depth 20000 --m-metadata-file metadata3 --output-dir core-metrics-output-v2
Saved FeatureTable[Frequency] to: core-metrics-output-v2/rarefied_table.qza
Saved SampleData[AlphaDiversity] % Properties(['phylogenetic']) to: core-metrics-output-v2/faith_pd_vector.qza
Saved SampleData[AlphaDiversity] to: core-metrics-output-v2/observed_otus_vector.qza
Saved SampleData[AlphaDiversity] to: core-metrics-output-v2/shannon_vector.qza
Saved SampleData[AlphaDiversity] to: core-metrics-output-v2/evenness_vector.qza

qiime diversity core-metrics-phylogenetic --i-phylogeny rooted-tree.qza  --i-table table.ftd.noUnassigned.qza --p-sampling-depth 20000 --m-metadata-file metadata3 --output-dir core-metrics-output-v2 **--p-n-jobs 7**
Saved FeatureTable[Frequency] to: core-metrics-output-v2/rarefied_table.qza
Saved SampleData[AlphaDiversity] % Properties(['phylogenetic']) to: core-metrics-output-v2/faith_pd_vector.qza
Saved SampleData[AlphaDiversity] to: core-metrics-output-v2/observed_otus_vector.qza
Saved SampleData[AlphaDiversity] to: core-metrics-output-v2/shannon_vector.qza

I believe this has to do with the number of samples (small number) and the number of jobs. The fast implementation of UniFrac is known to have this issue. Nothing bad about your data, mainly the parallelization not working out well when there’s not “enough” to parallelize. I’m puzzled as to why this would sometimes work though.

1 Like

Hello
I got this error today
Plugin error from emperor:

(‘array must not contain infs or NaNs’, ‘occurred at index 71d5910c4e04598e7a39ca180ef23d98’)

I am trying to study beta diversity without assigning a sampling depth (I previously assigned a sampling depth and everything worked fine). So I did:
qiime diversity beta
–i-table $FILTER/table-dada2_1.qza
–p-metric braycurtis
–o-distance-matrix $DIVERSITY/Alpha_norare/bray_curtis_distance_matrix.qza

Then
qiime diversity pcoa
–i-distance-matrix $DIVERSITY/Alpha_norare/bray_curtis_distance_matrix.qza
–o-pcoa $DIVERSITY/Alpha_norare/bray_curtis_distance_pcoa.qza

Then
qiime diversity pcoa-biplot
–i-pcoa $DIVERSITY/Alpha_norare/bray_curtis_distance_pcoa.qza
–i-features $FILTER/rel-freq-table-dada2_1.qza
–o-biplot $DIVERSITY/Alpha_norare/bray_curtis_distance_pcoa_biplot.qza

Then
qiime emperor biplot
–i-biplot $DIVERSITY/Alpha_norare/bray_curtis_distance_pcoa_biplot.qza
–m-sample-metadata-file $WORKING/sample_metadata1.tsv
–o-visualization $DIVERSITY/Alpha_norare/bray_curtis_distance_pcoa_biplot.qzv

This last step is giving me the above-mentioned plugin error. I do not have any job number.

Thanks.

Seta

Hi!
I had the same issue.
On this step:

provide option:

--p-number-of-dimensions

and try different numbers of dimensions

2 Likes

In general, based on this response from @thermokarst, I would recommend verifying that all samples and features have at least one count whenever you encounter problems like these.