Qiime diversity beta-phylogenetic Data must be symmetric and cannot contain NaNs error

diversity

(Scott Daniel) #1

Hi,

I’m running Qiime2-2018.11 on a 16S dataset of 288 samples, sequenced on a Miseq. Here is my pipeline: https://github.com/scottdaniel/TagSequencingPipeline/blob/master/qiime2/qiime2_pipeline.bash

The code runs up to the point of qiime diversity beta-phylogenetic (line 243) where it stops and spits out an error:
Plugin error from diversity:

  Data must be symmetric and cannot contain NaNs.

Debug info has been saved to /tmp/qiime2-q2cli-err-8uujpowz.log

I attached the error log even though it wasn’t that informative. error-log-for-beta-phylogenetic.txt (1.5 KB)

One thing I tried was “qiime toots export | biom convert --to-tsv” for the table.qza FeatureTable[frequency] and checked the resultant table.tsv for Na’s. There were none.

The big question is whether this is a bug or something wrong with my data? Should I remove the controls from the table prior to submitting it to beta-diversity calculation? Thanks a million.


(Matthew Ryan Dillon) #2

Hey there @scottdaniel!

Sorry to hear things aren’t working for you :(. Can you help us help you by preparing a small minimum working example? And please pare it down to the minimum commands required to reproduce the error. Thanks! :qiime2: :t_rex:


(Scott Daniel) #3

Ok, hopefully this will illustrate the problem, I subset my original dataset with this command:

qiime feature-table filter-samples \
        --i-table ../../denoising-results/table.qza \
        --m-metadata-file ./for_forum/subset_metadata2.txt \
        --o-filtered-table ./for_forum/subset_table2.qza

Then, I checked that it successfully subsett’ed by running:

qiime feature-table summarize --i-table ./for_forum/subset_table2.qza --o-visualization ./for_forum/subset_table2.qzv --m-sample-metadata-file ./for_forum/subset_metadata2.txt

Next, I re-ran the “diversity beta-diversity” command:

qiime diversity beta-phylogenetic --i-phylogeny ../../denoising-results/rooted-tree.qza --i-table ./for_forum/subset_table2.qza --p-metric weighted_unifrac --o-distance-matrix ./for_forum/weighted_unifrac2.qza

Got the same error. I attached the data files.

subset_table2.qza (46.5 KB)
subset_metadata2.txt (5.5 KB)
rooted-tree.qza (117.3 KB)
subset_table2.qzv (359.7 KB)


(Matthew Ryan Dillon) assigned thermokarst #4

(Matthew Ryan Dillon) #5

The problem is caused by sample DNAfreewater6 — it has no observations at all (check out your table summary viz). You can filter that out by updating your filter-samples command above to include --p-min-frequency 1, which will drop any samples with no observations in it.

Keep us posted! :qiime2: :t_rex:


(Matthew Ryan Dillon) unassigned thermokarst #6

(Scott Daniel) #7

It worked! Thanks so much for the help.

Could there be a line somewhere in the code that would tell the user that this was the sample causing the problem? If not, I’ll just have to check my samples from now on and make sure there are none with 0 features. Thanks again.


(Matthew Ryan Dillon) assigned thermokarst #8

(Matthew Ryan Dillon) #9

I suppose so, but technically the error is originating outside of QIIME 2 (that was coming from scikit-bio). This isn’t usually a problem for most folks, since they usually provide a rarefied feature table, which will not include any zero-sequence samples.

With that said, we would certainly appreciate a contribution to add this error check, if you’re interested!


(Matthew Ryan Dillon) unassigned thermokarst #10