Why does biom-convert save a JSON file when --to-hdf5 is passed as an option?

taxonomy
biom
tsv

(Thomas A. Christensen II) #1

I am trying to convert a TSV feature table from metaxa2 into biom v. 2.1.0.

feature-table.tsv (406.1 KB)

This is the command I’m using:

biom convert -i feature-table.tsv \
 -o feature-table.hdf5.biom.txt \
 --table-type="OTU table" \
 --to-hdf5 \
 --process-obs-metadata taxonomy

biom validate-table -i feature-table.hdf5.biom.txt
biom summarize-table -i feature-table.hdf5.biom.txt

And I get this output:

feature-table.hdf5.biom.txt (612.1 KB)

Command Line Output
The input file is a valid BIOM-formatted file.
Num samples: 31
Num observations: 2388
Total count: 545924
Table density (fraction of non-zero values): 0.223

Counts/sample summary:
 Min: 11123.0
 Max: 33374.0
 Median: 15722.000
 Mean: 17610.452
 Std. dev.: 5656.495
 Sample Metadata Categories: None provided
 Observation Metadata Categories: taxonomy

Counts/sample detail:
250-S18: 11123.0
247-S15: 11427.0
252-S20: 11490.0
253-S21: 12177.0
251-S19: 12205.0
249-S17: 12340.0
254-S22: 12659.0
255-S23: 12660.0
260-S28: 12712.0
256-S24: 13110.0
248-S16: 13201.0
258-S26: 13265.0
257-S25: 13336.0
259-S27: 14192.0
261-S29: 15519.0
11-S3: 15722.0
12-S4: 15936.0
19-S1: 19485.0
8-S17: 19633.0
16-S8: 19943.0
15-S7: 20340.0
18-S1: 20555.0
9-S15: 21068.0
14-S6: 22174.0
10-S2: 22852.0
20-S1: 22909.0
6-S14: 22935.0
13-S5: 23184.0
21-S1: 25468.0
7-S20: 28930.0
17-S9: 33374.0

(Note that I usually use the recommended .biom suffix, but couldn’t here due to upload rules.)

However, when I examine the file with a text editor, it is clearly a JSON file in BIOM v1.0.0

{"id": "None","format": "Biological Observation Matrix 1.0.0", ...

I specifically want this in HDF5 (BIOM v2.1.0) because the JSON causes problems further downstream when trying to import into qiime.

I have the biom-format, python-numpy, and python-h5py packages installed on my machine via apt.

I saw in this GitHub issue that sample metadata can cause problems when converting to HDF5, but this is critical observational data. Why is this conversion not successful, and is there a way to fix it?


(Evan Bolyen) #2

Thanks for pointing that out, I’ve updated the forum rules for extensions :slight_smile:


(Justine) #3

Hi @millironx,

Not an exact answer, but have you tried running the commands inside your QIIME 2 enviroment? Since QIIME is built on top of biom, it might be an easier approach. Your commands should run the same way.

Best,
Justine


(Thomas A. Christensen II) #4

Hi @jwdebelius,

Thanks for that advice. After some testing, it appears that the conflict between the globally-installed packages and the QIIME 2 environment packages was causing the issue. Removing biom-format-tools and python-h5py, then running from the QIIME 2 environment fixed the issue.

Thanks again,
Thomas