Feature-table filter-samples removes taxonomy from biom table

Hi @antgonza!

Looks like you found a bug! Our general strategy with BIOM tables is to normalize them on read/write, so that these biom metadata fields are stripped out (more on that below). It looks like this normalization is skipped when importing BIOMV210Format files (the normalization is applied when importing BIOMV100Format, though).

As far as "stripping out biom metadata", the idea here is that we can represent these data using other QIIME 2 semantic types, for example the taxonomy metadata can be represented as FeatureData[Taxonomy]!

We support importing these types of "fat" biom tables in QIIME 2 by running two (or more) separate import commands:

$ qiime tools import \
  --input-path hdf5.biom \
  --output-path feature-table.qza \
  --type "FeatureTable[Frequency]"
$ qiime tools import \
  --input-path hdf5.biom \
  --output-path taxonomy.qza \
  --source-format BIOMV210Format \
  --type "FeatureData[Taxonomy]"

That error is because the default QIIME 2 source format for FeatureTable[Frequency] is BIOMV210Format, which isn't compatible with your JSON-style BIOM table. You can import JSON-variant BIOM files by specifying the source format as BIOMV100Format:

$ qiime tools import \
  --input-path biom.biom \
  --output-path feature-table.qza \
  --source-format BIOMV100Format \
  --type "FeatureTable[Frequency]"

This issue just came up last week in an internal discussion --- the open issue can be found here.

Thanks!

2 Likes