--p-where doesn't work

Hello, I’m baffled when filtering feature-table by sample metadata. I’ve been trying this on two separate datasets and metadata but failed both times. Here is my command for one of them:

qiime feature-table filter-samples --i-table table_stool-only-noNC_freqFilt100_minFreq100.qza --m-metadata-file metadata.tsv --o-filtered-table test --p-where "'Beating-time-min'='10'"

The command successfully completed as shown by Saved FeatureTable[Frequency] to: test.qza, but when I export test.qza and try to converto to tsv, apparently feature-table.biom is empty. Command is: biom convert -i feature-table.biom -o feature-table.tsv --to-tsv. Resulting error:

(...)
biom.exception.TableException: Cannot delimit self if I don't have data...

I am already careful enough to put quotes around ‘Beating-time-min’ and ‘10’. I validated my metadata.tsv using Keemei, and even followed the latest metadata guidelines in the QIIME2 2018.2 release. What could go wrong here?

If I need to share my table and metadata in order to get help, please let me know and I’ll pm you.
Thank you very much.

1 Like

Hi @jjmmii! Can you send your feature table and metadata my way? Thanks!

Thanks for sending your data in a DM. A few comments:

  1. Quoting is hard, and we would like to make this easier in the future, but for now, we are stuck reading the sqlite.org docs to understand what to do. If you use double quotes around your column, this works as expected:
qiime feature-table filter-samples \
  --i-table table_stool-only-noNC_freqFilt100_minFreq100.qza \
  --m-metadata-file metadata.tsv \
  --o-filtered-table test \
  --p-where "\"Beating-time-min\"='10'" 
  1. Your table you provided is already filtered to only have a subset of the Beating-time-min=10 samples, interrogating your provenance it looks like this is a side-effect of all the frequency filtering you applied (looks like you did that 4 times, two filter-samples and two filter-features). Anyway, the table you sent has 88 samples, and all of them are Beating-time-min=10. I mention this so that when you run the command and the table appears to be unchanged, this is why. After filtering, I see 88 samples and 131 features.
2 Likes

Thanks @thermokarst! It works. You’re awesome.

Actually I have three other --p-where criteria (was just showing one for the test) so after using all four I got the right subset of samples. Thank you!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.