Failure when attempting to filter samples

Hi,

I was wondering if you could help me track down an error that I know is on my end. I am attempting to perform “metadata based filtering” on a dataset to remove the control sample. In this case I want to remove all data corresponding to NTC under the column heading “Group” (I will pasted command an error below) in my mapping file. I know it is an issue on my end because I downloaded the table and map from your tutorial and had no issue running the same type of test using your first example, "–p-where “Subject=‘subject-1’”, with the exception that I am removing the data as opposed to retaining (but also had an issue when I tested latter so that the test was identical). I will include the top of the Map file below in addition to the command and the error. Any thoughts? FYI, I’ve successfully had qiime parse this exact Map file correctly when I ran beta and alpha diversity tests using the same column. In fact, the results from those tests are the reason I am filtering the NTC as it was muddying the waters.

Top 5 lines of Map file:
#SampleID Group Side Interproximal BuccalvLingual Description
NTC-Concat NTC NTC NTC NTC NTC-Concat
X1017-Concat I Left Mesial Lingual X1017-Concat
X1117-Concat I Right NA Buccal X1117-Concat
X1217-Concat I Left Mesial Buccal X1217-Concat

Command (using 2017.8 from beginning)
qiime feature-table filter-samples --i-table table-dada2.qza --m-metadata-file Map-Final-Concat-08Aug2016-NoRepeatedMeasures.tsv --p-where “Group=‘NTC’” --o-filtered-table no-NTC-table-dada2.qza --p-exclude-ids --verbose

Error:
/home/user/bin/anaconda3/envs/qiime2-2017.8/lib/python3.5/site-packages/pandas/core/generic.py:1362: UserWarning: The spaces in these column names will not be changed. In pandas versions < 0.14, spaces were converted to underscores.
chunksize=chunksize, dtype=dtype)
Traceback (most recent call last):
File “/home/user/bin/anaconda3/envs/qiime2-2017.8/lib/python3.5/site-packages/qiime2/metadata.py”, line 267, in ids
c.execute(query)
sqlite3.OperationalError: near “Group”: syntax error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/user/bin/anaconda3/envs/qiime2-2017.8/lib/python3.5/site-packages/q2cli/commands.py”, line 222, in call
results = action(**arguments)
File “”, line 2, in filter_samples
File “/home/user/bin/anaconda3/envs/qiime2-2017.8/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 201, in callable_wrapper
output_types, provenance)
File “/home/user/bin/anaconda3/envs/qiime2-2017.8/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 334, in callable_executor
output_views = callable(**view_args)
File “/home/user/bin/anaconda3/envs/qiime2-2017.8/lib/python3.5/site-packages/q2_feature_table/_filter.py”, line 72, in filter_samples
where=where, axis=‘sample’, exclude_ids=exclude_ids)
File “/home/user/bin/anaconda3/envs/qiime2-2017.8/lib/python3.5/site-packages/q2_feature_table/_filter.py”, line 45, in _filter
ids_to_keep = metadata.ids(where=where)
File “/home/user/bin/anaconda3/envs/qiime2-2017.8/lib/python3.5/site-packages/qiime2/metadata.py”, line 271, in ids
% query)
ValueError: Selection of IDs failed with query:
SELECT “#SampleID” FROM metadata WHERE Group=‘NTC’ GROUP BY “#SampleID” ORDER BY “#SampleID”;

Plugin error from feature-table:

Selection of IDs failed with query: SELECT “#SampleID” FROM metadata
WHERE Group=‘NTC’ GROUP BY “#SampleID” ORDER BY “#SampleID”;

See above for debug info.

Hi @John! We are using sqlite for the where SQL-style filtering — Group is a reserved keyword in sqlite, so the error message you are seeing is a complaint from sqlite, which is confused about why it is seeing the word group there!

You have two options moving forward:

  1. Rename your metadata column to something else (e.g. SubjectGroup)
    or
  2. Use some aggressive quotes in your where clause to tell sqlite that you really do have a category named group: --p-where "\""Group"\"='NTC'"'

Good luck! :t_rex:

3 Likes

Hah. Glad I asked. Easy fix that I probably would not have figured out. It ran without fail this time.

Thanks!
John

1 Like

Given the nature of microbiome studies, I suspect we will see this issue pop up again in the future! Thanks!

Edit for clarity: What I mean is, the metadata column ‘Group’ seems pretty commonplace in this field!

1 Like

I had a related problem where the column name is "body-site". Apparently SQLite requires aggressive quotes around column names with hyphens in order to work properly. Thank you so much for this answer!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.

In QIIME 2 2018.2 this error now indicates that some of the columns may be reserved names (providing a URL to check) and suggests quoting them.

1 Like