Hello, Folks,
I am working with some metadata provided by an external source (i.e., please don't blame me for its funkiness ) and I ran across a little difficulty using it in a Metadata object. The metadata file has multiple data columns that differ only by case, and although the I don't get any errors/warnings when loading the file into a Metadata object, I hit a Duplicate column name
error when I try later to select ids based on one of those columns. I've recreated a minimal example with a foo
column and a FOO
column (see below).
Based on this, I am concluding that while some sample id column names can be case-sensitive in a Metadata file (e.g., #SampleID
, sample_name
), the data column names are handled as case-insensitive. (I looked in the Metadata in QIIME 2 — QIIME 2 2023.9.2 documentation but didn't see info about this specifically.) Could you let me know if I'm I correct about this?
Thank you!
Example:
minimal_metadata.tsv:
sample-id foo FOO
a b c
d e f
Attempted code (in conda qiime2-dev environment):
import qiime2
md = qiime2.Metadata.load("minimal_metadata.tsv")
md.get_ids("foo='b'")
Traceback (most recent call last):
File "/Applications/miniconda3/envs/qiime2-dev/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3508, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-6-d295b00d9a66>", line 1, in <module>
md.get_ids("foo='b'")
File "/Users/abirmingham/Work/Repositories/fork_qiime2/qiime2/metadata/metadata.py", line 683, in get_ids
self._dataframe.to_sql('metadata', conn, index=True,
File "/Applications/miniconda3/envs/qiime2-dev/lib/python3.8/site-packages/pandas/core/generic.py", line 2987, in to_sql
return sql.to_sql(
File "/Applications/miniconda3/envs/qiime2-dev/lib/python3.8/site-packages/pandas/io/sql.py", line 695, in to_sql
return pandas_sql.to_sql(
File "/Applications/miniconda3/envs/qiime2-dev/lib/python3.8/site-packages/pandas/io/sql.py", line 2187, in to_sql
table.create()
File "/Applications/miniconda3/envs/qiime2-dev/lib/python3.8/site-packages/pandas/io/sql.py", line 838, in create
self._execute_create()
File "/Applications/miniconda3/envs/qiime2-dev/lib/python3.8/site-packages/pandas/io/sql.py", line 1871, in _execute_create
conn.execute(stmt)
sqlite3.OperationalError: duplicate column name: FOO
pyver = sys.version_info
print('Python version: %d.%d.%d' % (pyver.major, pyver.minor, pyver.micro))
Python version: 3.8.18
print('QIIME 2 release: %s' % qiime2.__release__)
QIIME 2 release: 2023.11
print('QIIME 2 version: %s' % qiime2.__version__)
QIIME 2 version: 2023.11.0.dev0+12.gc4ec793.dirty