QIIME2 API - Bug in "#SampleID" column?

Hi all,

A potential bug that might be worth looking at. I've recently been processing some data using python scripts to call the QIIME2 API to run an analysis pipeline. Its been working fine but recently I had a study that kept failing with errors:

  • TypeError: not all arguments converted during string formatting or
  • TypeError: Detected non-string metadata ID of type <class 'int'>: 1

After a lot of testing I THINK the issue is that if all the entries in column 1 ("#SampleID") are numeric the pipeline fails. Even a single entry with a string seems to allow the protocol to run.

Thought it would being worth pointing out in case its a true bug to be fixed or just a quirk to be aware of. Happy to hear if others have experienced this.

Barry

Hi @bmurph79, thanks for the report, but this is not a bug, it is deliberate - we do not support non-string values for the identifier column. I think we can probably make this more clear in the Metadata Tutorial, sorry about that.

Yeah, because a single string is the "lowest common denominator" - all other values are coerced to a string because of that.

A simple solution (if using the Python 3 API) is to set the Index type at construction:

 md = Metadata(pd.DataFrame({}, index=pd.Index([1], name='id', dtype='str')))

I hope that helps!

:qiime2:

Thanks Matt. Suspected it might be deliberate but thought it worth mentioning. Cheers for the workaround, will pass it on to the relevant people, or just add "S" before every sample number.
Cheers.

1 Like