Linear mixed effect model - Sample ID issue

Hello!

I came across an issue with sample ID when I try to build a linear mixed effect model.

The command I used is:

qiime longitudinal linear-mixed-effects
--m-metadata-file sample-metadata.tsv
--m-metadata-file 2018-lme-filtered-core-metrics-results/shannon_vector.qza
--p-metric shannon_entropy
--p-random-effects Week,pH,Conductivity..uS.cm,Water.Temperature..Celcius,Average.Water.Depth..cm,Ammonia.NH3..mg.L,Nitrate.NO3..mg.L
--p-group-columns TreatP
--p-state-column Week
--p-individual-id-column SampleID
--o-visualization 2018-lme-filtered-core-metrics-results/2018-lme-shannon.qzv

The error message I received is

Plugin error from longitudinal:
SampleID is not a column in your metadata
Debug info has been saved to /tmp/qiime2-q2cli-err-o2mm8p7a.log

I checked both my metadata file and the shannon_vector.qza file, the sample id column is “SampleID” and “Sample ID” for the metadata and shannon_vector.qza respectively. Not sure if that causes the issue. I tried both, but none worked.

shannon_vector

metadata
image

Can someone help? Thank you!

Rui

Hi! I don’t think that you should use SampleID column at all.
For example, you sampled four persons at 2 time-points. In this case, individual IDs will be Person_1, Person_2, not the sample ID (each person sampled at least two times). You should choose the column in your metadata file, indicating object, individe or another unit, sampled several times in your analysis.

3 Likes

Thank you @timanix for the quick response! That just made so much sense, I should have realized that.

However, after changing it to what it’s supposed to be (“Site” in this case), I encountered new issues, I first changed the column names in the metadata file of my random-effect variables since some of them were “undefined” for some reason.

The code I used was:

qiime longitudinal linear-mixed-effects
–m-metadata-file sample-metadata-modified.tsv
–m-metadata-file 2018-lme-filtered-core-metrics-results/shannon_vector.qza
–p-metric shannon_entropy
–p-random-effects Week,pH,conductivity,WaterTemp,WaterDepth,ammonia,nitrate
–p-group-columns TreatP
–p-state-column Week
–p-individual-id-column Site
–o-visualization 2018-lme-filtered-core-metrics-results/2018-lme-shannon.qzv

Then I got this error message:

Plugin error from longitudinal:
operands could not be broadcast together with shapes (65,1) (56,1)
Debug info has been saved to /tmp/qiime2-q2cli-err-qfl3qr2c.log

Here is what’s in the log:
individual_id_column, random_effects=random_effects, formula=formula)
File “/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2_longitudinal/_utilities.py”, line 344, in _linear_effects
re_formula=random_effects)
File “/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/statsmodels/regression/mixed_linear_model.py”, line 1025, in from_formula
formula, data, *args, **kwargs)
File “/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/statsmodels/base/model.py”, line 194, in from_formula
mod = cls(endog, exog, args, **kwargs)
File “/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/statsmodels/regression/mixed_linear_model.py”, line 718, in init
**kwargs)
File “/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/statsmodels/base/model.py”, line 236, in init
super(LikelihoodModel, self).init(endog, exog, **kwargs)
File “/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/statsmodels/base/model.py”, line 77, in init
**kwargs)
File “/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/statsmodels/base/model.py”, line 100, in _handle_data
data = handle_data(endog, exog, missing, hasconst, **kwargs)
File “/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/statsmodels/base/data.py”, line 672, in handle_data
**kwargs)
File “/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/statsmodels/base/data.py”, line 72, in init
**kwargs)
File “/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/statsmodels/base/data.py”, line 267, in handle_missing
nan_mask = _nan_rows(
(nan_mask[:, None],) + combined_2d)
File “/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/statsmodels/base/data.py”, line 49, in _nan_rows
return reduce(_nan_row_maybe_two_inputs, arrs).squeeze()
File “/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/statsmodels/base/data.py”, line 48, in _nan_row_maybe_two_inputs
(x_is_boolean_array | _asarray_2d_null_rows(y)))
ValueError: operands could not be broadcast together with shapes (65,1) (56,1)

not sure what’s going on here…

Hi @timanix, thanks for your reply, I played around with the data, and it seems that –p-random-effects cannot contain variables with missing values. After removing variables with missing values, the model worked fine. I will post another question asking how to account for variables with missing values in an lme. Thanks again!

3 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.