Here is an example to illustrate fit_constrained
, using Gaussian family since I didn't quickly find a Poisson example with categorical variables
import pandas
import statsmodels.api as sm
from statsmodels.formula.api import glm
url = 'http://www.ats.ucla.edu/stat/data/hsb2.csv'
hsb2 = pandas.read_table(url, delimiter=",")
mod = glm("write ~ C(race) - 1", data=hsb2)
res = mod.fit()
print(res.summary())
constraint that all coefficients add to zero
res_c = mod.fit_constrained('C(race)[1] + C(race)[2] + C(race)[3] + C(race)[4] = 0')
print(res_c.summary())
Generalized Linear Model Regression Results
==============================================================================
Dep. Variable: write No. Observations: 200
Model: GLM Df Residuals: 197
Model Family: Gaussian Df Model: 2
Link Function: identity Scale: 1232.08314649
Method: IRLS Log-Likelihood: -993.41
Date: Wed, 25 Mar 2015 Deviance: 2.4149e+05
Time: 16:42:37 Pearson chi2: 2.41e+05
No. Iterations: 1
==============================================================================
coef std err z P>|z| [95.0% Conf. Int.]
------------------------------------------------------------------------------
C(race)[1] 1.0002 221.565 0.005 0.996 -433.260 435.260
C(race)[2] -41.1814 267.253 -0.154 0.878 -564.988 482.626
C(race)[3] -6.3498 235.771 -0.027 0.979 -468.453 455.754
C(race)[4] 46.5311 100.184 0.464 0.642 -149.827 242.889
==============================================================================
Model has been estimated subject to linear equality constraints.
constraints are comma separated and default to equal zero:
res_c2 = mod.fit_constrained('C(race)[1] + C(race)[2], C(race)[3] + C(race)[4]')
print(res_c2.summary())
the last prints
Generalized Linear Model Regression Results
==============================================================================
Dep. Variable: write No. Observations: 200
Model: GLM Df Residuals: 198
Model Family: Gaussian Df Model: 1
Link Function: identity Scale: 1438.99574167
Method: IRLS Log-Likelihood: -1008.9
Date: Wed, 25 Mar 2015 Deviance: 2.8204e+05
Time: 16:42:37 Pearson chi2: 2.82e+05
No. Iterations: 1
==============================================================================
coef std err z P>|z| [95.0% Conf. Int.]
------------------------------------------------------------------------------
C(race)[1] 13.6286 242.003 0.056 0.955 -460.689 487.946
C(race)[2] -13.6286 242.003 -0.056 0.955 -487.946 460.689
C(race)[3] -41.6606 111.458 -0.374 0.709 -260.115 176.794
C(race)[4] 41.6606 111.458 0.374 0.709 -176.794 260.115
==============================================================================
Model has been estimated subject to linear equality constraints.
I'm not sure how patsy formulas work so that none of the levels is dropped if there are several categorical explanatory variables.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…