Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.4k views
in Technique[技术] by (71.8m points)

pandas - How to plot statsmodels linear regression (OLS) cleanly

Problem Statement:

I have some nice data in a pandas dataframe. I'd like to run simple linear regression on it:

enter image description here

Using statsmodels, I perform my regression. Now, how do I get my plot? I've tried statsmodels' plot_fit method, but the plot is a little funky:

enter image description here

I was hoping to get a horizontal line which represents the actual result of the regression.

Statsmodels has a variety of methods for plotting regression (a few more details about them here) but none of them seem to be the super simple "just plot the regression line on top of your data" -- plot_fit seems to be the closest thing.

Questions:

  • The first picture above is from pandas' plot function, which returns a matplotlib.axes._subplots.AxesSubplot. Can I overlay a regression line easily onto that plot?
  • Is there a function in statsmodels I've overlooked?
  • Is there a better way to put together this figure?

Two related questions:

Neither seems to have a good answer.

Sample data

As requested by @IgorRaush

        motifScore  expression
6870    1.401123    0.55
10456   1.188554    -1.58
12455   1.476361    -1.75
18052   1.805736    0.13
19725   1.110953    2.30
30401   1.744645    -0.49
30716   1.098253    -1.59
30771   1.098253    -2.04

abline_plot

I had tried this, but it doesn't seem to work... not sure why:

enter image description here

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As I mentioned in the comments, seaborn is a great choice for statistical data visualization.

import seaborn as sns

sns.regplot(x='motifScore', y='expression', data=motif)

sns.regplot


Alternatively, you can use statsmodels.regression.linear_model.OLS and manually plot a regression line.

import statsmodels.api as sm

# regress "expression" onto "motifScore" (plus an intercept)
model = sm.OLS(motif.expression, sm.add_constant(motif.motifScore))
p = model.fit().params

# generate x-values for your regression line (two is sufficient)
x = np.arange(1, 3)

# scatter-plot data
ax = motif.plot(x='motifScore', y='expression', kind='scatter')

# plot regression line on the same axes, set x-axis limits
ax.plot(x, p.const + p.motifScore * x)
ax.set_xlim([1, 2])

manual


Yet another solution is statsmodels.graphics.regressionplots.abline_plot which takes away some of the boilerplate from the above approach.

import statsmodels.api as sm
from statsmodels.graphics.regressionplots import abline_plot

# regress "expression" onto "motifScore" (plus an intercept)
model = sm.OLS(motif.expression, sm.add_constant(motif.motifScore))

# scatter-plot data
ax = motif.plot(x='motifScore', y='expression', kind='scatter')

# plot regression line
abline_plot(model_results=model.fit(), ax=ax)

abline_plot


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...