Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
203 views
in Technique[技术] by (71.8m points)

Line of best fit not accurate in Python

I am trying to create a line of best fit for a small scatter plot. Right now I am using

m,b = np.polyfit(xArray, yArray, 1)
xValues = np.linspace(-8,2,50)

plt.scatter(xList, yList)
plt.plot(xValues, m*xValues+b)

This keeps giving me a reasonable line of best fit, but what I am looking for is a more vertical line. What could I use as a substitue to polyfit when the data has a much higher down trend like this one?

Calculated best fit line:

Calculated best fit line

question from:https://stackoverflow.com/questions/66052778/line-of-best-fit-not-accurate-in-python

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Using orthogonal least squares instead of polyfit gives the desired answer. There is a built in package for this within scipy.

from scipy import odr

def linear_func(p, x):
   m, c = p
   return m*x + c

#Create a Model.:
linear = odr.Model(linear_func)

#Create a Data or RealData instance.:
mydata = odr.Data(xArray, yArray)

#or, when the actual covariances are known:
mydata = odr.RealData(x, y, sx=sx, sy=sy)

#Instantiate ODR with your data, model and initial parameter estimate.:
myodr = odr.ODR(mydata, linear, beta0=[1., 2.])

#Run the fit.:
myoutput = myodr.run()

#Examine output.:
myoutput.pprint()

Example

Instead of vertical distance to each point, this method uses perpendicular distance as shown above. This gives better fits for when your data has a strong down or up trend.

There are more detailed explanations on how to use this method here.

referece: documentation


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...