Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
558 views
in Technique[技术] by (71.8m points)

python - Bug (?) on selecting knots on scipy.insterpolate's splrep function

I have this question regarding scipy's splrep function, which I think is a bug, so I'll post every piece of code so you can reproduce it on your computers. Suppose I want to find the b-spline representation of some data, say, the one obtained by the following code, which creates a dataset as a mixture of ten gaussians with some addeded noise:

import numpy as np
# First we define the number of datapoints:
ndata = 100
x = np.arange(0,1,1./np.double(ndata))
means = np.random.uniform(0,1,10)
y = 0.0
for i in range(len(means)):
    y = y+np.exp(-(x-means[i])**2./0.01)
# We add some noise to obtain the data:
data = y + np.random.normal(0,0.05,len(y))

Which should see like this: Noisy data and the true curve (mixture of gaussians) Now, let's use splrep and splev functions to obtain the b-spline representation of this curve:

from scipy.interpolate import splrep,splev
# First define the number of knots. Let's put, say, 10 knots:
nknots = 10
# Now we crate the array of knots:
knots = np.arange(x[1],x[len(x)-1],(x[len(x)-1]-x[1])/np.double(nknots))
tck = splrep(x,data,t=knots)
fit = splev(x,tck)

If you plot everything up to here, everything seems ok: Fit with b-splines However, there are problems with certain combinations of the number of datapoints and number of knots. For example, if you try the above code with ndata = 1931 and nknots = 796, I get the following error:

File "/usr/lib/python2.7/dist-packages/scipy/interpolate/fitpack.py", line 465, in
splrep raise _iermess[ier][1](_iermess[ier][0])
ValueError:     Error on input data

This is giving me problems because the above code is not automatizable. I'm playing with datasets which have ~19000 datapoints, where a while cycle with try and except can be very computationally demanding. So my questions are:

  1. Can you reproduce this problem? And if you can...
  2. Do you know what's going on?
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I created a way around the problem. It probably has to do with a low number of datapoints to fit between two knots, so what I did was to replace the line where I create the number of knots with:

idx_knots = (np.arange(1,len(x)-1,(len(x)-2)/np.double(nknots))).astype('int')
knots = x[idx_knots]

In that way, I ensure that there are enough datapoints between the knots because I play with the indices of the x vector.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...