I don't have an elegant answer, but I do have an answer.
(SEE THE EDIT BELOW FOR A MORE ELEGANT ANSWER)
If Cx
is small enough that there are no data points to fit A
and Cy
to, or if Cx
is big enough that there are no data points to fit B
and Cy
to, the QR decomposition matrix will be singular because there will be many different values of Cx
, A
and Cy
or Cx
, B
and Cy
respectively that will fit the data equally well.
I tested this by preventing Cx
from being fitted. If I fix Cx
at (say) Cx = mean(x)
, nls()
solves the problem without difficulty:
nls(y ~ ifelse(x < mean(x),ya+A*x,yb+B*x),
data = data.frame(x,y),
start = c(A=-1000,B=-1000,ya=3,yb=0))
... gives:
Nonlinear regression model
model: y ~ ifelse(x < mean(x), ya + A * x, yb + B * x)
data: data.frame(x, y)
A B ya yb
-1325.537 -1335.918 2.628 2.652
residual sum-of-squares: 0.06614
Number of iterations to convergence: 1
Achieved convergence tolerance: 2.294e-08
That led me to think that if I transformed Cx
so that it could never go outside the range [min(x),max(x)]
, that might solve the problem. In fact, I'd want there to be at least three data points available to fit each of the "A" line and the "B" line, so Cx has to be between the third lowest and the third highest values of x
. Using the atan()
function with the appropriate arithmetic let me map a range [-inf,+inf]
onto [0,1]
, so I got the code:
trans <- function(x) 0.5+atan(x)/pi
xs <- sort(x)
xlo <- xs[3]
xhi <- xs[length(xs)-2]
nls(y ~ ifelse(x < xlo+(xhi-xlo)*trans(f),ya+A*x,yb+B*x),
data = data.frame(x,y),
start = c(A=-1000,B=-1000,ya=3,yb=0,f=0))
Unfortunately, however, I still get the singular gradient matrix at initial parameters
error from this code, so the problem is still over-parameterised. As @Henrik has suggested, the difference between the bilinear and single linear fit is not great for these data.
I can nevertheless get an answer for the bilinear fit, however. Since nls()
solves the problem when Cx
is fixed, I can now find the value of Cx
that minimises the residual standard error by simply doing a one-dimensional minimisation using optimize()
. Not a particularly elegant solution, but better than nothing:
xs <- sort(x)
xlo <- xs[3]
xhi <- xs[length(xs)-2]
nn <- function(f) nls(y ~ ifelse(x < xlo+(xhi-xlo)*f,ya+A*x,yb+B*x),
data = data.frame(x,y),
start = c(A=-1000,B=-1000,ya=3,yb=0))
ssr <- function(f) sum(residuals(nn(f))^2)
f = optimize(ssr,interval=c(0,1))
print (f$minimum)
print (nn(f$minimum))
summary(nn(f$minimum))
... gives output of:
[1] 0.8541683
Nonlinear regression model
model: y ~ ifelse(x < xlo + (xhi - xlo) * f, ya + A * x, yb + B * x)
data: data.frame(x, y)
A B ya yb
-1317.215 -872.002 2.620 1.407
residual sum-of-squares: 0.0414
Number of iterations to convergence: 1
Achieved convergence tolerance: 2.913e-08
Formula: y ~ ifelse(x < xlo + (xhi - xlo) * f, ya + A * x, yb + B * x)
Parameters:
Estimate Std. Error t value Pr(>|t|)
A -1.317e+03 1.792e+01 -73.493 < 2e-16 ***
B -8.720e+02 1.207e+02 -7.222 1.14e-07 ***
ya 2.620e+00 2.791e-02 93.854 < 2e-16 ***
yb 1.407e+00 3.200e-01 4.399 0.000164 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.0399 on 26 degrees of freedom
Number of iterations to convergence: 1
There isn't a huge difference between the values of A
and B
and ya
and yb
for the optimum value of f
, but there is some difference.
(EDIT -- ELEGANT ANSWER)
Having separated the problem into two steps, it isn't necessary to use nls()
any more. lm()
works fine, as follows:
function (x,y)
{
f <- function (Cx)
{
lhs <- function(x) ifelse(x < Cx,Cx-x,0)
rhs <- function(x) ifelse(x < Cx,0,x-Cx)
fit <- lm(y ~ lhs(x) + rhs(x))
c(summary(fit)$r.squared,
summary(fit)$coef[1], summary(fit)$coef[2],
summary(fit)$coef[3])
}
r2 <- function(x) -(f(x)[1])
res <- optimize(r2,interval=c(min(x),max(x)))
res <- c(res$minimum,f(res$minimum))
best_Cx <- res[1]
coef1 <- res[3]
coef2 <- res[4]
coef3 <- res[5]
plot(x,y)
abline(coef1+best_Cx*coef2,-coef2) #lhs
abline(coef1-best_Cx*coef3,coef3) #rs
}
... which gives: