Usually, I try to avoid constraints as much as possible. So the first step should be: check if the true solution is within the constraints anyway. I'll do the programming in Python, but it is quite generic, so it should be easy to translate it to Matlab.
import numpy as np
xl = np.arange(9)+1
y1 = np.array([
0.4304, 0.2249, 0.1283, 0.0794, 0.0484, 0.0326, 0.0203, 0.0125, 0.0072
])
y2 = np.array([
0.2179, 0.1699, 0.1410, 0.1101, 0.0871, 0.0679, 0.0515, 0.0385, 0.0296
])
bd1_y1 = np.array([
0.5587, 0.2244, 0.1023, 0.0520, 0.0276, 0.0155, 0.0089, 0.0053, 0.0033
])
bd1_y2 = np.array([
0.4580, 0.2788, 0.1198, 0.0642, 0.0342, 0.0197, 0.0115, 0.0069, 0.0043
])
bd1_y3 = np.array([
0.3584, 0.3102, 0.1540, 0.0755, 0.0440, 0.0248, 0.0148, 0.0091, 0.0056
])
bd2_y1 = np.array([
0.3266, 0.1778, 0.1255, 0.0975, 0.0777, 0.0612, 0.0478, 0.0367, 0.0281
])
bd2_y2 = np.array([
0.2985, 0.2086, 0.1268, 0.0939, 0.0722, 0.0580, 0.0470, 0.0383, 0.0313
])
bd2_y3 = np.array([
0.2451, 0.2221, 0.1434, 0.0999, 0.0775, 0.0609, 0.0494, 0.0406, 0.0335
])
"""
if y = a x1 + b x2 + (1- a - b ) x3 then
y - x3 = a ( x1 - x3 ) + b ( x2 -x3 )
i.e. simple transformation to get a homogeneous linear system
"""
y1k = y1 - bd1_y3
y11k = bd1_y1 - bd1_y3
y12k = bd1_y2 - bd1_y3
y2k = y2 - bd2_y3
y21k = bd2_y1 - bd2_y3
y22k = bd2_y2 - bd2_y3
"""
So we just want to solve a simple linear system
y = a x1 + b x2
with y = (y1,...., yn ) and x1 = ( x11, x12, ..., x1n) etc
handling the two measurements with shared parameters is done by just
adding the according vectors so yp and yq go into
y_tot = ( yp1, ..., ypn, yq1, ...yqn ) and same for the x
The x for the matrix A such we can write the thing as y = A.(a,b)
This is standard linear optimization!
this is done in short in the following (first joining the data sets)
"""
y = np.append( y1k, y2k )
z1 = np.append( y11k, y21k )
z2 = np.append( y12k, y22k )
AT = np.array( [ z1, z2 ] )
A = np.transpose( AT )
U = np.dot( AT, A )
UI = np.linalg.inv( U )
K = np.dot( UI, AT )
v = np.dot( K, y )
a = v[0]
b = v[1]
print( a, b )
"""
in detail this should be done more carefully to avoid singular matrices etc,
e.g. via single value decomposition etc
in a next step it is usually assumed that all measurements share the
same base error s which we calculate
"""
diff = y - a * z1 - b * z2
s2 = np.sum( diff**2 ) / ( len( diff ) - 1 )
"""
The covariance matrix now is calculated via error propagation,
namely like s_a = s * sqrt( sum_i ( da / dyi )^2 ), which gives the classical
s^2 * K KT
"""
cov = s2 * np.dot( K, np.transpose( K ) )
print( cov )
"""
the error of a is (and be accordingly)
"""
sa = np.sqrt( cov[0,0] )
sb = np.sqrt( cov[1,1] )
print( sa, sb )
This code already joins the two data sets, which could be extended to n sets easily. The final solution looks like this:
upper graphs: original data. The target is blue and the three base
data sets are yellow, green and red, respectively. lower graphs are the blue target and the yellow best fit without constraints
With the solution
>> 1.9712440231598143 -3.163257723318949
So this is not within boundaries. Actually, a > 1
and b < 0
. So we know that the solution lies on the boundary. In the (a,b)
-space, either on the right or on the bottom. Least square minimization in a truly linear problem gives a true parabolic shape, so there is only one solution, no local minima.
Now instead of trying to solve the constraint problem, we can force the solution somewhat on the boundary.
"""
We just set on parameter to the according boundary value
and just solve as before. Here a matrix formalism is a small
overkill, but for consistency a good way of writing it.
"""
"""
case 1: b = 0
"""
AT = np.array( [ b1all ] )
A = np.transpose( AT )
U = np.dot( AT, A )
UI = np.linalg.inv( U )
K = np.dot( UI, AT )
v = np.dot( K, yall )
aopt1 = v[0]
diff = yall - aopt1 * b1all - 0 * b2all
s2 = np.sum( diff**2 ) / ( len( diff ) - 1 )
cov = s2 *np.dot( K, np.transpose( K ) )
print( aopt1 )
print( cov, np.sqrt(cov[0,0]) )
"""
case 2: a=1
"""
ymod = yall - b1all
AT = np.array( [ b2all ] )
A = np.transpose( AT )
U = np.dot( AT, A )
UI = np.linalg.inv( U )
K = np.dot( UI, AT )
v = np.dot( K, ymod )
bopt1 = v[0]
diff = ymod - bopt1 * b2all
s2 = np.sum( diff**2 ) / ( len( diff ) - 1 )
cov = s2 *np.dot( K, np.transpose( K ) )
print( bopt1 )
print( cov, np.sqrt(cov[0,0]) )
Which gives for a
>> 0.3815249997379142
>> [[0.00807797]] 0.08987750010601071
and for b
>> -1.2997646861507886
>> [[0.01736744]] 0.13178558294218914
So on the a=1
line the optimum value for b
is not within the constraints again, while with b = 0
there is a minimum for a
.
From a graphics point this looks as
chi-square surface as density plot. The boundary is shown in red. The true minimum is visible via the elliptic contour lines. The local minima on a=1
and b=0
are shown as blue dots
Here one can see the parabolic minimum of the chi-square surface. In case this has to be applied to many different data sets, it should be easy to automate the decision making. Calculation is fast, as everything is linear and can be calculated in one shot. If the true solution is outside the border and there is one border with a local minimum, there will be a second, but this is on the other side, so not relevant. Finally there is the case that for each border the minimum does not fulfill the constraints. Then the solution is the according corner.
To some extend this simple case allows to manually perform steps that otherwise would be performed an algorithm for inequality constraints (equipped for much more complicated situations).
Eventually, the best fit within constraints looks like:
As above but with constraints