Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.2k views
in Technique[技术] by (71.8m points)

machine learning - Gradient function not able to find optimal theta but normal equation does

I tried implementing my own linear regression model in octave with some sample data but the theta does not seem to be correct and does not match the one provided by the normal equation which gives the correct values of theta. But running my model(with different alpha and iterations) on the data from Andrew Ng's machine learning course gives the proper theta for the hypothesis. I have tweaked alpha and iterations so that the cost function decreases. This is the image of cost function against iterations.. As you can see the cost decreases and plateaus but not to a low enough cost. Can somebody help me understand why this is happening and what I can do to fix it?

Here is the data (The first column is the x values, and the second column is the y values):

20,48
40,55.1
60,56.3
80,61.2
100,68

Here is the graph of the data and the equations plotted by gradient descent(GD) and by the normal equation(NE).

Code for the main script:

clear , close all, clc;

%loading the data
data = load("data1.txt");

X = data(:,1);
y = data(:,2);

%Plotting the data
figure
plot(X,y, 'xr', 'markersize', 7);
xlabel("Mass in kg");
ylabel("Length in cm");

X = [ones(length(y),1), X];
theta = ones(2, 1);


alpha = 0.000001; num_iter = 4000;
%Running gradientDescent
[opt_theta, J_history] = gradientDescent(X, y, theta, alpha, num_iter);

%Running Normal equation
opt_theta_norm = pinv(X' * X) * X' * y;

%Plotting the hypothesis for GD and NE
hold on
plot(X(:,2), X * opt_theta);
plot(X(:,2), X * opt_theta_norm, 'g-.', "markersize",10);
legend("Data", "GD", "NE");
hold off

%Plotting values of previous J with each iteration
figure
plot(1:numel(J_history), J_history);
xlabel("iterations"); ylabel("J");

Function for finding gradientDescent:

function [theta, J_history] = gradientDescent (X, y, theta, alpha, num_iter)

m = length(y);
J_history = zeros(num_iter,1);
for iter = 1:num_iter
  theta = theta - (alpha / m) * (X' * (X * theta - y));
  J_history(iter) = computeCost(X, y, theta);
endfor
endfunction

Function for computing cost:

function J = computeCost (X, y, theta)
  
J = 0;
m = length(y);
errors = X * theta - y;
J = sum(errors .^ 2) / (2 * m);

endfunction

question from:https://stackoverflow.com/questions/65839043/gradient-function-not-able-to-find-optimal-theta-but-normal-equation-does

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Try alpha = 0.0001 and num_iter = 400000. This will solve your problem!

Now, the problem with your code is that the learning rate is way too less which is slowing down the convergence. Also, you are not giving it enough time to converge by limiting the training iterations to 4000 only which is very less given the learning rate.

Summarising, the problem is: less learning rate + less iterations.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...