Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
762 views
in Technique[技术] by (71.8m points)

matlab - The pooled covariance matrix of TRAINING must be positive definite

I know this question has already been asked a couple of times, but I couldn't find a solution to my problem.

I don't have more variables than observations and I don't have NAN values in my matrix. Here's my function:

function [ind, idx_ran] = fselect(features_f, class_f, dir)

idx = linspace(1,size(features_f, 2), size(features_f, 2));

idx_ran = idx(:,randperm(size(features_f, 2)));

features_t_ran = features_f(:,idx_ran); % randomize colums

len = length(class_f);

r = randi(len, [1, round(len*0.15)]);

x = features_t_ran;
y = class_f;

xtrain = x;
ytrain = y;

xtrain(r,:) = [];
ytrain(r,:) = [];

xtest = x(r,:);
ytest = y(r,:);

f = @(xtrain, ytrain, xtest, ytest)(sum(~strcmp(ytest, classify(xtest, xtrain, ytrain))));
fs = sequentialfs(f, x, y, 'direction', dir);

ind = find(fs < 1);

end

and here are my test and training data.

>> whos xtest
  Name         Size             Bytes  Class     Attributes

  xtest      524x42            176064  double              

>> whos xtrain
  Name           Size              Bytes  Class     Attributes

  xtrain      3008x42            1010688  double              

>> whos ytest
  Name         Size            Bytes  Class    Attributes

  ytest      524x1             32488  cell               

>> whos ytrain
  Name           Size             Bytes  Class    Attributes

  ytrain      3008x1             186496  cell               

>> 

and here's the error,

Error using crossval>evalFun (line 465)
The function
'@(xtrain,ytrain,xtest,ytest)(sum(~strcmp(ytest,classify(xtest,xtrain,ytrain))))' generated
the following error:
The pooled covariance matrix of TRAINING must be positive definite.

Error in crossval>getFuncVal (line 482)
funResult = evalFun(funorStr,arg(:));

Error in crossval (line 324)
    funResult = getFuncVal(1, nData, cvp, data, funorStr, []);

Error in sequentialfs>callfun (line 485)
    funResult = crossval(fun,x,other_data{:},...

Error in sequentialfs (line 353)
                crit(k) = callfun(fun,x,other_data,cv,mcreps,ParOptions);

Error in fselect (line 26)
fs = sequentialfs(f, x, y, 'direction', dir);

Error in workflow_forward (line 31)
    [ind, idx_ran] = fselect(features_f, class_f, 'forward');

this was working yesterday. :/

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you inspect function classify you find that the error is generated when the program checks the condition number of the matrix R obtained from QR decomposition of your training matrix. In other words, it is unhappy with the training matrix you are providing. It finds that this matrix is ill-conditioned and therefore any solution would be unstable (the function performs the equivalent of a matrix inversion which would lead to the equivalent of division by a very small number for an ill-conditioned training matrix).

It seems that by shrinking the size of your training set the stability was reduced. My suggestion is to use a larger training set if possible.

Edit

You may be wondering how it is possible to have more observations than variables and still have an ill-conditioned problem. The answer is that different observations can be linear combinations of each other.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...