Here the situation is different and somehow misleading, especially when you are comparing predict_proba
method to sklearn
methods with the same name. In Keras (not sklearn wrappers) a method predict_proba
is exactly the same as a predict
method. You can even check it here:
def predict_proba(self, x, batch_size=32, verbose=1):
"""Generates class probability predictions for the input samples
batch by batch.
# Arguments
x: input data, as a Numpy array or list of Numpy arrays
(if the model has multiple inputs).
batch_size: integer.
verbose: verbosity mode, 0 or 1.
# Returns
A Numpy array of probability predictions.
"""
preds = self.predict(x, batch_size, verbose)
if preds.min() < 0. or preds.max() > 1.:
warnings.warn('Network returning invalid probability values. '
'The last layer might not normalize predictions '
'into probabilities '
'(like softmax or sigmoid would).')
return preds
So - in a binary classification case - the output which you get depends on the design of your network:
- if the final output of your network is obtained by a single sigmoid output - then the output of
predict_proba
is simply a probability assigned to class 1.
- if the final output of your network is obtained by a two dimensional output to which you are applying a
softmax
function - then the output of predict_proba
is a pair where [a, b]
where a = P(class(x) = 0)
and b = P(class(x) = 1)
.
This second method is rarely used and there are some theorethical advantages of using the first method - but I wanted to inform you - just in case.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…