Jamie has a fleshed out example, but here's an example using make_scorer straight from scikit-learn documentation:
import numpy as np
def my_custom_loss_func(ground_truth, predictions):
diff = np.abs(ground_truth - predictions).max()
return np.log(1 + diff)
# loss_func will negate the return value of my_custom_loss_func,
# which will be np.log(2), 0.693, given the values for ground_truth
# and predictions defined below.
loss = make_scorer(my_custom_loss_func, greater_is_better=False)
score = make_scorer(my_custom_loss_func, greater_is_better=True)
ground_truth = [[1, 1]]
predictions = [0, 1]
from sklearn.dummy import DummyClassifier
clf = DummyClassifier(strategy='most_frequent', random_state=0)
clf = clf.fit(ground_truth, predictions)
loss(clf,ground_truth, predictions)
score(clf,ground_truth, predictions)
When defining a custom scorer via sklearn.metrics.make_scorer
, the convention is that custom functions ending in _score
return a value to maximize. And for scorers ending in _loss
or _error
, a value is returned to be minimized. You can use this functionality by setting the greater_is_better
parameter inside make_scorer
. That is, this parameter would be True
for scorers where higher values are better, and False
for scorers where lower values are better. GridSearchCV
can then optimize in the appropriate direction.
You can then convert your function as a scorer as follows:
from sklearn.metrics.scorer import make_scorer
def custom_loss_func(X_train_scaled, Y_train_scaled):
error, M = 0, 0
for i in range(0, len(Y_train_scaled)):
z = (Y_train_scaled[i] - M)
if X_train_scaled[i] > M and Y_train_scaled[i] > M and (X_train_scaled[i] - Y_train_scaled[i]) > 0:
error_i = (abs(Y_train_scaled[i] - X_train_scaled[i]))**(2*np.exp(z))
if X_train_scaled[i] > M and Y_train_scaled[i] > M and (X_train_scaled[i] - Y_train_scaled[i]) < 0:
error_i = -(abs((Y_train_scaled[i] - X_train_scaled[i]))**(2*np.exp(z)))
if X_train_scaled[i] > M and Y_train_scaled[i] < M:
error_i = -(abs(Y_train_scaled[i] - X_train_scaled[i]))**(2*np.exp(-z))
error += error_i
return error
custom_scorer = make_scorer(custom_loss_func, greater_is_better=True)
And then pass custom_scorer
into GridSearchCV
as you would any other scoring function: clf = GridSearchCV(scoring=custom_scorer)
.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…