Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
4.5k views
in Technique[技术] by (71.8m points)

python - How to compute auc score manually without using sklearn?

I want to compute auc_score with out using sklearn.

I have a csv file with 2 columns (actual,predicted(probability)). And I want to compute auc score using numpy.trapz() function .

And here is my code

from tqdm import tqdm
def AUC_SCORE(x):
  t=[]
  f=[]
  x=x.sort_values(by=["proba"],ascending=False)
  for t in tqdm(x["proba"].unique()):
    x['y_pred'] =np.where( x['proba']>=t,1,0)
    tp=(x["y"]==1)&(x["y_pred"]==1).sum()
    fp=(x["y"]==0)&(x["y_pred"]==1).sum()
    tn=(x["y"]==0)&(x["y_pred"]==0).sum()
    fn=(x["y"]==1)&(x["y_pred"]==0).sum()
    tpr= tp/(fp+fn)
    fpr= fp/(tn+fp)
    t.append(tpr)
    f.append(fpr)
  return np.trapz(t,f)
e=AUC_SCORE(a)

and i have around 10100 points and it almost takes above 1 hr using google colab. and i din't get my result and i am getting errors while modifying my code. is there there any better/any way to compute auc score with out using sklearn.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The problem with your implementation seems to be here:

x=x.sort_values(by=["proba"],ascending=False)
for t in tqdm(x["proba"].unique()):

You seem to get through each unique values of probabilities, but these are in range 0-1 (probably) and are most likely barely unique, which leads to very long run. You need to translate probability into the label. If you are using binary labels (which from your attempt seems so), you can do following list comprehension:

df["prediction"] = [0 if x<0.5 else 1 for x in df["proba"]]

This way you translate the probability to label and then can sort according to prediction and use unique values in predictions. If you use multilabel predictions, you can extend the above condition according to your needs.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...