I need a hand on understanding the intuition behind the usage of Cohen's kappa for measuring the reliability of two classified data-set.
- Can the formula Po - Pe / 1 - Pe be interpreted as we were dealing with probabilities? I have seen somewhere this approach, rather that the actual accuracy and expected accuracy interpretation, so I wanted to frame my question from this standpoint.
"Po" would be the acutal probability that, given the elements in the dataset, the two analysers have given them the same label. This probability is estimated as "Number of cases of accordance/ Number of total elements".
"Pe" would be the probability that the analysers have assigned the same category, but in the case of statistical independence. Suppose the categories were "a" and "b".
Pe would be equal to P(a) + Pe(b), it is the sum of the expected probability of extracting an a-element, plus the expected probability of extracting a b-element, both in the case of statistical independence. (I do not think I have really understood why this should be correct)
Secondly, considering P(a) we would have that P(a) = P-a1(a) * P-a2(a)
The probability of finding an a-element is the product of the probability that the Analyser 1 has labeled an element as "a" * that same probability but from Analyser 2.
The probability that A1 has assigned to an element the label "a" can be estimated as the number of times A1 has labeled an element as A over the whole number of elements.
Same with the label "b".
My interpretation for all the process would be:
We want to know the probability of finding an element they have both labeled the same. This should be P(a U b) = Pa + Pb.
The probability both A1 and A2 have assigned an element the category "a" will be the joint probability of P(A1-a, A2-a) = PA1(a) * PA2(a) and these last are estimeted as relative frequencies.
Is this thought process correct or do i misunderstood some points?
- I still do not understand why we should use this measure as an estimater of the degree of accordance between two data-set. What does it give us more than the numbers of instaces they both have categorized in the same way?
Thanks in advance for the help.
question from:
https://stackoverflow.com/questions/65648935/explanation-of-the-intuition-behind-the-cohens-kappa-usage 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…