I'm looking for implementation for K-Nearest Neighbor algorithm in Java for unstructured data. I found many implementation for numeric data, however how I can implement it and calculate the Euclidean Distance for text (Strings).
Here is one example for double:
public static double EuclideanDistance(double [] X, double []Y)
{
int count = 0;
double distance = 0.0;
double sum = 0.0;
if(X.length != Y.length)
{
try {
throw new Exception("the number of elements" +
" in X must match the number of elements in Y");
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
else
{
count = X.length;
}
for (int i = 0; i < count; i++)
{
sum = sum + Math.pow(Math.abs(X[i] - Y[i]),2);
}
distance = Math.sqrt(sum);
return distance;
}
How I can implement it for Strings (unstructured data)? For example,
Class 1:
"It was amazing. I loved it"
"It is perfect movie"
Class 2:
"Boring. Boring. Boring."
"I do not like it"
How can we implement KNN on such type of data and calculate Euclidean Distance?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…