Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
716 views
in Technique[技术] by (71.8m points)

scala - Difference between org.apache.spark.ml.classification and org.apache.spark.mllib.classification

I'm writing a spark application and would like to use algorithms in MLlib. In the API doc I found two different classes for the same algorithm.

For example, there is one LogisticRegression in org.apache.spark.ml.classification also a LogisticRegressionwithSGD in org.apache.spark.mllib.classification.

The only difference I can find is that the one in org.apache.spark.ml is inherited from Estimator and was able to be used in cross validation. I was quite confused that they are placed in different packages. Is there anyone know the reason for it? Thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It's JIRA ticket

And From Design Doc:

MLlib now covers a basic selection of machine learning algorithms, e.g., logistic regression, decision trees, alternating least squares, and k-means. The current set of APIs contains several design flaws that prevent us moving forward to address practical machine learning pipelines, make MLlib itself a scalable project.

The new set of APIs will live under org.apache.spark.ml, and o.a.s.mllib will be deprecated once we migrate all features to o.a.s.ml.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...