I have already answered a similar question before. The error message says all. With spark < version 2.x, you'll need a HiveContext
in your application jar, no other way around.
You can read further about the difference between SQLContextand HiveContext here.
SparkSQL
has a SQLContext
and a HiveContext
. HiveContext
is a super set of the SQLContext
. The Spark community suggest using the HiveContext
. You can see that when you run spark-shell, which is your interactive driver application, it automatically creates a SparkContext
defined as sc and a HiveContext
defined as sqlContext
. The HiveContext
allows you to execute SQL queries as well as Hive commands.
You can try to check that inside of your spark-shell
:
Welcome to
____ __
/ __/__ ___ _____/ /__
_ / _ / _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_ version 1.6.0
/_/
Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_74)
scala> sqlContext.isInstanceOf[org.apache.spark.sql.hive.HiveContext]
res0: Boolean = true
scala> sqlContext.isInstanceOf[org.apache.spark.sql.SQLContext]
res1: Boolean = true
scala> sqlContext.getClass.getName
res2: String = org.apache.spark.sql.hive.HiveContext
By inheritance, HiveContext
is actually an SQLContext
, but it's not true the other way around. You can check the source code if you are more intersted in knowing how does HiveContext
inherits from SQLContext
.
Since spark 2.0, you'll just need to create a SparkSession
(as the single entry point) which removes the HiveContext
/SQLContext
confusion issue.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…