In Spark, there are 3 primary ways to specify the options for the SparkConf
used to create the SparkContext
:
- As properties in the conf/spark-defaults.conf
- e.g., the line:
spark.driver.memory 4g
- As args to spark-shell or spark-submit
- e.g.,
spark-shell --driver-memory 4g ...
- In your source code, configuring a
SparkConf
instance before using it to create the SparkContext
:
- e.g.,
sparkConf.set( "spark.driver.memory", "4g" )
However, when using spark-shell
, the SparkContext is already created for you by the time you get a shell prompt, in the variable named sc
. When using spark-shell, how do you use option #3 in the list above to set configuration options, if the SparkContext is already created before you have a chance to execute any Scala statements?
In particular, I am trying to use Kyro serialization and GraphX. The prescribed way to use Kryo with GraphX is to execute the following Scala statement when customizing the SparkConf
instance:
GraphXUtils.registerKryoClasses( sparkConf )
How do I accomplish this when running spark-shell
?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…