Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
966 views
in Technique[技术] by (71.8m points)

amazon ec2 - Submitting jobs to Spark EC2 cluster remotely

I've set up the EC2 cluster with Spark. Everything works, all master/slaves are up and running.

I'm trying to submit a sample job (SparkPi). When I ssh to cluster and submit it from there - everything works fine. However when driver is created on a remote host (my laptop), it doesn't work. I've tried both modes for --deploy-mode:

--deploy-mode=client:

From my laptop:

./bin/spark-submit --master spark://ec2-52-10-82-218.us-west-2.compute.amazonaws.com:7077 --class SparkPi ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar

Results in the following indefinite warnings/errors:

WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 15/02/22 18:30:45

ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 0 15/02/22 18:30:45

ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 1

...and failed drivers - in Spark Web UI "Completed Drivers" with "State=ERROR" appear.

I've tried to pass limits for cores and memory to submit script but it didn't help...

--deploy-mode=cluster:

From my laptop:

./bin/spark-submit --master spark://ec2-52-10-82-218.us-west-2.compute.amazonaws.com:7077 --deploy-mode cluster --class SparkPi ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar

The result is:

.... Driver successfully submitted as driver-20150223023734-0007 ... waiting before polling master for driver state ... polling master for driver state State of driver-20150223023734-0007 is ERROR Exception from cluster was: java.io.FileNotFoundException: File file:/home/oleg/spark/spark12/ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar does not exist. java.io.FileNotFoundException: File file:/home/oleg/spark/spark12/ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:329) at org.apache.spark.deploy.worker.DriverRunner.org$apache$spark$deploy$worker$DriverRunner$$downloadUserJar(DriverRunner.scala:150) at org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:75)

So, I'd appreciate any pointers on what is going wrong and some guidance how to deploy jobs from remote client. Thanks.

UPDATE: So for the second issue in cluster mode, the file must be globally visible by each cluster node, so it has to be somewhere in accessible location. This solve IOException but leads to the same issue as in the client mode.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I advise against submitting spark jobs remotely using the port opening strategy, because it can create security problems and is in my experience, more trouble than it's worth, especially due to having to troubleshoot the communication layer.

Alternatives:

1) Livy - now an Apache project! http://livy.io or http://livy.incubator.apache.org/

2) Spark Job server - https://github.com/spark-jobserver/spark-jobserver


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...