python - How do I install pyspark for use in standalone scripts?

Question

Welcome To Ask or Share your Answers For Others

python - How do I install pyspark for use in standalone scripts?

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - How do I install pyspark for use in standalone scripts?

I'm am trying to use Spark with Python. I installed the Spark 1.0.2 for Hadoop 2 binary distribution from the downloads page. I can run through the quickstart examples in Python interactive mode, but now I'd like to write a standalone Python script that uses Spark. The quick start documentation says to just import pyspark, but this doesn't work because it's not on my PYTHONPATH.

I can run bin/pyspark and see that the module is installed beneath SPARK_DIR/python/pyspark. I can manually add this to my PYTHONPATH environment variable, but I'd like to know the preferred automated method.

What is the best way to add pyspark support for standalone scripts? I don't see a setup.py anywhere under the Spark install directory. How would I create a pip package for a Python script that depended on Spark?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-16T23:59:34+0000

Spark-2.2.0 onwards use `pip install pyspark` to install pyspark in your machine.

For older versions refer following steps. Add Pyspark lib in Python path in the bashrc

export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH

also don't forget to set up the SPARK_HOME. PySpark depends the py4j Python package. So install that as follows

pip install py4j

For more details about stand alone PySpark application refer this post

Categories

python - How do I install pyspark for use in standalone scripts?

python - How do I install pyspark for use in standalone scripts?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Spark-2.2.0 onwards use `pip install pyspark` to install pyspark in your machine.

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags

Categories

python - How do I install pyspark for use in standalone scripts?

python - How do I install pyspark for use in standalone scripts?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Spark-2.2.0 onwards use pip install pyspark to install pyspark in your machine.

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags

Spark-2.2.0 onwards use `pip install pyspark` to install pyspark in your machine.