My current Java/Spark Unit Test approach works (detailed here) by instantiating a SparkContext using "local" and running unit tests using JUnit.
The code has to be organized to do I/O in one function and then call another with multiple RDDs.
This works great. I have a highly tested data transformation written in Java + Spark.
Can I do the same with Python?
How would I run Spark unit tests with Python?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…