I guess you are trying to use pandas df
instead of Spark's DF.
Pandas DataFrame has no such method as registerTempTable
you may try to create Spark DF from pandas DF.
I've tested it under Cloudera (with installed Anaconda parcel, which includes Pandas module).
Make sure that you have set PYSPARK_PYTHON
to your anaconda python installation (or another one containing Pandas module) on all your Spark workers (usually in: spark-conf/spark-env.sh
Here is result of my test:
>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame(np.random.randint(0,100,size=(10, 3)), columns=list('ABC'))
>>> sdf = sqlContext.createDataFrame(df)
>>> sdf.show()
| A| B| C|
| 98| 33| 75|
| 91| 57| 80|
| 20| 87| 85|
| 20| 61| 37|
| 96| 64| 60|
| 79| 45| 82|
| 82| 16| 22|
| 77| 34| 65|
| 74| 18| 17|
| 71| 57| 60|
>>> sdf.printSchema()
|-- A: long (nullable = true)
|-- B: long (nullable = true)
|-- C: long (nullable = true)