We are reading data from MongoDB Collection
. Collection
column has two different values (e.g.: (bson.Int64,int) (int,float)
).
I am trying to get a datatype using pyspark.
My problem is some columns have different datatype.
Assume quantity
and weight
are the columns
quantity weight
--------- --------
12300 656
123566000000 789.6767
1238 56.22
345 23
345566677777789 21
Actually we didn't defined data type for any column of mongo collection.
When I query to the count from pyspark dataframe
dataframe.count()
I got exception like this
"Cannot cast STRING into a DoubleType (value: BsonString{value='200.0'})"
question from:
https://stackoverflow.com/questions/45033315/get-datatype-of-column-using-pyspark 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…