I'm getting invalid timestamp when reading Elasticsearch records using Spark with elasticsearch-hadoop library. I'm using following Spark code for records reading:
val sc = spark.sqlContext
val elasticFields = Seq(
"start_time",
"action",
"category",
"attack_category"
)
sc.sql(
"CREATE TEMPORARY TABLE myIndex " +
"USING org.elasticsearch.spark.sql " +
"OPTIONS (resource 'aggattack-2021.01')" )
val all = sc.sql(
s"""
|SELECT ${elasticFields.mkString(",")}
|FROM myIndex
|""".stripMargin)
all.show(2)
Which leads to the following result:
+-----------------------+------+---------+---------------+
|start_time |action|category |attack_category|
+-----------------------+------+---------+---------------+
|1970-01-19 16:04:27.228|drop |udp-flood|DoS |
|1970-01-19 16:04:24.027|drop |others |DoS |
+-----------------------+------+---------+---------------+
But I'm expecting timestamp with current year, eg 2021-01-19 16:04:27.228
. In the elastic, start_time
field has unixtime format in millis -> start_time": 1611314773.641
question from:
https://stackoverflow.com/questions/65858628/invalid-timestamp-when-reading-elasticsearch-records-with-spark 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…