Invalid timestamp when reading Elasticsearch records with Spark

Question

Invalid timestamp when reading Elasticsearch records with Spark

269 views Asked by Jacfal At 23 January 2021 at 11:22

I'm getting invalid timestamp when reading Elasticsearch records using Spark with elasticsearch-hadoop library. I'm using following Spark code for records reading:

val sc = spark.sqlContext
  val elasticFields = Seq(
    "start_time",
    "action",
    "category",
    "attack_category"
  )

  sc.sql(
    "CREATE TEMPORARY TABLE myIndex " +
      "USING org.elasticsearch.spark.sql " +
      "OPTIONS (resource 'aggattack-2021.01')" )

  val all = sc.sql(
    s"""
      |SELECT ${elasticFields.mkString(",")}
      |FROM myIndex
      |""".stripMargin)
  all.show(2)

Which leads to the following result:

+-----------------------+------+---------+---------------+
|start_time             |action|category |attack_category|
+-----------------------+------+---------+---------------+
|1970-01-19 16:04:27.228|drop  |udp-flood|DoS            |
|1970-01-19 16:04:24.027|drop  |others   |DoS            |
+-----------------------+------+---------+---------------+

But I'm expecting timestamp with current year, eg 2021-01-19 16:04:27.228. In the elastic, start_time field has unixtime format in millis -> start_time": 1611314773.641

Original Q&A

There are 1 answers

**Jacfal** · Accepted Answer · 2021-01-25T19:34:56+00:00

Jacfal On 25 January 2021 at 19:34 BEST ANSWER

Problem was with the data in ElasticSearch. start_time field was mapped as epoch_seconds and contained value epoch seconds with three decimal places (eg 1611583978.684). Everything works fine after we have converted epoch time to millis without any decimal places

TechQA.

Invalid timestamp when reading Elasticsearch records with Spark

There are 1 answers

Related Questions in APACHE-SPARK

Related Questions in ELASTICSEARCH

Related Questions in ELASTICSEARCH-HADOOP

Popular Questions

Trending Questions