How to convert unix timestamp to date in Spark

136.3k views Asked by At

I have a data frame with a column of unix timestamp(eg.1435655706000), and I want to convert it to data with format 'yyyy-MM-DD', I've tried nscala-time but it doesn't work.

val time_col = sqlc.sql("select ts from mr").map(_(0).toString.toDateTime)
time_col.collect().foreach(println)

and I got error: java.lang.IllegalArgumentException: Invalid format: "1435655706000" is malformed at "6000"

7

There are 7 answers

0
Abhinav Kaushal Keshari On

What you can do is:

input.withColumn("time", concat(from_unixtime(input.col("COL_WITH_UNIX_TIME")/1000,
"yyyy-MM-dd'T'HH:mm:ss"), typedLit("."), substring(input.col("COL_WITH_UNIX_TIME"), 11, 3), 
typedLit("Z")))

where time is a new column name and COL_WITH_UNIX_TIME is the name of the column which you want to convert. This will give data in millis, making your data more accurate, like: "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"

0
Alex Stanovsky On

You can use the following syntax in Java

input.select("timestamp)
            .withColumn("date", date_format(col("timestamp").$div(1000).cast(DataTypes.TimestampType), "yyyyMMdd").cast(DataTypes.IntegerType))
0
youngchampion On

I have solved this issue using the joda-time library by mapping on the DataFrame and converting the DateTime into a String :

import org.joda.time._
val time_col = sqlContext.sql("select ts from mr")
                         .map(line => new DateTime(line(0)).toString("yyyy-MM-dd"))
1
Orar On

You needn't convert to String before applying toDataTime with nscala_time

import com.github.nscala_time.time.Imports._

scala> 1435655706000L.toDateTime
res4: org.joda.time.DateTime = 2015-06-30T09:15:06.000Z

`

4
Yuan Zhao On

Since spark1.5 , there is a builtin UDF for doing that.

val df = sqlContext.sql("select from_unixtime(ts,'YYYY-MM-dd') as `ts` from mr")

Please check Spark 1.5.2 API Doc for more info.

1
Marsellus Wallace On

Here it is using Scala DataFrame functions: from_unixtime and to_date

// NOTE: divide by 1000 required if milliseconds
// e.g. 1446846655609 -> 2015-11-06 21:50:55 -> 2015-11-06 
mr.select(to_date(from_unixtime($"ts" / 1000))) 
0
Hammad Haleem On
import org.joda.time.{DateTime, DateTimeZone}
import org.joda.time.format.DateTimeFormat

You need to import the following libraries.

val stri = new DateTime(timeInMillisec).toString("yyyy/MM/dd")

Or adjusting to your case :

 val time_col = sqlContext.sql("select ts from mr")
                     .map(line => new DateTime(line(0).toInt).toString("yyyy/MM/dd"))

There could be another way :

  import com.github.nscala_time.time.Imports._
  
  val date = (new DateTime() + ((threshold.toDouble)/1000).toInt.seconds )
             .toString("yyyy/MM/dd")

Hope this helps :)