I have a field in a data frame currently formatted as a string (mm/dd/yyyy) and I want to create a new column in that data frame with the day of week name (i.e. Thursday) for that field. I've imported
import com.github.nscala_time.time.Imports._
but am not sure where to go from here.
Create formatter:
Parse date:
Get a day of the week:
Wrap it using
org.apache.spark.sql.functions.udf
and you have a complete solution. Still there is no need for that sinceHiveContext
already provides all the required UDFs:EDIT:
Since Spark 1.5 you can use
from_unixtime
,unix_timestamp
functions directly: