Problem converting time series df from chr to date using as.POSIXct

Question

Problem converting time series df from chr to date using as.POSIXct

30 views Asked by LZ24AP At 28 March 2024 at 13:56

I'm importing a large high resolution time series hydrometric dataset from government website as a csv file. The column 'x.Timestamp' is importing as chr with some unusual characters but when I try to convert it from chr to date, NA is returned. I would be really grateful for any help.

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
OW_HR_Discharge_DataV2<- data.frame(
  stringsAsFactors = FALSE,
  X.Timestamp = c("1 1955-11-08T00:00:00.000Z",
                  "2 1955-11-08T00:15:00.000Z","3 1955-11-08T00:30:00.000Z",
                  "4 1955-11-08T00:45:00.000Z",
                  "5 1955-11-08T01:00:00.000Z","6 1955-11-08T01:15:00.000Z"),
  Value = c(10.227, 10.226, 10.228, 10.227, 10.227, 10.227),
  Quality.Code = c(31L, 31L, 31L, 31L, 31L, 31L)
)
str(OW_HR_Discharge_DataV2)
#> 'data.frame':    6 obs. of  3 variables:
#>  $ X.Timestamp : chr  "1 1955-11-08T00:00:00.000Z" "2 1955-11-08T00:15:00.000Z" "3 1955-11-08T00:30:00.000Z" "4 1955-11-08T00:45:00.000Z" ...
#>  $ Value       : num  10.2 10.2 10.2 10.2 10.2 ...
#>  $ Quality.Code: int  31 31 31 31 31 31
OW_HR_Discharge_DataV2$X.Timestamp<-as.POSIXct(OW_HR_Discharge_DataV2$X.Timestamp, format = "%Y-%m-%d %H:%M:%S", tz="GMT")
str(OW_HR_Discharge_DataV2)
#> 'data.frame':    6 obs. of  3 variables:
#>  $ X.Timestamp : POSIXct, format: NA NA ...
#>  $ Value       : num  10.2 10.2 10.2 10.2 10.2

...
#>  $ Quality.Code: int  31 31 31 31 31 31

Original Q&A

There are 2 answers

Sophie On 28 March 2024 at 14:34

Your X.Timestamp does not contain the format passed in parameter to as.POSIXct. in "1 1955-11-08T00:00:00.000Z", there is a superfluous "1" at the beginning. This code works:

OW_HR_Discharge_DataV2$X.Timestamp2 <- as.POSIXct(sapply(strsplit(OW_HR_Discharge_DataV2$X.Timestamp, " "),
                                                         function (x) x[[2]]),
                                                  tz = "GMT")
OW_HR_Discharge_DataV2

**Adriano Mello** · Accepted Answer · 2024-03-28T21:08:00+00:00

As @Sophie said with other words, your csv's row-numbers appears to be at the beginning of OW_HR_Discharge_DataV2. Remove it with stringr::str_remove() and parse the timestamp with lubridate::ymd_hms():

library(tidyverse)

# ------------------
OW_HR_Discharge_DataV2 <- OW_HR_Discharge_DataV2 %>% 
  as_tibble() %>% 
  mutate(
    new_timestamp = X.Timestamp %>% 
      str_remove("^\\d+\\s") %>% 
      ymd_hms(tz = "GMT"))

# Output ----------
> OW_HR_Discharge_DataV2

# A tibble: 6 × 4
  X.Timestamp                Value Quality.Code new_timestamp      
  <chr>                      <dbl>        <int> <dttm>             
1 1 1955-11-08T00:00:00.000Z  10.2           31 1955-11-08 00:00:00
2 2 1955-11-08T00:15:00.000Z  10.2           31 1955-11-08 00:15:00
3 3 1955-11-08T00:30:00.000Z  10.2           31 1955-11-08 00:30:00
4 4 1955-11-08T00:45:00.000Z  10.2           31 1955-11-08 00:45:00
5 5 1955-11-08T01:00:00.000Z  10.2           31 1955-11-08 01:00:00
6 6 1955-11-08T01:15:00.000Z  10.2           31 1955-11-08 01:15:00

Note1: You don't really need to make your data frame as_tibble() but it makes sense in a tidyverse solution.

Note2: when working with dates, make sure your timezone is in base::OlsonNames() ("GMT" is) and avoid problems.

Hope it helps.

TechQA.

Problem converting time series df from chr to date using as.POSIXct

There are 2 answers

Related Questions in R

Related Questions in DATETIME

Related Questions in TYPE-CONVERSION

Related Questions in LUBRIDATE

Popular Questions

Trending Questions