Reading mixed-format dates

381 views Asked by At

Where a column of dates contains two distinct formats, is there a way to parse both such that I get 2 rows of 2012-04-01 rather than the result below?

# A tibble: 2 x 1
        date
      <date>
1 2012-04-01
2         NA

**Example Code**
library(tidyverse)
data <- read_csv("date\n01/04/12\n41000", col_types = cols_only("date" = col_date("%d/%m/%y")))
head(data)
2

There are 2 answers

0
Sam Firke On BEST ANSWER

janitor::convert_to_date handles a mix of Excel numeric values and date-formatted dates.

library(tidyverse)
library(janitor)
data <- read_csv("date,text\n01/04/12,a\n41000,b\n02/02/13,c")

data %>%
      mutate(clean_date = convert_to_date(date,
                                          character_fun = lubridate::dmy))
#> # A tibble: 3 x 3
#>   date     text  clean_date
#>   <chr>    <chr> <date>    
#> 1 01/04/12 a     2012-04-01
#> 2 41000    b     2012-04-01
#> 3 02/02/13 c     2013-02-02

Self-promotion disclaimer: I authored this package. But this function is built for precisely this use case so feels relevant to post here.

3
shaojl7 On

Instead of forcing a date data type while reading the data, you can try something like below:

data <- read_csv("date\n01/04/12\n41000")
data %>% mutate(date =ifelse(is.numeric(date), as.Date(date, origin = "1990-01-01"), date))