The normal_address() function from the campfin package is not working as I'd expect it to.
I'm trying to use a piece of code like this:
df <- df %>% mutate(clean_add = normal_address(RESERVATION_ADDRESS, abbs=usps_street))
I'm expecting all the words contained in usps_street$full to get replace with it's abbreviation. It does it most of the time, but not every time.
Is this just a bug with normal_address() or am I missing something?
It is causing addresses to not match when I attempt fuzzy matching in a step later one (even though when I look at them they're clearly the same).
Below are some addresses I haven't been able to get normalized correctly:
structure(list(RESERVATION_ADDRESS = c("4620 ASH GROVE DRIVE #3B",
"4001 DE MORADA DRIVE UNIT 118", "734 THOMPSON DRIVE, UNIT A",
"5917 YORK BRIDGE CIRCLE, AUSTIN, TX", "4140 SUNLAND CIRCLE NW",
"3951 BELLAIRE DRIVE SOUTH"), RESERVATION_CITY = c("SPRINGFIELD",
"ODESSA", "LAKE DALLAS", "AUSTIN", "ALBUQUERQUE", "FORT WORTH"
), RESERVATION_STATE = c("IL", "TX", "TX", "TX", "NM", "TX"),
RESERVATION_ZIPCODE = c(62711, 79765, 75065, 78749, 87107,
76109)), row.names = c(NA, 6L), class = "data.frame")
I'm trying to avoid having to utilize something like `gsub("CIRCLE", "CIR", clean_add) because there could be more instances I'm missing other than "CIRCLE" or "DRIVE".
Is there a better function out there to do this? Or am I just missing something?
Current:
Probably disered output:
Meaning, you need to specify
abb_end = FALSE, andnormal_address()works as expected. If so, then change to:Data: