I need to know how to map unstructured data to structured data.
I have a variable that has customer's addresses that includes their cities. The name of the city for example DELHI, can be of the form "DELHI", "DEHLI" "DILLI", "DELI" and I need to detect the city name from these addresses and map it to the correct name that is "DELHI".
I am trying to implement a solution in SAS or R.
In SAS this might not be the easiest way, but one way of doing this if your city name is inside the address string is to use the
TRANWRD
function. This can replace a string inside your address variable. The syntax is:For example using your city DELHI:
I put a space before and after both the original and new strings so that it won't replace a correct string that is inside a word (E.g. DELICIOUS Road will be changed to DELHICIOUS Road)