Name-matching against table with alternative spellings

385 views Asked by At

I have a table with alternative spellings of country names:

Use name, Alt1, Alt2, Alt3, Alt4
[...]
Bahamas, "Bahamas, The"
Bolivia
Bosnia and Herzegovina, Bosnia & Herzegovina
[...]

(Some countries have 0 alternative spellings, other up to 4.)

Given a country string, what is performance-wise the best solution to returning the element in the first column? (In most of the cases, independent of the number of alternative spellings, the string matches the first column and doesn't have to be name-matched. In the other cases the probability is evenly distributed across column 2-X.

(Preferably in JavaScript or PHP, thanks :) )

1

There are 1 answers

2
Czechnology On BEST ANSWER

In my opinion, I think you should put this in two separate tables in database:

countries: id, ..., ...

countries_names: country_id, default (bool or enum('yes', 'no')), name

Put an index on the name column so that you can search fast for the correct name and country_id:

SELECT name FROM countries_names
WHERE default = 'yes'
AND country_id = (SELECT country_id FROM countries_names
                  WHERE name = 'search_string'
                  LIMIT 1)
LIMIT 1

(optionally, you can add a LEFT JOIN if you need more info from the main table)

Other option would be to create a table only for alternative names:

countries: id, name, ..., ...

countries_alternative_names: country_id, name

But you'd have to search in two tables when looking for a first match.


EDIT: Static JavaScript solution:

function getCountryName(var name) {
  switch (name) {
    case "Bahamas" :
    case "The Bahamas" :
      return "Bahamas";
    case "Bolivia" :
      return "Bolivia";
    case "Bosnia and Herzegovina" :
    case "Bosnia & Herzegovina" :
      return "Bosnia and Herzegovina"

    // ...

    default :
      return null;
  }
}