Lua Patterns and Unicode

2.5k views Asked by At

What would be the best way to find a word such as Hi or a name mainly like dön with that special char in it through a pattern. They would be optional so it should obviously use a '?' but I dont know what control code to use to find them.

I basically want to make sure that I am getting words with possible unicode characters in them but nothing else. So dön would be fine but no other special chars or numbers and such like brackets.

1

There are 1 answers

0
GravityScore On

According to the Lua guide on Unicode, "Lua's pattern matching facilities work byte by byte. In general, this will not work for Unicode pattern matching, although some things will work as you want". This means the best option is probably to iterate over each character and work out if it is a valid letter. To loop over each unicode character in a string:

for character in string.gmatch(myString, "([%z\1-\127\194-\244][\128-\191]*)") do
    -- Do something with the character
end

Note this method will not work if myString isn't valid unicode. To check if the character is one that you want, it's probably best to simply have a list of all characters you don't want in your strings and then exclude them:

local notAllowed = ":()[]{}+_-=\|`~,.<>/?!@#$%^&*"
local isValid = true

for character in string.gmatch(myString, "([%z\1-\127\194-\244][\128-\191]*)") do
    if notAllowed:find(character) then
        isValid = false
        break
    end
end

Hope this helped.