I'm unable to find a way to match all extended alphabet characters without doing so explicitly. For example, matching the tag språk.
tag = "språk"
tag:match([[%w+]])
This doesn't work because å is not contained within %w. This can be matched with tag:match([[[%wå]+]]), but then you have to explicitly add all special.
One can also extend the range. This works tag:match([[[a-å]+]]), but I'm not 100% clear on why, or at least not where that range actually covers in the character table.
So what is the correct way to match a range that includes all ascii plus all latin extended?
The best solution I've come up with so far is:
tag = "språk"
tag:match([[[a-zA-ZÀ-ÿ]+]])
But I'm still unsure if that is completely correct, and it would be ideal if there is a shortcut class for this I'm simply overlooking.
I will suggest how to make a set of some characters from additional Latin letters - 1. By analogy, you can make sets for the necessary sets (Latin Extended A,B,C,D,E).