I am encountering a font ligature issue in a sentence.
The sentence in question is:
Verizon is sunseng BlueJeans as the plaorm struggled to gain tracon against rival services in the video conferencing market
I have a list of ligatures, and some examples are provided here:
const ligatureMap = {
"": "ti",
"": "tf",
ſt: "ft",
"pla": "platf",
"AT&T": "AT&T",
}
To address this issue, I am attempting to replace the ligatures using the following code:
return text.replace(/[\uE000-\uF8FF]/g, (match) => {
return ligatureMap[match] || match;
});
but it is not converting plaorm to tf and & to & So how to solve this?
There are at least two problems:
ligatureKeysobject are only one character long, but your regular expression only searches for single character matches (specifically, for single code unit¹ matches).is char code\uFB05).Separately, there doesn't appear to be an entry in the example
ligaturesMapfor the character inin your example. I've assumed it should be"tt".To make sure all of your
ligatureMapentries are searched for, including the multi-character ones, you can convert your keys into a regular expression alternation (basically, "this key or that key or this other key"), like this:The
escapeRegexfunction there should be whatever your preferred solution for excaping regular expressions is (perhaps one from this question's answers).Here's an example using the
escapeRegexfrom this answer (just as an example):¹ For more about "characters" vs. code points vs code units, see my blog post What is a string?