How to iterate over only the characters in a string I can actually see?

1k views Asked by At

Normally I would just use something like str[i].

But what if str = "☀️"?

str[i] fails. for (x of str) console.log(x) also fails. It prints out a total of 4 characters, even though there are clearly only 2 emoji in the string.

What's the best way to iterate over every character I can see in a string (and newlines, I guess), and nothing else?

The ideal solution would return an array of 2 characters: the 2 emoji, and nothing else. The claimed duplicate, and a bunch of other solutions I've found, don't fit this criteria.

2

There are 2 answers

2
Amadan On

You need to make your own methods for astral characters.

"foobar".match(/[\uD800-\uDBFF][\uDC00-\uDFFF]|./g);
// => ["f", "o", "o", "", "b", "a", "r"]
0
Harrison On

Segmenter will do what you need:

The Intl.Segmenter object enables locale-sensitive text segmentation, enabling you to get meaningful items (graphemes, words or sentences) from a string.

In you case, the code would look like this:

const segmenterEmoji = new Intl.Segmenter('en', { granularity: 'word' });
const string2 = '☀️'

const iterator1 = segmenterEmoji.segment(string2)[Symbol.iterator]();

console.log(iterator1.next().value.segment);
// Expected output: '☀️'

console.log(iterator1.next().value.segment);
// Expected output: ''

Note: The language/locale doesn't really matter in your case because emojis are a little different to "normal text"

In the example from MDN:

const segmenterFr = new Intl.Segmenter('fr', { granularity: 'word' });
const string1 = 'Que ma joie demeure';

const iterator1 = segmenterFr.segment(string1)[Symbol.iterator]();

console.log(iterator1.next().value.segment);
// Expected output: 'Que'

console.log(iterator1.next().value.segment);
// Expected output: ' '