In Javascript, why are char codes reversible but not codepoints, for emojis?

71 views Asked by At

Here, I split apart a string and put it back together, using two approaches: char codes and code points. I expected the two approaches to be equivalent. The charCode solution works, but yet the codePoint solution does not. I expected that charCodeAt and fromCharCode, and codePointAt and fromCodePoint, would both be reversible.

const str = ``;
const codepoints = str.split(``).map((_, i) => str.codePointAt(i));
const charCodes = str.split(``).map((_, i) => str.charCodeAt(i));

console.log({
    str,
    'str.length': str.length,
    '[...str]': [...str],
    'str.split()': str.split(``),
    codepoints,
    fromCodePoint: String.fromCodePoint(...codepoints),
    charCodes,
    fromCharCode: String.fromCharCode(...charCodes)
});

However, the following output is produced by both Node.js, and Chrome, running on macOS. charCodeAt returned the original string, which I expected, but yet fromCodePoint did not. Why is thiat? Furthermore, how can I figure out how to generate the original string using fromCodePoint?

{
  str: '',
  'str.length': 14,
  '[...str]': [
    '', '',
    '', '',
    '', '',
    ''
  ],
  'str.split()': [
    '\ud83c', '\udf45',
    '\ud83c', '\udf46',
    '\ud83c', '\udf47',
    '\ud83c', '\udf48',
    '\ud83c', '\udf49',
    '\ud83c', '\udf4a',
    '\ud83c', '\udf4b'
  ],
  codepoints: [
    127813,  57157, 127814,
     57158, 127815,  57159,
    127816,  57160, 127817,
     57161, 127818,  57162,
    127819,  57163
  ],
  fromCodePoint: '\udf45\udf46\udf47\udf48\udf49\udf4a\udf4b',
  charCodes: [
    55356, 57157, 55356,
    57158, 55356, 57159,
    55356, 57160, 55356,
    57161, 55356, 57162,
    55356, 57163
  ],
  fromCharCode: ''
}
1

There are 1 answers

1
Mr. Polywhirl On

If you want to split the string to get the same result as spreading the string into an array, you will need to group the bytes pairs:

spread = [...str]

split = [...chunks(str.split(''), 2)].map(s => s.join(''))

You can create a genrator function that will partition an array into chunks of size n..

function* chunks(arr, n) {
  for (let i = 0; i < arr.length; i += n) {
    yield arr.slice(i, i + n);
  }
}

const
  str = '',
  spread = [...str],
  split = [...chunks(str.split(''), 2)].map(s => s.join('')),
  codepoints = str.split('').map((c) => c.codePointAt(0)),
  charCodes = str.split('').map((c) => c.charCodeAt(0));

console.table({
  str,
  'length': str.length,
  'spread': [...str],
  'split': split,
  codepoints,
  fromCodePoint: String.fromCodePoint(...codepoints),
  charCodes,
  fromCharCode: String.fromCharCode(...charCodes)
});
kbd {
  border: 4px outset grey;
  padding: 2px 4px;
  font-family: monospace;
}
<!-- Note: I used console.table         -->
<!--       So use the browser dev tools -->
<p>Press <kbd>F12</kbd> to open the dev tools.</p>