parseInt returns NaN for normal-looking numeric strings. Why?

102 views Asked by At

So I have the JSON below, as an example:

[{"x":"‭12‬","y":"‭−‭67‬‬"},{"x":"‭12‬","y":"‭−‭68‬‬"},{"x":"‭13‬","y":"‭−‭70‬‬"}]

After I parse it into an object, using JSON.parse, I end up with an array of objects. So far so good. But when I try to parse those "x" and "y" coordinates into integers, I just get NaN no matter what! Tried all kinds of solutions; parseInt(obj.x), +obj.x, Number(obj.x), etc..! Just can't understand why I can't convert those properties to integers?

2

There are 2 answers

0
user3840170 On

Your numeric strings contain invisible Unicode formatting (directional override) characters. Furthermore, the minus signs are true typographical minus signs (U+2212) instead of the typewriter hyphen-minus characters (U+002D), and parseInt is unable to parse the former.

offset       USV  gc  sc    bc     age  name
[...]
     6    U+0022  Po  Zyyy  ON     1.1  QUOTATION MARK
     7    U+202D  Cf  Zyyy  LRO    1.1  LEFT-TO-RIGHT OVERRIDE
    10    U+0031  Nd  Zyyy  EN     1.1  DIGIT ONE
    11    U+0032  Nd  Zyyy  EN     1.1  DIGIT TWO
    12    U+202C  Cf  Zyyy  PDF    1.1  POP DIRECTIONAL FORMATTING
    15    U+0022  Po  Zyyy  ON     1.1  QUOTATION MARK
[...]
    21    U+0022  Po  Zyyy  ON     1.1  QUOTATION MARK
    22    U+202D  Cf  Zyyy  LRO    1.1  LEFT-TO-RIGHT OVERRIDE
    25    U+2212  Sm  Zyyy  ES     1.1  MINUS SIGN
    28    U+202D  Cf  Zyyy  LRO    1.1  LEFT-TO-RIGHT OVERRIDE
    31    U+0036  Nd  Zyyy  EN     1.1  DIGIT SIX
    32    U+0037  Nd  Zyyy  EN     1.1  DIGIT SEVEN
    33    U+202C  Cf  Zyyy  PDF    1.1  POP DIRECTIONAL FORMATTING
    36    U+202C  Cf  Zyyy  PDF    1.1  POP DIRECTIONAL FORMATTING
    39    U+0022  Po  Zyyy  ON     1.1  QUOTATION MARK
[...]

Ideally, you would fix the JSON so that it contain numbers directly (instead of wrapped in strings). But if that is outside your control, you need to remove the formatting characters and replace the minus sign with the hyphen-minus before using parseInt.

const parseScrapedInt = (s) => {
  s = s.replaceAll(/\p{Cf}/ug, '');
  s = s.replaceAll(/\u{2212}/ug, '\u{002d}');
  return parseInt(s, 10);
};

console.log(parseScrapedInt("\u{202d}12\u{202c}"));
console.log(parseScrapedInt("\u{202d}−\u{202d}67\u{202c}\u{202c}"));

0
mplungjan On

If you really cannot change the data, then try this:

const makeNum = str => {
  let sign = escape(str).includes("%u2212%u202D") ? -1 : 1;
  return str.replace(/[\u{0080}-\u{FFFF}]/gu, "") * sign;
};

const numCoordinates = coordinates
  .map(({ x, y }) => ({ x: makeNum(x), y: makeNum(y) }));

console.log(numCoordinates);
Data received:

<pre>
[{ x: "%u202D12%u202C", y: "%u202D%u2212%u202D67%u202C%u202C" }, 
 { x: "%u202D12%u202C", y: "%u202D%u2212%u202D68%u202C%u202C" }, 
 { x: "%u202D13%u202C", y: "%u202D%u2212%u202D70%u202C%u202C" }] 
</pre>
<script>
  const coordinates = [{
    "x": "‭12‬",
    "y": "‭−‭67‬‬"
  }, {
    "x": "‭12‬",
    "y": "‭−‭68‬‬"
  }, {
    "x": "‭13‬",
    "y": "‭−‭70‬‬"
  }];
</script>