Converting large integer (uint64+) into hex

74 views Asked by At

While importing things into bigquery my hex strings got converted into float. I understand I need to fix the import but I'd like to do a best effort recovery of some of the data.

I'm trying my best to convert them back into hex, however, trying toy examples creates unexpected behaviors.

Ex. Given the following hex value:

hh = 0x6de517a18f003625e7fba9b9dc29b310f2e3026bbeb1997b3ada9de1e3cec8d6
# int: 49706871569187420659586066466638340615522392400360198520171375183123350210774
# float: 4.9706871569187424e+76

I'm not sure why the last couple digits goes from 420 to 424 in float

Trying to turn this value into float then back into hex heavily truncates the value

ff = 4.9706871569187424e+76 # same as calling float.fromhex('0x6de517a18f003625e7fba9b9dc29b310f2e3026bbeb1997b3ada9de1e3cec8d6')
int(ff) # 49706871569187423635521182730432496296162592228596139982404260202468916330496
# not sure why getting so many significant figures
hex(int(ff))
# '0x6de517a18f003800000000000000000000000000000000000000000000000000'

To me this is unexpected since there is a change in the last non-zero value in hex. (0036 -> 0038) I'm assuming it has something to do with how mantissa is being represented but was hoping someone on here would have a quick answer rather than going on a deep dive into float implementation in python.

1

There are 1 answers

4
yingw On

Thanks @mark-tolonen for pointer to 53 bits of float64 and rounding. For my use case of best effort mapping to recover auto conversion issues, the following code will suffice

bb = bin(int(ff))
hex(int(bb[2:53],2)) # 51 bits, see below
# 0x6de517a18f003

A bit more explanation:

Hex is represented by 4 bits (2^4 = 16), so when looking at binary positions

  • 0..3 - first hex value
  • 4..7 - next hex value
  • ...
  • 48..51 - last complete
  • 52..56 - this one is going to be incomplete since we only get 53 bits of precision

Since string is prepended by '0b' we take 2:(2+51) which is how we get to bb[2:53]