How do you turn an Array of codepoints (Int32) to a string?

543 views Asked by At

In Crystal, a String can be turned into an Array(Int32) of codepoints:

"abc".codepoints # [97,98,99] 

Is there a way to turn the Array back into a String?

2

There are 2 answers

0
dgo.a On

Here's one way:

arr   = "abc".codepoints

# The line below allocates memory and returns a "safe" pointer (ie slice) to it.
# The allocated memory is on the heap with size:
#    arr.size * sizeof(0_u8)
#    sizeof(0_u8) == 8 bits
# A slice of uint8 values (i.e. `Slice(UInt8)`) is aliased
#    in Crystal as `Bytes`.
bytes = Slice.new(arr.size, 0_u8) 
# You can also use the alias: Bytes.new(arr.size, 0_u8)

arr.each_with_index { |v, i|
  bytes[i] = v.to_u8
}
puts String.new(bytes).inspect # => "abc"

However, the above fails for multi-byte codepoints: "a€æ∡"

0
dgo.a On
  str     = "aа€æ∡"
  arr     = str.codepoints              # Array(Int32)
  new_str = arr.map { |x| x.chr }.join

  puts str
  puts new_str
  puts(str == new_str)

The .chr instance method can be used to get the Unicode codepoint of an Int. You then .join the individual chars into a new String.