I'm working on a project that involves a lot of byte-level manipulation. I would like the library to accept binaries as input, but it's often convenient to work with functions on Enum
. I've ended up with a lot of functions that have two guarded definitions, one for when is_binary
and one for when is_list
:
@doc """
Convert a binary or list of bytes to a hexadecimal string representation,
## Examples
iex> Bytes.to_hex <<1,2,3,253,254,255>>
"010203fdfeff"
iex> Bytes.to_hex [1,2,3,253,254,255]
'010203fdfeff'
"""
def to_hex(bytes) when is_binary(bytes) do
:binary.bin_to_list(bytes) |> to_hex |> :binary.list_to_bin
end
def to_hex(bytes) when is_list(bytes) do
hexes = Enum.map bytes, &byte_to_hex/1
Enum.join(hexes) |> String.downcase |> :binary.bin_to_list
end
defp byte_to_hex(byte) do
Integer.to_char_list(byte, 16) |> :string.right(2, ?0)
end
In addition, there's the :binary.bin_to_list
at the end of the list version followed by the :binary.list_to_bin
at the end of the string version, resulting in superfluous work.
I'd really like to be able to use the same abstractions for binaries, as well. I initially tried doing this by implementing a protocol for BitString
:
defimpl Enumerable, for: BitString do
def count(coll) do
{:ok, byte_size(coll)}
end
def member?(_coll, _val) do
{:error, __MODULE__}
end
def reduce(_, {:halt, acc}, _fun) do
{:halted, acc}
end
def reduce(bin, {:suspend, acc}, fun) do
{:suspended, acc, &reduce(bin, &1, fun)}
end
def reduce(<<>>, {:cont, acc}, _fun) do
{:done, acc}
end
def reduce(<< h :: binary-size(1), t :: binary >>, {:cont, acc}, fun) do
reduce(t, fun.(h, acc), fun)
end
end
This works great for the count
, member
, and reduce
cases:
test "counts bytes" do
assert Enum.count(<<0, 1, 2>>) == 3
end
test "reports members" do
assert Enum.member?(<<0, 1, 2>>, <<3>>) == false
assert Enum.member?(<<0, 1, 2>>, <<2>>) == true
end
test "reduces binaries" do
reducer = fn(b, acc) -> b <> acc end
assert Enum.reduce("abc", "", reducer) == "cba"
end
However, some Enum
functions call out to functions on :lists
; for example, Enum.map
calls :lists.reverse
, so this gives different output:
test "maps binaries" do
mapper = fn(<<b>>) -> <<b + 1>> end
assert Enum.map(<<0, 1, 2 >>, mapper) == <<1, 2, 3>>
end
1) test maps binaries
Assertion with == failed
code: Enum.map(<<0, 1, 2>>, mapper) == <<1, 2, 3>>
lhs: [<<1>>, <<2>>, <<3>>]
rhs: <<1, 2, 3>>
What is the most idiomatic way to deal with this kind of situation? Should I simply keep with the guards and implement the list <-> binary conversions? Should I create a specific module for the functions, like List
is to Enum
?
Edit
I thought I could take care of the extra list <-> binary conversions by making the binary the base case:
def to_hex(bytes) when is_binary(bytes) do
bytes
|> :binary.bin_to_list
|> Enum.map(&byte_to_hex/1)
|> Enum.join
end
def to_hex(bytes) when is_list(bytes) do
bytes |> :binary.list_to_bin |> to_hex |> :binary.bin_to_list
end
defp byte_to_hex(byte) do
Integer.to_string(byte, 16) |> String.rjust(2, ?0) |> String.downcase
end
But of course, there's still the same number of list_to_bin
and bin_to_list
calls in this version, just moved around. Is there just no way around this?
It looks weird that you get binaries whil enumerating a binary. To me, they should be integers. Just like a char list.
Then functions like
Enum.map
would work as expected.