I'd like to write some IDs for use in URLs in Crockford's base32. I'm using the base32 npm module.
So, for example, if the user types in http://domain/page/4A2A I'd like it to map to the same underlying ID as http://domain/page/4a2a
This is because I want human-friendly URLs, where the user doesn't have to worry about the difference between upper- and lower-case letters, or between "l" and "1" - they just get the page they expect.
But I'm struggling to implement this, basically because I'm too dim to understand how encoding works. First I tried:
var encoded1 = base32.encode('4a2a');
var encoded2 = base32.encode('4A2A');
console.log(encoded1, encoded2);
But they map to different underlying IDs:
6hgk4r8 6h0k4g8
OK, so maybe I need to use decode?
var encoded1 = base32.decode('4a2a');
var encoded2 = base32.decode('4A2A');
console.log(encoded1, encoded2);
No, that just gives me empty strings:
" "
What am I doing wrong, and how can I get 4A2A and 4A2A to map to the same thing?
For an incoming request, you'll want to decode the URL fragment. When you create URLs, you will take your identifier and encode it. So, given a URL
http://domain/page/
dnwnyub46m50
, you will take that fragment and decode it. Example:The library you linked to is case-insensitive, so you get the same result this way:
When dealing with any encoding scheme (Base-16/32/64), you have two basic operations:
encode
, which works on a raw stream of bits/bytes, anddecode
which takes an encoded set of bytes and returns the original bit/byte stream. The Wikipedia page on Base32 encoding is a great resource.When you decode a string, you get raw bytes: it may be that those bytes are not compatible with
ASCII
,UTF-8
, or some other encoding which you are trying to work with. This is why your decoded examples look like spaces: the tools you are using do not recognize the resulting bytes as valid characters.How you go about encoding identifiers depends on how your identifiers are generated. You didn't say how you were generating the underlying identifiers, so I can't make any assumptions about how you should handle the raw bytes that come out of the decoder, nor about the content of the raw bytes being passed into the encoder.
It's also important to mention that the library you linked to is not compatible with Crockford's Base32 encoding. The library excludes
I, L, O, S
, while Crockford's encoding excludesI, L, O, U
. This would be a problem if you were trying to interoperate with another system that used a different library. If no one besides you will ever need to decode your URL fragments, then interoperability doesn't matter.