Secure integer hashing for order number

1.4k views Asked by At

Let say I have a table Orders with auto-incremented id, e.g. 1, 2, 3, 4..., and they are current queried as http://www.example.com/order?id={1,2,3..}

Now, I want to hash primary key [1, 2, 3, ..] into another number called Order Number so our customer can reference them in their request, e.g.

1 -> 100192938303
2 -> 293029200002

I want the following:

  • Not able to guess how many order I've created everyday by looking at the auto increment ID
  • No DB extra lookup is neeed, purely hash by PHP (and a pre-defined salt)
  • No collision

Is it possible?

4

There are 4 answers

1
Mikk On

I think you can probably choose easier approach - do not use auto incrementing id, use random integers as ids. Example:

while (true) {
    $id = get_random_integer();
    $stmt = $db->prepare("INSERT INTO Orders (id, foo, bar) VALUES (:id, 'foo', 'bar')");
    try {
        $stmt->execute(array(':id' => $id));
        //OK
        break;
    } catch (Exception $ex) {
        if (is_duplicate_id_exception) {
            //generate new id and try again
            continue;
        }
        //Some other problem
        throw $ex;
    }
}

This way you:

  • avoid collisions
  • do not need a hashing function and {hash -> id} mapping
  • have ids that do not contain information about amount of orders
6
DLastCodeBender On

You can encode your order id using base64_encode() before you submit it in your GET form, then base64_decode() when you capture the variables sent by the form

you can even add salts eg base64_encode($id."salt")

0
drf On

You proposed using a salted hash. Since a hash is a one-way function, and you will need to convert from the hash to the original value, you will need one of the following to translate the hash into the original order value:

  • Loop through plausible order values, taking the salted hash of each, until you either identify the matching hash or exhaust the pool of allowable order IDs.
  • Cache the plausable order values once (e.g., at app startup), and store in a hashtable. This approach is much faster once the cache is created, but requires an additional lookup.

You also noted that the original order identifier is confidential since an attacker who can obtain multiple Order IDs can infer order volume. The confidentiality of the Order Identifier is a separate concern from the confidentiality of the order itself, which the question does not address and may be handled through a separate access control mechanism.

I think the preferred approach in your example would be to use encryption rather than a hash. Encrypting the Order ID will meet the confidentiality and round-trip requirements, without the overhead of a cache of hashes or database lookup. The approach might look something like this:

  1. Encrypt the Order ID with your key.
  2. Base64 encode the Order ID and return to the client as a token.
  3. Upon receiving the encrypted token from the client, decode the Base64 string
  4. Decrypt the decoded string with your key to produce the original order number.

For example, for Order 42 and DES key E0EC4E44EF2C6CEE and zero IV, you would send dmTt0kbIlcA= to the client as the order ID (if you encode 42 as a little-endian 32-bit integer). (A zero IV is appropriate here since having a unique ciphertext is not a concern in your scenario.)

0
aaronsuperglue On

Here's two ideas:

  1. Use a reversible hash. Whether this works depends on what you consider to be security, since it's essentially just obfuscation. But if you customize it (perhaps altering the order some of the steps in the algorithm), and you prevent source from being leaked, it will prevent all but the most determined attackers. (Depending on your security goals, you'd probably want to combine with a few other techniques to mitigate the risk of leaks, eg, employees leaving the company. Consider keeping part of the algorithm secret, as if it were a cryptographic key, and having additional, variable, pretransformations to the input.)

Off the top of my head, a simple reversible hash might just be "lateral addition" of bits. For something more sophisticated, off the top of my head, the popular "MurmurHash" family of algorithms is claimed to be reversible.

I am not aware of any cryptographically-strong reversible hashes. However, other answers on the topic of symmetric encryption are similar to this idea.

  1. Use a stream cipher, AKA cryptographic RNG. This is appropriate if the total number of orders is going to be fairly small. What you need is a unique sequence of numbers that maps one-to-one with the counting number sequence. So generate a sequence of unique random numbers, using RC4 or HMAC of your choice, eliminating duplicates as you go. (Maybe a creative way to make this go fast is a bloom filter.)

For mapping from internal IDs to external IDs, you just generate the sequence. For the reverse, you keep going until you find the ID, or hit the maximum order ID. This algorithm is O(n), which is obviously not ideal, but if you're willing to compromise a little, add more complexity, or be clever, you might find a way to mitigate this. For instance, you might be able to keep a cache of the IDs in RAM.

Edited:

I myself feel skeptical about #2 due to the linear complexity, so I ran some numbers. Using Crypto++'s benchmark numbers from a Core2 processor, if you budget 10 ms to the number transformation, and you use 40-bit IDs (which gets you hypothetically to one quadrillion orders), you get to an order ID max of about 2,500,000. And I think you could double that by using a smaller key.

So this method could go either way. For small-scale stuff, it's fine. (The assumptions above are conservative.) But for large-scale stuff, this could be an annoyance. It's enough to get you through a product launch; you'd want to revisit it at about the time you started talking about how to build your software as a distributed system, which would also help solve this problem. But at that point, you're probably better off questioning initial assumptions, and just storing this thing in a database somewhere.