I would like to use the pseudo_encrypt function mentioned a few times on StackOverflow to make my IDs look more random: https://wiki.postgresql.org/wiki/Pseudo_encrypt
How can I customize this to output unique "random" numbers for just me. I read somewhere that you can just change the 1366.0 constant, but I don't want to take any risks with my IDs as any potential ID duplicates would cause major issues.
I really have no idea what each constant actually does, so I don't want to mess around with it unless I get some direction. Does anyone know which constants I can safely change?
Here it is:
CREATE OR REPLACE FUNCTION "pseudo_encrypt"("VALUE" int) RETURNS int IMMUTABLE STRICT AS $function_pseudo_encrypt$
DECLARE
l1 int;
l2 int;
r1 int;
r2 int;
i int:=0;
BEGIN
l1:= ("VALUE" >> 16) & 65535;
r1:= "VALUE" & 65535;
WHILE i < 3 LOOP
l2 := r1;
r2 := l1 # ((((1366.0 * r1 + 150889) % 714025) / 714025.0) * 32767)::int;
r1 := l2;
l1 := r2;
i := i + 1;
END LOOP;
RETURN ((l1::int << 16) + r1);
END;
$function_pseudo_encrypt$ LANGUAGE plpgsql;
for bigint's
CREATE OR REPLACE FUNCTION "pseudo_encrypt"("VALUE" bigint) RETURNS bigint IMMUTABLE STRICT AS $function_pseudo_encrypt$
DECLARE
l1 bigint;
l2 bigint;
r1 bigint;
r2 bigint;
i int:=0;
BEGIN
l1:= ("VALUE" >> 32) & 4294967295::bigint;
r1:= "VALUE" & 4294967295;
WHILE i < 3 LOOP
l2 := r1;
r2 := l1 # ((((1366.0 * r1 + 150889) % 714025) / 714025.0) * 32767*32767)::bigint;
r1 := l2;
l1 := r2;
i := i + 1;
END LOOP;
RETURN ((l1::bigint << 32) + r1);
END;
$function_pseudo_encrypt$ LANGUAGE plpgsql;
Alternative solution: use different ciphers
Other cipher functions are now available on postgres wiki. They're going to be significantly slower, but aside from that, they're better candidates for generating customized random-looking series of unique numbers.
For 32 bit outputs, Skip32 in plpgsql will encrypt its input with a 10 bytes wide key, so you just have to choose your own secret key to have your own specific permutation (the particular order in which the 2^32 unique values will come out).
For 64 bit outputs, XTEA in plpgsql will do similarly, but using a 16 bytes wide key.
Otherwise, to just customize
pseudo_encrypt
, see below:Explanations about
pseudo_encrypt
's implementation:This function has 3 properties
The first and second property come from the Feistel Network, and as already explained in @CodesInChaos's answer, they don't depend on the choice of these constants:
1366
and also150889
and714025
.Make sure when changing
f(r1)
that it stays a function in the mathematical sense, that isx=y
impliesf(x)=f(y)
, or in other words the same input must always produce the same output. Breaking this would break the unicity.The purpose of these constants and this formula for
f(r1)
is to produce a reasonably good pseudo-random effect. Using postgres built-inrandom()
or similar method is not possible because it's not a mathematical function as described above.Why these arbitrary constants? In this part of the function:
The formula and the values
1366
,150889
and714025
come from Numerical recipes in C (1992, by William H.Press, 2nd ed.), chapter 7: random numbers, specifically p.284 and 285. The book is not directly indexable on the web but readable through an interface here: http://apps.nrbook.com/c/index.html .It's also cited as a reference in various source code implementing PRNGs.Among the algorithms discussed in this chapter, the one used above is very simple and relatively effective. The formula to get a new random number from a previous one (
jran
) is:where
jran
is the current random integer.This generator will necessarily loop over itself after a certain number of values (the "period"), so the constants
ia
,ic
andim
have to be chosen carefully for that period to be as large as possible. The book provides a table p.285 where constants are suggested for various lengths of the period.ia=1366
,ic=150889
andim=714025
is one of the entries for a period of 229 bits, which is way more than needed.Finally the multiplication by
32767
or 215-1 is not part of the PRNG but meant to produce a positive half-integer from the 0..1 pseudo-random float value. Don't change that part, unless to widen the blocksize of the algorithm.