I am scraping websites for information and it involves getting sha1 hashes of magnet links.
I get all the magnet links with a simple preg_match_all
but in my results I am getting weird results, I understand that a magnet hash in its hexadecimal form is 40 characters long, but I am also getting results that return strings that are 32 characters long that contain other non hexadecimal values.
Two examples from my results, firstly a normal 40 hexadecimal hash within a magnet link,
array
0 => string 'F5AD2D170C033736FD987106F04C3ABD6DF41D14' (length=40)
And the other weird results that I do not understand where the hash is a 32 non hexadecimal value,
array
0 => string 'VPR33QQM3L6BFU5FGOZXMBNORAFFSZWW' (length=32)
Has the hash been packed in some way? I know it is not done with pack('H*', $hash)
as that returns the binary of the hash? The magnet links do work as I have tested them.
More so you can see these hashes in use at this website
By hovering over the magnet links and looking a the magnet hash.
Thanks
Hashes in magnet links can be encoded using Base32. In your example,
turns into
which is a valid SHA-1 hash.