In order to understand ECIES completely and use my favorite library I implemented some parts of ECIES myself. Doing this and comparing the results led to one point which is not really clear for me: what exacly is the input of KDF?
The result of ECDH is an vector, but what do you use for the KDF? Is it just the X value, or is it X + Y (perhaps with an prepended 04)? You can find both concept in the wild, and for sake of interoberability, it would be really interesting which way is the correct way (if there is a correct way at all - I know that ECIEs is more a concept and has several degrees of freedom).
Explanation (correct me if I'm wrong at a specific point, please). If I talk about byte length, this will refer to ECIES with 256 Bit EC Keys.
So, first, the big picture: here's the ECIES process, and I'm talking about the step 2 -> 3:
The recipient's public key is an vector V, the sender's emphemal private key is a scalar u, and key agreement function KA is ECDH which is basicly a multiplication of V * u. As a result, you get a shared key which is also a vector - let's call it "shared key".
Then you take the sender's public key, concat it with the shared key, and use this as an input for the key derival function KDF.
But: If you want to use this vector for the key derival function KDF, you have two ways of doing this:
- you can use just shared key's X. Then you have a bytestring of 32 bytes.
- you can use shared key's X and Y and prepend it 0x04 as you with public keys. Then you have a bytestring of 01 + 32 + 32 bytes [3) just to be complete: you can also use X + Y as a compressed point)
The length of the bytestring does not really matter, because after KDF (which usually involves hashing) you always have a fixed value, e.g. 32 bytes (if you use sha256).
But of course the result of KDF is quite different if you choose one or the other method. So the question is: what's the correct way?
- eciespy uses method 2 https://github.com/ecies/py/blob/master/ecies/utils.py#L143
- python cryptography gives just X back at their ECDH: https://cryptography.io/en/latest/hazmat/primitives/asymmetric/ec/#cryptography.hazmat.primitives.asymmetric.ec.ECDH . They have no ECIES support.
- if I understand CryptoC++s documentation correctly, they also just give X back: https://cryptopp.com/wiki/Elliptic_Curve_Diffie-Hellman
- same with Java BountyCastle, if I read this correctly - result is an integer: https://github.com/bcgit/bc-java/blob/master/core/src/main/java/org/bouncycastle/crypto/agreement/DHBasicAgreement.java#L79
- but you can also find online calculators with both, X and Y: http://www-cs-students.stanford.edu/~tjw/jsbn/ecdh.html
So, I tried to get more information in documentation:
- there's the ISO propsal for ECIES. They don't describe it in detail (or I was not able to find it), but I would interpret it as the way with the full vector, X and Y: https://www.shoup.net/papers/iso-2_1.pdf
- there is this paper which is widely linked in the internet which refers to just using X at page 27: http://www.secg.org/sec1-v2.pdf
So, result is: I'm confused. Can anybody point me in the right direction, or is this just a degree of freedom you have (and reason for lot's of fun when it comes to compatibility)?
To answer my quesion myself: yes, this is a degree of freedom. The X coordinate way is called compact representation, and it's defined in RFC 6090. So both are valid.
They are also equally secure, because you can calculate Y out of X as described in appendix C at RFC 6090.
The default way is using compact representation. Both ways are not compatibile to each other, so if you stumble across compatibility issues between libaries this might be an interesting point to find out.