I know how data is (in theory) stored in a DHT. However, I am uncertain as to how one might go about updating a piece of data associated with a key. Is this possible? Also, how are conflicts handled in a DHT.
2
There are 2 answers
0
On
It is possible. I've researched pastrys dht. It is possible to alter data stored under a given key but pastrys developers advise against it as it can have nasty side effects, mainly with replications of the altered piece of data which is stored on other nodes. (see the FAQ on freepastrys home page).
I'm not sure about how it would effect other dhts such as chord or tapestry however.
With regard to conflicts, again I have only experience with pastry. If you try to store data under a key that's already in use an exception will be thrown.
A DHT simply defines
put(key,value)
andget(key)
operations and the core of the various DHT algorithms revolve around how to locate the nodes responsible for a specific key.What those nodes do on an incoming
put
request for a value already stored largely depends on the purpose and implementation of the DHT network, not on the algorithm itself.E.g. a node might opt to timestamp all incoming values and return lists with multiple separate timestamped issues. Or it might return lists that also include the source address for each value. Or they might just overwrite the stored value.
If you have some relation between the key and a signature within the value or the source ID or something like that you can put enough intelligence into the nodes to verify the data cryptographically and thus allow them to keep a single canonical value for each key by replacing the old data.
In the case of bittorrent's DHT you wouldn't want that. Many different bittorrent peers announce their presence to a single key from different source addresses. Therefore the nodes actually store unique
<key,IP,port>
tuples where<IP,port>
can be considered the value. Which means it'll return lists of IPs and ports on each lookup. And since a DHT will have multiple nodes responsible for one key you will actually have K (bucket size) nodes responding with varying lists.TL;DR: It's implementation-dependent