A simple Distributed Hash Table (DHT) can be used for storing key value pairs in a distributed storage. The DHT design that will be discussed in this post is simple but entirely sufficient to serve the purpose of a distributed hash table. Given a static network of nodes with perfect up time, you can start with any node and key and find the node responsible for that key.

DHTRing

In a real DHT implementation, each node would be running on a different machine and all calls to them would be needed to be communicated over some socket protocol. However for now, we will only simulate the DHT to help us understand how it works. A complete real DHT implementation will be discussed in the next post.

Each node is itself a standard hash table. All we need to do, to store or retrieve a value from the hash table is find the appropriate node in the network, then store it in that nodes hash table.

Network Overlay

Each node points to another node called its successor. The last node points to the first node and forming the network overlay. For now we will not discuss how to maintain this overlay, rather concentrate on store and lookup assuming we have the overlay. Nevertheless, it is very simple to simulate the overlay, you can use a simple Circular Linked List, where each node points to a successor node forming the circular list.

Choosing Node ID

By hashing the Nodes IP, using a consistent hash function like SHA-1, a m bit identifier is generated which is set as the Node Id . Similarly the value is hashed to generate a m bit identifier to be used as the key. Any key has to be of the same range of the Node Id so that they can be compared.

Store

The start is the node which will perform the store request. It will simply check whether the key is in between its Node ID and its Successors Node ID, if so then it just has to decide whether it key is closer to its Node ID or Successor Node ID. If it is closer it will store it in its Hash Table. If the successor’s Node ID is closer, it will tell the successor node to store the value. But if the key doesn’t fall in the range, then it requests its successor to perform the search. This goes on until the actual node to store the value in is found. This is done using the closest_node and find_node functions.

The closest_node function, given the node and its successor node and the key, returns the node which is closer to the key.

closest_node

The find_node function searches to see if the key is between the start node and its successor node, if not moves on to the next node, and does the check again. When found, it calls the closest_node function to find out which node is closer to the key and returns that node.

find_node

The store function is used to store a value at a appropriate node. It uses the find_node function to find the appropriate node to store the value and then stores it there.

Store

Lookup

It uses the find_node function to find which node holds the value and then gets the value from it.

Lookup

Analysis

  • Storage or Lookup complexity – Storage or lookup takes O(n) messages. We can modify the current DHT implementation into a Chord Implementation. Instead of each Node having only one successor, it can keep a list of successors (finger table). Then the storage or lookup can be done using O(log n) messages.
  • Range Queries – In this simple DHT we cannot perform range queries efficiently. We can implement Prefix Hash Tree (PHT) or Range Search Tree (RST) that supports range queries over DHT.
  • Value Storage – Each node maintains a standard Hash Table to store in values. In the current implementation if two values come with the same key to be stored, then the previous value will be over written and is lost. But this is not a common situation as usually key itself is produced by hashing, and thus the different values will not have the same key. But nevertheless this needs to be considered if the key is not made sure to be unique for different values. To be able to store different values which might have the same key, we need to use a chained hash table instead of the standard hash table which can only store one value. In a chained has table each key points to a list of values.

Conclusion

I tried to touch the core basics of a DHT and show how to implement one. DHT’s are very useful means to distributed load among servers. It has a lot of other application domains. This implementation given in this post is however not much useful as it is just a basic simulation of how things work under the hood. In the coming post I will write a real DHT implementation in Python which will run in different nodes, and will try to hook it  up with a sample application.  Watch out for that.

The full Python implementation of the simulated DHT presented in this post can be downloaded from here

Advertisements