Distributed Hash Tables: Design and Implementation, Slides of Computer Networks

An overview of distributed hash tables (dhts), a distributed data structure used for storing and managing large amounts of data across a network. The basics of hash tables, the design goals and key decisions behind dhts, hash functions, consistent hashing, and the process of storing and retrieving data in a dht. It also discusses the challenges and solutions for handling joins and leaves of nodes, and the concept of links in the overlay topology.

Typology: Slides

2012/2013

Uploaded on 04/25/2013

avanti
avanti 🇮🇳

4.4

(11)

112 documents

1 / 14

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Lecture 12
Distributed Hash Tables
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe

Partial preview of the text

Download Distributed Hash Tables: Design and Implementation and more Slides Computer Networks in PDF only on Docsity!

Lecture 12

Distributed Hash Tables

Hash Table

  • Name-value pairs (or key-value pairs)
    • E.g,. “Mehmet Hadi Gunes” and [email protected]
    • E.g., “http://cse.unr.edu/” and the Web page
    • E.g., “HitSong.mp3” and “12.78.183.2”
  • Hash table
    • Data structure that associates keys with values

2

lookup(key) key^ value value

Distributed Hash Table

• Two key design decisions

  • How do we map names on to nodes?
  • How do we route a request to that node?

Hash Functions

• Hashing

  • Transform the key into a number
  • And use the number to index an array

• Example hash function

  • Hash(x) = x mod 101, mapping to 0, 1, …, 100

• Challenges

  • What if there are more than 101 nodes? Fewer?
  • Which nodes correspond to each hash value?
  • What if nodes come and go over time?

Consistent Hashing

7

0

4

8

12 Bucket

14

  • Construction
    • Assign each of C hash buckets to random points on mod 2 n^ circle; hash key size = n
    • Map object to random position on circle
    • Hash of object = closest clockwise bucket
  • Desired features
    • Balanced: No bucket responsible for large number of objects
    • Smoothness: Addition of bucket does not cause movement among existing buckets
    • Spread and load: Small set of buckets that lie near object
  • Similar to that later used in P2P Distributed Hash Tables (DHTs)
    • In DHTs, each node only has partial view of neighbors Docsity.com

Consistent Hashing

• Large, sparse identifier space (e.g., 128 bits)

  • Hash a set of keys x uniformly to large id space
  • Hash nodes to the id space as well

8

Hash(name)object_id Hash(IP_address)node_id

Id space represented as a ring

Joins and Leaves of Nodes

• Maintain a circularly linked list around the ring

  • Every node has a predecessor and successor

10

node

pred

succ

Joins and Leaves of Nodes

• When an existing node leaves

  • Node copies its <key, value> pairs to its predecessor
  • Predecessor points to node’s successor in the ring

• When a node joins

  • Node does a lookup on its own id
  • And learns the node responsible for that id
  • This node becomes the new node’s successor
  • And the node can learn that node’s predecessor
    • which will become the new node’s predecessor

How to Find the Nearest Node?

• Need to find the closest node

  • To determine who should store (key, value) pair
  • To direct a future lookup(key) query to the

node

• Strawman solution: walk through linked list

  • Circular linked list of nodes in the ring
  • O(n) lookup time when n nodes in the ring

• Alternative solution:

  • Jump further around ring
  • “Finger” table of additional overlay links (^) Docsity.com^13

Links in the Overlay Topology

  • Trade-off between # of hops vs. # of neighbors
    • E.g., log(n) for both, where n is the number of nodes
    • E.g., such as overlay links 1/2, 1/4 1/8, … around the ring
    • Each hop traverses at least half of the remaining distance

14