Distributed Hash Tables: An Overview of Chord and Related Systems, Exercises of Design

An introduction to Distributed Hash Tables (DHTs), focusing on the Chord system. It covers the overall concept of DHTs, their benefits and drawbacks, and the Chord ring structure. The document also discusses routing algorithms, joining and adding nodes, node failure, and security issues.

Typology: Exercises

2021/2022

Uploaded on 09/12/2022

andreasphd
andreasphd 🇬🇧

4.7

(28)

287 documents

1 / 39

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Introduction to Distributed Hash Tables
Eric Rescorla
Network Resonance
Eric Rescorla IAB Plenary, IETF 65 1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27

Partial preview of the text

Download Distributed Hash Tables: An Overview of Chord and Related Systems and more Exercises Design in PDF only on Docsity!

Introduction to Distributed Hash Tables

Eric Rescorla

Network Resonance

[email protected]

IAB Plenary, IETF 65

Overall Concept

Distributed Hash Table (DHT)

Distribute data over a large P2P network

Quickly find any given item

Can also distribute responsibility for data storage

What’s stored is key/value pairs

The key value controls which node(s) stores the value

Each node is responsible for some section of the space

Basic operations

Store

key, val

val

Retrieve

key

IAB Plenary, IETF 65

The Chord Ring

2 n − 1 A B

B’s

responsibility

C

C’s

responsibility

D

D’s

responsibility

IAB Plenary, IETF 65

Routing

Naive routing algorithm

∗ Each node knows its neighbors

Send message to nearest neighbor

Hop-by-hop from there

Obviously this is

O

n )

So no good

Better algorithm: “finger table”

∗ Memorize locations of other nodes in the ring

a , a

  • 2

a

  • 4

a

  • 8

a

  • 16

a

  • 2

n

∗ Send message to closest node to destination

Hop-by-hop again

This is

log

n )

IAB Plenary, IETF 65

Adding a node

2 n − 1 A B

B’s

responsibility

C

C’s

responsibility

D

X

D’s

responsibility

X’s responsibility

IAB Plenary, IETF 65

Node Failure

0

2 n − (^1)

A

B

C

D

X

D’s

responsibility

X’s responsibility

Data

Before

0

2 n − (^1)

A

B

C

D

D’s

responsibility

Data

X Fails

0

2 n − (^1)

A

B

C

D

D’s

responsibility

Data

After Stabilization

Data must be replicated to survive node failure.

IAB Plenary, IETF 65

What DHTs are good at

Distributed storage of things with known names

Highly scalable

Automatically distributes load to new nodes

Robust against node failure

...except for bootstrap nodes

Data automatically migrated away from failed nodes

Self organizing

No need for a central server

IAB Plenary, IETF 65

What DHTs are bad at

Searching

Consequence of hash algorithm

“abc” and “abcd” are at totally different nodes

  • Warning:

DHT people call lookup “search”

Security problems

Hard to verify data integrity

Secure routing is an open problem

IAB Plenary, IETF 65

DDNS [CMM02] and CoDoNS [RS04]

Obvious approach: Each DNS name becomes a DHT entry

e.g.,

www.example.com:A

(Just a conceptual example)

DDNS

Based on Chord

Inferior performance to DNS (

log

N

lookup cost)

CoDoNS

Based on Beehive

O

performance due to aggressive replication

Probably unrealistic memory requirements on each node

Both use DNSSEC for security

IAB Plenary, IETF 65

Performance Under Attack

DNS

Attack on root nodes

Chord

subspaceAttack on a continuous

Percent failed queries

Data/Figure from Pappas et al. [PMTZ06]

IAB Plenary, IETF 65

Example Application: Peer-to-Peer VoIP

Skype Envy

Reduce network operational costs

Avoid having (paying) a service provider

VoIP when there’s no Internet connectivity

Scalability

Anonymous Calling

IAB Plenary, IETF 65

What’s the problem?

SIP is

already

mostly P2P

SIP UAs can already connect directly to each other

But in practice they go through a centralized server

Modulo firewall and NAT traversal issues

The problem is locating the right peer to connect to

∗ Currently this is done with DNS

Works fine with stable centralized servers

But how do you lookup the location of unstable peers?

∗ What about dynamic DNS?

Concerns about performance

What if you’re disconnected from the Internet?

IAB Plenary, IETF 65

Overview of Security Issues

Data correctness

Correctness of routing

Fairness and detecting defection

DoS

IAB Plenary, IETF 65

Data Correctness

Storing nodes have no relationship to data owner

What stops me from overwriting data?

Nothing!

And how do I know it’s right when I get it?

General approach: make sure data is verifiable

Self-certifying (e.g.,

k

SHA

data

Externally signed

IAB Plenary, IETF 65