Hash Table Implementations: Chaining and Open Addressing, Study notes of Data Structures and Algorithms

An overview of hash table implementations using chaining and open addressing techniques. It covers the concepts of hash functions, hash table size, collision handling, and retrieval methods for both chaining and open addressing. The document also includes examples of hash table implementations and their respective probe sequences.

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-cdn
koofers-user-cdn 🇺🇸

9 documents

1 / 47

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
hs0
Copyright © 1996 Hanan Samet
These notes may not be reproduced by any means (mechanical or elec-
tronic or any other) without the express written permission of Hanan Samet
HASHING METHODS
Hanan Samet
Computer Science Department and
Center for Automation Research and
Institute for Advanced Computer Studies
University of Maryland
College Park, Maryland 20742
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f

Partial preview of the text

Download Hash Table Implementations: Chaining and Open Addressing and more Study notes Data Structures and Algorithms in PDF only on Docsity!

hs

Copyright © 1996 Hanan Samet

These notes may not be reproduced by any means (mechanical or elec- tronic or any other) without the express written permission of Hanan Samet

HASHING METHODS

Hanan Samet

Computer Science Department and Center for Automation Research and Institute for Advanced Computer Studies University of Maryland College Park, Maryland 20742 e-mail: [email protected]

hs

HASHING OVERVIEW

  • Task: compare the value of a key with a set of key values in a table
  • Conventional solutions:
    1. use a comparison on key values (tree-based)
    2. branching process governed by the digits comprising the key value (trie-based)
  • Alternative solution is to find a 1-1 mapping (i.e., function) from set of possible key values to a memory address and use table lookup methods to retrieve the

record —O (1) process

  • Problem: the set of possible key values is much larger than the number of available memory addresses

1. developing the 1-1 functionh is time-consuming as it

requires puzzle-solving abilities

  • result is called a perfect hashing function

2. onceh is found, addition of a single key value may

render the function meaningless

  • need to develop it anew

3. can replaceh by a program, which may itself be time-

consuming to compute

  • Result: usually abandon goal of finding 1-1 mapping and use a special method to resolve any ambiguity (i.e., when more than one key value is mapped to

the same address — termed acollision)

Copyright © 1998 by Hanan Samet

hs

• Hash table of sizem

• One chain (linked list) for each ofm hash values

containing all elements that hash to that location (known

as acollision list )

• Hash chains are known asbuckets

• Hash table locations are known asbucket addresses

• Forn key values, average chain size isn/m

• One chain (linked list) for each ofm hash values

  • Retrieval
    1. use sequential search through chain
    2. speed up unsuccessful search by sorting chain by key value
    3. speed up successful search by self-organizing methods
      • move key value to start of chain each time it is accessed
  • Ex:

1 SEPARATE CHAINING b

h(k) NAME k=KEY NEXT 0 JIM 49 Λ 1 JOHN 22 Λ 2 RAY 30 Λ 3 SUZY 3 Λ 4 5 6

Copyright © 1998 by Hanan Samet

hs

• Hash table of sizem

• One chain (linked list) for each ofm hash values

containing all elements that hash to that location (known

as acollision list )

• Hash chains are known asbuckets

• Hash table locations are known asbucket addresses

• Forn key values, average chain size isn/m

• One chain (linked list) for each ofm hash values

  • Retrieval
    1. use sequential search through chain
    2. speed up unsuccessful search by sorting chain by key value
    3. speed up successful search by self-organizing methods
      • move key value to start of chain each time it is accessed
  • Ex:

1 SEPARATE CHAINING b

h(k) NAME k=KEY NEXT 0 JIM 49 Λ 1 JOHN 22 Λ 2 RAY 30 Λ 3 SUZY 3 Λ 4 5 6

Copyright © 1998 by Hanan Samet

(^2) hs r

JANE 14 Λ

  1. add JANE(14)→ 0

Copyright © 1998 by Hanan Samet

hs

• Whenm is large, many of the chains are empty

  • Use empty locations in table for the chain
  • Must be able to distinguish between free and occupied locations
  • Insertion algorithm:
    1. if key value not present, then allocate a free location
    2. link location to chain which was unsuccessfully searched
  • Ex:

1 IN-PLACE CHAINING b

h(k) NAME k=KEY NEXT 0 JIM 49 Λ 1 JOHN 22 Λ 2 RAY 30 Λ 3 SUZY 3 Λ 4 5 6

Copyright © 1998 by Hanan Samet

hs

• Whenm is large, many of the chains are empty

  • Use empty locations in table for the chain
  • Must be able to distinguish between free and occupied locations
  • Insertion algorithm:
    1. if key value not present, then allocate a free location
    2. link location to chain which was unsuccessfully searched
  • Ex:

1 IN-PLACE CHAINING b

h(k) NAME k=KEY NEXT 0 JIM 49 Λ 1 JOHN 22 Λ 2 RAY 30 Λ 3 SUZY 3 Λ 4 5 6

Copyright © 1998 by Hanan Samet

(^2) hs r

  1. add JANE(14)→0 which collides with JIM(49)→ 0

6

JANE 14 Λ

Copyright © 1998 by Hanan Samet

hs

• Whenm is large, many of the chains are empty

  • Use empty locations in table for the chain
  • Must be able to distinguish between free and occupied locations
  • Insertion algorithm:
    1. if key value not present, then allocate a free location
    2. link location to chain which was unsuccessfully searched
  • Ex:

1 IN-PLACE CHAINING b

h(k) NAME k=KEY NEXT 0 JIM 49 Λ 1 JOHN 22 Λ 2 RAY 30 Λ 3 SUZY 3 Λ 4 5 6

Copyright © 1998 by Hanan Samet

(^2) hs r

  1. add JANE(14)→0 which collides with JIM(49)→ 0

6

JANE 14 Λ

Copyright © 1998 by Hanan Samet

(^3) hs z

  1. add LUCY(41)→6 which collides with JANE(14)→ 0 which is stored at 6
    • result in coalescing of chains of JANE and LUCY making unsuccessful search longer as several chains must be searched

LUCY 41 Λ 5

Copyright © 1998 by Hanan Samet

(^4) hs g

  • Can avoid coalescing by moving JANE just before adding LUCY

LUCY 41

JANE 14

5

Λ

Copyright © 1998 by Hanan Samet

hs

IN-PLACE CHAINING INSERTION ALGORITHM

location procedure CHAINING_WITH_COALESCING_INSERTION(k); begin value key k; integer i; global integer r; /* r is the most recently allocated location */ global hashtable table; i←h(k); if OCCUPIED(table[i]) then begin while NOT(NULL(NEXT(table[i])) do begin if k=KEY(table[i]) then return(i) else i←NEXT(table[i]); end; if k=KEY(table[i]) then return(i); while OCCUPIED(table[r]) do r←r-1; if r≤0 then return(OVERFLOW') else begin NEXT(table[i])←r; i←r; end; end; MARK(table[i],OCCUPIED'); KEY(table[i])←k; NEXT(table[i])←NIL; return(i); end;

Copyright © 1998 by Hanan Samet

hs

  • Avoid extra space for NEXT field by not storing entire key value with record

• k =m ·q(k) +h(k),q(k) =  k/m ,h(k) =k modm

• Storeq(k) in table instead ofk

• Can computek givenm,q(k), andh(k),

• Ex: 0 ≤ k < 2 32

• Since only compareq(k), all elements in same collision

list must have the same value ofh(k) and thus no

coalescing is allowed

  • Data structure:
    1. circular collision lists
    2. flag FIRST denoting if first element on collision list
    3. pointer NEXT to next element in circular list with same

h(k) value

  • Ex:

1 LAMPSON’S IN-PLACE CHAINING b

h(k) NAME k=KEY FIRST 0 JIM 49 T 7 0 1 JOHN 22 T 3 1 2 RAY 30 T 4 2 3 SUZY 3 T 0 3 4 5 6

q(k) NEXT

q(k) h(k)

0 21 22 31

Copyright © 1998 by Hanan Samet

(^2) hs r

  1. add JANE(14)→ 0

JANE 14 F 2 0

6

Copyright © 1998 by Hanan Samet

hs

  • Avoid extra space for NEXT field by not storing entire key value with record

• k =m ·q(k) +h(k),q(k) =  k/m ,h(k) =k modm

• Storeq(k) in table instead ofk

• Can computek givenm,q(k), andh(k),

• Ex: 0 ≤ k < 2 32

• Since only compareq(k), all elements in same collision

list must have the same value ofh(k) and thus no

coalescing is allowed

  • Data structure:
    1. circular collision lists
    2. flag FIRST denoting if first element on collision list
    3. pointer NEXT to next element in circular list with same

h(k) value

  • Ex:

1 LAMPSON’S IN-PLACE CHAINING b

h(k) NAME k=KEY FIRST 0 JIM 49 T 7 0 1 JOHN 22 T 3 1 2 RAY 30 T 4 2 3 SUZY 3 T 0 3 4 5 6

q(k) NEXT

q(k) h(k)

0 21 22 31

Copyright © 1998 by Hanan Samet

(^2) hs r

  1. add JANE(14)→ 0

JANE 14 F 2 0

6

Copyright © 1998 by Hanan Samet

(^3) hs z

  1. add LUCY(41)→6 but 6 contains JANE
    • if at least one element of the hash chain starting at 6 exists, then it must be stored there
    • must move JANE as it does not belong in 6

JANE 14 F 2 0 LUCY 41 T 5 6

5

Copyright © 1998 by Hanan Samet

hs

  • Like chaining but NEXT link field is open or unspecified
  • Probe sequence: set of locations comprising collision list of a key
  • Goal: cycle through all locations with little or no duplication

• Linear probing: h(k),h(k)+1,h(k)+2, …,m–1, 0, 1,h(k)–

  • Insertion Algorithm:

1. calculate hash addressi

2. if TABLE(i ) is empty then insert and exit; elsei← i+

modm and repeat step 2 until exhausting TABLE

  • Ex:

1 OPEN ADDRESSING b

h(k) NAME k=KEY 0 JIM 49 1 JOHN 22 2 RAY 30 3 SUZY 3 4 5 6

Copyright © 1998 by Hanan Samet

hs

  • Like chaining but NEXT link field is open or unspecified
  • Probe sequence: set of locations comprising collision list of a key
  • Goal: cycle through all locations with little or no duplication

• Linear probing: h(k),h(k)+1,h(k)+2, …,m–1, 0, 1,h(k)–

  • Insertion Algorithm:

1. calculate hash addressi

2. if TABLE(i ) is empty then insert and exit; elsei← i+

modm and repeat step 2 until exhausting TABLE

  • Ex:

1 OPEN ADDRESSING b

h(k) NAME k=KEY 0 JIM 49 1 JOHN 22 2 RAY 30 3 SUZY 3 4 5 6

Copyright © 1998 by Hanan Samet

(^2) hs r

  1. adding JANE(14)→0 yields a collision; cyclic probe sequence causes its insertion in 4

JANE 14

Copyright © 1998 by Hanan Samet

hs

  • Like chaining but NEXT link field is open or unspecified
  • Probe sequence: set of locations comprising collision list of a key
  • Goal: cycle through all locations with little or no duplication

• Linear probing: h(k),h(k)+1,h(k)+2, …,m–1, 0, 1,h(k)–

  • Insertion Algorithm:

1. calculate hash addressi

2. if TABLE(i ) is empty then insert and exit; elsei← i+

modm and repeat step 2 until exhausting TABLE

  • Ex:

1 OPEN ADDRESSING b

h(k) NAME k=KEY 0 JIM 49 1 JOHN 22 2 RAY 30 3 SUZY 3 4 5 6

Copyright © 1998 by Hanan Samet

(^2) hs r

  1. adding JANE(14)→0 yields a collision; cyclic probe sequence causes its insertion in 4

JANE 14

Copyright © 1998 by Hanan Samet

(^3) hs z

  1. adding LUCY(41)→ 6

LUCY 41

Copyright © 1998 by Hanan Samet

(^4) hs g

  1. delete RAY(30)→ 2

Copyright © 1998 by Hanan Samet

hs

  • Like chaining but NEXT link field is open or unspecified
  • Probe sequence: set of locations comprising collision list of a key
  • Goal: cycle through all locations with little or no duplication

• Linear probing: h(k),h(k)+1,h(k)+2, …,m–1, 0, 1,h(k)–

  • Insertion Algorithm:

1. calculate hash addressi

2. if TABLE(i ) is empty then insert and exit; elsei← i+

modm and repeat step 2 until exhausting TABLE

  • Ex:

1 OPEN ADDRESSING b

h(k) NAME k=KEY 0 JIM 49 1 JOHN 22 2 RAY 30 3 SUZY 3 4 5 6

Copyright © 1998 by Hanan Samet

(^2) hs r

  1. adding JANE(14)→0 yields a collision; cyclic probe sequence causes its insertion in 4

JANE 14

Copyright © 1998 by Hanan Samet

(^3) hs z

  1. adding LUCY(41)→ 6

LUCY 41

Copyright © 1998 by Hanan Samet

(^4) hs g

  1. delete RAY(30)→ 2

Copyright © 1998 by Hanan Samet

(^5) hs r

  • problem: if look up JANE then don’t find her since a collision exists at location 0, and probe sequence finds location 2 unoccupied

Copyright © 1998 by Hanan Samet