Linked Lists and Hash Tables: Variants and Collision Strategies, Slides of Algorithms and Programming

Various types of linked lists, including singly linked lists, circular linked lists, and doubly linked lists. It also covers hash tables and their collision strategies, such as linear probing and chaining. Examples and explanations of how these data structures work and their respective advantages and disadvantages.

Typology: Slides

2012/2013

Uploaded on 04/27/2013

netii
netii 🇮🇳

4.4

(7)

91 documents

1 / 27

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
More Linked Lists
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b

Partial preview of the text

Download Linked Lists and Hash Tables: Variants and Collision Strategies and more Slides Algorithms and Programming in PDF only on Docsity!

More Linked Lists

In some applications, it is convenient to keep access to both

the first node and the last node in the list.

This would work nicely for a linked-list implementation of a

queue (but we'll see an alternative later.)

L

first

mySize 5

last

Linked List Variants (§9.1)

(If data portion of element is large, two or more lists can

share the same trailer node.)

first?^9 17 22 26 34?

Sometimes a (dummy) trailer node is also used so that

every node has a successor.

Not very common

Trailer Nodes

Each node in a circular linked list has a predecessor (and a successor), provided that the list is nonempty.

insertion and deletion do not require special consideration of the first node. This is a good implementation for a linked queue or for any problem in

last (^) 9 17 22 26 34

In other applications a circular linked list is used;

instead of the last node containing a null pointer, it

contains a pointer to the first node in the list.

For such lists,one can use a single pointer to the last node in the

list, because then one has direct access to it and "almost-direct"

access to the first node.

Circular Linked List

Docsity.com

Circularly Linked Lists

Traversal must be modified: avoid an infinite loop by looking for the end of list as signalled by a null pointer.

Like other methods, deletion must also be slightly modified.

Deleting the last node is signalled when the node deleted points to itself.

if (first == 0) // list is empty

// Signal that the list is empty

else

{

ptr = predptr->next; // hold node for deletion if (ptr == predptr) // one-node list first = 0; else // list with 2 or more nodes predptr->next = ptr->next;

delete ptr;

} Docsity.com

All of these lists, however, are uni-directional; we can

only move from one node to its successor.

last

L^ prev

first

mySize (^5)

next

In many applications, bidirectional movement is necessary. In

this case, each node has two pointers — one to its successor

(null if there is none) and one to its predecessor (null if there is

none.) Such a list is commonly called a doubly-linked (or

symmetrically-linked ) list.

e.g., §9.4 BigInt

Doubly Linked Lists (§9.4)

And of course, we could modify this doubly-linked list so that

both lists are circular forming a doubly-linked ring.

L

first mySize (^5)

last 9 17 22 26 34

Add a head node and we have the implementation used in

STL's list class.

Other variations: §9. Multiply-ordered lists Lists of lists (LISP)

Doubly Linked Rings (§9.5)

STL list Class Template

L

first mySize (^5)

last (^917 22 26 )

p r e v

n e x t

d a t a

list is a sequential container

optimized for insertion and erasure at arbitrary points in the

sequence

Implementation: circular doubly-linked list with head node.

Caveat:

Ease of use is a trade-off for the significant overhead of doubly-

linked lists

Moral:

Be aware of what you are using and its costs/benefits

Hash Tables

Given up to 25 integers in the range 0 through 999 to be stored in a hash table.

This hash table can be implemented as an integer array table in which each array element is initialized with some dummy value, such as - 1.

If we use each integer i in the set as an index, that is, if we store i in table [ i ], then to determine whether a particular integer number has been stored, we need only check if table [ number ] is equal to number.

The hash function then is h ( i ) = i

The hash function determines the location of an item i

in the hash table.

Hash Tables

The hash function in the previous example works perfectly because the time required to search the table for a given value is constant;  only one location needs to be examined.

This hash function then is very time efficient, but it is surely not space-efficient.

Only 25 of the 1000 available locations are used to store items, leaving 975 unused locations; only 2.5 percent of the available space is used, and so 97.5 percent is wasted!

Because it is possible to store 25 values in 25 locations, we might try improving space utilization by using an array table with capacity 25.

Modified hash function h ( i ) = i modulo 25 addresses the space problem

// C++ syntax,

int h(int i)

{ return i % 25;}

Hash Tables

But what about placing 77? h (77) = 77 % 25 = 2

Collision!!

Other values may collide at a given position: for example, all integers of the form 25 k + 2 hash to location 2.

Some strategy is needed to resolve such collisions:

  1. Need to be able to place an element when its mapped location is already full
  2. Need to be able to retrieve element when it's not placed directly according to the hash function

Collision Strategy (Storage)

linear probing : linear search of the table from location of collision until an empty slot is found in which the item can be stored. When 77 collides with 52 at location 2, put 77 in position 3

INDEX VALUE 0 500 1 - 2 52 3 77 4 129 5 102 … 23 273 24 49

To insert 102, we follow the probe sequence consisting of locations 2, 3, 4, and 5 to find the first available location and thus store 102 in table [5].

Cost of Linear Probing (Searching)

To determine if a specified value is in this hash table,

apply the hash function to compute the location for this value

1. if location is empty, value not in the table.

2. if location contains the specified value, the search is

successful.

3. if location contains a different value, must rule out collision

 begin a “circular” linear search at this location and

continue until either item is found or empty or starting

location reached (item not in table)

Items 1 & 2 take O(1) time

But Item 3 (worst case) takes O(n)!

 Use a hash table capacity that’s 1.5 - 2 times the # of items to be

stored (c.f., the Birthday Problem on p. 484) Docsity.com

Another Collision Strategy

Chaining: use a hash table that is an array (or vector) of linked lists to store the items.

For example, to store names "alphabetically", use an array table of 26 linked lists, initially empty, and the simple hash function h ( name ) = name [0] - ‘A’;

that is, h ( name ) is 0 if name [0] is ‘A’,

1 if name [0] is ‘B’,... , 25 if name [0] is ‘Z’

Searching such a hash table is straightforward: apply the hash function to the item sought and then use one of the search algorithms for linked lists.

When a collision occurs, we simply insert the new item into the appropriate linked list.