Hash Tables with Buckets: Implementation in C | Exams Data Structures and Algorithms

Worksheet 38: Hash Tables with Buckets Name:

An Active Learning Approach to Data Structures using C

Worksheet 38: Hash Tables using Buckets

In Preparation: Read Chapter 12 to learn more about hash tables. If you have not done

so already, complete Worksheet 37 on open address hashing.

In the previous lesson you learned about the concept of hashing, and how it was used in

an open address hash table. In this lesson you will explore a different approach to dealing

with collisions, the idea of hash tables using buckets.

A hash table that uses buckets is really a combination of an array and a linked list. Each

element in the array (the hash table) is a header for a linked list. All elements that hash

into the same location will be stored in the list.

Each operation on the hash table divides into two steps. First, the element is hashed and

the remainder taken after dividing by the table size. This yields a table index. Next,

linked list indicated by the table index is examined. The algorithms for the latter are very

similar to those used in the linked list. For example, to add a new element is simply the

following:

void HashTableAdd (struct hashTab le * ht, EleType newValue) {

// compute hash value to find the correct bucket

long int hash = HASH(newValue);

int hashIndex = (int) (labs(hash) % ht.tableleng th);

struct link * newLink = (struct link *) malloc(sizeo f(struct link));

assert(newLink);

newLink->value = newValue; newLink->next = ht->table[hashIndex];

ht->table[hashIndex] = newLink; // add to bucket

dataCount++; // Note: later might want to add resizing the table (below)

}

The contains test is performed as a loop, but only on the linked list stored at the table

index. The removal operation is the most complicated, since like the linked list it must

modify the previous element. The easiest way to do this is to maintain a pointer to both

the current element and to the previous element, as you did in Lesson 32. When the

current element is found, the next pointer for the previous is modified.

As with open address hash tables, the load factor (l) is defined as the number of elements

divided by the table size. In this structure the load factor can be larger than one, and

represents the average number of elements stored in each list, assuming that the hash

function distributes elements uniformly over all positions. Since the running time of the

contains test and removal is proportional to the length of the list, they are O(l). Therefore

the execution time for hash tables is fast only if the load factor remains small. A typical

technique is to resize the table (doubling the size, as with the vector and the open address

hash table) if the load factor becomes larger than 10.

Complete the implementation of the HashTable class based on these ideas.

Partial preview of the text

Download Hash Tables with Buckets: Implementation in C and more Exams Data Structures and Algorithms in PDF only on Docsity!

Worksheet 38: Hash Tables using Buckets

In Preparation : Read Chapter 12 to learn more about hash tables. If you have not done so already, complete Worksheet 37 on open address hashing. In the previous lesson you learned about the concept of hashing, and how it was used in an open address hash table. In this lesson you will explore a different approach to dealing with collisions, the idea of hash tables using buckets. A hash table that uses buckets is really a combination of an array and a linked list. Each element in the array (the hash table) is a header for a linked list. All elements that hash into the same location will be stored in the list. Each operation on the hash table divides into two steps. First, the element is hashed and the remainder taken after dividing by the table size. This yields a table index. Next, linked list indicated by the table index is examined. The algorithms for the latter are very similar to those used in the linked list. For example, to add a new element is simply the following: void HashTableAdd (struct hashTable * ht, EleType newValue) { // compute hash value to find the correct bucket long int hash = HASH(newValue); int hashIndex = (int) (labs(hash) % ht.tablelength); struct link * newLink = (struct link *) malloc(sizeof(struct link)); assert(newLink); newLink->value = newValue; newLink->next = ht->table[hashIndex]; ht->table[hashIndex] = newLink; // add to bucket dataCount++; // Note: later might want to add resizing the table (below) } The contains test is performed as a loop, but only on the linked list stored at the table index. The removal operation is the most complicated, since like the linked list it must modify the previous element. The easiest way to do this is to maintain a pointer to both the current element and to the previous element, as you did in Lesson 32. When the current element is found, the next pointer for the previous is modified. As with open address hash tables, the load factor (l) is defined as the number of elements divided by the table size. In this structure the load factor can be larger than one, and represents the average number of elements stored in each list, assuming that the hash function distributes elements uniformly over all positions. Since the running time of the contains test and removal is proportional to the length of the list, they are O(l). Therefore the execution time for hash tables is fast only if the load factor remains small. A typical technique is to resize the table (doubling the size, as with the vector and the open address hash table) if the load factor becomes larger than 10. Complete the implementation of the HashTable class based on these ideas.

struct hlink { EleType value; struct hlink *next; }; struct hashTable { struct hlink ** table; int tablesize; int count; }; void initHashTable (struct hashTable * ht, int size) { } int hashTableSize (struct hashTable * ht) { return ht->count; } void hashTableAdd (struct hashTable &ht, EleType newValue) { // compute hash value to find the correct bucket long hash = HASH(newValue); int hashIndex = (int) (labs(hash) % ht.tablelength); struct link * newLink = (struct link ) malloc(sizeof(struct link)); assert(newLink); newLink->value = newValue; newLink->next = ht->table[hashIndex]; ht->table[hashIndex] = newLink; / add to bucket */ ht->count++; if ((ht->count / (double) ht->tablesize) > 3.0) _resizeHashTable(ht); } int hashTableContains (struct hashTable * ht, EleType testElement) { }

Hash Tables with Buckets: Implementation in C, Exams of Data Structures and Algorithms

Related documents

Partial preview of the text

Download Hash Tables with Buckets: Implementation in C and more Exams Data Structures and Algorithms in PDF only on Docsity!

Worksheet 38: Hash Tables using Buckets