Hash Tables with Buckets: Implementation in C, Exams of Data Structures and Algorithms

An implementation of hash tables using buckets in c. The author explains the concept of hash tables with buckets, which is a combination of an array and a linked list. Each operation on the hash table, such as adding or removing an element, is divided into two steps: first, computing the hash value to find the correct bucket, and second, examining the linked list stored at the table index. The document also discusses the load factor, the contain and remove operations, and resizing the table when necessary.

Typology: Exams

Pre 2010

Uploaded on 08/30/2009

koofers-user-r2s
koofers-user-r2s 🇺🇸

10 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Worksheet 38: Hash Tables with Buckets Name:
An Active Learning Approach to Data Structures using C
1
Worksheet 38: Hash Tables using Buckets
In Preparation: Read Chapter 12 to learn more about hash tables. If you have not done
so already, complete Worksheet 37 on open address hashing.
In the previous lesson you learned about the concept of hashing, and how it was used in
an open address hash table. In this lesson you will explore a different approach to dealing
with collisions, the idea of hash tables using buckets.
A hash table that uses buckets is really a combination of an array and a linked list. Each
element in the array (the hash table) is a header for a linked list. All elements that hash
into the same location will be stored in the list.
Each operation on the hash table divides into two steps. First, the element is hashed and
the remainder taken after dividing by the table size. This yields a table index. Next,
linked list indicated by the table index is examined. The algorithms for the latter are very
similar to those used in the linked list. For example, to add a new element is simply the
following:
void HashTableAdd (struct hashTab le * ht, EleType newValue) {
// compute hash value to find the correct bucket
long int hash = HASH(newValue);
int hashIndex = (int) (labs(hash) % ht.tableleng th);
struct link * newLink = (struct link *) malloc(sizeo f(struct link));
assert(newLink);
newLink->value = newValue; newLink->next = ht->table[hashIndex];
ht->table[hashIndex] = newLink; // add to bucket
dataCount++; // Note: later might want to add resizing the table (below)
}
The contains test is performed as a loop, but only on the linked list stored at the table
index. The removal operation is the most complicated, since like the linked list it must
modify the previous element. The easiest way to do this is to maintain a pointer to both
the current element and to the previous element, as you did in Lesson 32. When the
current element is found, the next pointer for the previous is modified.
As with open address hash tables, the load factor (l) is defined as the number of elements
divided by the table size. In this structure the load factor can be larger than one, and
represents the average number of elements stored in each list, assuming that the hash
function distributes elements uniformly over all positions. Since the running time of the
contains test and removal is proportional to the length of the list, they are O(l). Therefore
the execution time for hash tables is fast only if the load factor remains small. A typical
technique is to resize the table (doubling the size, as with the vector and the open address
hash table) if the load factor becomes larger than 10.
Complete the implementation of the HashTable class based on these ideas.
pf3

Partial preview of the text

Download Hash Tables with Buckets: Implementation in C and more Exams Data Structures and Algorithms in PDF only on Docsity!

Worksheet 38: Hash Tables using Buckets

In Preparation : Read Chapter 12 to learn more about hash tables. If you have not done so already, complete Worksheet 37 on open address hashing. In the previous lesson you learned about the concept of hashing, and how it was used in an open address hash table. In this lesson you will explore a different approach to dealing with collisions, the idea of hash tables using buckets. A hash table that uses buckets is really a combination of an array and a linked list. Each element in the array (the hash table) is a header for a linked list. All elements that hash into the same location will be stored in the list. Each operation on the hash table divides into two steps. First, the element is hashed and the remainder taken after dividing by the table size. This yields a table index. Next, linked list indicated by the table index is examined. The algorithms for the latter are very similar to those used in the linked list. For example, to add a new element is simply the following: void HashTableAdd (struct hashTable * ht, EleType newValue) { // compute hash value to find the correct bucket long int hash = HASH(newValue); int hashIndex = (int) (labs(hash) % ht.tablelength); struct link * newLink = (struct link *) malloc(sizeof(struct link)); assert(newLink); newLink->value = newValue; newLink->next = ht->table[hashIndex]; ht->table[hashIndex] = newLink; // add to bucket dataCount++; // Note: later might want to add resizing the table (below) } The contains test is performed as a loop, but only on the linked list stored at the table index. The removal operation is the most complicated, since like the linked list it must modify the previous element. The easiest way to do this is to maintain a pointer to both the current element and to the previous element, as you did in Lesson 32. When the current element is found, the next pointer for the previous is modified. As with open address hash tables, the load factor (l) is defined as the number of elements divided by the table size. In this structure the load factor can be larger than one, and represents the average number of elements stored in each list, assuming that the hash function distributes elements uniformly over all positions. Since the running time of the contains test and removal is proportional to the length of the list, they are O(l). Therefore the execution time for hash tables is fast only if the load factor remains small. A typical technique is to resize the table (doubling the size, as with the vector and the open address hash table) if the load factor becomes larger than 10. Complete the implementation of the HashTable class based on these ideas.

struct hlink { EleType value; struct hlink *next; }; struct hashTable { struct hlink ** table; int tablesize; int count; }; void initHashTable (struct hashTable * ht, int size) { } int hashTableSize (struct hashTable * ht) { return ht->count; } void hashTableAdd (struct hashTable &ht, EleType newValue) { // compute hash value to find the correct bucket long hash = HASH(newValue); int hashIndex = (int) (labs(hash) % ht.tablelength); struct link * newLink = (struct link ) malloc(sizeof(struct link)); assert(newLink); newLink->value = newValue; newLink->next = ht->table[hashIndex]; ht->table[hashIndex] = newLink; / add to bucket */ ht->count++; if ((ht->count / (double) ht->tablesize) > 3.0) _resizeHashTable(ht); } int hashTableContains (struct hashTable * ht, EleType testElement) { }