Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Hashing - Programming Languages and Techniques II - Lecture Slides, Slides of Programming Languages

Aligarh Muslim University Programming Languages

In all programming language only syntax is different not the logic. This course discuss core concepts for many different programming language and techniques. Key points for this lecture are: Hashing, Magic Function, Array, Finding the Hash Function, Imperfect Hash Function, Collisions, Handling Collisions, Insertion, Clustering, Efficiency

Typology: Slides

2012/2013

Uploaded on 09/29/2013

dhanvant 🇮🇳

4.9

(9)

89 documents

1 / 23

This page cannot be seen from the preview

Don't miss anything!

Hashing

docsity.com

Discover Slides of Programming Languages Aligarh Muslim University

Partial preview of the text

Download Hashing - Programming Languages and Techniques II - Lecture Slides and more Slides Programming Languages in PDF only on Docsity!

Hashing

docsity.com

Searching

 Consider the problem of searching an array for a given

value

 If the array is not sorted, the search requires O(n) time

 If the value isn’t there, we need to search all n elements  If the value is there, we search n/2 elements on average

 If the array is sorted, we can do a binary search

 A binary search requires O(log n) time  About equally fast whether the element is found or not

 It doesn’t seem like we could do much better

 How about an O(1), that is, constant time search?  We can do it if the array is organized in a particular way

docsity.com

Example (ideal) hash function

 Suppose our hash function

gave us the following values:

hashCode("apple") = 5 hashCode("watermelon") = 3 hashCode("grapes") = 8 hashCode("cantaloupe") = 7 hashCode("kiwi") = 0 hashCode("strawberry") = 9 hashCode("mango") = 6 hashCode("banana") = 2

kiwi

banana

watermelon

apple

mango

cantaloupe

grapes

strawberry

docsity.com

Why hash tables?

 We don’t (usually) use

hash tables just to see if

something is there or not—

instead, we put key / value

pairs into a map

 We use a key to find a place

in the table

 The value holds the

information we are actually

interested in

robin sparrow hawk seagull

bluejay owl

robin info sparrow info hawk info seagull info

bluejay info owl info

key value

docsity.com

Example imperfect hash function

 Suppose our hash function gave

us the following values:

 hash("apple") = 5 hash("watermelon") = 3 hash("grapes") = 8 hash("cantaloupe") = 7 hash("kiwi") = 0 hash("strawberry") = 9 hash("mango") = 6 hash("banana") = 2 hash("honeydew") = 6

kiwi

banana

watermelon

apple

mango

cantaloupe

grapes

strawberry

• Now what?

docsity.com

Collisions

 When two values hash to the same array location,

this is called a collision

 Collisions are normally treated as “first come, first

served”—the first value that hashes to the location

gets it

 We have to find something to do with the second and

subsequent values that hash to this same location

docsity.com

Insertion, I

 Suppose you want to add

seagull to this hash table

 Also suppose:

 hashCode(seagull) = 143

 table[143] is not empty

 table[143] != seagull

 table[144] is not empty

 table[144] != seagull

 table[145] is empty

 Therefore, put seagull at

location 145

robin

sparrow

hawk

bluejay

owl

seagull

docsity.com

Searching, I

 Suppose you want to look up

seagull in this hash table

 Also suppose:

 hashCode(seagull) = 143

 table[143] is not empty

 table[143] != seagull

 table[144] is not empty

 table[144] != seagull

 table[145] is not empty

 table[145] == seagull!

 We found seagull at location

robin

sparrow

hawk

bluejay

owl

seagull

docsity.com

Insertion, II

 Suppose you want to add

hawk to this hash table

 Also suppose

 hashCode(hawk) = 143

 table[143] is not empty

 table[143] != hawk

 table[144] is not empty

 table[144] == hawk

 hawk is already in the table,

so do nothing

robin

sparrow

hawk

seagull

bluejay

owl

docsity.com

Insertion, III

 Suppose:

 You want to add cardinal to

this hash table

 hashCode(cardinal) = 147

 The last location is 148

 147 and 148 are occupied

 Solution:

 Treat the table as circular; after

148 comes 0

 Hence, cardinal goes in

location 0 (or 1, or 2, or ...)

robin

sparrow

hawk

seagull

bluejay

owl

docsity.com

Efficiency

 Hash tables are actually surprisingly efficient

 Until the table is about 70% full, the number of

probes (places looked at in the table) is typically

only 2 or 3

 Sophisticated mathematical analysis is required to

prove that the expected cost of inserting into a hash

table, or looking something up in the hash table, is

O(1)

 Even if the table is nearly full (leading to long

searches), efficiency is usually still quite high

docsity.com

Solution #2: Rehashing

 In the event of a collision, another approach is to rehash: compute

another hash function

 Since we may need to rehash many times, we need an easily computable sequence of functions

 Simple example: in the case of hashing Strings, we might take the

previous hash code and add the length of the String to it

 Probably better if the length of the string was not a component in computing the original hash function

 Possibly better yet: add the length of the String plus the number

of probes made so far

 Problem: are we sure we will look at every location in the array?

 Rehashing is a fairly uncommon approach, and we won’t pursue

it any further here

docsity.com

The hashCode function

 public int hashCode() is defined in Object

 Like equals, the default implementation of

hashCode just uses the address of the object—

probably not what you want for your own objects

 You can override hashCode for your own objects

 As you might expect, String overrides hashCode

with a version appropriate for strings

 Note that the supplied hashCode method does not

know the size of your array —you have to adjust

the returned int value yourself

docsity.com

Writing your own hashCode method

 A hashCode method must:

 Return a value that is (or can be converted to) a legal

array index

 Always return the same value for the same input

 It can’t use random numbers, or the time of day

 Return the same value for equal inputs

 Must be consistent with your equals method

 It does not need to return different values for

different inputs

 A good hashCode method should:

 Be efficient to compute

 Give a uniform distribution of array indices

 Not assign similar numbers to similar input values

docsity.com

Hashing - Programming Languages and Techniques II - Lecture Slides, Slides of Programming Languages

Related documents

Partial preview of the text

Download Hashing - Programming Languages and Techniques II - Lecture Slides and more Slides Programming Languages in PDF only on Docsity!

Hashing

Searching

 Consider the problem of searching an array for a given

value

 If the array is not sorted, the search requires O(n) time

 If the array is sorted, we can do a binary search

 It doesn’t seem like we could do much better

Example (ideal) hash function

 Suppose our hash function

gave us the following values:

kiwi

banana

watermelon

apple

mango

cantaloupe

grapes

strawberry

Why hash tables?

 We don’t (usually) use

hash tables just to see if

something is there or not—

instead, we put key / value

pairs into a map

 We use a key to find a place

in the table

 The value holds the

information we are actually

interested in

key value

Example imperfect hash function

 Suppose our hash function gave

us the following values:

kiwi

banana

watermelon

apple

mango

cantaloupe

grapes

strawberry

• Now what?

Collisions

 When two values hash to the same array location,

this is called a collision

 Collisions are normally treated as “first come, first

served”—the first value that hashes to the location

gets it

 We have to find something to do with the second and

subsequent values that hash to this same location

Insertion, I

 Suppose you want to add

seagull to this hash table

 Also suppose:

 table[143] is not empty

 table[144] is not empty

 table[145] is empty

 Therefore, put seagull at

location 145

robin

sparrow

hawk

bluejay

owl

seagull

Searching, I

 Suppose you want to look up

seagull in this hash table

 Also suppose:

 table[143] is not empty

 table[144] is not empty

 table[145] is not empty

 table[145] == seagull!

 We found seagull at location

robin

sparrow