true

Take BTech Tuition from the Best Tutors

Affordable fees
1-1 or Group class
Flexible Timings
Verified Tutors

Search in

Hashing Techniques

D Subba Rao

08/08/2017 1 0

I. Hashing:

1. Hash Table Representation:

Hash table is a data structure used for storing and retrieving data very quickly. Insertion of data in the hash table is based on the key value. Hence every entry in the hash table is associated with some key.
Using the hash key the required piece of data can be searched in the hash table by few or more key comparisons. The searching time depends on the size of the hash table.
The effective representation of dictionary can be done using hash table. We can place the dictionary entries in the hash table using hash function.

2. Hash Function:

Hash function is a function which is used to put the data in the hash table. Hence one can use the same hash function to retrieve the data from the hash table. Thus hash function is used to implement the hash table.
An integer value is returned by the hash function, called hash key.

Example:

Consider that we want place some employee records in the hash table The record of employee is placed with the help of key: employee ID. The employee ID is a 7 digit number for placing the record in the hash table. To place the record 7 digit number is converted into 3 digits by taking only last three digits of the key.

If the key is 496700 it can be stored at 0^th position. The second key 8421002, the record of those key is placed at 2^nd position in the array.

Hence the hash function will be:

H (key) = key%1000

Where key%1000 is a hash function and key obtained by hash function is called hash key.

Bucket and Home bucket: The hash function H(key) is used to map several dictionary entries in the hash table. Each position of the hash table is called bucket.

The function H(key) is home bucket for the dictionary with pair whose value is key.

3. Types of Hashing Functions:

There are various types of hash functions that are used to place the record in the hash table:

i. Division or Modulus Method:

The hash function depends upon the remainder of division.

0
1
2	2
3
4	4
5
6
7	7
8
9	9

Typically the divisor is table length.

Example: If the record 54, 72, 89, 37 is placed in the hash table and if the table size is 10 then

h(key) = record % table size

4 = 54%10

2 = 72%10

9 = 89%10

7 = 37%10

ii. Mid Square:

In the mid square method, the key is squared and the middle or mid part of the result is used as the index.

If the key is a string, it has to be preprocessed to produce a number.

The formula for computing the hash key is:

h(key) = (key)² =è and Result = middle digits of squared number or key

Example: Consider that if we want to place a record 3111 then:

(3111)² = 9678321

For the hash table of size 1000

h(3111) = 783 (the middle 3 digits)

iii. Multiplicative hash function:

The given record is multiplied by some constant value. The formula for computing the hash key is-

h(key) = floor(p *(fractional part of key*A))

Where, p is an integer constant, and A is constant number.

Donald Knuth suggested to use constant A = 0.61803398987

Example: If key 107 and p = 50 then,

h(key) = floor(50*(107*0.61803398987))

= floor(3306.4818458045)

= 3306

At 3306 location in the hash table the record 107 will be placed.

iv. Digit Folding:

The key is divided into separate parts and using some simple operation these parts are combined to produce the hash key.

The formula for computing the hash key is:

h(key) = h(k1) +h(K2) +----

Where, h(k1), h(k2),… are separate parts used for computing key.

Example:

Consider a record 12365412 then it is divided into separate parts as 123, 654, 12 and these are added together

H(key) = 123+654+12

= 789

The record will be placed at location 789.

v. Digit Analysis:

The digit analysis is used in a situation when all the identifiers are known in advance. We first transform the identifiers into numbers using some radix, r. Then examine the digits of each identifier. Some digits having most skewed distributions are deleted. This deleting of digits is continued until the number of remaining digits is small enough to give an address in the range of the hash table. Then these digits are used to calculate the hash address.

4. Collisions In Hash Table:

The hash function is a function that returns the key value using which the record can be placed in the hash table. Thus this function helps us in placing the record in the hash table at appropriate position and due to this we can retrieve the record directly from that location. This function need to be designed very carefully and it should not return the same hash key address for two different records. This is an undesirable situation in hashing.

Definition: The situation in which the hash function returns the same hash key (home bucket) for more than one record is called collision and two same hash keys returned for different records is called synonym.

Similarly when there is no room for a new pair in the hash table then such a situation is called overflow. Sometimes when we handle collision it may lead to overflow conditions. Collision and overflow show the poor hash functions.

Index	Hash key
0
1	131
2
3	43
4	44
5
6	36
7	57
8	78
9	19

For example,

Consider a hash function.

h(key) = recordkey%10 having the hash table size of 10

The record keys to be placed are:

131, 44, 43, 78, 19, 36, 57 and 77

Now if we try to place 77 in the hash table then we get the hash key to be 7 and at index 7 already the record key 57 is placed. This situation is called collision. From the index 7 if we look for next vacant position at subsequent indices 8,9 then we find that there is no room to place 77 in the hash table. This situation is called overflow.

5. Collision Resolution Techniques:

If collision occurs then it should be handled by applying some techniques. Such a technique is called collision handling technique.

There are two methods for detecting collisions and overflows in the hash table:

Chaining
Open addressing (linear probing)

The two more difficult collision handling techniques are:

Quadratic probing
Double hashing

II. CHAINING:

In collision handling method chaining is a concept which introduces an additional field with data i.e. chain. A separate chain table is maintained for colliding data. When collision occurs then a linked list(chain) is maintained at the home bucket.

Example:

Consider the keys to be placed in their home buckets are:

131, 3, 4, 21, 61, 7, 97, 8, 9

Then we will apply a hash function as H(key) = key % D

Where, D is the size of table. The hash table will be:

Here D = 10

A chain is maintained for colliding elements. For instance 131 have a home bucket (key) 1. Similarly key 21 and 61 demand for home bucket 1. Hence a chain is maintained at index 1.

III. OPEN ADDRESSING :

i. LINEAR PROBING:

This is the easiest method of handling collision. When collision occurs i.e. when two records demand for the same home bucket in the hash table then collision can be solved by placing the second record linearly down whenever the empty bucket is found.
When use linear probing (open addressing), the hash table is represented as a one-dimensional array with indices that range from 0 to the desired table size-1.
Before inserting any elements into this table, we must initialize the table to represent the situation where all slots are empty.
This allows us to detect overflows and collisions when we inset elements into the table. Then using some suitable hash function the element can be inserted into the hash table.

For example:

Consider that following keys are to be inserted in the hash table:

131, 4, 8, 7, 21, 5, 9, 29

Initially, we will put the following keys in the hash table.

We will use Division hash function. That means the keys are placed using the formula:

H(key) = key % table-size

H(key) = key % 10

For instance the element 131 can be placed at

H(key) = 131 % 10

= 1

Index 1 will be the home bucket for 131. Continuing in this fashion we will place 4, 8, 7.

Index

Hash key

131

1 Like 0 Dislike

Follow 1

Other Lessons for You

Object Oriented Programming Paradigm

Object Oriented Programming Paradigm: OOPS is a better way of solving problems in computers compared to the procedural language programming such as in C. oops is designed around the data being operated...

D Subba Rao

1 0

Deadlocks In Distributed Systems

Deadlocks in Distributed Systems: Deadlock can occur whenever two or more processes are competing for limited resources and the processes are allowed to acquire and hold a resource (obtain a lock) thus...

D Subba Rao

2 0

Session Tracking In Java Servlets

Session Tracking: HTTP is a stateless protocol. Each request is independent of the previous one. However, in some applications, it is necessary to save state information so that information can be collected...

D Subba Rao

2 0

Infix Expression To Post-fix Expression Conversion Procedure

Algorithm 1. Scan the infix expression from left to right. 2. If the scanned character is an operand, output it. 3. Else, a. If the precedence of the scanned operator is greater than the precedence...

D Subba Rao

1 0

Pattern Matching Algorithms

Pattern Matching Algorithms: There are various algorithms used to implement the pattern matching problem. Some of them are: Brute Force. Boyer-Moore. Knuth-Morris-Pratt (KMP). Brute Force: ...

D Subba Rao

1 0

Find BTech Tuition near you

Looking for BTech Tuition ?

Learn from Best Tutors on UrbanPro.

Are you a Tutor or Training Institute?

Join UrbanPro Today to find students near you

BTech Tuition Questions

After learning Python, what I have to study to become a data scientist? Do I have to study deep learning,...

5 Answers

What parents see in a good teacher,when they hire home tutor for their child's benefit?

42 Answers

Looking for BTech Tuition Classes?

The best tutors for BTech Tuition Classes are on UrbanPro

Select the best Tutor
Book & Attend a Free Demo
Pay and start Learning

Take BTech Tuition with the Best Tutors

The best Tutors for BTech Tuition Classes are on UrbanPro

This website uses cookies

We use cookies to improve user experience. Choose what cookies you allow us to use. You can read more about our Cookie Policy in our Privacy Policy

Accept All

Decline All

UrbanPro.com is India's largest network of most trusted tutors and institutes. Over 55 lakh students rely on UrbanPro.com, to fulfill their learning requirements across 1,000+ categories. Using UrbanPro.com, parents, and students can compare multiple Tutors and Institutes and choose the one that best suits their requirements. More than 7.5 lakh verified Tutors and Institutes are helping millions of students every day and growing their tutoring business on UrbanPro.com. Whether you are looking for a tutor to learn mathematics, a German language trainer to brush up your German language skills or an institute to upgrade your IT skills, we have got the best selection of Tutors and Training Institutes for you. Read more