Data Structures: Introduction, Basic Terminology, and Algorithms, Lecture notes of Data Structures and Algorithms

Unit one from data structures

Typology: Lecture notes

2018/2019

Uploaded on 04/01/2019

pratik-savla
pratik-savla 🇮🇳

2 documents

1 / 36

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
15CS201J- DATA STRUCTURES UNIT-I
Ms. Selva Mary. G Page 1
UNIT I
INTRODUCTION TO DATA STRUCTURES
Introduction Basic terminology Data structures Data structure operations - ADT
Algorithms: Complexity, Time Space trade off - Mathematical notations and functions -
Asymptotic notations Linear and Binary search - Bubble sort - Insertion sort
INTRODUCTION
Data Structure is a way of collecting and organising data in such a way that we can
perform operations on these data in an effective way.
Data Structures is about rendering data elements in terms of some relationship, for better
organization and storage. For example, we have data player's name "Virat" and age 26.
Here "Virat" is of String data type and 26 is of integer data type.
We can organize this data as a record like Player record. Now we can collect and store
player's records in a file or database as a data structure. For example: "Dhoni" 30,
"Gambhir" 31, "Sehwag" 33.
In simple language, Data Structures are structures programmed to store ordered data, so
that various operations can be performed on it easily.
Basic Terminology of Data Organization:
Data : The term ‘DATA’ simply referes to a a value or a set of values. These values may
present anything about something, like it may be roll no of a student, marks, name of an
employee, address of person etc.
Data item : A data item refers to a single unit of value. For eg. roll no of a student, marks,
name of an employee, address of person etc. are data items. Data items that can be
divided into sub items are called group items (Eg. Address, date, name), where as those
who can not be divided in to sub items are called elementary items (Eg. Roll no, marks,
city, pin code etc.).
Entity - with similar attributes ( e.g all employees of an organization) form an entity set
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24

Partial preview of the text

Download Data Structures: Introduction, Basic Terminology, and Algorithms and more Lecture notes Data Structures and Algorithms in PDF only on Docsity!

UNIT I

INTRODUCTION TO DATA STRUCTURES

Introduction – Basic terminology – Data structures – Data structure operations - ADT – Algorithms: Complexity, Time – Space trade off - Mathematical notations and functions - Asymptotic notations – Linear and Binary search - Bubble sort - Insertion sort

INTRODUCTION

Data Structure is a way of collecting and organising data in such a way that we can perform operations on these data in an effective way.

Data Structures is about rendering data elements in terms of some relationship, for better organization and storage. For example, we have data player's name "Virat" and age 26. Here "Virat" is of String data type and 26 is of integer data type.

We can organize this data as a record like Player record. Now we can collect and store player's records in a file or database as a data structure. For example: "Dhoni" 30, "Gambhir" 31, "Sehwag" 33.

In simple language, Data Structures are structures programmed to store ordered data, so that various operations can be performed on it easily.

Basic Terminology of Data Organization:

Data : The term ‘DATA’ simply referes to a a value or a set of values. These values may present anything about something, like it may be roll no of a student, marks, name of an employee, address of person etc.

Data item : A data item refers to a single unit of value. For eg. roll no of a student, marks, name of an employee, address of person etc. are data items. Data items that can be divided into sub items are called group items (Eg. Address, date, name), where as those who can not be divided in to sub items are called elementary items (Eg. Roll no, marks, city, pin code etc.).

Entity - with similar attributes ( e.g all employees of an organization) form an entity set

Information: processed data, Data with given attribute

Field is a single elementary unit of information representing an attribute of an entity

Record is the collection of field values of a given entity

File is the collection of records of the entities in a given entity set

Name Age Sex Roll Number Branch

A 17 M 109cs0132 CSE

B 18 M 109ee1234 EE

Basic types of Data Structures

Anything that can store data can be called as a data structure, hence Integer, Float, Boolean, Char etc, all are data structures. They are known as Primitive Data Structures.

Then we also have some complex Data Structures, which are used to store large and connected data. Some example of Abstract Data Structure are :

 Array  Linked List  Stack  Queue  Tree  Graph All these data structures allow us to perform different operations on data. We select these data structures based on which type of operation is required.

(4) Deleting: Removing the record from the structure. (5) Sorting: Managing the data or record in some logical order (Ascending or descending order). (6) Merging: Combining the record in two different sorted files into a single sorted file.

Abstract Data Types (ADT)

An abstract data type (ADT) refers to a set of data values and associated operations that are specified accurately, independent of any particular implementation. With an ADT, we know what a specific data type can do, but how it actually does it is hidden. Simply hiding the implementation

Data Structure - Arrays

Array is a container which can hold fix number of items and these items should be of same type. Most of the data structures make use of array to implement their algorithms. Following are important terms to understand the concepts of Array.  Element − each item stored in an array is called an element.  Index − each location of an element in an array has a numerical index which is used to identify the element.

Array Representation

Arrays can be declared in various ways in different languages. For illustration, let's take C array declaration.

As per above shown illustration, following are the important points to be considered.  Index starts with 0.  Array length is 10 which means it can store 10 elements.  Each element can be accessed via its index. For example, we can fetch element at index 6 as 27.

Basic Operations

Following are the basic operations supported by an array.  Traverse − print all the array elements one by one.  Insertion − add an element at given index.  Deletion − delete an element at given index.  Search − search an element using given index or by value.  Update − update an element at given index.

Data Structure – Linked Lists

Linked List is a linear data structure and it is very common data structure which consists of group of nodes in a sequence which is divided in two parts. Each node consists of its own data and the address of the next node and forms a chain. Linked Lists are used to create trees and graphs.

Doubly Linked List : In a doubly linked list, each node contains two links the first link

points to the previous node and the next link points to the next node in the sequence.

Circular Linked List : In the circular linked list the last node of the list contains the

address of the first node and forms a circular chain.

Data Structure – Stack

Stacks

Stack is an abstract data type with a bounded (predefined) capacity. It is a simple data structure that allows adding and removing elements in a particular order. Every time an

element is added, it goes on the top of the stack, the only element that can be removed is the element that was at the top of the stack, just like a pile of objects.

Stack Data Structure

Basic features of Stack

  1. Stack is an ordered list of similar data type.
  2. Stack is a LIFO structure. (Last in First out).
  3. push() function is used to insert new elements into the Stack and pop() is used to delete an element from the stack. Both insertion and deletion are allowed at only one end of Stack called Top.
  4. Stack is said to be in Overflow state when it is completely full and is said to be in Underflow state if it is completely empty.

Applications of Stack

  • The simplest application of a stack is to reverse a word. You push a given word to stack - letter by letter - and then pop letters from the stack.
  • There are other uses also like : Parsing, Expression Conversion(Infix to Postfix, Postfix to Prefix etc) and many more.

Implementation of Stack

Queue Data Structures

Queue is also an abstract data type or a linear data structure, in which the first element is inserted from one end called REAR(also called tail), and the deletion of exisiting element takes place from the other end called as FRONT(also called head). This makes queue as FIFO data structure, which means that element inserted first will also be removed first.

The process to add an element into queue is called Enqueue and the process of removal of an element from queue is called Dequeue.

Basic features of Queue

  1. Like Stack, Queue is also an ordered list of elements of similar data types.
  2. Queue is a FIFO( First in First Out ) structure.
  3. Once a new element is inserted into the Queue, all the elements inserted before the new element in the queue must be removed, to remove the new element.
  4. peek( ) function is oftenly used to return the value of first element without dequeuing it.

Applications of Queue

Queue, as the name suggests is used whenever we need to have any group of objects in an order in which the first one coming in, also gets out first while the others wait for there turn, like in the following scenarios :

  1. Serving requests on a single shared resource, like a printer, CPU task scheduling etc.
  2. In real life, Call Center phone systems will use Queues, to hold people calling them in an order, until a service representative is free.
  3. Handling of interrupts in real-time systems. The interrupts are handled in the same order as they arrive, First come first served.

Analysis of Queue

  • Enqueue : O(1)
  • Dequeue : O(1)
  • Size : O(1)

Data Structure - Tree

Tree represents nodes connected by edges. We'll going to discuss binary tree or binary search tree specifically. Binary Tree is a special datastructure used for data storage purposes. A binary tree has a special condition that each node can have two children at maximum. A binary tree have benefits of both an ordered array and a linked list as search is as quick as in sorted array and insertion or deletion operation are as fast as in linked list.

Data Structure - Graph

A graph is a pictorial representation of a set of objects where some pairs of objects are connected by links. The interconnected objects are represented by points termed as vertices, and the links that connect the vertices are called edges. Formally, a graph is a pair of sets (V, E), where V is the set of vertices and E is the set of edges, connecting the pairs of vertices. Take a look at the following graph −

In the above graph, V = {a, b, c, d, e} E = {ab, ac, bd, cd, de}

Graph Data Structure Mathematical graphs can be represented in data-structure. We can represent a graph using an array of vertices and a two dimensional array of edges. Before we proceed further, let's familiarize ourselves with some important terms −  Vertex − Each node of the graph is represented as a vertex. In example given below, labeled circle represents vertices. So A to G are vertices. We can represent them using an array as shown in image below. Here A can be identified by index 0. B can be identified using index 1 and so on.  Edge − Edge represents a path between two vertices or a line between two vertices. In example given below, lines from A to B, B to C and so on represents edges. We can use a two dimensional array to represent array as shown in image below. Here AB can be represented as 1 at row 0, column 1, BC as 1 at row 1, column 2 and so on, keeping other combinations as 0.

Adjacency − Two node or vertices are adjacent if they are connected to each other through an edge. In example given below, B is adjacent to A, C is adjacent to B and so on.  Path − Path represents a sequence of edges between two vertices. In example given below, ABCD represents a path from A to D.

Basic Operations Following are basic primary operations of a Graph which are following.  Add Vertex − add a vertex to a graph.  Add Edge − add an edge between two vertices of a graph.  Display Vertex − display a vertex of a graph.

Algorithms Basics

Algorithm is a step by step procedure, which defines a set of instructions to be executed in certain order to get the desired output. Algorithms are generally created independent

Algorithm Analysis

An algorithm is said to be efficient and fast, if it takes less time to execute and consumes less memory space. The performance of an algorithm is measured on the basis of following properties:

**1. Time Complexity

  1. Space Complexity**

Suppose X is an algorithm and n is the size of input data, the time and space used by the Algorithm X are the two main factors which decide the efficiency of X.

  • Time Factor − The time is measured by counting the number of key operations such as comparisons in sorting algorithm
  • Space Factor − The space is measured by counting the maximum memory space required by the algorithm.

The complexity of an algorithm f(n) gives the running time and / or storage space required by the algorithm in terms of n as the size of input data.

Space Complexity

Space complexity of an algorithm represents the amount of memory space required by the algorithm in its life cycle. Its the amount of memory space required by the algorithm, during the course of its execution. Space complexity must be taken seriously for multi- user systems and in situations where limited memory is available.

Space required by an algorithm is equal to the sum of the following two components −

  • A fixed part that is a space required to store certain data and variables that are independent of the size of the problem. For example simple variables & constant used and program size etc.
  • A variable part is a space required by variables, whose size depends on the size of the problem. For example dynamic memory allocation, recursion stacks space etc.

An algorithm generally requires space for following components:

  • Instruction Space: It is the space required to store the executable version of the program. This space is fixed, but varies depending upon the number of lines of code in the program.
  • Data Space: It is the space required to store all the constants and variables value.
  • Environment Space: It is the space required to store the environment information needed to resume the suspended function.

Space complexity S(P) of any algorithm P is S(P) = C + SP(I) Where C is the fixed part and S(I) is the variable part of the algorithm which depends on instance characteristic I. Following is a simple example that tries to explain the concept −

Asymptotic Notations

The main idea of asymptotic analysis is to have a measure of efficiency of algorithms that doesn’t depend on machine specific constants, and doesn’t require algorithms to be implemented and time taken by programs to be compared. Asymptotic notations are mathematical tools to represent time complexity of algorithms for asymptotic analysis. The following 3 asymptotic notations are mostly used to represent time complexity of algorithms.

1) Θ Notation:

The theta notation bounds a function from above and below, so it defines exact asymptotic behavior. A simple way to get Theta notation of an expression is to drop low order terms and ignore leading constants. For example, consider the following expression. 3 𝑛^3 + 6𝑛^2 + 6000 = 𝛩(𝑛^3 )

Dropping lower order terms is always fine because there will always be a n0 after which 𝛩(𝑛^3 ) beats 𝛩(𝑛^2 ) irrespective of the constants involved. For a given function g(n), we denote Θ(g(n)) is following set of functions.

3) Ω Notation: Just as Big O notation provides an asymptotic upper bound on a function, Ω notation provides an asymptotic lower bound. Ω Notation< can be useful when we have lower bound on time complexity of an algorithm. As discussed in the previous post, the best case performance of an algorithm is generally not useful; the Omega notation is the least used notation among all three. For a given function g(n), we denote by Ω(g(n)) the set of functions. 𝛺 (𝑔(𝑛)) = {𝑓(𝑛): 𝑡𝑕𝑒𝑟𝑒 𝑒𝑥𝑖𝑠𝑡 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡𝑠 𝑐 𝑎𝑛𝑑 𝑛 0 𝑠𝑢𝑐𝑕 𝑡𝑕𝑎𝑡 0 <= 𝑐𝑔(𝑛) <= 𝑓(𝑛) 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑛 >= 𝑛0}.

IT-1004 Design and Analysis of Algorithms

Ms.Selva Mary. G (9025405426) Page 3

Mathematical Analysis of Recursive Algorithms

Many algorithms are recursive in nature. When we analyze them, we get a recurrence relation for time complexity. We get running time on an input of size n as a function of n and the running time on inputs of smaller sizes. For example in Merge Sort, to sort a given array, we divide it in two halves and recursively repeat the process for the two halves. Finally we merge the results. Time complexity of Merge Sort can be written as T(n) = 2T(n/2) + cn. There are many other algorithms like Binary Search, Tower of Hanoi, etc. There are mainly three ways for solving recurrences.

1) Substitution Method : We make a guess for the solution and then we use mathematical induction to prove the guess is correct or incorrect. For example consider the recurrence T(n) = 2T(n/2) + n We guess the solution as T(n) = O(nLogn). Now we use induction to prove our guess. We need to prove that T(n) <= cnLogn. We can assume that it is true for values smaller than n.

T(n) = 2T(n/2) + n <= cn/2Log(n/2) + n <= cnLogn - cnLog2 + n <= cnLogn - cn + n <= cnLogn

2) Recurrence Tree Method: In this method, we draw a recurrence tree and calculate the time taken by every level of tree. Finally, we sum the work done at all levels. To draw the recurrence tree, we start from the given recurrence and keep drawing till we find a pattern among levels. The pattern is typically an arithmetic or geometric series. For example consider the recurrence relation T(n) = T(n/4) + T(n/2) + cn^2 cn^2 /
T(n/4) T(n/2)

If we further break down the expression T(n/4) and T(n/2), we get following recursion tree. cn^2 /
c(n^2 )/16 c(n^2 )/ / \ /
T(n/16) T(n/8) T(n/8) T(n/4)

Breaking down further gives us following cn^2 /
c(n^2 )/16 c(n^2 )/ / \ /
c(n^2 )/256 c(n^2 )/64 c(n^2 )/64 c(n^2 )/ / \ / \ / \ / \