Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

The Tree Data Model, Lecture notes of Data Structures and Algorithms

Massachusetts Institute of Technology (MIT)Data Structures and Algorithms

The tree data model, which is a fundamental concept in computer science. It covers the basic terminology related to trees, the data structures used to represent trees in programs, recursive algorithms that operate on the nodes of a tree, and a method for making inductive proofs about trees. The document also discusses binary trees, binary search trees, and priority queues. examples and diagrams to illustrate the concepts.

Typology: Lecture notes

2022/2023

Uploaded on 05/11/2023

ekaram 🇺🇸

4.6

(30)

264 documents

1 / 63

This page cannot be seen from the preview

Don't miss anything!

CHAPTER 5

The Tree

Data Model

There are many situations in which information has a hierarchical or nested struc-

ture like that found in family trees or organization charts. The abstraction that

models hierarchical structure is called a tree and this data model is among the most

fundamental in computer science. It is the model that underlies several program-

ming languages, including Lisp.

Trees of various types appear in many of the chapters of this book. For in-

stance, in Section 1.3 we saw how directories and files in some computer systems

are organized into a tree structure. In Section 2.8 we used trees to show how lists

are split recursively and then recombined in the merge sort algorithm. In Section

3.7 we used trees to illustrate how simple statements in a program can be combined

to form progressively more complex statements.

5.1 What This Chapter Is About

The following themes form the major topics of this chapter:

The terms and concepts related to trees (Section 5.2).

The basic data structures used to represent trees in programs (Section 5.3).

Recursive algorithms that operate on the nodes of a tree (Section 5.4).

A method for making inductive proofs about trees, called structural induction,

where we proceed from small trees to progressively larger ones (Section 5.5).

The binary tree, which is a variant of a tree in which nodes have two “slots”

for children (Section 5.6).

The binary search tree, a data structure for maintaining a set of elements from

which insertions and deletions are made (Sections 5.7 and 5.8).

223

Discover Lecture notes of Data Structures and Algorithms Massachusetts Institute of Technology (MIT)

Partial preview of the text

Download The Tree Data Model and more Lecture notes Data Structures and Algorithms in PDF only on Docsity!

CHAPTER 5 The Tree

Data Model

There are many situations in which information has a hierarchical or nested struc- ture like that found in family trees or organization charts. The abstraction that models hierarchical structure is called a tree and this data model is among the most fundamental in computer science. It is the model that underlies several program- ming languages, including Lisp. Trees of various types appear in many of the chapters of this book. For in- stance, in Section 1.3 we saw how directories and files in some computer systems are organized into a tree structure. In Section 2.8 we used trees to show how lists are split recursively and then recombined in the merge sort algorithm. In Section 3.7 we used trees to illustrate how simple statements in a program can be combined to form progressively more complex statements.

5.1 What This Chapter Is About

The following themes form the major topics of this chapter:

The terms and concepts related to trees (Section 5.2).

The basic data structures used to represent trees in programs (Section 5.3).

Recursive algorithms that operate on the nodes of a tree (Section 5.4).

A method for making inductive proofs about trees, called structural induction, where we proceed from small trees to progressively larger ones (Section 5.5).

The binary tree, which is a variant of a tree in which nodes have two “slots” for children (Section 5.6).

The binary search tree, a data structure for maintaining a set of elements from which insertions and deletions are made (Sections 5.7 and 5.8).

224 THE TREE DATA MODEL

The priority queue, which is a set to which elements can be added, but from which only the maximum element can be deleted at any one time. An efficient data structure, called a partially ordered tree, is introduced for implementing priority queues, and an O(n log n) algorithm, called heapsort, for sorting n elements is derived using a balanced partially ordered tree data structure, called a heap (Sections 5.9 and 5.10).

5.2 Basic Terminology

Nodes and Trees are sets of points, called nodes, and lines, called edges. An edge connects two edges (^) distinct nodes. To be a tree, a collection of nodes and edges must satisfy certain properties; Fig. 5.1 is an example of a tree.

Root 1. In a tree, one node is distinguished and called the root. The root of a tree is generally drawn at the top. In Fig. 5.1, the root is n 1.

Every node c other than the root is connected by an edge to some one other Parent and node p called the parent of c. We also call c a child of p. We draw the parent child of a node above that node. For example, in Fig. 5.1, n 1 is the parent of n 2 , n 3 , and n 4 , while n 2 is the parent of n 5 and n 6. Said another way, n 2 , n 3 , and n 4 are children of n 1 , while n 5 and n 6 are children of n 2.

All nodes are 3. A tree is connected in the sense that if we start at any node n other than the connected to the root

root, move to the parent of n, to the parent of the parent of n, and so on, we eventually reach the root of the tree. For instance, starting at n 7 , we move to its parent, n 4 , and from there to n 4 ’s parent, which is the root, n 1.

n 1

n 2 n 3 n 4

n 5 n 6 n 7

Fig. 5.1. Tree with seven nodes.

An Equivalent Recursive Definition of Trees

It is also possible to define trees recursively with an inductive definition that con- structs larger trees out of smaller ones.

BASIS. A single node n is a tree. We say that n is the root of this one-node tree.

INDUCTION. Let r be a new node and let T 1 , T 2 ,... , Tk be one or more trees with roots c 1 , c 2 ,... , ck, respectively. We require that no node appear more than once in the Ti’s; and of course r, being a “new” node, cannot appear in any of these trees. We form a new tree T from r and T 1 , T 2 ,... , Tk as follows: a) Make r the root of tree T.

226 THE TREE DATA MODEL

If m 1 , m 2 ,... , mk is a path in a tree, node m 1 is called an ancestor of mk and node mk a descendant of m 1. If the path is of length 1 or more, then m 1 is called a Proper ancestor proper ancestor of mk and mk a proper descendant of m 1. Again, remember that and descendant (^) the case of a path of length 0 is possible, in which case the path lets us conclude that m 1 is an ancestor of itself and a descendant of itself, although not a proper ancestor or descendant. The root is an ancestor of every node in a tree and every node is a descendant of the root.

Example 5.3. In Fig. 5.1, all seven nodes are descendants of n 1 , and n 1 is an

ancestor of all nodes. Also, all nodes but n 1 itself are proper descendants of n 1 , and n 1 is a proper ancestor of all nodes in the tree but itself. The ancestors of n 5 are n 5 , n 2 , and n 1. The descendants of n 4 are n 4 and n 7.

Sibling Nodes that have the same parent are sometimes called siblings. For example, in Fig. 5.1, nodes n 2 , n 3 , and n 4 are siblings, and n 5 and n 6 are siblings.

Subtrees

In a tree T , a node n, together with all of its proper descendants, if any, is called a subtree of T. Node n is the root of this subtree. Notice that a subtree satisfies the three conditions for being a tree: it has a root, all other nodes in the subtree have a unique parent in the subtree, and by following parents from any node in the subtree, we eventually reach the root of the subtree.

Example 5.4. Referring again to Fig. 5.1, node n 3 by itself is a subtree, since

n 3 has no descendants other than itself. As another example, nodes n 2 , n 5 , and n 6 form a subtree, with root n 2 , since these nodes are exactly the descendants of n 2. However, the two nodes n 2 and n 6 by themselves do not form a subtree without node n 5. Finally, the entire tree of Fig. 5.1 is a subtree of itself, with root n 1.

Leaves and Interior Nodes

A leaf is a node of a tree that has no children. An interior node is a node that has one or more children. Thus, every node of a tree is either a leaf or an interior node, but not both. The root of a tree is normally an interior node, but if the tree consists of only one node, then that node is both the root and a leaf.

Example 5.5. In Fig. 5.1, the leaves are n 5 , n 6 , n 3 , and n 7. The nodes n 1 , n 2 ,

and n 4 are interior.

Height and Depth

In a tree, the height of a node n is the length of a longest path from n to a leaf. Level The height of the tree is the height of the root. The depth, or level, of a node n is the length of the path from the root to n.

SEC. 5.2 BASIC TERMINOLOGY 227

Example 5.6. In Fig. 5.1, node n 1 has height 2, n 2 has height 1, and leaf n 3

has height 0. In fact, any leaf has height 0. The tree in Fig. 5.1 has height 2. The depth of n 1 is 0, the depth of n 2 is 1, and the depth of n 5 is 2.

Ordered Trees

Optionally, we can assign a left-to-right order to the children of any node. For example, the order of the children of n 1 in Fig. 5.1 is n 2 leftmost, then n 3 , then n 4. This left-to-right ordering can be extended to order all the nodes in a tree. If m and n are siblings and m is to the left of n, then all of m’s descendants are to the left of all of n’s descendants.

Example 5.7. In Fig. 5.1, the nodes of the subtree rooted at n 2 — that is, n 2 ,

n 5 , and n 6 — are all to the left of the nodes of the subtrees rooted at n 3 and n 4. Thus, n 2 , n 5 , and n 6 are all to the left of n 3 , n 4 , and n 7.

In a tree, take any two nodes x and y neither of which is an ancestor of the other. As a consequence of the definition of “to the left,” one of x and y will be to the left of the other. To tell which, follow the paths from x and y toward the root. At some point, perhaps at the root, perhaps lower, the paths will meet at some node z as suggested by Fig. 5.2. The paths from x and y reach z from two different nodes m and n, respectively; it is possible that m = x and/or n = y, but it must be that m 6 = n, or else the paths would have converged somewhere below z.

m n

x y

Fig. 5.2. Node x is to the left of node y.

Suppose m is to the left of n. Then since x is in the subtree rooted at m and y is in the subtree rooted at n, it follows that x is to the left of y. Similarly, if m were to the right of n, then x would be to the right of y.

Example 5.8. Since no leaf can be an ancestor of another leaf, it follows that

all leaves can be ordered “from the left.” For instance, the order of the leaves in Fig. 5.1 is n 5 , n 6 , n 3 , n 7.

SEC. 5.2 BASIC TERMINOLOGY 229

INDUCTION. If E 1 and E 2 are expressions represented by trees T 1 and T 2 , re- spectively, then the expression (E 1 + E 2 ) is represented by the tree of Fig. 5.3(a), whose root is labeled +. This root has two children, which are the roots of T 1 and T 2 , respectively, in that order. Similarly, the expressions (E 1 − E 2 ), (E 1 × E 2 ), and (E 1 /E 2 ) have expression trees with roots labeled −, ×, and /, respectively, and subtrees T 1 and T 2. Finally, we may apply the unary minus operator to one expression, E 1. We introduce a root labeled −, and its one child is the root of T 1 ; the tree for (−E 1 ) is shown in Fig. 5.3(b).

Example 5.9. In Example 2.17 we discussed the recursive construction of a

sequence of six expressions from the basis and inductive rules. These expressions, listed in Fig. 2.16, were

i) x iv)

−(x + 10)

ii) 10 v) y iii) (x + 10) vi)

y ×

−(x + 10)

Expressions (i), (ii), and (v) are single operands, and so the basis rule tells us that the trees of Fig. 5.4(a), (b), and (e), respectively, represent these expressions. Note that each of these trees consists of a single node to which we have given a name — n 1 , n 2 , and n 5 , respectively — and a label, which is the operand in the circle.

x n 1

(a) For x.

10 n 2

(b) For 10.

x n 1 10 n 2

− n 4

x n 1 10 n 2

(d) For (−(x + 10)).

y n 5

(e) For y.

× n 6

y n 5 − n 4

x n 1 10 n 2

(f) For (y × (−(x + 10))).

Fig. 5.4. Construction of expression trees.

Expression (iii) is formed by applying the operator + to the operands x and 10, and so we see in Fig. 5.4(c) the tree for this expression, with root labeled +, and the roots of the trees in Fig. 5.4(a) and (b) as its children. Expression (iv) is

230 THE TREE DATA MODEL

formed by applying unary − to expression (iii), so that the tree for

−(x + 10)

shown in Fig. 5.4(d), has root labeled − above the tree for (x + 10). Finally, the

tree for the expression

y ×

−(x + 10)

, shown in Fig. 5.4(f), has a root labeled

×, whose children are the roots of the trees of Fig. 5.4(e) and (d), in that order.

Fig. 5.5. Tree for Exercise 5.2.1.

EXERCISES

5.2.1: In Fig. 5.5 we see a tree. Tell what is described by each of the following phrases:

a) The root of the tree b) The leaves of the tree c) The interior nodes of the tree d) The siblings of node 6 e) The subtree with root 5 f) The ancestors of node 10 g) The descendants of node 10 h) The nodes to the left of node 10 i) The nodes to the right of node 10 j) The longest path in the tree k) The height of node 3 l) The depth of node 13 m) The height of the tree

5.2.2: Can a leaf in a tree ever have any (a) descendants? (b) proper descendants?

5.2.3: Prove that in a tree no leaf can be an ancestor of another leaf.

232 THE TREE DATA MODEL

One distinction in representations concerns where the structures for the nodes “live” in the memory of the computer. In C, we can create the space for struc- tures for nodes by using the function malloc from the standard library stdlib.h, in which case nodes “float” in memory and are accessible only through pointers. Alternatively, we can create an array of structures and use elements of the array to represent nodes. Again nodes can be linked according to their position in the tree, but it is also possible to visit nodes by walking down the array. We can thus access nodes without following a path through the tree. The disadvantage of an array- based representation is that we cannot create more nodes than there are elements in the array. In what follows, we shall assume that nodes are created by malloc, although in situations where there is a limit on how large trees can grow, an array of structures of the same type is a viable, and possibly preferred, alternative.

Array-of-Pointers Representation of Trees

One of the simplest ways of representing a tree is to use for each node a structure consisting of a field or fields for the label of the node, followed by an array of pointers to the children of that node. Such a structure is suggested by Fig. 5.6. The constant bf is the size of the array of pointers. It represents the maximum number of children Branching a node can have, a quantity known as the branching factor. The ith component of factor the array at a node contains a pointer to the ith child of that node. A missing child can be represented by a NULL pointer.

info p 0 p 1 · · · pbf − 1

Fig. 5.6. Node represented by an array of pointers.

In C this data structure can be represented by the type declaration typedef struct NODE *pNODE; struct NODE { int info; pNODE children[BF]; }; Here, the field info represents the information that constitutes the label of a node and BF is the constant defined to be the branching factor. We shall see many variants of this declaration throughout this chapter. In this and most other data structures for trees, we represent a tree by a pointer to the root node. Thus, pNODE also serves as the type of a tree. We could, in fact, use the type TREE in place of pNODE, and we shall adopt that convention when we talk about binary trees starting in Section 5.6. However, for the moment, we shall use the name pNODE for the type “pointer to node,” since in some data structures, pointers to nodes are used for other purposes besides representing trees. The array-of-pointers representation allows us to access the ith child of any node in O(1) time. This representation, however, is very wasteful of space when only a few nodes in the tree have many children. In this case, most of the pointers in the arrays will be NULL.

SEC. 5.3 DATA STRUCTURES FOR TREES 233

Try to Remember Trie

The term “trie” comes from the middle of the word “retrieval.” It was originally intended to be pronounced “tree.” Fortunately, common parlance has switched to the distinguishing pronunciation “try.”

Example 5.10. A tree can be used to represent a collection of words in a way

that makes it quite efficient to check whether a given sequence of characters is a Trie valid word. In this type of tree, called a trie, each node except the root has an associated letter. The string of characters represented by a node n is the sequence of letters along the path from the root to n. Given a set of words, the trie consists of nodes for exactly those strings of characters that are prefixes of some word in the set. The label of a node consists of the letter represented by the node and also a Boolean telling whether or not the string from the root to that node forms a complete word; we shall use for the Boolean the integer 1 if so and 0 if not.^1 For instance, suppose our “dictionary” consists of the four words he, hers, his, she. A trie for these words is shown in Fig. 5.7. To determine whether the word he is in the set, we start at the root n 1 , move to the child n 2 labeled h, and then from that node move to its child n 4 labeled e. Since these nodes all exist in the tree, and n 4 has 1 as part of its label, we conclude that he is in the set.

0 n 1

h 0 n 2 s 0 n 3

e 1 n 4 i 0 n 5 h 0 n 6

r 0 n 7 s 1 n 8 e 1 n 9

s 1 n 10

Fig. 5.7. Trie for words he, hers, his, and she.

As another example, suppose we want to determine whether him is in the set. We follow the path from the root to n 2 to n 5 , which represents the prefix hi; but at n 5 we find no child corresponding to the letter m. We conclude that him is not in the set. Finally, if we search for the word her, we find our way from the root to node n 7. That node exists but does not have a 1. We therefore conclude that her is not in the set, although it is a proper prefix of a word, hers, in the set. Nodes in a trie have a branching factor equal to the number of different char- acters in the alphabet from which the words are formed. For example, if we do not

(^1) In the previous section we acted as if the label was a single value. However, values can be of any type, and labels can be structures consisting of two or more fields. In this case, the label has one field that is a letter and a second that is an integer that is either 0 or 1.

SEC. 5.3 DATA STRUCTURES FOR TREES 235

Leftmost-Child–Right-Sibling Representation of Trees

Using arrays of pointers for nodes is not necessarily space-efficient, because in typical cases, the great majority of pointers will be NULL. That is certainly the case in Fig. 5.9, where no node has more than two non-NULL pointers. In fact, if we think about it, we see that the number of pointers in any trie based on a 26-letter alphabet will have 26 times as many spaces for pointers as there are nodes. Since no node can have two parents and the root has no parent at all, it follows that among N nodes there are only N − 1 non-NULL pointers; that is, less than one out of 26 pointers is useful. One way to overcome the space inefficiency of the array-of-pointers represen- tation of a tree is to use linked lists to represent the children of nodes. The space occupied by a linked list for a node is proportional to the number of children of that node. There is, however, a time penalty with this representation; accessing the ith child takes O(i) time, because we must traverse a list of length i − 1 to get to the ith node. In comparison, we can get to the ith child in O(1) time, independent of i, using an array of pointers to the children. In the representation of trees called leftmost-child–right-sibling, we put into each node a pointer only to its leftmost child; a node does not have pointers to any of its other children. To find the second and subsequent children of a node n, we create a linked list of those children, with each child c pointing to the child of n Right sibling immediately to the right of c. That node is called the right sibling of c.

Example 5.11. In Fig. 5.1, n 3 is the right sibling of n 2 , n 4 is the right sibling

of n 3 , and n 4 has no right sibling. We would find the children of n 1 by following its leftmost-child pointer to n 2 , then the right-sibling pointer to n 3 , and then the right-sibling pointer of n 3 to n 4. There, we would find a NULL right-sibling pointer and know that n 1 has no more children. Figure 5.10 contains a sketch of the leftmost-child–right-sibling representation for the tree in Fig. 5.1. The downward arrows are the leftmost-child links; the sideways arrows are the right-sibling links.

n 1

n 2 n 3 n 4

n 5 n 6 n 7

Fig. 5.10. Leftmost-child–right-sibling representation for the tree in Fig. 5.1.

In a leftmost-child–right-sibling representation of a tree, nodes can be defined as follows:

236 THE TREE DATA MODEL

typedef struct NODE *pNODE; struct NODE { int info; pNODE leftmostChild, rightSibling; };

The field info holds the label associated with the node and it can have any type. The fields leftmostChild and rightSibling point to the leftmost child and right sibling of the node in question. Note that while leftmostChild gives information about the node itself, the field rightSibling at a node is really part of the linked list of children of that node’s parent.

Example 5.12. Let us represent the trie of Fig. 5.7 in the leftmost-child–right-

sibling form. First, the type of nodes is

typedef struct NODE *pNODE; struct NODE { char letter; int isWord; pNODE leftmostChild, rightSibling; };

The first two fields represent information, according to the scheme described in Example 5.10. The trie of Fig. 5.7 is represented by the data structure shown in Fig. 5.11. Notice that each leaf has a NULL leftmost-child pointer, and each rightmost child has a NULL right-sibling pointer.

h 0 s 0

e 1 i 0 h 0

r 0 s 1 e 1

s 1

Fig. 5.11. Leftmost-child–right-sibling representation for the trie of Fig. 5.7.

238 THE TREE DATA MODEL

Comparison of Tree Representations

We summarize the relative merits of the array-of-pointers (trie) and the leftmost- child–right-sibling representations for trees: The array-of-pointers representation offers faster access to children, requiring O(1) time to reach any child, no matter how many children there are. The leftmost-child–right-sibling representation uses less space. For instance, in our running example of the trie of Fig. 5.7, each node contains 26 pointers in the array representation and two pointers in the leftmost-child–right-sibling representation. The leftmost-child–right-sibling representation does not require that there be a limit on the branching factor of nodes. We can represent trees with any branching factor, without changing the data structure. However, if we use the array-of-pointers representation, once we choose the size of the array, we cannot represent a tree with a larger branching factor.

5.3.2: Represent the tree of Fig. 5.

a) As a trie with branching factor 3 b) By leftmost-child and right-sibling pointers

How many bytes of memory are required by each representation?

5.3.3: Consider the following set of singular personal pronouns in English: I, my, mine, me, you, your, yours, he, his, him, she, her, hers. Augment the trie of Fig. 5.7 to include all thirteen of these words.

5.3.4: Suppose that a complete dictionary of English contains 2,000,000 words and that the number of prefixes of words — that is, strings of letters that can be extended at the end by zero or more additional letters to form a word — is 10,000,000.

a) How many nodes would a trie for this dictionary have?

b) Suppose that we use the structure in Example 5.10 to represent nodes. Let pointers require four bytes, and suppose that the information fields letter and isWord each take one byte. How many bytes would the trie require?

c) Of the space calculated in part (b), how much is taken up by NULL pointers?

5.3.5: Suppose we represent the dictionary described in Exercise 5.3.4 by using the structure of Example 5.12 (a leftmost-child–right-sibling representation). Under the same assumptions about space required by pointers and information fields as in Exercise 5.3.4(b), how much space does the tree for the dictionary require? What portion of that space is NULL pointers?

Lowest common 5.3.6: In a tree, a node c is the lowest common ancestor of nodes x and y if c is an ancestor ancestor of both x and y, and no proper descendant of c is an ancestor of x and y. Write a program that will find the lowest common ancestor of any pair of nodes in a given tree. What is a good data structure for trees in such a program?

SEC. 5.4 RECURSIONS ON TREES 239

5.4 Recursions on Trees

The usefulness of trees is highlighted by the number of recursive operations on trees that can be written naturally and cleanly. Figure 5.13 suggests the general form of a recursive function F (n) that takes a node n of a tree as argument. F first performs some steps (perhaps none), which we represent by action A 0. Then F calls itself on the first child, c 1 , of n. During this recursive call, F will “explore” the subtree rooted at c 1 , doing whatever it is F does to a tree. When that call returns to the call at node n, some other action — say A 1 — is performed. Then F is called on the second child of n, resulting in exploration of the second subtree, and so on, with actions at n alternating with calls to F on the children of n.

c 1 c 2 · · · ck

(a) General form of a tree. F (n) { action A 0 ; F (c 1 ); action A 1 ; F (c 2 ); action A 2 ; · · · F (ck); action Ak ; } (b) General form of recursive function F (n) on a tree.

Fig. 5.13. A recursive function on a tree.

Preorder Example 5.13. A simple recursion on a tree produces what is known as the

preorder listing of the node labels of the tree. Here, action A 0 prints the label of the node, and the other actions do nothing other than some “bookkeeping” operations that enable us to visit each child of a given node. The effect is to print the labels as we would first meet them if we started at the root and circumnavigated the tree, visiting all the nodes in a counterclockwise tour. Note that we print the label of a node only the first time we visit that node. The circumnavigation is suggested by the arrow in Fig. 5.14, and the order in which the nodes are visited is +a + ∗ − b − c − ∗d ∗ +. The preorder listing is the sequence of node labels +a ∗ −bcd. Let us suppose that we use a leftmost-child–right-sibling representation of nodes in an expression tree, with labels consisting of a single character. The label of an interior node is the arithmetic operator at that node, and the label of a leaf is

SEC. 5.4 RECURSIONS ON TREES 241

for i ≥ 1, consists of line (5), which moves c through the children of n, and the test at line (3) to see whether we have exhausted the children. These actions are for bookkeeping only; in comparison, line (1) in action A 0 does the significant step, printing the label. The sequence of events for calling preorder on the root of the tree in Fig. 5. is summarized in Fig. 5.16. The character at the left of each line is the label of the node n at which the call of preorder(n) is currently being executed. Because no two nodes have the same label, it is convenient here to use the label of a node as its name. Notice that the characters printed are +a ∗ −bcd, in that order, which is the same as the order of circumnavigation.

call preorder(+) (+) print + (+) call preorder(a) (a) print a (+) call preorder(∗) (∗) print ∗ (∗) call preorder(−) (−) print − (−) call preorder(b) (b) print b (−) call preorder(c) (c) print c (∗) call preorder(d) (d) print d

Fig. 5.16. Action of recursive function preorder on tree of Fig. 5.14.

Example 5.14. Another common way to order the nodes of the tree, called

Postorder postorder, corresponds to circumnavigating the tree as in Fig. 5.14 but listing a node the last time it is visited, rather than the first. For instance, in Fig. 5.14, the postorder listing is abc − d ∗ +. To produce a postorder listing of the nodes, the last action does the printing, and so a node’s label is printed after the postorder listing function is called on all of its children, in order from the left. The other actions initialize the loop through the children or move to the next child. Note that if a node is a leaf, all we do is list the label; there are no recursive calls. If we use the representation of Example 5.13 for nodes, we can create postorder listings by the recursive function postorder of Fig. 5.17. The action of this function when called on the root of the tree in Fig. 5.14 is shown in Fig. 5.18. The same convention regarding node names is used here as in Fig. 5.16.

Example 5.15. Our next example requires us to perform significant actions

among all of the recursive calls on subtrees. Suppose we are given an expression tree with integers as operands, and with binary operators, and we wish to produce

242 THE TREE DATA MODEL

void postorder(pNODE n) { pNODE c; /* a child of node n */ (1) c = n->leftmostChild; (2) while (c != NULL) { (3) postorder(c); (4) c = c->rightSibling; } (5) printf("%c\n", n->nodeLabel); }

Fig. 5.17. Recursive postorder function.

call postorder(+) (+) call postorder(a) (a) print a (+) call postorder(∗) (∗) call postorder(−) (−) call postorder(b) (b) print b (−) call postorder(c) (c) print c (−) print − (∗) call postorder(d) (d) print d (∗) print ∗ (+) print +

Fig. 5.18. Action of recursive function postorder on tree of Fig. 5.14.

the numerical value of the expression represented by the tree. We can do so by executing the following recursive algorithm on the expression tree.

Evaluating an BASIS. For a leaf we produce the integer value of the node as the value of the tree. expression tree

INDUCTION. Suppose we wish to compute the value of the expression formed by the subtree rooted at some node n. We evaluate the subexpressions for the two subtrees rooted at the children of n; these are the values of the operands for the operator at n. We then apply the operator labeling n to the values of these two subtrees, and we have the value of the entire subtree rooted at n.

We define a pointer to a node and a node as follows:

The Tree Data Model, Lecture notes of Data Structures and Algorithms

Related documents

Partial preview of the text

Download The Tree Data Model and more Lecture notes Data Structures and Algorithms in PDF only on Docsity!

CHAPTER 5

The Tree

Data Model

5.1 What This Chapter Is About

224 THE TREE DATA MODEL

5.2 Basic Terminology

An Equivalent Recursive Definition of Trees

226 THE TREE DATA MODEL

Example 5.3. In Fig. 5.1, all seven nodes are descendants of n 1 , and n 1 is an

Subtrees

Example 5.4. Referring again to Fig. 5.1, node n 3 by itself is a subtree, since

Leaves and Interior Nodes

Example 5.5. In Fig. 5.1, the leaves are n 5 , n 6 , n 3 , and n 7. The nodes n 1 , n 2 ,

Height and Depth

SEC. 5.2 BASIC TERMINOLOGY 227

Example 5.6. In Fig. 5.1, node n 1 has height 2, n 2 has height 1, and leaf n 3

Ordered Trees

Example 5.7. In Fig. 5.1, the nodes of the subtree rooted at n 2 — that is, n 2 ,

Example 5.8. Since no leaf can be an ancestor of another leaf, it follows that

SEC. 5.2 BASIC TERMINOLOGY 229

Example 5.9. In Example 2.17 we discussed the recursive construction of a

230 THE TREE DATA MODEL

EXERCISES

232 THE TREE DATA MODEL

Array-of-Pointers Representation of Trees

SEC. 5.3 DATA STRUCTURES FOR TREES 233

Try to Remember Trie

Example 5.10. A tree can be used to represent a collection of words in a way

SEC. 5.3 DATA STRUCTURES FOR TREES 235

Leftmost-Child–Right-Sibling Representation of Trees

Example 5.11. In Fig. 5.1, n 3 is the right sibling of n 2 , n 4 is the right sibling

236 THE TREE DATA MODEL

Example 5.12. Let us represent the trie of Fig. 5.7 in the leftmost-child–right-

238 THE TREE DATA MODEL

Comparison of Tree Representations

SEC. 5.4 RECURSIONS ON TREES 239

5.4 Recursions on Trees

Preorder Example 5.13. A simple recursion on a tree produces what is known as the

SEC. 5.4 RECURSIONS ON TREES 241

Example 5.14. Another common way to order the nodes of the tree, called

Example 5.15. Our next example requires us to perform significant actions

242 THE TREE DATA MODEL