Download Data Structures and Algorithms: Union-Find and Percolation and more Slides Data Representation and Algorithm Design in PDF only on Docsity!
1
Kevin Wayne
2
Overview
What is COS 226?
! Intermediate-level survey course.
! Programming and problem solving with applications.
! Algorithm: method for solving a problem.
! Data structure: method to store information.
Topic sorting searching graphs Data Structures and Algorithms quicksort, mergesort, heapsort, radix sorts hash table, BST, red-black tree, B-tree DFS, Prim, Kruskal, Dijkstra, Ford-Fulkerson strings KMP, Rabin-Karp, TST, Huffman, LZW geometry Graham scan, k-d tree, Voronoi diagram data types stack, queue, list, union-find, priority queue
A misperception: algiros [painful] + arithmos [number].
3
Impact of Great Algorithms
Internet. Web search, packet routing, distributed file sharing.
Biology. Human genome project, protein folding.
Computers. Circuit layout, file system, compilers.
Computer graphics. Hollywood movies, video games.
Security. Cell phones, e-commerce, voting machines.
Multimedia. CD player, DVD, MP3, JPG, DivX, HDTV.
Transportation. Airline crew scheduling, map routing.
Physics. N-body simulation, particle collision simulation.
For me, great algorithms are the poetry of computation. Just like verse, they can be terse, allusive, dense, and even mysterious. But once unlocked, they cast a brilliant new light on some aspect of computing. - Francis Sullivan 4
Why Study Algorithms?
Using a computer?
! Want it to go faster? Process more data?
! Want it to do something that would otherwise be impossible?
Algorithms as a field of study.
! Philosophical implications.
! Burgeoning application areas.
! Old enough that basics are known.
! New enough that new discoveries arise.
20th century science (formula based) 21st century science (algorithm based) bioinformatics neurosciences computational physics !^ … E = mc^2 ! F = ma ! F = Gm r^12 m^2 ! $ % & " 2 h m^2 # (^2) + V ( r ) ' ( ) *( r ) = E *( r )
The Usual Suspects Questionnaire
Please fill out questionnaire so that we can adapt course as needed.
! Who are you?
! Why are you taking COS 226?
! Which precept(s) can you attend?
! What do you hope to get out of it?
! What is your programming experience?
7
26
Union-Find Abstraction
What are critical operations we need to support?
! Objects.
! Disjoint sets of objects.
! Find: are two objects in the same set?
! Union: replace sets containing two items by their union.
Goal. Design efficient data structure for union and find.
! Number of operations M can be huge.
! Number of objects N can be huge.
27
Objects
Applications involve manipulating objects of all types.
! Variable name aliases.
! Pixels in a digital photo.
! Computers in a network.
! Web pages on the Internet.
! Transistors in a computer chip.
! Metallic sites in a composite system.
When programming, convenient to name them 0 to N-1.
! Details not relevant to union-find.
! Integers allow quick-access to object-related info.
array indices 29
Quick-Find [eager approach]
Data structure.
! Integer array id[] of size N.
! Interpretation: p and q are connected if they have the same id.
Find. Check if p and q have the same id.
Union. To merge components containing p and q,
change all entries with id[p] to id[q].
i 0 1 2 3 4 5 6 7 8 9 id[i] 0 1 9 9 9 6 6 7 8 9 5 and 6 are connected 2, 3, 4, and 9 are connected union of 3 and 6 2, 3, 4, 5, 6, and 9 are connected i 0 1 2 3 4 5 6 7 8 9 id[i] 0 1 6 6 6 6 6 7 8 6 id[3] = 9; id[6] = 6 3 and 6 not connected many values can change 30
Quick-Find: Example
31
Quick-Find: Java Implementation
1 operation N operations set id of each object to itself public class QuickFind { private int[] id; public QuickFind(int N) { id = new int[N]; for (int i = 0 ; i < N; i++) id[i] = i; } public boolean find(int p, int q) { return id[p] == id[q]; } public void unite(int p, int q) { int pid = id[p]; for (int i = 0 ; i < id.length; i++) if (id[i] == pid) id[i] = id[q]; } } 32
Problem Size and Computation Time
Rough standard for 2000.
! 109 operations per second.
! 109 words of main memory.
! Touch all words in approximately 1 second. [unchanged since 1950!]
Ex. Huge problem for quick find.
! 1010 edges connecting 10^9 nodes.
! Quick-find might take 10^20 operations. [~10 ops per query]
! 3,000 years of computer time!
Paradoxically, quadratic algorithms get worse with newer equipment.
! New computer may be 10x as fast.
! But, has 10x as much memory so problem may be 10x bigger.
! With quadratic algorithm, takes 10x as long!
34
Quick-Union [lazy approach]
Data structure.
! Integer array id[] of size N.
! Interpretation: id[i] is parent of i.
! Root of i is id[id[id[...id[i]...]]].
Find. Check if p and q have the same root.
Union. Set the id of q's root to the id of p's root.
keep going until it doesn't change i 0 1 2 3 4 5 6 7 8 9 id[i] 0 1 9 4 9 6 6 7 8 9 4 7 3 5 0 1 9 6 8 2 3's root is 9; 5's root is 6 3 and 5 are not connected i 0 1 2 3 4 5 6 7 8 9 id[i] 0 1 9 4 9 6 9 7 8 9 4 7 3 5 0 1 9 6 8 2 only one value changes p q 35
Quick-Union: Example
40
Weighted Quick-Union: Java Implementation
Java implementation.
! Almost identical to quick-union.
! Maintain extra array sz[] to count number of elements
in the tree rooted at i.
Find. Identical to quick-union.
Union. Same as quick-union, but merge smaller tree into larger tree,
and update the sz[] array.
if (sz[i] < sz[j]) { id[i] = j; sz[j] += sz[i]; } else sz[i] < sz[j] { id[j] = i; sz[i] += sz[j]; } 41
Weighted Quick-Union: Analysis
Analysis.
! Find: takes time proportional to depth of p and q.
! Union: takes constant time, given roots.
! Fact: depth is at most lg N. [needs proof]
Stop at guaranteed acceptable performance? No, can improve further.
Quick-find Data Structure N Union Quick-union 1 † Weighted QU lg N 1 Find N lg N 42
Path compression. Just after computing the root of i,
set the id of each examined node to root(i).
Path Compression
0 3 1 2 4 5 6 8 7 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 root( 9 ) 43
Weighted Quick-Union with Path Compression
Path compression.
! Standard implementation: add second loop to root() to set
the id of each examined node to the root.
! Simpler one-pass variant: make every other node in path
point to its grandparent.
In practice. No reason not to! Keeps tree almost completely flat.
public int root(int i) { while (i != id[i]) { id[i] = id[id[i]]; i = id[i]; } return i; } only one extra line of code!
44
Weighted Quick-Union with Path Compression
45 2 N 16 65536 265536 1 lg N 3 4 5 4 2*
Weighted Quick-Union with Path Compression
Theorem. Starting from an empty data structure,
any sequence of M union and find operations
on N elements takes O(N + M lg* N) time.
! Proof is very difficult.
! But the algorithm is still simple!
Remark. lg* N is a constant in this universe.
Linear algorithm?
! Cost within constant factor of reading in the data.
! Theory: WQUPC is not quite linear.
! Practice: WQUPC is linear.
46
Context
Ex. Huge practical problem.
! 1010 edges connecting 10^9 nodes.
! WQUPC reduces time from 3,000 years to 1 minute.
! Supercomputer won't help much.
! Good algorithm makes solution possible.
Bottom line. WQUPC on Java cell phone beats QF on supercomputer!
Quick-find Algorithm Weighted QU Path compression M N Time N + M log N N + M log N Quick-union M N Weighted + path 5 (M + N) M union-find ops on a set of N elements Robert Sedgewick and Kevin Wayne • Copyright © 2005 • http://www.Princeton.EDU/~cos 226
Applications
52
Summary
Lessons.
! Start with simple, brute force approach.
- don't use for large problems
- can't use for huge problems
! Strive for worst-case performance guarantees.
! Identify fundamental abstractions: union-find.
! Apply to many domains.
might be nontrivial to analyze