- University of Applied Sciences - Data Structures and Algorithm Design - CSCI 340 - Friedhelm Seutter Institut für Angewandte Informatik Contents 1. Analyzing Algorithms and Problems 2. Data Abstraction 3. Recursion and Induction 4. Sorting 6. Dynamic Sets and Searching 7. Graphs and Graph Traversals 8. Optimization and Greedy Algorithms 10. Dynamic Programming 13. NP-Complete Problems Institut für Angewandte Informatik 2 1
4. Sorting Insertion Sort Quicksort Mergesort Heapsort Institut für Angewandte Informatik 3 Sorting Problem Let A = (a 1, a 2,..., a n ) be an array of (nonnegative) integers, called keys. The problem is to find a permutation π such that the integers are sorted in nondecreasing order. Solution: A = (a π(1), a π(2),..., a π(n) ) with a π(1) a π(2)... a π(n) Institut für Angewandte Informatik 4 2
Analysis of Complexity General strategy: Sorting by comparison of keys Time: Number of key comparisons (basic operations) Space: Amount of extra space (in addition to the input) Institut für Angewandte Informatik 5 Insertion Sort Let some elements at the left side of the array be sorted. Take the first element from the unexamined elements and insert it at the right position of the sorted elements. To get a vacant space the greater elements must be shifted one position to the right. Institut für Angewandte Informatik 6 3
Insertion Sort - Example < sorted > < not examined > 2 4 12 14 15 7 19 11 2 4 12 14 15 19 11 7 2 4 7 12 14 15 19 11 < sorted > <not exam.> Institut für Angewandte Informatik 7 Insertion Sort Institut für Angewandte Informatik 8 4
Worst-case Complexity Basic operation: Key comparison in line 4 W(n) = (2 j n) (j 1) = (1 j n-1) j = ½ n (n 1) Θ(n 2 ) Institut für Angewandte Informatik 9 Average-case Complexity Basic operation: Key comparison in line 4 Assumptions: All permutations are equally likely as input and the keys are distinct. A(n) ¼n 2 Θ(n 2 ) Institut für Angewandte Informatik 10 5
Best-case Complexity Basic operation: Key comparison in line 4 B(n) = n 1 Θ(n) Institut für Angewandte Informatik 11 Space Complexity Insertion Sort sorts in-place. The additional amount of space is independent of the number of elements to sort. Institut für Angewandte Informatik 12 6
Divide and Conquer Institut für Angewandte Informatik 13 Quicksort Divide: Choose one element to be the pivot. Divide the array in two subarrays corresponding to the pivot. Less or equal elements to the left, greater elements to the right, the pivot in between. Conquer: An array of length 1 is sorted. Combine: Append two sorted subarrays with the pivot in between. Institut für Angewandte Informatik 14 7
Quicksort Institut für Angewandte Informatik 15 Quicksort-Partition Institut für Angewandte Informatik 16 8
Quicksort-Partition f l unexamined pivot f i j l pivot > pivot unexamined pivot f q l pivot pivot > pivot Institut für Angewandte Informatik 17 Quicksort-Partition Institut für Angewandte Informatik 18 9
Divide Institut für Angewandte Informatik 19 Combine Institut für Angewandte Informatik 20 10
Complexity Basic operation: Key comparison in line 4 of Partition W(n) Θ(n 2 ) A(n) Θ(n log n) B(n) Θ(n log n) Institut für Angewandte Informatik 21 Space Complexity Amount of space needed: Θ(n) The exchange of keys is in-place, but there are in the worst case n recursive procedure calls and they need that space for storing their local variables. A tricky implementation may reduce the space complexity to Θ(log n). Institut für Angewandte Informatik 22 11
Mergesort Divide: Divide the array in two halves, recursively. Conquer: An array of length 1 is sorted. Combine: Two sorted subarrays are merged to a sorted array. Institut für Angewandte Informatik 23 Mergesort Institut für Angewandte Informatik 24 12
Mergesort-Merge Institut für Angewandte Informatik 25 Divide Institut für Angewandte Informatik 26 13
Combine Institut für Angewandte Informatik 27 Complexity Basic operation: Key comparison in line 6 of Merge W(n) Θ(n log n) A(n) Θ(n log n) B(n) Θ(n log n) Institut für Angewandte Informatik 28 14
Space Complexity Amount of space needed: Θ(n) There is no exchange of keys, but all n keys are copied and merged to an extra array. A tricky implementation may reduce the space needed to n/2, but this still is in Θ(n). Institut für Angewandte Informatik 29 Lower Bounds for Sorting by Comparison of Keys What is the minimum number of key comparisons for sorting algorithms based on comparisons of keys? Given an array of n distinct keys. The solution of sorting the keys is a permutation of the keys. Thus there are n! possible solutions. Institut für Angewandte Informatik 30 15
Decision Tree for Sorting All possible sorting solutions may be represented in a decision tree. The inner nodes represent a comparison of two keys. The possible outcomes are true or false. If false, the keys have to be exchanged. Inner nodes have two successors. The leaves are the possible sorting solutions. Institut für Angewandte Informatik 31 Decision Tree for Sorting Institut für Angewandte Informatik 32 16
Decision Tree for Sorting Sorting by comparison corresponds to a path in the decision tree from the root to a leaf. The length of the longest path corresponds to the number of comparisons in the worst case. Therefore the lower bound of the height of a decision tree is a worst case lower bound for the number of key comparisons. Institut für Angewandte Informatik 33 Lower Bounds for Sorting Institut für Angewandte Informatik 34 17
Lower Bounds for Sorting Institut für Angewandte Informatik 35 Heapsort The algorithm uses a data structure called heap, which is a binary tree and some special properties. Heap-structure: Complete binary tree with some of the rightmost leaves removed. Partial tree order property: The key at any node is greater (less) than or equal to the keys at each of its children. Institut für Angewandte Informatik 36 18
Heap Institut für Angewandte Informatik 37 Heap Institut für Angewandte Informatik 38 19
Heap Implementation As a linked structure with each node containing pointers (references) to the roots of its subtrees. As an array: The root is in A[1] Let i be the index of a node, except the root, then the index of the parent is i/2. Let i be the index of a node, except a leaf, then 2i is the index of the left child and 2i+1 is the index of the right child. Institut für Angewandte Informatik 39 Heap: Array-Implementation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 16 14 13 8 7 11 12 2 4 1 heapsize[a] length[a] Institut für Angewandte Informatik 40 20
Heapsort Strategy The root contains the largest key in the heap. Build a sorted sequence in reverse order by repeatedly removing the root element from the heap. After each removing step the heap properties have to be reestablished by bringing the next largest key to the root. Institut für Angewandte Informatik 41 Fixing a Heap A node violates the partial order tree property, i. e. its key is less than at least one of the keys of its children. This node must be exchanged with the child, which has the largest key, recursively. Institut für Angewandte Informatik 42 21
Fixing a Heap Institut für Angewandte Informatik 43 Fixing a Heap Institut für Angewandte Informatik 44 22
Fixing a Heap Institut für Angewandte Informatik 45 Fixing a Heap Institut für Angewandte Informatik 46 23
Complexity of FixHeap Basic operation: Key comparisons in lines 3 and 6 W(n) = 2h = 2 lg n Θ( log n) (h height of the heap, n number of nodes) Institut für Angewandte Informatik 47 Constructing a Heap Given an unordered array of keys. The corresponding binary tree has heap structure, but the partial order tree property is violated. The leaves A[ n/2+1 ],..., A[n] are heaps. The subtrees with roots from A[ n/2 ] down to A[1] must establish their partial order tree property. Institut für Angewandte Informatik 48 24
Constructing a Heap A = (4, 1, 12, 2, 16, 11, 13, 14, 8, 7) Institut für Angewandte Informatik 49 Constructing a Heap A = (16, 14, 13, 8, 7, 11, 12, 2, 4, 1) Institut für Angewandte Informatik 50 25
Constructing a Heap Institut für Angewandte Informatik 51 Complexity of ConstructHeap Basic operation: Call of FixHeap in line 3 W(n) n lg n Θ(n log n) But this upper bound is poor! Institut für Angewandte Informatik 52 26
Heights of subtrees for FixHeap Institut für Angewandte Informatik 53 Complexity of ConstructHeap Basic operation: Call of FixHeap in line 3 W(n) (0 k h) 2 k (h k) 2n -lgn + 2 Θ(n) Institut für Angewandte Informatik 54 27
Heapsort Institut für Angewandte Informatik 55 Heapsort Institut für Angewandte Informatik 56 28
Heapsort Institut für Angewandte Informatik 57 Heapsort Institut für Angewandte Informatik 58 29
Heapsort Given: A = (4, 1, 12, 2, 16, 11, 13, 14, 8, 7) ConstructHeap: A = (16, 14, 13, 8, 7, 11, 12, 2, 4, 1) HeapSort: A = (1, 2, 4, 7, 8, 11, 12, 13, 14, 16) Institut für Angewandte Informatik 59 Complexity of Heapsort Add up the complexities of ConstructHeap and FixHeap in the loop: W(n) = Θ(n) + (n-1) Θ(lg n) Θ(n lg n) Institut für Angewandte Informatik 60 30
Space Complexity Heapsort sorts in-place. The space needed for recursion is limited to a depth of about lg n. But these procedures can be recoded in iterative procedures. Institut für Angewandte Informatik 61 Comparison of Sorting Algorithms Algorithm Worst case Average Extra space Insertion Sort n 2 /2 Θ(n 2 ) Θ(1) Quicksort n 2 /2 Θ(n log n) Θ(log n) Mergesort n lg n Θ(n log n) Θ(n) Heapsort 2n lg n Θ(n log n) Θ(1) Institut für Angewandte Informatik 62 31