Introduction:
Sorting routines are among the most widely used and studied algorithms. Every student should know how to implement several different kinds of sorts, and should have idea about how they perform, both theoretically and practically. This programming project is designed to give the student practice in implementing and observing the behavior of four sorts: Insertion Sort, Merge Sort, Heap Sort, and Quick Sort.
Resources
The algorithms for Insertion Sort and Merge sort are given; the algorithm for Heap Sort is given; and the algorithm for Quick Sort is given.
Programs must be written in standard C, C++
You will find 12 data files for this part of the project. These files all contain shuffled lists of integers, with one integer list per line. The files are:
Filename | # items | lowest | highest | Description |
Num8.txt | 23 | 1 | 8 | no omissions, no duplicates |
Num16.txt | 24 | 1 | 16 | no omissions, no duplicates |
Num32.txt | 25 | 1 | 32 | no omissions, no duplicates |
Num64.txt | 26 | 1 | 64 | no omissions, no duplicates |
Num128.txt | 27 | 1 | 128 | omissions/duplicates possible |
Num256.txt | 28 | 1 | 256 | omissions/duplicates possible |
Num512.txt | 29 | 1 | 512 | omissions/duplicates possible |
Num1024.txt | 210 | 1 | 1024 | omissions/duplicates possible |
Num2048.txt | 211 | 1 | 2048 | omissions/duplicates possible |
Num4096.txt | 212 | 1 | 4096 | omissions/duplicates possible |
Num8192.txt | 213 | 1 | 8192 | omissions/duplicates possible |
Num16284.txt | 214 | 1 | 16284 | omissions/duplicates possible |
Description:
You will write C, C++ code that implements the algorithms for sorting routines mentioned above. As part of your code, you will include counters that iterate whenever a specific line of the algorithm is executed.
Some lines in an algorithm may have a higher cost than other lines. For example, the function call in line 5 in the Merge Sort algorithm is executed only 7 times for an array of 8 elements, but the body of the Merge function which is being called has many lines, some of which are executed more than once. So the cost of line 5 in the Merge sort algorithm is higher than the other 4 lines, We can use the cost of the highest-cost line as an indicator of the cost of the algorithm as a whole.
Insertion Sort:
Here is the pseudocode for Insertion Sort, modified to include a counter:
count <- 0
Insertion_Sort(A)
1 for j <- 2 to length(A) do
2 key <- A[j]
3 // Insert A[j] into the sorted sequence A[1..j - 1]
4 i <- j - 1
5 while i > 0 and A[i] > key do
5.5 count <- count + 1
6 A[i + 1] <- A[i]
7 i <- i - 1
8 A[i + 1] <- key
Your code for Insertion Sort should have a line in it that is equivalent to line 5.5 in the Insertion_Sort pseudocode above. The global variable count will keep a running total of the number of times this line is executed. When you exit from the call to the Insertion Sort function, you should print out the values of n (the length of the array) and count as an indicator of the cost of the function.
Merge Sort:
Here is the pseudocode for Merge Sort, modified to include a counter:
count <- 0
Merge_Sort(A, p, r)
1 if p < r
2 then q <- (p + r) / 2
3 Merge-Sort (A, p, q)
4 Merge-Sort (A, q+1, r)
5 Merge (A, p, q, r)
And here is the modified algorithm for the Merge function used by Merge Sort:
Merge (A, p, q, r)
1 n1 <- (q - p) + 1
2 n2 <- (r - q)
3 create arrays L[1..n1+1] and R[1..n2+2]
4 for i <- 1 to n1 do
5 L[i] <- A[(p + i) -1]
6 for j <- 1 to n2 do
7 R[j] <- A[q + j]
8 L[n1 + 1] <- infinity
9 R[n2 + 1] <- infinity
10 i <- 1
11 j <- 1
12 for k <- p to r do
12.5 count <- count + 1
13 if L[I] <= R[j]
14 then A[k] <- L[i]
15 i <- i + 1
16 else A[k] <- R[j]
17 j <- j + 1
Your code for Merge Sort should have a line of code in it that is equivalent to line 12.5 in the Merge pseudocode above. The global variable count will keep a running total of the number of times this line is executed. When you exit from the call to the Insertion Sort function, you should print out the values of n (the length of the array) and count as an indicator of the cost of the function.
Heap Sort:
Here is the pseudocode for the Heap Sort, modified to include a counter:
count <- 0
HeapSort(A)
1 Build_Max_Heap(A)
2 for i <- length[A] downto 2 do
3 exchange A[1] <-> A[i]
4 heap-size[A] <- heap-size[A] - 1
5 Max_Heapify(A, 1)
And here is the algorithm for Max_Heapify function used by Heap Sort:
Max_Heapify(A, i)
1 l <- LEFT(i)
2 r <- RIGHT(i)
3 if l <= heap-size[A] and A[l] > A[i]
4 then largest <- l
5 else largest <- i
6 if r <= heap-size[A] and A[r] > A[largest]
7 then largest <- r
8 if largest != i
9 then exchange A[i] <-> A[largest]
9.5 count <- count + 1
10 Max_heapify(A, largest)
And here is the algorithm for the Build_Max_Heap function used by Heap Sort:
Build_Max_Heap(A)
1 heap-size[A] <- length[A]
2 for i <- floor(length[A]/2) downto 1 do
3 Max_Heapify(A, i)
Your program for Heap Sort should have a line of code in it that is equivalent to line 10 in the Max_Heapify pseudocode above. In your program, a global counter should keep track of the number of times this line is executed.
Quick Sort:
Here is the pseudocode for Quick Sort, modified to include a counter
QuickSort(A)
1 Count <- 0
2 QUICKSORT(A, 1, length[A])
Here is the pseudocode for the QUICKSORT function used by Quick Sort:
QUICKSORT(A, p, r)
1 if p < r
2 then q <- PARTITION(A, p, r)
3 QUICKSORT(A, p, q-1)
4 QUICKSORT(A, q+1, r)
Here is the pseudocode for the PARTITION function used by QUICKSORT():
PARTITION(A, p, r)
1 x <- A[r]
2 i <- p - 1
3 for j <- p to r-1
3.5 count <- count + 1
4 do if A[j] <= x
5 then i <- i + 1
6 exchange A[i] <-> A[j]
7 exchange A[i+1] <-> A[r]
8 return i+1
Program Execution:
Your program should read in data from Num8.txt, run Insertion Sort on it, and store its results; read in data from Num16.txt, run Insertion Sort on it, and store its results, etc., up through file Num16284.
Next it should repeat the process, using Merge Sort, Heap Sort, and Quick Sort as the sorting routines.
When your program terminates, you should have 48 sets of results. Each set of results should contain:
Analyze and discuss the results of your experiment. You may want to plot your results using Excel or a graphing program to help you visualize the results better. At the very least answer the following questions: