CSSE 230
SortingRaces Programming Assignment
Overview
For the first part of this assignment, you will need your Binary Heap from a previous assignment. See here: heaps.html
In this assignment, you will compare the speeds of various O(n log n) sorting algorithms.
-
You will study the runtimes of five sorting
algorithms: three built into Java, and two that you write (or have
recently written!).
Built-in:
- mergesort. Java's mergesort is supposed to be optimized
when array is partially sorted.
- quicksort. Java's quicksort is an optimized
"dual-pivot" version that is supposedly faster than the single pivot
version we learned.
- bstSort. Here you just insert the items into a binary
search tree and then iterate through the tree, copying the data back
into the array. (Can you explain why this is O(n log n)? It would be
great if we could use your solution to EditorTrees, but EditorTrees
don't sort their data like AVL trees do. Please use java's TreeSet
instead. Since TreeSets don't hold duplicates, you'll need to account for those somehow (see FAQ below).
Ones you provide:
- heapsort. done in part 1.
Drag and drop BinaryHeap and BinaryHeapTest into your
project. If Eclipse gives you an error in BinaryHeapTest, hover your
mouse over the first error at the top of that file, and it will give
suggestions as to how to fix it (you might need to click the error to
see the suggestions). Choose the last option, "Fix Project Setup" and
then add JUnit 4 to the build path in the next window. Then run the BinaryHeapMoreTests.java
tests to see how your heap does with a more complete set of tests. Debug it if needed.
- quicksort. Implement a
simple version like we did in class so we can see the effects of Java's
optimizations. For simplicity, you can hardcode it for ints. Try to
recall it from the concepts you remember from class, but feel free to
refer to our Weiss textbook as needed. You may have to modify the
given code to handle cases where the pivot repeats. We suggest making a
class called Quicksort with a static sort(array) method.
-
Test your code using at least four kinds of arrays, so you
can investigate sort performance under various conditions.
- Random order: We have given you tests for this.
- Random order, but guaranteed to have no duplicates. To
create such an array, you can create an array such that a[i] = i, and
shuffle it. The Collections class has a static method called
shuffle
that will efficiently and throughly shuffle the elements.
- Almost sorted. Start with a sorted array and perform
0.01*n swaps of pairs of elements in random
positions (to "slightly" shuffle it). This will introduce some
inversions.
- Almost sorted, but backwards. Just reverse the array
from the previous step.
Your array size should be large enough so that all of your
sorts take approximately 25 milliseconds or more. If you go much smaller than that,
it will be hard to tell them apart. You can go much larger for most of
the sorts.
Frequently Asked Questions
- Java's TreeSet doesn't allow for duplicates? So
what if my array has them? Answer:
Find any reasonable way to deal with this. For example, you could use a TreeMap instead of a TreeSet, if you are clever.
Or you can tell when it tries to insert a duplicate and can't.
Store those values and add them to the array in the right place once you are copying the data back into the array.
Compared to the sort runtime, this shouldn't take long. There are typically very few duplicates to account for
(for example, I once had 2 duplicates when testing with n=50000 elements).
You should find a way to add them all in one O(n) loop through the array.
- The quicksort code you gave us in class doesn't handle the
i and j walking out of bounds. What should I do? Answer:
Modify the
code to handle that case.
- The quicksort code you gave us in class is giving me a stack overflow error in my recursion. Answer: That code also doesn't consider the case where
the pivot element is duplicated - you'll need to handle that too! (When
you are sorting arrays with unique elements, this won't be an issue.)
- I want to display array ar1 while debugging, but
there is no
ar1.toString()
. Do I need to write my own
method to print the array?
Answer: No. Instead, call Arrays.toString(ar1)
Deliverables
Commit and push your Git repo, including:
- All changed .java files
- All newly created .java files
- Your completed Analysis.docx file
Grading outline
- (30 pts) Binary Heaps and heapsort, as discussed in that
specification. Copy BinaryHeap class into this project.
- (20 pts) Code for your other sorts, and your timing code.
Each sort you provide should be a
method of the appropriate class (so BinaryHeap has a
sort()
method). For Quicksort, sort()
can be static since you don't need an instance of the
class.
- (20 pts) A writeup giving a 2D table of sort times (sort
vs. data), committed to the repo. Discuss the performance of each algorithm. Did all of them
run fine? Were there any memory issues? Which was fastest for each type
of array? Was anything surprising or did you expect this? From what you
know, try to explain what you see. Easiest for me and the grader:
I put a template document called Analysis.docx in your Eclipse project.
Just edit that in Word to add your analysis, and then commit it using Eclipse.