Programming Assignment:
Warm Up and Stretching

Objectives

To get you programming in Java again if it has been a while. To have you think about using basic data structures. To have you think about how linked lists are implemented. To perform simple algorithmic tasks on arrays. To recall how to write simple recursive code. To remember fundamentals of object-oriented programming.

There are also some basic infrastructure issues to review, such as checking projects out and committing projects to a git repository, and running JUnit tests. We have included notes to help you get up to speed on these things. Be sure to ask for help if you’re having trouble with those details. The main challenge on this assignment is supposed to be solving the programming problems, not wrestling with the tools. Please don’t waste your time on something that we’re more than happy to show you how to do.

Many of the problems have appeared on timed CSSE 220 exams in the past, so we know that they can be done quickly if you are up-to-speed on Java programming.

Check-out Instructions

Please follow the instructions in this video for information on how to accept the github for this assignment. The invitation link is available on Moodle, under Week 1 Material, Programming Assignment.

Be sure to commit your work often as you make progress on these programs. This will help you recover from mistakes, power failures, and freak accidents.   (Later, on team projects, frequent commits will facilitate teamwork on different machines, and will enable instructors and graders to observe the progress of your team.) What will be graded is what is committed to your repository, so do not forget to commit and push your final changes.

 

Seven Small Programming Tasks for this Assignment

Anagrams

Two strings are anagrams if the characters in one string are an exact rearrangement (permutation) of the characters in the other string. In this problem, we extend the definition to allow for different cases of letters, so that "Data Structure" and "Saturated Curt" should be considered to be anagrams. Spaces are included in the "permuted characters."  In the anagram.Anagram class, we provided the stub for a method, isAnagram(), which takes two Strings and determines whether they are anagrams. You should complete this method. When it is working, it should pass all of the unit tests in anagram.AnagramTests.  

To run a particular set of unit tests: right click the test file in Eclipse’s Package Explorer, then choose Run As → JUnit Test. You can right click a package to run all the tests in a package or right click a project to run all of its tests. Once you’ve run a set of tests, you can run the same tests again by using the green “play” button in the JUnit view's tool bar. Talk to an assistant or your instructor if you need help getting up-to-speed on writing/using JUnit tests.

Commit and push your solution.

Maps

Maps are very useful structures for storing key-value pairs, something that you’ll do throusands of times in your career. If you are new to maps, you should first learn the basic operations (put, get, containsKey). Then, complete the first (small) task: read through the code in the MapImplementationReview class and make a small change to it. This is to remind you of an important difference between the two most commonly used implementations of maps. Following this, you should complete a task using maps: implement the nGramCounter method in the NGramCounting class, which counts the number of instances of each n-length substring in an input string of text. (This method would be useful, for example, in running frequency analysis on a text. Check out the test file for an example of this. Such analysis can be used, e.g., to efficiently break the classical substitution cipher. )

Euclid's GCD Algorithm

You will implement many recursive algorithms this term, since one of the main data structures, binary trees, is a recursive structure. In this part, you'll implement Euclid's algorithm, which is a recursive way to calculate the greatest common divisor (GCD) between two numbers, much faster than looping over all possibilities. Specifically, given two integers, a and b, their GCD is equal to the GCD of b and the remainder when a is divided by b. This causes the numbers to get small quickly.  Eventually, when b is 0, the GCD is simply a. For example, gcd(42, 18) = gcd(18, 6) = gcd(6, 0) = 6. The unit tests have more examples, including one with a very special sequence of numbers that causes the worst-case runtime possible for this algorithm. Do you recognize that sequence? The wikipedia page has code if you get stuck or want to check it out once you finish.        

Efficient Search

This package contains code to search an array for a given item. It looks at the items, one at a time, which was easy to code but is slow, taking O(n) time in the worst-case. So if there are 1,000,000 items in the array, it will have to make that many comparisons in the worst case. Since the array is sorted, the much-more efficient BINARY search, which runs in O(log n) worst case time, will work. If there are 1,000,000 items in the array, it will only have to make ~20 comparisons.

Replace it with binary search. You can remind yourself of the algorithm from the CSSE220 materials or the wikipedia page.

We don't care if you do this one recursively or using a loop.

ComparingShapes

As you know, the Arrays.sort() method that is part of the Java API can only sort an array of objects if they are Comparable, that is, they implement the Comparable interface. Run ShapeTest.java to sort an array of Circle objects and notice that it fails. Make circles Comparable to one another. (The best syntax is to write, class Circle implements Comparable<Circle>.) Then sort an array of triangles both by perimeter and by area. Wait, you say, if I make them Comparable using perimeter, I can't also make them Comparable using area! True. You will need to write and use Comparators to do this. Make changes both to ShapeTest.java and Triangle.java to do this, following the TODOs and noting that we did perimeter for you. (The TODOs give a hint if you don't know how to find the area of an arbitrary triangle from its side lengths.) We'll be using Comparable and Comparators often this term, so this quick review exercise is worth your time. Then, you'll repeat the exercise using a TreeSet, so you won't even have to call sort()!

SortedLinkedList

The list package contains the  DoublyLinkedList class. This class implements a doubly linked list of Comparable objects. Study this code. It's more interesting than code you wrote in CSSE220 in that it uses inheritance and polymorphism. First, note how it uses polymorphism for the nodes of the list.  In particular, a DoublyLinkedList has a special head node and tail node that do not contain actual list data but make it easier to do the list's operations.

Second, in the list package is a skeletal definition of a new class called SortedLinkedList, which is-a DoublyLinkedList (double because each node points both to the next and previous nodes), but as the name implies, the elements of such a list are kept in increasing order. Comments in the SortedLinkedList code briefly explain what each method is supposed to do and/or return. Looking at the unit tests in SortedLinkedListTests may also help clarify what the methods are supposed to do.

You are to finish the implementation of SortedLinkedList by completing each TODO item. When it is working, it should pass all of the unit tests in SortedLinkedListTests.

You can view a listing of TODO items in Eclipse by turning on the Task view. Choose Window → Show View → Tasks.

PriorityQueue

Priority queues were not discussed directly in CSSE220, but the idea behind them is simple. The files for this task are in the priorityQueue package. A PriorityQueue is a data structure that gives special access to the smallest element (which has the highest priority; like if you are position 1 (first) in line, you have the highest priority). If you want to read a quick explanation, see section 8.4 in our text, but you can ignore the discussion of heaps, which we'll study later in the term. One possible interface to a PriorityQueue is via the three methods insert, findMin, and deleteMin (in the text, they are called push, peek, and pop, respectively). The method insert(pri, element) inserts element into this priority queue with priority pri. (Alternatively, we could have insert(element) take just a single argument, the element itself, where the type of this element implements Comparable; then, relative priority of different elements is determined by using the natural ordering, as determined by the compareTo method. This is the convention you will use in this problem.) Calling findMin() returns the element with minimum priority, and deleteMin() deletes that lowest-priority element. If two elements have the same priority, the ordering of the elements themselves is used to "break the tie" when determining which is the minimum element of the priority queue.

One common theme in this course is to use one data structure as the basis for implementing another one. Here, you will implement a PriorityQueue by using an ArrayList. The PQ class (partially) does this; you are to complete the task. (Later in the course, we’ll learn about a more efficient approach, the Binary Heap). Your job is to fill in the details of the constructor and the three methods of the ArrayListMinPQ class. Your code must pass the unit tests in the same package.  


Another theme is to write code that is efficient. To earn the last 2 points on this one, set up and use your ArrayList such that findMin() is O(1) and deleteMin() is amortized O(1). There is no runtime requirement for insert(). You’ll have to figure out how to do this, but I’ll give you some hints: (1) The text shows the priority queue's elements to be in order, so keep your array in order when you insert so you always know where the min element is! You can do this with a loop or you may find a static method from the Collections class to be helpful. Remember, I don't care how slow insert() is. Double-hint: if you remove the first element, it's slow to shift the other data over to fill the gap: you learned this in the first reading assignment. But if you order it so that the smallest one is at the end, no other data needs to be shifted when you delete it - that is much faster. Make sure that you don't break the unit tests to do this; if you don't have 10/10 on the unit tests, you usually cannot earn any efficiency points. Remember, correctness is king!

Turn-in Instructions

Turn in your work by committing it to your Git repository and pushing it to the server. We recommend committing often, whenever you have made some progress toward a solution to one of the problems. Your last commit/push date will determine the status of your submission as on-time, early, or late. See the Late Days policy in the course syllabus for more details.

Grading

For many assignments, a small portion of your grade will be based on comments and style.  Follow these guidelines:

  1. Good variable and method names: Eclipse has name completion (ALT /), so the “typing cost” of using long names is small
  2. Defensive programming: Use local variables (instead of fields) anywhere you can’t explicitly justify creating instance fields.
  3. No super-long lines of code
  4. No super-long methods: use top down design and procedural abstraction.
  5. Consistent indentation (ctrl-shift f)
  6. Blank lines between methods, space after punctuation
  7. Don't compare boolean operations to true. Thus, use if (empty(tree)) instead of if (empty(tree) == true).
  8. Also, when returning boolean expressions, write
     return x > 0;

    rather than

    if (x > 0) {
       return true
    } else {
       return false;
    }

For the general guidelines that the graders will use, see this page.