Markov Chain Programming Assignment—Index
Note: None of the documents can be fully understood until you have read the others. They are all inter-related. That means that you should probably read all of them twice before beginning your design and code.
I suggest that you quickly read this document, then the other documents in the order listed. Then re-read all of them more carefully.
This is a pair programming assignment. You must do your programming as a pair. You may talk with other people about it, of course, and get help as needed.
Your code must not contain any GUI components (such as Frame, JFrame, or JApplet).
Milestones
Milestone 1: After reading the input, your program should produce a correct Markov chain of words (as described in the Markov document). All of the provided JUnit tests for milestone 1 should pass.
Milestone 2: Process the words created by the Markov chain algorithm so that the output contains justified lines of the length specified (as described in the Justification document). All of the provided JUnit test cases for both milestones should pass.
Efficiency
Minor details of efficiency are not a concern here, but the running time should certainly be an asymptotic linear function of the number of input words. If your program takes 20 minutes to run for one of the large files in the texts folder, chances are good that you have an N2 algorithm. The first thing you might look at is how you do the input.
Documents and folders
There are five main documents associated with this assignment:
- overview.html describes the goals of the assignment, the four required command-line arguments, the data structures to be used, and the submission instructions.
- markov.html describes the Markov Chain algorithm that you are to use to generate semi-random text.
- justification.html describes the algorithm for producing text that is left and right justified.
- examples.html gives examples of using several data structures and methods useful for this assignment.
- The turn-in instructions.
- The grading rubric. [Word document]
There is also a folders, multiset. We are providing a MultiSet class for you. The multiset folder contains:
- javadoc for the Multiset class, and
- TestMultiSet.java, which should help you to understand how the MultiSet class works.
The bytecode, but not source code, for MultiSet is included in the Eclipse template you will use.
Suggested process for getting started on this assignment
- Read all the documents once briefly, then again more thoroughly. You may want to print them and highlight the key items that you need to remember.
- Check out the Markov project from your pair SVN repository.
- Make sure the project runs in your Eclipse, both as a Java application and as JUnit tests.
- For your own development work, you will probably want to run the application and view its output. You can tell Eclipse to use command-line arguments, by choosing, choose Run → Run
Configurations..., and selecting the Arguments tab. (You could also just add test cases directly to the main method, just be sure to comment them out when done debugging.)
Why can't I just do it my way?
Why do you have to use the data structures that I specify, instead of simply writing the program "from scratch"?
During your career some projects will leave more of the design decisions to you, but others will not. It is important to get practice in implementing other people's specifications and designs, and using classes that are provided for you.