Markov Chain Programming Assignment—Index
Note: None of the documents can be fully understood until you have read the others. They are all inter-related. That means that you should probably read all of them twice before beginning your design and code.
I suggest that you quickly read this document, then the other documents in the order listed. Then re-read all of them more carefully.
You may do this project alone or with a partner using pair-programming. You may talk with other people about it, of course, and get help as needed.
Your code must
not
contain any GUI components (such as Frame, JFrame, or JApplet). Far cry from
recent assignments, I know!
Milestones
Milestone 1:
After reading the input, your program should produce a correct Markov chain of words (as described in the
Markov document). All of the provided JUnit tests for milestone 1 should pass.
Milestone 2:
Process the words created by the Markov chain algorithm so that the output contains justified lines of the length specified (as described in the
Justification document). All of the provided JUnit test cases for both milestones should pass.
Efficiency
Minor details of efficiency are not a concern here, but the running time should certainly be an asymptotic
linear function
of the number of input words. If your program takes 20 minutes to run for one of the large files in the
texts
folder, chances are good that you have an N2
algorithm. The first thing you might look at is how you do the input.
Documents and folders
There are five main documents associated with this assignment:
-
overview.html
describes the goals of the assignment, the four required command-line arguments, the data structures to be used, and the submission instructions.
-
markov.html
describes the Markov Chain algorithm that you are to use to generate semi-random text.
-
justification.html
describes the algorithm for producing text that is left and right justified.
-
examples.html
gives examples of using several data structures and methods useful for this assignment.
- The
turn-in instructions.
-
The grading rubric. [Word document]
There is also a folders,
multiset. We are providing a MultiSet class for you. The multiset folder contains:
-
javadoc
for the Multiset class, and
-
TestMultiSet.java, which should help you to understand how the MultiSet class works.
The bytecode, but not source code, for MultiSet is included in the Eclipse template you will use.
Suggested process for getting started on this assignment
-
Read all the documents once briefly, then again more thoroughly. You may want to print them and highlight the key items that you need to remember.
-
Check out the Markov project from your SVN repository, listed here.
-
Make sure the project runs in your Eclipse, both as a Java application and as JUnit tests.
-
For your own development work, you will probably want to run the application and view its output. You can tell Eclipse to use command-line arguments, by choosing, choose Run → Run..., and selecting the Arguments tab. (You could also just add test cases directly to the main method, just be sure to comment them out when done debugging.)
Why can't I just do it my way?
Why do you have to use the data structures that I specify, instead of simply writing the program "from scratch"?
Some of the later assignments will leave more of the design decisions to you, but others will not. It is important to get practice in implementing other people's specifications and designs, and using classes that are provided for you.