CSSE 230 – Data Structures and Algorithm Analysis

CSSE 230 – Spreadsheet Project

Overview

For this project you will create a basic spreadsheet program. Your spreadsheet will have a simple graphical user interface. Your program will support numeric values (up to the precision of Java’s double type), boolean values, and strings.

You’ll also provide common spreadsheet functionality including formulae, fill down and right, copying a range of cells, moving a range of cells, loading and saving files, and setting the number of decimal places used to display numeric values.

Administrative Background

Caveats

This is the first time I’ve given this assignment. I don’t have a strong sense of how difficult it will be or how detailed the specification will need to be. My intent is that the project be an interesting way to bring together lots of the ideas we’ve worked with during the term. I don’t intend to crush your souls. Please give me feedback on how the project is going and ask questions if you need clarification. I will likely need to update the specification as we go and will make the changes obvious when I do that.

Team Assignment

This is a team assignment. My intention is not that you “divide and conquer” as much as that you have people to talk with as you write and test this program. Most teams will have four members. A reasonable approach is to work in pairs. You might begin with two people working on the GUI and two on the spreadsheet data structures. Then you could mix the pairs so that each pair has a GUI person and a data structure person.

Everything that you submit for this project should be understood by all team members. It is your responsibility to (a) not submit anything without first discussing it with your teammates, and (b) not let something your teammates write go “over your head” without making a strong effort to understand it (including having a teammate explain it to you, of course).

Team Grading Note

Each team member will write performance evaluations for all members of the team, including themselves. In general each team member will receive about the same grade, but I will consider the performance evaluations when assigning individual grades.

Subversion Repository

Your SVN repository for this assignment is:

    http://svn.cs.rose-hulman.edu/repos/csse230-200920-teamX

where X is your team number given in class. You should check out the


			Spreadsheet

project from this repository, and all subsequent work should be placed in this project folder and committed back to your repository.

Features

The subsequent subsections list the features that your program is requird to have and several potential additional features. This is the first time I’ve assigned this project, so I’m not sure what to expect. At this point, my thought is that an “A” project would include all of the required features and several of the additional features. A team that delivers all of the milestones, well done and on time, and produces a project that correctly implements all the required features will earn at least 75% on this assignment.

To ensure that every team member has a chance to learn about the full sweep of the project, four of the required features will have “end-to-end” owners. These features are labeled E2E below. You will identify an owner for each of these features on the first day of the project. The owner of an end-to-end feature must implement the code for the feature in both the user interface and the model. You may pair program the end-to-end features, but the feature owner must be the driver. (If your team only has three members, you only have to have owners for three of the end-to-end features.)

Required Features

A successful program must include all of the following features:

Select a cell, area, row or column.
Enter a number, boolean value, text, or formula into a selected cell. (Text entry can be in a separate text box.)
Correctly evaluate all formulae as described below.
Support “infinite” spreadsheets. (I don’t mean sheets with billions of formulae, but you should be able to have formulae separated by vast tracks of empty cells. Your GUI can be written to only handle fixed size spreadsheets, but the text user interface should support infinite spreadsheets. We'll take infinite to mean Integer.MAX_VALUE.)
E2E Copy a range of cells, appropriately translating cell references.
Move a range of cells, appropriately translating cell references.
E2E Fill the selected area (right or down) by correctly translating the formulae in the first column or row.
E2E Set the number of decimal places to use when displaying a number in a given cell.
E2E Save and load files.

Additional Features

Once your team completes the required features, you should try to complete as many of these additional features as time permits.

Insert a cell, area, row or column before the current selection.
Delete the current selection.
Change the width of the selected column and remember the change after saving and reloading.
Copy and paste.
Additional functions besides SUM and IF.
Goal solver (sets a selected cell to a given value by automatically modifying the value of some other cell)
Charts
Other features that you find interesting…

Details on Features

Your implementation of these features must satisfy the additional details listed below.

Addressing Cells and Areas

Your program will use the same approach the Excel uses for addressing cells and rectangular areas: columns are labeled by letters or strings of letters (ordered lexicographically), rows are labeled by positive integers, e.g., A12 or BH137. A rectangular area is denoted by listing the upper-left and lower-right cells, separated by a column, e.g., A12:C15.

In a formula, cell addresses may be absolute (meaning they don’t change when the formula is copied) or relative, e.g. A1, A$1, $A1, or $A$1. The syntax and semantics are the same as Excel’s.

Data Types

A cell will hold a number, a string, or a boolean value. Unlike Excel, strings must be enclosed in double quotes. To avoid more complications, all numbers can be represented internally as doubles. The provided parser package handles the details of processing entered values and formulae.

Formula Format

All formulae begin with = and spaces are ignored, except within quotes. Arithmetic operators are +, -, *, /, and ^ (power), with the usual precedence and associativity. Parentheses may be used for grouping. The minus sign may be unary (e.g., -2) or binary (e.g., 3 - 2). Comparison operators are <, <=, ==, >, >=, and !=. Boolean operators, like AND, OR, and NOT are not included, but you may add functions to support these.

Functions in formulae are called as in Java. That is, FUNC_NAME(comma-separated list of argument). The provided code recognizes two functions: IF and SUM.

IF takes three arguments. The first must evaluate to a boolean value. If the value is TRUE, then the second argument is evaluated to find the result of the entire IF expression. Otherwise the third argument is evaluated.

SUM takes an arbitrary number of arguments. It sums the values of all the arguments. If an argument refers to an area of cells, then it sums the values in all of the cells in the area.

See NodeFactory.validFunctionNames() in the provided code for information on adding your own functions.

Formulae must not have circular references. If the user attempts to add one, your program should indicate an error. (Try entering circular references in Excel and see what happens.)

Data Structures

The simplest way to represent the grid internally would be with a two dimensional array. However, I’m requiring support for “infinite” spreadsheets, which have a lot of empty cells, so a 2d array isn’t viable. You’ll need an alternative representation.

You’ll use an expression tree of “formula nodes” to represent cell contents. The provided parser package works with NodeFactory, which you will complete, to build these trees.

When the content of a cell is changed, all cells containing formulae that depend on it must be re-evaluated. We don’t want to re-evaluate every cell in the spreadsheet. Thus a “depends on me” graph structure could be useful. For each cell, you can keep track of the cells whose formulae directly refer to it, and in these cells keep track of the ones that refer to them, etc. An “I depend on” graph structure might also be useful. When the spreadsheet is first loaded, the program should always calculate a cell’s “I depend on” cells before calculating that cell. (Hint: you might read up on topological sort.)

Architectural Overview

The basic architecture of the provided code is shown in the UML diagram below.

UML Diagram for basic system architecture. (Click for larger view.)

The architecture uses several object-oriented design patterns.

The separation of the underlying data structures in the model package from the user interfaces in the textui and gui packages is an example of the Model-View-Controller pattern. The two user interface packages are views of the data. The model package is an abstract representation of the data. The SpreadsheetModel class acts as controller through which the views interact to change the model. User commands in the view are sent to the controller. This separates the views from the details of the model so that they can be developed more independently.
The FormulaParser is a façade. It hides the complexity of formula processing behind a simple interface consisting of just one method in a single class.
The NodeFactory class, which you will implement, is a factory class for creating instances of FormulaNodes.
The Location class also has a factory method to get new Location objects. Location objects are immutable, so rather than create dozens of such objects all referencing cell A1 (for example), the factory method returns the same Location object whenever we ask for location A1. These unique Location objects provide the basis for implementing cell references in your program as flyweight objects.

You may replace any provided code with your own, though I discourage wholesale changes. In particular, the functionality provided by the parser package would, on its own, be two weeks of work in CSSE 404–Compiler Construction.

Deliverables

Milestone dates are listed in the table below. This rest of this section describes the artifacts due at each milestone. Follow the instructions carefully. Your team may want to print a copy of this document, highlight the key things to do, and check them off as you do them.

Project Schedule and Milestones

All due times are start of the class session unless otherwise noted.

End of session 25	Cycle 1 User Stories
Session 27	UML Class Diagram, Cycle 1 Status Report, Cycle 2 User Stories
Session 28	UML Class Diagram, Cycle 2 Status Report, Cycle 3 User Stories
Session 30	UML Class Diagram, Cycle 3 Status Report, Final working code, Project Demonstration (in class)
11:59 p.m., Sunday, Feb. 22	Performance Evaluations

UML Class Diagram

You must submit a UML class diagram for each development cycle. Your diagram does not have to include anything from the parser or textui packages, but must show the model and gui packages and their classes and interfaces. Be sure to show the relationships and dependencies between your classes.

These UML diagrams are not just an academic exercise (except in that the whole project is one). Rather, your problem and team are both big enough that you’ll need some way to see the whole design in one picture and to discuss the design apart from the code.

You may use whatever software you like for creating your diagrams. One simpler alternative that is freely available is Violet, available here. (Download and double-click the jar file. Then click Class Diagram to get started.) Whatever you use to create them, your diagrams must be turned in use a standard image format such as pdf, png, or jpeg.

To be turned in:

To commit your UML Class Diagram, you’ll add it to your Eclipse project. For example, in Violet, choose File → Export to → Image File to save the file as CycleN.png to your Eclipse workspace → Spreadsheet → Planning folder. Within Eclipse, right-click the Spreadsheet project folder in the Package Explorer view, and choose Refresh. Your png file should appear under Planning. Now right-click the project folder again, and choose Team → Commit. Be sure the png file is checked so it is added to the repository. Enter an appropriate message and complete the commit.

You can use a similar technique as above to add any other documents to your repository.

User Stories

Short development cycles are a key feature of Extreme Programming. This is especially useful for projects where the development team—that’s you—is unfamiliar with the problem to be solved. At the start of each development cycle, the team negotiates with the customer on the work to be accomplished during that cycle.

The customer suggests user stories to be completed. These are short, often one sentence, descriptions of what the user would be able to do with the software. For example,

User launches program and sees an empty table and a text box where he/she could enter a formula.

This story doesn’t say anything about the text box actually working, so that is not part of this user story. A user story for the text box actually working might be something like,

User clicks a cell in the table, types a formula in the text box, and clicks the Enter button. The value of the formula appears in the previously clicked cell. Formulae are limited to basic arithmetic with constants.

User stories are powerful in two important ways. First, they make it very clear how to test whether the story has been completed. Second, they keep the development team focused on the importance of meeting the needs of the user.

After the customer has suggested some user stories for the cycle, the development team will decide whether it is reasonable to complete them all in the time available. If not, the customer will withdraw some of the proposed stories until an achievable set of stories is agreed upon.

For this assignment you’ll play both the customer role and that of the development team. As a customer, the number of features completed will help determine your grade (and how much fun you have with the project). As a development team, you’ll have to do the work to implement the user stories.

To be turned in:

At the start of each development cycle, you should commit a text document to your project repository listing the user stories that you plan to complete during that cycle. You can use File → New → Untitled Text File to add a new text file to an Eclipse project. Save your file to your Planning folder with a name like “Cycle N User Stories.txt”, where N is the cycle number. Commit the file to your repository.

Status Reports

At the end of each development cycle, you should commit a text document to your project repository indicating either that you completed all of the user stories for the cycle or else listing any user stories the team planned to complete but was not able to. Briefly state any complications that prevented you from completing the stories, for example, “We underestimated how hard it would be to implement the SUM function.”

Save your file to your Planning folder with a name like “Cycle N Status Report.txt”, where N is the number of the cycle just completed. Commit the file to your repository.

Final Working Code

We’ll grade the version of your code committed to your repository at the final-working-code deadline. Your code should be well-commented and should use appropriate class, method, field, and variable names. No Eclipse warnings should remain for your final code (according to the required CSSE 230 preferences file).

You must also provide JUnit tests for the parts of your program, particularly your model, that can be tested apart from the GUI. Note that you can test much of the larger scale functionality using the textui. I strongly encourage you to do that. With teams the size of yours, it is very easy to accidently break someone's code without noticing. Unit tests prevent that.

Some comments on Subversion and team projects:

Commit your code often. And don’t forget to update your code before working and before committing. The chances of SVN conflicts grow exponentially with the number of team members, but they decrease with the number of lines of code in the project. The net result is that you’ll have more trouble at the beginning of a project. For this reason it makes a lot of sense to carefully work on completely different classes in the beginning.

Demonstration

Each team will give a short (8-10 minute) demonstration of their program during the last class session. Each team member should participate in the presentation. In your presentation be sure to demonstrate all of the required features from this specification, plus any additional features that you chose to implement. You do not need to prepare any slides, just demonstrate your software.

To be turned in:

At the end of your presentation you must hand in a short outline or script of your presentation. This is to ensure that you've prepared the presentation before class.

Performance Evaluations

You will complete performance evaluations as for the Sliding Blocks project. Remember to keep an individual log or journal of your team members’ performance so that you can give specific examples in the performance evaluations.

This specification was written by Curt Clifton drawing on some ideas and text from a previous assignment by Claude Anderson.