Runtime Exploration
Work on this exercise by yourself,
but ask questions of your instructor, student assistants and classmates as desired.
Goals
The goals of this exercise are to:
- Learn about informal use of asymptotic analysis (also known as Big-Oh).
- Review the use of some control structures.
- Observe how choice of control structures affects efficiency in practice.
The Exercise
Instructions Part 1 - Background
For many applications, efficiency is critical. Consider search.
If Google's search took 10 seconds to find the first page of hits, they'd go out of business.
A first step in designing efficient algorithms is to be able to analyze their runtimes. Runtimes are affected
primarily by two things:
- The speed of the processor, and the part of the processor and memory used for other applications.
- The control structures used by the code itself.
Code that loops over the data more often will take longer to run.
Since this is a software development class, we will focus on the software aspects.
Even within the software, there are some applications that will
take more time than you've been living to process even on the fastest machines,
while some applications take little time at all to solve even on slow machines.
This is due both to the inherent complexity of the problem and how well the
algorithm used to solve the problem was designed.
How do we measure efficiency?
Say I have two algorithms and I want to measure the number of assignment statements
were made when processing an array of a given size n. One piece of code uses
f(n) = 2n2 + 13n + 5 statements and another uses g(n) = n3 -n2 + 4 statements.
What part of each polynomial contributes most to the runtimes?
Informally, I say that f(n) is O(n2), pronounced "big-Oh of
n-squared". Why? What about g(n)?
As a rule of thumb, we take the highest-order term and drop the coefficient.
In the case that the function is constant, like h(n) = 25, then I say it is
O(1).
You will learn formal definitions of big-Oh
(and others) in CSSE230. Here, we will run some experiments to connect
actual code with big-Oh and runtimes.
Instructions Part 2 - Get the starting code
- Checkout Runtime from your repository
- Open Runtime.java. Notice some
code has already been written for you. Please discuss with your instructor as
needed.
Instructions Part 3 - Write the code and execute the project
To do this, we will use a simple function f(), as shown in the code, and
experiment with putting various control structures around the call to f(). For
each case, it will be great if you hypothesize what the big-Oh runtime will be,
then confirm or reject using data by varying n (you'll probably have to
make it pretty large to get runtimes over 1000 ms), looking at the runtime,
and determining the relationship between the
runtime and n. Actually running code for various sizes of input and timing each
to determine the behavior is called profiling the code.
- A single for loop (running from 1 to n (or more naturally, 0 to (n-1))), calling f() in its body.
- Sequential for loops, each running from 0 to (n-1) and calling f() in its body.
- Nested for loops with both inner and outer loops running from 0 to (n-1), in which f() is called within the inner loop.
- Nested for loops in which f() is called once from within the inner loop and
once outside the inner loop, but still inside the outer loop.
To help to see the relationships between these, you should graph the runtimes of
all of 1-4 (vs n).
Here are some options:
- Excel doesn't easily graph plots where the x-coordinates aren't uniformly distributed,
but you may find a way to do it.
- You could use Maple: Instead of supplying a list of functions to graph
together, like we did in class, you can give it a list of list of points. For
example, plot([[[0,0], [1,1]], [[0,0], [1, -1]]]) gives 2 separate line
segments emanating from the origin, one into quadrant I and one into quadrant
IV.
- I provided a Matlab script for you in this folder to use for your plots if you'd like.
Can you apply this in a more general setting?
Consider (don't write), what are the big-Oh runtimes when processing an array of size n...
- ...if you just have to look at each entry individually?
- ...if you have to look at each pair of entries individually?
To turn in:
- Please submit your 4 graphs. You should choose an appropriate zoom factor for each
so that the shape of the graph is obvious. (You are allowed to plot multiple
curves on a single graph IF one zoom factor shows the shape of both.)
- Please write up your results in
a paragraph that summarizes your findings, being
sure to use big-Oh notation correctly. (For example, the first answer is: A single loop leads to code that is O(n), where n is
the number of times the loop repeats.) Also try to explain why the big-Oh result is what it is.