More on Big O Notation
Background
When we study algorithms, we want to have some sort of way of analyzing their effectiveness. ALL work in algorithm development is basically a big competition to see who can do something the fastest. And since algorithms are by definition implementation independent, we need to have some way of guessing how fast the algorithm is going to be without actually programming it and running it.
OK. So, we're all big time researchers who have just developed what we think is the best algorithm out there. How do we convince others of that? Well, we're interested in showing that our algorithm will be the best, no matter how big the input data set size is. It's not very interesting to consider small data set sizes since most things are pretty fast for small data sets no matter how bad the algorithm is. So, we want to look at the asymptotic complexity of our algorithm. The term asymptotic basically means at or near the limit, which in our case, is infinity (or at least the computer's idea of infinity). In other words, really, really big. This is also referred to as time complexity, and we often want the worst case, so we look at worst case time complexity.
Asymptotic (Big O) Notation
So, we will use what has been developed as the standard for comparing algorithms. The goal of this notation is to identify the dominant factor in determining the execution time of our algorithm. In almost all cases, this will be some function of the size of our input data set, almost always denoted by n. So, the goal of this whole exercise is to be able to look at our code and see how the execution time, or at least our prediction of it, varies with the size of the data set.
Usually, we count steps of an algorithm. While this is a loose definition, it usually refers to operations performed, such as multiplication or addition. It’s not even necessary to always accurately count the steps, as long as we know whether the number of steps is a constant, independent of n, or is a function of n. For example, the following block of code is a constant number of steps and will not change, no matter how big our input size is:
x[0] = initvalue;
x[n-1] = initvalue;
In contrast, the following block of code does vary with the size of n:
for (i = 1; i < n; i++)
x[i] = x[i-1]*factor;
}
Okay, so we understand how to count steps. How does this relate to Big O notation? Big O notation refers to the notation used for analyzing an upper bound of an algorithm. The above examples were very clearly counted; however, there are many algorithms where it is impossible to tell how many steps will be taken. Therefore, we talk about determining the best case, average case or worst case behavior. We can use Big O notation for all of these, but in most cases, we are talking about the worst case.
So, what is Big O notation mean? In technical terms, it’s:
f(n) = O (g(n)) iff there exist positive constants c and n0 such that f(n) £ cg(n), for all n ³ n0
In other words, for all sufficiently large n, g(n) is an upper bound for f.
If you don’t understand the technical definition, don’t panic. Just try reading it a few times—it may eventually make sense. If it still doesn’t, here’s the scoop: we’re trying to predict the "time" of our algorithm using a function of n, our input size. In other words, we’re trying to determine f(n). Big O tries to determine an upper bound for f using another function g(n). The criteria for this function is that it is greater than the actual "time" of the algorithm, represented by f(n), with some relaxations: there can be a constant positive multiple for g(n) and, the upper bound, g(n) need only hold true for sufficiently large n > n0.
Still not make sense? Well, it is a rather involved way of defining an upper bound of a function, but it is a mathematical term and mathematical terms have to be precisely defined. What you need to understand is that Big O is used for stating the upper bound of the running time of an algorithm. And that that upper bound ignores constants since for presumably large enough n, those get lost in the noise. Also, it's recognized that it's nearly impossible to precisely determine constants for algorithms since they'll be affected by the implementation of the algorithm.
So, how can we used Big O notation? Well, we can use it to bound the worst case behavior of an algorithm. Technically, we could choose any large function and use that as a bound, but that wouldn’t be very useful. For example, we could bound the following loop at O(n1000):
for (i=0; i<n; i++){
x[i] = 0;
}
But, what would the point of that be? It’s much more accurate to state that the algorithm is O(n), not any larger bound. So, if we have some super spectacular algorithms that we’ve just developed, we’d want to spend quite a bit of time determining the worst case behavior of the algorithm and determining the smallest upper bound possible.
In some cases, the code time doesn’t have to vary with the size of n. In that case, we say that the code takes constant time or O(1). Of course, this is the smallest upper bound possible, so it would be nice for our algorithm to have constant running time!
Some other random notes:
When we see that something is O(n), it is the same as saying that it is order n.
There are a limited number of functions of n that you’ll probably see in this class or in practice. They are:
O(1): constant. Doesn’t change with the problem size.
O(logn): logarithmic. Very slow growing with the problem size. When a problem size doubles, the function increases by a constant. The problem size must be increased by n2 in order for the time to double.
O(n): linear. Increases at the same rate as the increase in the problem size. When n doubles, so does the problem size.
O(nlogn): More than linear, but not by much.
O(n2): Quadratic. When a problem size doubles, the time quadruples.
O(n3): Cubic. When a problem size doubles, the time increases eightfold.
O(2n): Exponential.
O(n!): Factorial. HUGE!
Let’s understand how the different functions grow as a size of n:
|
Time |
n=1 |
n=2 |
n=4 |
n=8 |
n=16 |
n=32 |
|
1 |
1 |
1 |
1 |
1 |
1 |
1 |
|
log n |
0 |
1 |
2 |
3 |
4 |
5 |
|
n |
1 |
2 |
4 |
8 |
16 |
32 |
|
nlogn |
0 |
2 |
8 |
24 |
64 |
160 |
|
n2 |
1 |
4 |
16 |
64 |
256 |
1024 |
|
n3 |
1 |
8 |
64 |
512 |
4096 |
32768 |
|
2n |
2 |
4 |
16 |
256 |
65536 |
4294967296 |
|
n! |
1 |
2 |
24 |
40326 |
20922789888000 |
2613x1033 |
Other Bounds
Similar to Big O, we have two other bounds: W and q . They are defined in the same way as Big O, so if you understand that, you’ll understand the following bounds. The first, W , is for defining the lower bound of the running time. It is defined as :
f(n) = W (g(n)) iff there exist positive constants c and n0 such that f(n) ³ cg(n), for all n ³ n0. In other words, for all sufficiently large n, g(n) is a lower bound for f.
Determining the lower bound for an algorithm is typically done to show how bad an algorithm is. If you can state that the algorithm takes at least some large function, then it doesn’t matter what the upper bound is.
Finally, there is q : f(n) = q (g(n)) iff there exist positive constants c and n0 such that f(n) = cg(n), for all n ³ n0 . In other words, for all sufficiently large n, g(n) is a tight bound for f. This is the best type of bound that we can give an algorithm since it is a tight one. That is, g(n) is both a lower and upper bound for f. We can determine a tight bound when we can accurately "count" the steps of an algorithm, like the above for loops. Those are both actually q (n), not just O(n).
Determining Complexity
So, how do we determine the asymptotic complexity of an algorithm? It’s a bit like solving a mystery—we look for some clues and do some deduction to arrive at the answer. Some obvious things—
Examples
Okay. Let’s try some examples.
Suppose algorithm A has time O(n) and B has running time O(nlogn). Will B always be faster than A? What about the case where c = 100 for algorithm A, but 5 for algorithm B. At what point does B start getting faster than A?
True or False
n2 = O(n3)
n3 =O(n2)
2 n2 + 1 = O(n2)
(n+1)2 = O(n2)
n2 - 3n -18 = W (n)