In computer science and programming, we’re always looking for “good” solutions. In many cases, the good solution is the one that performs the fastest. Sometimes we're interested in measuring this in terms of the number of computations, and other times we are just interested in the algorithm that takes the least amount of time as measured by a stopwatch.
Imagine you are creating a new social network. There are two algorithms you can use to find all the cat pictures in your social network. Algorithm Q takes 0.01 seconds to return a result on your test data; Algorithm L takes 0.03 seconds to return the same result on the same test data.
Even though Algorithm L takes longer on your test data, it might be a better idea to implement Algorithm L than the apparently faster Algorithm Q! Let's explore one reason why this might be the case.
One of the important aspects of most problems in computer science is that problem inputs usually change, and they usually get bigger. Your social network is likely to gain more users, more connections, and more cat pictures as time goes on.
The number of seconds your algorithms take to run, , could depend heavily on the total number of people in your social network, . In particular, the amounts of time the two algorithms use might look like this:
If Algorithm Q took 0.01 seconds on your test data and Algorithm L took 0.03 seconds on the same test data, how many people were in your test data?
If your social network grows to have about 11,000 users in the first month, then Algorithm Q, which was faster on the test data, will take 1.21 seconds. Algorithm L, which was three times slower on the test data, is now 30 times faster, taking only 0.04 seconds.
Algorithm Q gets worse and worse as gets bigger and bigger. With a million users, the formerly speedy Algorithm Q takes more than 2 hours, while Algorithm L takes 0.06 seconds. All else being equal, Algorithm L is the obvious choice for a social network that might grow.
What should we learn from this? That the test data wasn't enough for you to evaluate the two algorithms! You needed to know how the algorithms changed when (the total number of users) got bigger. Big O notation is the most popular tool for thinking and talking about this problem in computer science.
Say that is a function that represents the amount of time that Algorithm Q takes to deal with users. This function is described by a quadratic formula, so the function is roughly proportional to .
Similarly, say that is the amount of time Algorithm L takes to run when given users to inspect. The function can be written with a logarithm; it's roughly proportional to .
Big O gives computer scientists a rigorous mathematical definition for describing how fast algorithms run.
Using big O notation, we would say that and that . Without going into the notation, these two statements communicate to a computer scientist that the two functions are very different. They mean that algorithm Q takes a quadratic amount of time to run, so it would be common to call it a quadratic-time algorithm. Algorithm L, similarly, would be called a logarithmic-time algorithm.
A computer scientist will generally pick Algorithm L: logarithmic-time algorithms are generally very fast, even on very large inputs.
One reason big O can be confusing is that big O notation is used in two different ways, and both are important!
Let's talk about the informal, conversational idea first.
Tiffany needs to figure out how to find all the cat DNA in her massive DNA database. Jill tells Tiffany that she knows of three algorithms that will do this and that the algorithms are in ), , and .
All else being equal, which one will Tiffany presumably pick for her project?
Here's the formal, mathematical definition of Big-O:
is a set of functions. Any other function is in the set if the ratio between and eventually goes and stays at or below some constant as gets very large.
For example, is in because the ratio between and stays at or below as long as .
The ratio approaches 3 as gets bigger and bigger. But be careful, and pay attention to the definition! The ratio is never at or below 3 for positive , so we can't use to show that .
The constant in the definition of Big-O is not describing the number that the ratio approaches! It's describing an upper bound on the ratio. The ratio stays at or below as long as , but it also goes and stays below , and below . Therefore, , , and are all valid upper bounds, even though is not.
Similarly, is in because the ratio between and is always at or below which of the following constants (Select all that apply.)
The mathematical definition of big O means a function like is in because the ratio between the function and the function eventually goes below the constant and then stays there:
As a more complicated example, is in , because the ratio between the function and the function eventually goes below the constant and then stays there:
The specific constant doesn't matter for big O. We could have picked the constant for both examples above, and that would have also been correct.
Here's the definition of Big-O again:
is a set of functions. Any other function is in the set if the ratio between and eventually goes and stays at or below some constant as gets very large.
Which of these statements are true according to the formal, mathematical definition of big O?
Here's the definition of big O again:
is a set of functions. Any other function is in the set if the ratio between and eventually goes and stays at or below some constant as gets very large.
Which of these statements are true according to the formal, mathematical definition of big O?
There are four functions shown on the graph below:
According to the formal, mathematical definition of big O, which of these algorithms appear to have a running time that is in
Big O is a pessimistic way of estimating how long a function will take to run. The mathematical definition permits the actual running time of a function to grow much more slowly than big O implies. Big O notation just puts a kind of upper limit on how fast the running time can increase.
Remember Tiffany and Jill? Tiffany needed to figure out how to find all the cat DNA in her database, and Jill told Tiffany she knew of three algorithms that will do this: one in ), one in , and one in .
Relying only on the formal, mathematical notion of big O, it's possible that these three algorithms might all take a logarithmic amount of time to run, because , , and .
It would be mathematically justified, but very strange and unhelpful, if Jill said that three different logarithmic-time-requiring algorithms were in ), , and , respectively.
When a computer scientist uses big O in an informal, conversational way to communicate how fast an algorithm is (or how fast she thinks an algorithm might be), she will give the smallest big O time she knows how to give.
You have an algorithm that operates on cat pictures with cats in them. You determine that this algorithm will always run in fewer than seconds.
Your boss wants to know the running time of the algorithm. Ideally, you should say it is in , but your boss just wants to know whether it's in , , or , or none of these. What's the best answer you could give in this situation?
We're often interested in the big O classification of some polynomial function. These are easy! If you have a polynomial, you can find the big O by looking at the biggest exponent. (This is called the degree of the polynomial.)
Here's an example: This polynomial has degree 5 because of the , so the whole polynomial will be in .
Informally, this is because larger exponents grow much faster than smaller exponents. They grow so much faster that we can effectively ignore the smaller exponents as gets really big.
The insertion sort algorithm that we explored previously can sort an array of length with no more than comparisons. That means that the running time of insertion sort will never be more than , where is some constant.
What is the correct big O classification for insertion sort?
Tiffany has a program that rearranges an array of length , but she doesn't know how it works. She notices that when she doubles the length of the array, it consistently quadruples the amount of time it takes for the algorithm to run.
What will she guess about the running time of this algorithm?
Now you know a bit more about big O, an important and multipurpose tool in computer science. When you encounter big O notation in the world, it is important to remember that there are two ways that people use the notation, and they're not quite the same!