In computer science and programming, we’re always looking for “good” solutions. In many cases, the good solution is the one that performs the fastest. Sometimes we're interested in measuring this in terms of the number of computations, and other times we are just interested in the algorithm that takes the least amount of time as measured by a stopwatch.

Imagine you are creating a new social network. There are two algorithms you can use to find all the cat pictures in your social network. Algorithm Q takes 0.01 seconds to return a result on your test data; Algorithm L takes 0.03 seconds to return the same result on the same test data.

Even though Algorithm L takes longer on your test data, it might be a better idea to implement Algorithm L than the apparently faster Algorithm Q! Let's explore one reason why this might be the case.

One of the important aspects of most problems in computer science is that *problem inputs usually change, and they usually get bigger*. Your social network is likely to gain more users, more connections, and more cat pictures as time goes on.

The number of seconds your algorithms take to run, \(t\), could depend heavily on the total number of people in your social network, \(n\). In particular, the amounts of time the two algorithms use might look like this:

If Algorithm Q took 0.01 seconds on your test data and Algorithm L took 0.03 seconds on the same test data, how many people were in your test data?

If your social network grows to have about 11,000 users in the first month, then Algorithm Q, which was *faster* on the test data, will take 1.21 seconds. Algorithm L, which was *three times slower* on the test data, is now *30 times* faster, taking only 0.04 seconds.

Algorithm Q gets worse and worse as \(n\) gets bigger and bigger. With a million users, the formerly speedy Algorithm Q takes more than 2 hours, while Algorithm L takes 0.06 seconds. All else being equal, Algorithm L is the obvious choice for a social network that might grow.

What should we learn from this? That the test data wasn't enough for you to evaluate the two algorithms! You needed to know how the algorithms changed when \(n\) (the total number of users) got bigger. *Big O notation* is the most popular tool for thinking and talking about this problem in computer science.

Say that \(\text{time}_Q(n)\) is a function that represents the amount of time that Algorithm Q takes to deal with \(n\) users. This function is described by a quadratic formula, so the function is roughly proportional to the function \(f(n) = n^2\).

Similarly, say that \(\text{time}_L(n)\) is the amount of time Algorithm L takes to run when given \(n\) users to inspect. The function can be written with a logarithm; it's roughly proportional to \(g(n) = \log n\).

Big O gives computer scientists a rigorous mathematical definition for describing how fast algorithms run.

Using big O notation, we would say that \(\text{time}_Q \in O\big(n^2\big)\) and that \(\text{time}_L \in O(\log n)\). These two different big Os communicate to a computer scientist that the two functions are very different. Algorithm Q takes a quadratic amount of time to run, so it would be common to call it a *quadratic-time algorithm*. Algorithm L, similarly, would be called a *logarithmic-time algorithm*.

A computer scientist will generally pick Algorithm L: logarithmic-time algorithms are generally very fast, even on very large inputs.

One reason big O can be confusing is that big O notation is used in two different ways, and both are important!

- Big O notation is a
**formal, mathematical**way of describing how functions grow. - Big O notation is an
**informal, conversational**language that computer scientists use ubiquitously to describe how fast algorithms run.

Let's talk about the **informal, conversational** idea first.

Tiffany needs to figure out how to find all the cat DNA in her massive DNA database. Jill tells Tiffany that she knows of three algorithms that will do this and that the algorithms are in \(O(\log n\)), \(O(n)\), and \(O(2^n)\).

All else being equal, which one will Tiffany presumably pick for her project?

Here's the **formal, mathematical** definition of Big-O:

\(O\big(g(n)\big)\) is a set of functions. Any other function \(f(n)\) is in the set \(O\big(g(n)\big)\) if the ratio between \(f\) and \(g,\) \(\frac{f(n)}{g(n)},\) eventually goes and stays at or below some constant \(C\) as \(n\) gets very large.

For example, \(f(n) = 3n + 1\) is in \(O(n)\) because the ratio between \(f(n)\) and \(g(n) = n\) stays at or below \(C = 4\) as long as \(n \geq 1\).

The ratio \(\frac{f(n)}{g(n)} = \frac{3n + 1}{n}\) approaches 3 as \(n\) gets bigger and bigger. **But be careful, and pay attention to the definition!** The ratio \(\frac{3n+1}{n}\) is never at or below 3 for positive \(n\), so we can't use \(C = 3\) to show that \(3n + 1 \in O(n)\).

The constant \(C\) in the definition of Big-O is not describing the number that the ratio approaches! It's describing an *upper bound* on the ratio. The ratio \(\frac{f(n)}{g(n)} = \frac{3n + 1}{n}\) stays at or below \(C = 4\) as long as \(n \geq 1\), but it *also* goes and stays below \(C = 6\), and below \(C = 16\). Therefore, \(C = 4\), \(C = 6\), and \(C = 16\) are all valid upper bounds, even though \(C = 3\) is not.

Similarly, \(f(n) = 16n^2\) is in \(O\big(n^2\big)\) because the ratio between \(f(n)\) and \(g(n) = n^2\) is *always at or below* which of the following constants \(C?\)

The mathematical definition of big O means a function like \(f(n) = 3n^2 + n + 50\) is in \(O\big(n^2\big),\) because the ratio between the function \(f(n)\) and the function \(g(n) = n^2\) eventually goes below the constant \(C = 4\) and then stays there:

As a more complicated example, \(f(n) = \big(\sin(10 \cdot n) + 1\big) \cdot n + 4\) is in \(O(n)\), because the ratio between the function \(f(n)\) and the function \(g(n) = n\) eventually goes below the constant \(C = 3\) and then stays there:

The specific constant doesn't matter for big O. We could have picked the constant \(C = 10\) for both examples above, and that would have also been correct.

Here's the definition of Big-O again:

\(O\big(g(n)\big)\) is a set of functions. Any other function \(f(n)\) is in the set \(O\big(g(n)\big)\) if the ratio between \(f\) and \(g,\) \(\frac{f(n)}{g(n)},\) eventually goes and stays at or below some constant \(C\) as \(n\) gets very large.

Which of these statements are true according to the formal, mathematical definition of big O?

Here's the definition of big O again:

\(O\big(g(n)\big)\) is a set of functions. Any other function \(f(n)\) is in the set \(O\big(g(n)\big)\) if the ratio between \(f\) and \(g,\) \(\frac{f(n)}{g(n)},\) eventually goes and stays at or below some constant \(C\) as \(n\) gets very large.

Which of these statements are true according to the formal, mathematical definition of big O?

The previous question pointed out a difference between the formal and informal uses of big O. In informal usage, it's common to expect that if an algorithm is in \(O(2^n)\), its running time will grow exponentially. However, a much slower-growing function like \(f(n) = \log n\) is also in \(O(2^n)\).

You have four algorithms that rate cat pictures. Here are four functions that describe how the four algorithms' running times change as the algorithms are given more cat pictures.

According to the formal, mathematical definition of big O, which of these algorithms appear to have a running time that is in \(O(n)?\)

Big O is a *pessimistic* way of estimating how long a function will take to run. The mathematical definition permits the actual running time of a function to grow much more slowly than big O implies. Big O notation just puts a kind of upper limit on how fast the running time can increase.

Remember Tiffany and Jill? Tiffany needed to figure out how to find all the cat DNA in her database, and Jill told Tiffany she knew of three algorithms that will do this: one in \(O(\log n\)), one in \(O(n)\), and one in \(O(2^n)\).

Relying only on the **formal, mathematical** notion of big O, it's possible that these three algorithms might *all* take a logarithmic amount of time to run, because \(\log n \in O(\log n)\), \(\log n \in O(n)\), **and** \(\log n \in O(2^n)\).

It would be mathematically justified, but *very strange and unhelpful*, if Jill said that three different logarithmic-time-requiring algorithms were in \(O(\log n\)), \(O(n)\), and \(O(2^n)\), respectively.

When a computer scientist uses big O in an

informal, conversationalway to communicate how fast an algorithm is (or how fast she thinks an algorithm might be), she will give thesmallestbig O time she knows how to give.

Tiffany has an algorithm that operates on cat pictures with \(n\) cats in them. She determines that this algorithm will always run in fewer than \(\frac{n^{1.72}}{93.2}\) seconds.

The best thing would be for her to say that the algorithm is in \(O\big(n^{1.72}\big)\), but her boss just wants to know whether it's in \(O(n)\), \(O\big(n^2\big)\), or \(O\big(n^3\big)\), or none of these. What's the *best* answer she could give in this situation?

We're often interested in the big O classification of some polynomial function. These are easy! If you have a polynomial, you can find the big O by looking at the biggest exponent. (This is called the *degree* of the polynomial.)

Here's an example: \[\frac{n^5}{1000} + 1000 n^3 + 300 n^2 + 20 n + 5.\] This polynomial has degree 5 because of the \(n^5\), so the whole polynomial will be in \(O\big(n^5\big)\).

Informally, this is because larger exponents grow much faster than smaller exponents. They grow so much faster that we can effectively ignore the smaller exponents as \(n\) gets really big.

The insertion sort algorithm that we explored previously can sort an array of length \(n\) with no more than \(\frac{n^2 - n}{2}\) comparisons. That means that the running time of insertion sort will never be more than \(kn^2 - kn\), where \(k\) is some constant.

What is the correct big O classification for insertion sort?

Tiffany has a program that rearranges an array of length \(n\), but she doesn't know how it works. She notices that when she *doubles* the length of the array, it consistently *quadruples* the amount of time it takes for the algorithm to run.

What will she guess about the running time of this algorithm?

Now you know a bit more about big O, an important and multipurpose tool in computer science. When you encounter big O notation in the world, it is important to remember that there are two ways that people use the notation, and they're not quite the same!

- Big O is a mathematical notation for talking about sets of functions by giving an upper limit to how fast those functions can grow.
- Big O is an informal way that computer scientists talk about how much an algorithm is expected to slow down as the algorithm is given bigger and bigger inputs.

×

Problem Loading...

Note Loading...

Set Loading...