Waste less time on Facebook — follow Brilliant.
×

Big O Notation

               

In computer science and programming, we’re always looking for “good” solutions. In many cases, the good solution is the one which performs the fastest, which is generally related to using the fewest computations.

To meaningfully compare algorithmic performance, we can use big O notation--sometimes referred to as "order of growth." In short, it compares the asymptotic behavior of algorithms; that is, how does their performance scale as a function of the input size?

Informally, \(f(x) = O(g)\) if \(f\) is "less or about the same size" as \(g\) as \(x\) gets very large.

For example, consider two algorithms with step count represented by \( f(x) = 4x \) and \(g(x) = 10x .\) These are "about the same size" as \(x\) gets very large, since the infinite, linear \(x\) dominates the \(4\) and the \(10.\) Thus, \(4x = O(10x),\) and \(10x = O(4x).\)

On the other hand, compare \(4x\) and \(x^2:\) \(4x\) is less than \(x^2\) as \(x\) gets very large. Thus, \(4x = O(x^2),\) but \(x^2\) does not equal \(O(4x).\)

Formally speaking, if \(f\) and \(g\) are functions, then \(f = O(g)\) if there exists some constant \(C\) such that \[\left|f(x)\right| \le C \cdot \left|g(x)\right|\] for sufficiently large \(x.\)

Another way you could look at this is to divide \(g\) by \(f\) and see if the result is finite. In the examples just given, \(10x/4x = 2.5,\) but \(x^2 /4x = x/4 \) which goes to infinity as \(x\) gets large.

In the equality case \((\)that is, when \(\left|f(x)\right| = C \cdot \left|g(x)\right|\) for sufficiently large \(x),\) it is said that \(f = \Theta(g).\)

Since the theta notation is a special case of big O notation, the term "big O notation" can refer to either, even though only one of them uses an O.

The reason for this formulation is that we aren’t too concerned about constant factors in algorithmic performance, but rather the dependency on the input size--especially when data sets can be extraordinarily large.

When \(f=O(g),\) we're saying that when input sizes scale to become very large, the relative performance of these algorithms is about constant, so they behave "similarly enough."

For example, if one algorithm takes \(100\log_2(n)\) computations and another takes \(n,\) the constant \(100\) becomes entirely irrelevant for large \(n.\) Even for \(n=10000,\) the \(n\) algorithm uses about 10 times as many computations. We want to build algorithms that scale!

In addition, when we talk about Big O Notation, we often "drop" lower degree terms and only keep the most significant one. Remember that we're looking at situations where input sizes scale to become very large, so that's why we can safely drop lower terms.

For example, consider a two-step algorithm that takes \(O(n\log_2(n))\) and \(O(n^{2})\) in each step. The overall runtime for this algorithm in Big O Notation would be \(O(n^{2})\). Although technically we can write the total runtime as \(O(n\log_2(n) + n^{2})\), the \(n^{2}\) term grows much faster than the \(n\log_2(n)\) term as \(n\) grows very large.

Which of the following is the worst case runtime (number of comparisons) for binary search in a list of \(n\) elements?

Note: Binary search requires the list already be sorted, so you can assume that to be the case.

As one additional practice problem to ensure you’ve got the technical big O definition down, which of the following is/are true? \[ \begin{eqnarray} &\text{A:} &\quad 2^{ 2n+1 } \ &=& \ O(2^{ n }) \\ &\text{B:}& \quad{ 2 }^{ n+1 }\ &=& \ O({ 2 }^{ n }) \end{eqnarray} \]

At this point, you're encouraged to go back to the previous few quizzes and think about how we can frame the conclusions we came to in terms of big O notation.

For example, what is the worst case number of comparisons in linear search? What about the average number of comparisons?

What is the number of comparisons on average for insertion sort?

Hint: A reasonable (though over-simplified) assumption is that, on average, the \(n^\text{th}\) successive value is compared to half of the values before it.

Comparing the efficiencies of different solutions to the same problem is a key part of computer science, and big O notation gives us a standardized way of doing so. In our course on Computer Science Algorithms, big O notation is explored in significant detail on a wide range of algorithms.

Warning: It’s important not to conflate the best big O complexity with the “right” answer. Right answers only exist in context of the problem you need to solve, and there are often tradeoffs between simplicity and efficiency. You do not always need to have the most efficient data structure or algorithm in practice, especially when you are dealing with small data sets. For example, insertion sort is a perfectly fine way to sort a poker hand!


In the next quizzes, we’re going to explore a tool called recursion, and then dive into (and evaluate with big O notation!) some more efficient sorting algorithms.

×

Problem Loading...

Note Loading...

Set Loading...