×
Back to all chapters

# Computer Science Fundamentals

The fundamental toolkit for the aspiring computer scientist or programmer.

# Big O Notation

In computer science and programming, we’re always looking for “good” solutions. In many cases, the good solution is the one which performs the fastest, which is generally related to using the fewest computations.

To meaningfully compare algorithmic performance, we can use big O notation--sometimes referred to as "order of growth." In short, it compares the asymptotic behavior of algorithms; that is, how does their performance scale as a function of the input size?

Formally speaking, if $$f$$ and $$g$$ are functions, then $$f = O(g)$$ if there exists some constant $$C$$ such that $\left|f(x)\right| \le C \cdot \left|g(x)\right|$ for sufficiently large $$x.$$

Colloquially, this is saying that for large inputs $$x,$$ $$f$$ is less than or equal to $$g$$ up to some constant scale factor. For example, if $$f(x) = 3x$$ and $$g(x) = x,$$ then $$f=O(g)$$ since we could take any constant $$C \ge 3$$ and the inequality would hold.

In the equality case $$($$that is, when $$\left|f(x)\right| = C \cdot \left|g(x)\right|$$ for sufficiently large $$x),$$ it is said that $$f = \Theta(g).$$

Since the theta notation is a special case of big O notation, the term "big O notation" can refer to either, even though only one of them uses an O.

The reason for this formulation is that we aren’t too concerned about constant factors in algorithmic performance, but rather the dependency on the input size--especially when data sets can be extraordinarily large.

When $$f=\Theta(g),$$ we're saying that when input sizes scale to become very large, the relative performance of these algorithms is about constant, so they behave "similarly enough."

For example, if one algorithm takes $$100\log_2(n)$$ computations and another takes $$n,$$ the constant $$100$$ becomes entirely irrelevant for large $$n.$$ Even for $$n=10000,$$ the $$n$$ algorithm uses about 10 times as many computations. We want to build algorithms that scale!

In addition, when we talk about Big O Notation, we often "drop" lower degree terms and only keep the most significant one. Remember that we're looking at situations where input sizes scale to become very large, so that's why we can safely drop lower terms.

For example, consider a two-step algorithm that takes $$O(n\log_2(n))$$ and $$O(n^{2})$$ in each step. The overall runtime for this algorithm in Big O Notation would be $$O(n^{2})$$. Although technically we can write the total runtime as $$O(n\log_2(n) + n^{2})$$, the $$n^{2}$$ term grows much faster than the $$n\log_2(n)$$ term as $$n$$ grows very large.

Which of the following is the worst case runtime (number of comparisons) for binary search in a list of $$n$$ elements?

Note: Binary search requires the list already be sorted, so you can assume that to be the case.

As one additional practice problem to ensure you’ve got the technical big O definition down, which of the following is/are true? $\begin{eqnarray} &\text{A:} &\quad 2^{ 2n+1 } \ &=& \ O(2^{ n }) \\ &\text{B:}& \quad{ 2 }^{ n+1 }\ &=& \ O({ 2 }^{ n }) \end{eqnarray}$

At this point, you're encouraged to go back to the previous few quizzes and think about how we can frame the conclusions we came to in terms of big O notation.

For example, what is the worst case number of comparisons in linear search? What about the average number of comparisons?

What is the number of comparisons on average for insertion sort?

Hint: A reasonable (though over-simplified) assumption is that, on average, the $$n^\text{th}$$ successive value is compared to half of the values before it.

Comparing the efficiencies of different solutions to the same problem is a key part of computer science, and big O notation gives us a standardized way of doing so. In our course on Computer Science Algorithms, big O notation is explored in significant detail on a wide range of algorithms.

Warning: It’s important not to conflate the best big O complexity with the “right” answer. Right answers only exist in context of the problem you need to solve, and there are often tradeoffs between simplicity and efficiency. You do not always need to have the most efficient data structure or algorithm in practice, especially when you are dealing with small data sets. For example, insertion sort is a perfectly fine way to sort a poker hand!

In the next quizzes, we’re going to explore a tool called recursion, and then dive into (and evaluate with big O notation!) some more efficient sorting algorithms.

×