# Big O Notation

**Big O notation** is a system used in computer science to analyze how much time it takes for a program to run. Despite the fact that processing power is improving daily, computers are only as good as the efficiency of their algorithms.

Often, an algorithm's runtime depends on the size of the input, how the algorithm is designed, and hardware parameters such as CPU speed, ram, etc. However, many of these factors are controlled for, so the goal of big O notation is to answer the question, "When an algorithm is fed an input of size \(n\), how long will an algorithm take to run, all else being equal?" The runtime is expressed as a function of \(n\), the input size, while the output of the function gives a rough measure of the number of operations required for the algorithm to run. This could be the number of elements in an array, or the number of bits being multiplied.

In particular, algorithm analysis is focused on how the amount of run time (or other resources like memory or network bandwidth) scales with the size of the input data. An algorithm which has O(n\(^2\)) behavior might run somewhat faster than a much more sophisticated algorithm which has run time O(n log(n)) when both algorithms are presented with an input of 1000 items of data. When, however, they are given 10\(^6\) input items, the former takes 10\(^6\) times longer whilst the latter only takes a few thousand times longer. The difference on the small input might not be noticeable whilst the difference on the larger input might be the difference in running times between a few seconds and an hour.

#### Contents

## Intuitive Meanings

Formally, we use the big O notation, and its relatives \(\Theta\) and \(\Omega,\) to state bounding functions on our runtime. For example,

- If our algorithm's runtime \(f(n)\) will always run in less time than \(g(n)\), we say \(f = O(g)\).
- If our algorithm's runtime \(f(n)\) will always run in more time than \(g(n)\), we say \(f = \Omega(g)\).
- If our algorithm's runtime \(f(n)\) will always run in the same time as \(g(n)\), we say \(f=\Theta(g)\).

This is similar to the usage of O notation in calculus, where we use O notation to indicate that terms above a certain power are irrelevant to the essential behavior of a function, i.e. as in taking a linear approximation \((\sin x \approx x + O(x^2)\) when \(x\) is near zero).

Thus,

- The \(\Theta\) notation shows an asymptotically tight bound.
- The \(O\) notation shows an asymptotic upper bound.
- The \(\Omega\) notation shows an asymptotic lower bound.

It is important to note that we use this notation to describe a set of functions, rather than a function itself. So, it would be more appropriate to say that \( a n^2 + b n + c \in \Theta(n^2) \). However, we say \( a n^2 + b n + c = \Theta(n^2) \) for convenience.

Here are some possible big O categories from fastest to slowest, along with some programs that might fall under each category:

\(O(1)\): This big O category consists of algorithms for which we can assign a runtime \(x\) such that the algorithm never takes time more than \(x\) to terminate for all possible inputs. Algorithms which run for the same amount of time for all possible inputs also fall under this category since we can bound the runtime above by that constant runtime, but note that algorithms like "swap two numbers if they are unequal" which have a runtime depending on whether the two numbers are equal or not also fall under the \(O(1)\) category.

\(O(\log n)\): This is a pretty nice big O category. It grows much slower than \(O(n)\). For example, binary search on a sorted list takes \(O(\log n)\) time. Note that the base of the logarithm is not included, since all logarithms differ only by a factor of a constant.

\(O(n)\): This big O category usually results when a program iterates over data one or more times. For example, a program to find the maximum value in a list usually falls under this category, as it loops over the list once keeping track of the maximum so far.

\(O(n\log n)\): This is included because often-used sorting algorithms, such as mergesort and quicksort, fall under this category.

\(O(n^2)\): This category happens when a program has a loop within a loop.

\(O(a^n)\): This is a very slow big O category. Usually, a program in this category has \(n\) values that each can be any of \(a\) values.

\(O(n!)\): This is one of the worst big O categories. Usually, a program in this category works with all permutations of a list.

You lost your wedding ring on the beach, and have no memory of when it came off. Thus, you decide to do a brute force grid search with your metal detector, where you divide the beach into strips, and walk down every strip, scanning the whole beach, until you find it. For simplicity, assume the beach is a square of side length \(l\). Find the \(O\) performance of your ring finding algorithm.

Because you're walking the whole beach, scanning the area along each strip, the total length of beach you must cover is \(l\times l\). If, we divide your path into bins of size \(\Delta l\), we can say that each time you advance \(\Delta l\) along a strip, you must scan an area \(\sim l\Delta l\) which takes a time \(t_{\Delta l}\). Thus, the whole beach takes you \(t_{\Delta l}\sum l\Delta l = l^2t_{\Delta l}\), and thus your grid search is \(O(l^2)\). \(_\square\)

## Write a program that prints all permutations of the first \(n\) positive integers that are in strictly increasing order.

The program is very easy to write for a novice, since there's only one very obvious permutation. However, consider another approach: going through every permutation and examining whether the elements are in increasing order. This would run in \(O(n!)\) time, since there are \(n!\) permutations. It takes a constant number of operations to check each permutation, and the constant is dropped. This is infeasible, as \(n!\) grows much too fast. Even \(15!\) would take a nontrivial amount of time. \(_\square\)

## Definitions

The figure below makes the O notation usage clear.

We can codify this intuition in a formal way, again reflected in the figure above.

- If \(\Theta(g(n)) = f(n)\), then there exists positive constants \(c_1, c_2,n_0\) such that \( c_1g(n) \leq f(n) \leq c_2g(n)\) for all \(n \geq n_0. \)
- If \(O(g(n)) = f(n),\) then there exists positive constants \(c,n_0\) such that \(f(n) \leq cg(n)\)for all \(n \geq n_0.\)
- If \( \Omega(g(n)) = f(n),\) then there exists positive constants \(c,n_0\) such that \(cg(n) \leq f(n)\) for all \(n \geq n_0. \)

One shortcoming of \(O\) and \(\Omega\) notations is that it does not distinguish between bounds that are asymptotic and non-asymptotic. For example, it is true that \(a n^2 = O (n^2) \) and also \( 7 = O (n^2) \). So the following little notations are used to express functions that are *not* asymptotic bounds:

\[ o(g(n)) = \left \{ f(n) :\text{For any positive constant } c > 0, \text{ there exists a constant } n_0 > 0 \\ \text{ such that } 0 \leq f(n) < cg(n) \text{ for all } n \geq n_0 \right \}. \]

\[ \omega(g(n)) = \left \{ f(n) :\text{For any positive constant } c > 0, \text{ there exists a constant } n_0 > 0 \\ \text{ such that } 0 \leq cg(n)< f(n) \text{ for all } n \geq n_0 \right \}. \ _\square \]

The difference in the definitions between the little and the big notations is a bit subtle to see at first. What is important is that we required the inequalities to satisfy for *some* \(c\) in case of \(O\) and \(\Omega\) notations; however, we need to satisfy them for *all* \(c\) in this case.

- The \(o\) notation stands for upper bounds that are
*not*asymptotic. - The \(\omega\) notation stands for lower bounds that are
*not*asymptotic.

A program with runtime \(f(n) = e^n + n!\) has dominant term \(e^n\), which quickly outpaces any program with \(n!\) runtime, and is much faster than programs with runtime \(e^{\frac{n^2}{10}}\) for \(n \geq n_0 \approx 22\).

Therefore, we can say

- \(f(n) = \omega(n!)\)
- \(f(n) = O(e^n)\)
- \(f(n) = o(e^{n^2})\).

## Properties

For any two functions \(f(n)\) and \(g(n)\), we have

\[f(n) = \Theta (g(n)) ~\text{ if and only if } ~f(n) = O(g(n)) \text{ and } f(n) = \Omega(g(n)). \ _\square\]

## Transitivity\[\]

\[ f(n) = \Theta (g(n)), g(n) = \Theta (h(n)) \implies f(n) = \Theta (h(n)) \]

\[ f(n) = O (g(n)), g(n) = O (h(n)) \implies f(n) = O (h(n)) \]

\[ f(n) = \Omega (g(n)), g(n) = \Omega (h(n)) \implies f(n) = \Omega (h(n)) \]

\[ f(n) = o (g(n)), g(n) = o (h(n)) \implies f(n) = o (h(n)) \]

\[ f(n) = \omega (g(n)), g(n) = \omega (h(n)) \implies f(n) = \omega (h(n))\]

## Reflexivity \[\]

\[ f(n) = \Theta (f(n)) \]

\[ f(n) = O (f(n)) \]

\[ f(n) = \Omega (f(n)) \]

Please note that

- \(f(n) \neq o(f(n))\)
- \(f(n) \neq \omega(f(n))\),

## Symmetry\[\]

\[ f(n) = \Theta (g(n))\Leftrightarrow g(n) = \Theta (f(n)) \]

\[\]

## Transpose Symmetry\[\]

\[ f(n) = O (g(n))\Leftrightarrow g(n) = \Omega (f(n)) \]

\[ f(n) = o (g(n))\Leftrightarrow g(n) = \omega (f(n)) \]