# Big-O Misconceptions

In computer science and sometimes mathematics, big-O notation is used to talk about how quickly a function grows while disregarding multiplicative and additive constants. When classifying algorithms, big-O notation is useful because it lets us abstract away the differences between real computers as just multiplicative and additive constants.

Big-O is not a difficult concept at all, but it seems to be common even for people who should know better to misunderstand some aspects of it. The following is a list of misconceptions that I have seen in the wild.

But first a definition: We write

$f(n) = O(g(n))$

when

## Misconception 1: “The Equals Sign Means Equality”

The equals sign in

$f(n) = O(g(n))$

is a widespread travestry. If you take it at face value, you can
deduce that since

The expression

The way to interpret the right-hand side is as a *set* of functions:

$ O(f) = \{ g \mid g(n) \le M f(n) \text{ for some \(M > 0\) for large \(n\)}\}. $

With this definition, the world makes sense again: If

## Misconception 2: “Informally, Big-O Means ‘Approximately Equal’"

If an algorithm takes

So informally, big-O means *approximately less than or equal*,
not *approximately equal*.

If someone says “Topological Sort, like other sorting algorithms, is
*technically* correct, but severely
misleading, because Toplogical Sort is also

If someone says “In the worst case, any comparison based sorting
algorithm must make *not* a
correct statement. Translated into English it becomes:

“In the worst case, any comparison based sorting algorithm must make fewer than or equal to

$M n \log (n)$ comparisons”

which is not true: You can easily come up with a comparison based sorting algorithm that makes more comparisons in the worst case.

To be precise about these things we have other types of notation at our disposal. Informally:

$O()$ :Less than or equal, disregarding constants $\Omega()$ :Greater than or equal, disregarding constants $o()$ :Stricly less than, disregarding constants $\Theta()$ :Equal to, disregarding constants

and some more.
The correct statement about lower bounds is this: “In the worst case,
any comparison based sorting algorithm must make

“In the worst case, any comparison based sorting algorithm must make at least

$M n \log (n)$ comparisons”

which is true. And a correct, non-misleading statement about
Topological Sort is that it is

## Misconception 3: “Big-O is a Statement About Time”

Big-O is used for making statements about functions. The functions can measure time or space or cache misses or rabbits on an island or anything or nothing. Big-O notation doesn’t care.

In fact, when used for algorithms, big-O is almost never about time. It is about primitive operations.

When someone says that the time complexity of MergeSort is *time*
complexity of any particular MergeSort might be because that would
depend how much time it takes to make a comparison. In other words,
the *comparisons* as the primitive operation.

The important point here is that when big-O is applied to algorithms,
there is always an underlying model of computation. The claim that the
*time* complexity of MergeSort is

Which is fine as far as it goes. It lets us compare MergeSort to other comparison based sorts, such as QuickSort or ShellSort or BubbleSort, and in many real situations, comparing two sort keys really does take constant time.

However, it doesn’t allow us to compare MergeSort to RadixSort because
RadixSort is not comparison based. It simply doesn’t ever make a
comparison between two keys, so its time complexity in the comparison
model is 0. The statement that RadixSort is

To compare RadixSort to MergeSort, we must first define a shared model
of computation. If we are sorting strings that are

In this model, MergeSort makes

## Misconception 4: “Big-O Is About Worst Case”

Big-O is often used to make statements about functions that measure the worst case behavior of an algorithm, but big-O notation doesn’t imply anything of the sort.

If someone is talking about the randomized QuickSort and says that it
is *expected running
time* is