Big-O Misconceptions
In computer science and sometimes mathematics, big-O notation is used to talk about how quickly a function grows while disregarding multiplicative and additive constants. When classifying algorithms, big-O notation is useful because it lets us abstract away the differences between real computers as just multiplicative and additive constants.
Big-O is not a difficult concept at all, but it seems to be common even for people who should know better to misunderstand some aspects of it. The following is a list of misconceptions that I have seen in the wild.
But first a definition: We write
$f(n) = O(g(n))$
when
Misconception 1: “The Equals Sign Means Equality”
The equals sign in
$f(n) = O(g(n))$
is a widespread travestry. If you take it at face value, you can
deduce that since
The expression
The way to interpret the right-hand side is as a set of functions:
$ O(f) = \{ g \mid g(n) \le M f(n) \text{ for some \(M > 0\) for large \(n\)}\}. $
With this definition, the world makes sense again: If
Misconception 2: “Informally, Big-O Means ‘Approximately Equal’"
If an algorithm takes
So informally, big-O means approximately less than or equal, not approximately equal.
If someone says “Topological Sort, like other sorting algorithms, is
If someone says “In the worst case, any comparison based sorting
algorithm must make
“In the worst case, any comparison based sorting algorithm must make fewer than or equal to
$M n \log (n)$ comparisons”
which is not true: You can easily come up with a comparison based sorting algorithm that makes more comparisons in the worst case.
To be precise about these things we have other types of notation at our disposal. Informally:
$O()$ :Less than or equal, disregarding constants $\Omega()$ :Greater than or equal, disregarding constants $o()$ :Stricly less than, disregarding constants $\Theta()$ :Equal to, disregarding constants
and some more.
The correct statement about lower bounds is this: “In the worst case,
any comparison based sorting algorithm must make
“In the worst case, any comparison based sorting algorithm must make at least
$M n \log (n)$ comparisons”
which is true. And a correct, non-misleading statement about
Topological Sort is that it is
Misconception 3: “Big-O is a Statement About Time”
Big-O is used for making statements about functions. The functions can measure time or space or cache misses or rabbits on an island or anything or nothing. Big-O notation doesn’t care.
In fact, when used for algorithms, big-O is almost never about time. It is about primitive operations.
When someone says that the time complexity of MergeSort is
The important point here is that when big-O is applied to algorithms,
there is always an underlying model of computation. The claim that the
time complexity of MergeSort is
Which is fine as far as it goes. It lets us compare MergeSort to other comparison based sorts, such as QuickSort or ShellSort or BubbleSort, and in many real situations, comparing two sort keys really does take constant time.
However, it doesn’t allow us to compare MergeSort to RadixSort because
RadixSort is not comparison based. It simply doesn’t ever make a
comparison between two keys, so its time complexity in the comparison
model is 0. The statement that RadixSort is
To compare RadixSort to MergeSort, we must first define a shared model
of computation. If we are sorting strings that are
In this model, MergeSort makes
Misconception 4: “Big-O Is About Worst Case”
Big-O is often used to make statements about functions that measure the worst case behavior of an algorithm, but big-O notation doesn’t imply anything of the sort.
If someone is talking about the randomized QuickSort and says that it
is