A simple and modest proposal for improving asymptotic notation

(Summary: I propose that we should use inequalities with asymptotic notation like big-O, big-\Omega, little-\omega, little-o, and \mathrm{poly} when they represent only an upper bound or only a lower bound. E.g., f(n) \leq O(n^3) or f(n) > \omega(\log n) or f(n) < 2^{-o(n)}.)

Recall the big-O asymptotic notation. E.g., for two functions f, g : \mathbb{Z}_{> 0} \to \mathbb{R}_{> 0}, we write  f(n) = O(g(n)) to mean that there is some universal constant C > 0 such that f(n) \leq C g(n). (Of course, there are many many variants of this. I tried here to choose a setting—functions whose domain is the positive integers and range is the positive reals—in which this notoriously ambiguous notation is least ambiguous. But, everything that I will say about big-O notation below has exceptions. Indeed, any attempt to formally define how this notation is actually used in any reasonable amount of generality will quickly fail in the face of our collective notational creativity.) For example,  10n^2 = O(n^3) and also  10n^2+245 = O(n^2), but  10n^2 \neq O(n) and  10n^2 \neq O(n^2/\log n). As a more complicated and mildly interesting example, we have  10n^2 = O(n^2 (2+\sin(n)) but  10n^2 \neq O(n^2 (1+\sin(n)). (Proving the latter fact requires a tiny bit of diophantine approximation.)

To greatly understate things, this notation is useful. But, a common complaint about it is that it is an abuse of the equals sign. (Deligne points out that the equals sign is actually abused quite frequently.) Indeed, it is a little disconcerting that we write an “equality” of the form  f(n) = O(g(n)) when the left-hand side and the right-hand side don’t even seem to have the same type. (What even is the type of the right-hand side O(g(n))? The left-hand side is clearly a function of a variable n. The right-hand side clearly is not a function of n, since it makes no sense to ask what its value is at, say, n = 1.) And, it is quite worrisome that this use of the equals sign is not at all symmetric, since  f(n) = O(g(n)) certainly does not imply that  g(n) = O(f(n)) (and we would never write O(g(n)) = f(n)).

These problems are of course confusing for people when they first encounter the notation. But, they also lead to many little annoying issues in actual papers, where even brilliant computer scientists and mathematicians will frequently write silly things like “we prove a lower bound of  f(n) = O(n^2) on the running time of any algorithm for this problem.” This is presumably meant to suggest that the lower bound is good in that it is reasonably large (presumably that it grows quadratically with n) but when interpreted literally, this statement suggests that the authors have proven some unspecified lower bound f(n) satisfying  f(n) = O(n^2). For example, a lower bound of f(n) = 1 on the running time of an algorithm is almost never impressive but certainly satisfies this. (In such cases, the authors usually mean to say “we prove a lower bound of  f(n) = \Omega(n^2).”)

Of course, this is really just pedantry, since in almost all such cases it is perfectly clear what the authors meant. But, (1) pedantry is good; (2) saying “what I wrote is technically wrong but the reader should be able to figure it out” is rarely good; and (3) in even slightly more complicated or less clear cases, this can and does lead to confusion or even bugs.

A commonly proposed resolution to this is to interpret O(g(n)) as the set of all functions such that f(n)/g(n) is bounded. Then, rather than writing  f(n) = O(g(n)), one should write f(n) \in O(g(n)). A lot of people like to point out that this is technically more accurate (see Wikipedia), and one occasionally sees such notation used in the wild. However, it is used quite rarely, in spite of the fact that people have been pointing out for decades that this is the “technically correct notation.” I won’t pretend to truly know how other people’s minds work, but I suspect that the reason that this notation has not caught on is simply because it does not match how we think about big-O. I don’t think of  f(n) = O(g(n)) as meaning “f is in the set of functions such that…”, but instead, I think of it as representing an upper bound on f. (Of course, more-or-less any mathematical notation can be defined in terms of sets, and in many formal axiomatizations of mathematics, everything simply is a set. E.g., an overly formal interpretation of the notation x \leq y is really saying that the ordered pair (x,y) is contained in the set of ordered pairs defining the relation \leq. But, we do not parse it that way in most situations. So, the fact that it is technically consistent to view the statement  f(n) = O(g(n)) as placing f in a set is not very interesting. The question is really whether it’s actually convenient to view the notation in this way.)

So, I propose that we use inequalities with big-O notation in this context. For example, I propose that we write f(n) \leq O(g(n)). To me, this simply represents exactly what is meant by the notation. Indeed, it is difficult for me to read  f(n) \leq O(g(n)) and interpret it in any way other than the correct way. Specifically, it represents an asymptotic upper bound on f in terms of g. (Here and throughout, I am using blue to represent the notation that I am proposing and red to represent the more standard notation.)

In fact, I find that this notation is so intuitive that in my papers and courses, readers immediately know what it means without me even having to mention that I have switched from the more conventional  f(n) = O(g(n)) to the variant  f(n) \leq O(g(n)). (This is why I view this as a quite “modest” proposal.) One can therefore simply start using it in ones work without any fanfare (or any apparent downside whatsoever)—and I suggest that you do so.

More generally, when asymptotic notation is used to represent an upper bound, we should write it as an upper bound. Anyway, that’s the high-level idea. I’ll now go into more detail, explaining how I think we should write more complicated expressions with asymptotic notation and how to use the same notation in the context of big-\Omega, little-o, little-\omega, big-\Theta, and \mathrm{poly}. To be clear, this idea is not novel, and one can already find it in many papers by many excellent authors. It seems that most authors mix inequalities and equalities rather freely in this context. (I asked ChatGPT to find early examples of this notation, and it found a 1953 paper by Tsuji, where he uses this notation in chains of inequalities. More generally, this notation is more often used in the literature in chains of inequalities and in other spots where inequalities might seem more natural than equalities.) I am merely advocating that we as a community try to use it consistently—in papers and in teaching.

Fancier use cases and choosing the direction of the inequality

Of course, big-O notation gets used in fancier ways than simply writing things like  f(n) =O(g(n)). For example, one might wish to say that an event happens with probability that is at least n^{-C} for some constant C, which many will write quite succinctly as  f(n) = n^{-O(1)}. Now the big-O has made its way all the way up to the exponent! And, because of the minus sign, this is actually a lower bound on f(n) using big-O notation, whereas simpler uses of big-O always yield upper bounds.

In this case, I naturally propose writing  f(n) \geq n^{-O(1)}. This again makes it very clear what’s going on—we are giving a lower bound on f(n).

One might have a more complicated expression such as  f(n) = 1+\frac{1}{1-e^{-O(n)}} or something, where it is quite easy to get confused about whether an upper bound or lower bound on f is intended. (Yes, I know that this expression simplifies.) In these cases, using an inequality sign provides some help to the reader, as evidenced by how much easier it is (at least for me) to parse the expression   f(n) \geq 1+ \frac{1}{1-e^{-O(n)}}. So, one does a real service for the reader by writing   f(n) \geq 1 + \frac{1}{1-e^{-O(n)}}.

There are also examples where the traditional notation is actually formally ambiguous. This often occurs when big-O is used around a lower-order term, such as in the expression  f(n) = 12n^2 + O(n). Here, it is not clear whether one simply means that f(n) \leq 12n^2 + C n for some constant C >0, or if one means that |f(n)-12n^2| \leq C n for some constant C > 0. (A formal reading of the definition of big-O would suggest the former, but I think authors typically mean the latter. Admittedly, some of the ambiguity here comes from the question of how big-O interacts with functions taking negative values, which I am deliberately ignoring.) Here, the notation that I am proposing can help a bit, in that one can write  f(n) \leq 12n^2 + O(n) in the first case while still writing  f(n) = 12 n^2 + O(n) in the second case. (Of course, unless this notation comes into very widespread use, one should not assume that the reader recognizes the distinction. One should often clarify in more than one way what one means in cases where the risk of ambiguity is great. When it doesn’t matter much—e.g., in an aside, or in a case where the weaker fact  f(n) \leq 12 n^2 + O(n) is all that’s needed but the stronger fact that  f(n) = 12 n^2 + O(n) is true—one can simply rely on this notation to suggest the correct interpretation without burdening the reader with much discussion.) In particular, notice that if one truly means that |f(n) - 12n^2| \leq Cn, then the use of the equals sign in the expression  f(n) = 12n^2 + O(n) is actually justified, since we are in fact placing f(n) in an equivalence class in this case (whereas the notation  f(n) = O(n^2) does not place f in any reasonable equivalence class). We will see more examples of this below

All of this means that authors who use this convention must be sure to carefully think about what specifically they mean when they write asymptotic notation. That is a good thing.

Cousins of big-O

Big-O also has its close cousins big-\Omega, little-o and little-\omega, and it is relatively straightforward to see how these notations should be modified.

  • Big-\Omega the old way: One typically writes  f(n) = \Omega(g(n)) if and only if there is some universal constant C > 0 such that f(n) \geq C g(n). Equivalently,  f(n) = \Omega(g(n)) if and only if  g(n) = O(f(n)).
    My recommendation: As you might guess, I recommend instead writing  f(n) \geq \Omega(g(n)), to make clear that the big-\Omega notation is giving a lower bound.
  • Little-o the old way: One typically writes  f(n) = o(g(n)) if and only if \lim_{n \to \infty} f(n)/g(n) = 0.
    My recommendation: I recommend instead writing  f(n) < o(g(n)). This makes clear both that the little-o notation is giving an upper bound, and in my opinion the strict inequality does a good job of representing the distinction between big-O and little-o. In particular,  f(n) < o(g(n)) means that f is “asymptotically strictly smaller” than g, where the fact that this inequality is strict is justified formally by the fact that  f(n) \not < o(f(n)).
  • Little-\omega the old way: One typically writes  f(n) = \omega(g(n)) if and only if \lim_{n \to \infty} f(n)/g(n) = \infty. Equivalently,  f(n) = \omega(g(n)) if and only if  g(n) = o(f(n)).
    My recommendation: As you might guess, I recommend instead writing   f(n) > \omega(g(n)).
  • Big-\Theta the old way: One typically writes  f(n) = \Theta(g(n)) if and only if  f(n) = O(g(n) AND  g(n) = O(f(n)).
    My recommendation: Keep doing it the old way! When we write  f(n) = \Theta(g(n)), we are in fact proving both an upper and lower bound on f and we are relating f and g according to an equivalence relation. So, in this case, the traditional notation is already great!

As before, in slightly more complicated expressions the inequality sign can flip. E.g.,  f(n) > 2^{-o(n)} is a perfectly good lower bound in terms of little-o, just like  f(n) \geq 2^{-O(n)} is a lower bound in terms of big-O and  f(n) < 2^{-\omega(n)} is an upper bound in terms of little-\omega. And, sometimes the specific choice of which (in)equality symbol to use can convey additional meaning that is not clear without it, like in the expression  f(n) < n + o(n), where the inequality is meant to make clear that an upper bound on f(n) is all that is intended. In contrast,  f(n) = n + o(n) suggests that we also mean that  f(n) > n - o(n).

I’ll end by noting that big-O notation has some rather distant cousins for which this notation is particularly important. For example, there is the \mathrm{poly} notation, where one writes  f(n) = \mathrm{poly}(g(n)) to mean… something. In fact, out of context, it is often entirely unclear what this notation means. This might mean that f(n) \leq C g(n)^C for some constant C > 0, or it might mean f(n) \geq C g(n)^C for some constant C > 0, or it might mean that both things are true. (Note that including the constant C in two places is a convenient trick to avoid needing two constants.) Of course, often the context makes this clear. But, one can be unambiguous by instead writing  f(n) \leq \mathrm{poly}(g(n)),  f(n) \geq \mathrm{poly}(g(n)) and  f(n) = \mathrm{poly}(g(n)) respectively. Other examples of distant cousins include \mathrm{polylog}, \mathrm{negl}, \mathrm{subexp}, etc. For example, one traditionally writes f(n) = \mathrm{negl}(n) to mean that  f(n) = n^{-\omega(1)}, i.e., that f is a negligible function. Hopefully it is clear to the reader what I propose: we should write  f(n) < \mathrm{negl}(n).

Leave a Reply

Your email address will not be published. Required fields are marked *