I totally understand what big $O$ notation means. My issue is when we say $T(n)=O(f(n))$ , where $T(n)$ is running time of an algorithm on input of size $n$.

I understand semantics of it. But $T(n)$ and $O(f(n))$ are two different things.

$T(n)$ is an exact number, But $O(f(n))$ is not a function that spits out a number, so technically we can't say $T(n)$equals$O(f(n))$, if one asks you what's the value of $O(f(n))$, what would be your answer? There is no answer.

$\begingroup$Simply, the = symbol in $T(n)=O(f(n))$ does not mean "equals". The question rests on this assumption. Did you find a source that says so?$\endgroup$
– ShreevatsaRDec 11 '18 at 20:33

20

$\begingroup$This is just one of the many ways mathematical notation is abused :($\endgroup$
– technical_difficultyDec 11 '18 at 22:37

13

$\begingroup$If 64 is a number how can I be 64?$\endgroup$
– TaWDec 12 '18 at 12:53

7

$\begingroup$Wikipedia is always a good place to start looking for an answer - it has a section discussing this exact point.$\endgroup$
– DukelingDec 12 '18 at 14:45

3

$\begingroup$@mathreadler There is no algorithm. Thinking that big-O is talking about an algorithm is like thinking that decimal is talking about somebody's height. big-O is a notation for talking about the growth rate of mathematical functions; decimal is a notation for numbers. The mathematical function could be, but doesn't have to be, the running time of some algorithm; the number could be, but doesn't have to be, somebody's height.$\endgroup$
– David RicherbyDec 14 '18 at 1:47

10 Answers
10

Strictly speaking, $O(f(n))$ is a set of functions. So the value of $O(f(n))$ is simply the set of all functions that grow asymptotically not faster than $f(n)$. The notation $T(n) = O(f(n))$ is just a conventional way to write that $T(n) \in O(f(n))$.

Note that this also clarifies some caveats of the $O$ notation.
For example, we write that $(1/2) n^2 + n = O(n^2)$, but we never write that $O(n^2)=(1/2)n^2 + n$. To quote Donald Knuth (The Art of Computer Programming, 1.2.11.1):

The most important consideration is the idea of one-way equalities. [...] If $\alpha(n)$ and $\beta(n)$ are formulas that involve the $O$-notation, then the notation $\alpha(n)=\beta(n)$ means that the set of functions denoted by $\alpha(n)$ is contained in the set denoted by $\beta(n)$.

$\begingroup$I don't understand the second paragraph. I agree that when we write $f = \mathcal{O}(f(n))$, we mean $f \in \mathcal{O}(f(n))$. But shouldn't $\mathcal{O}(f(n)) = \mathcal{O}(g(n))$ be interpreted as set equality, since $\mathcal{O}(f(n)) \in \mathcal{O}(g(n))$ does not make sense (it is a type error!). If so, why do you say $\mathcal{O}(n^2) \in \mathcal{O}(n^3)$ but $\mathcal{O}(n^3) \notin \mathcal{O}(n^2)$ in the second paragraph?$\endgroup$
– Alex VongDec 12 '18 at 23:07

7

$\begingroup$because when both are set we interpret it as $O(n^2) \subset O(n^3)$$\endgroup$
– RiaDDec 12 '18 at 23:18

1

$\begingroup$I'd use words in places where it's actually needed. In computations you usually need only one sided thingy$\endgroup$
– RiaDDec 12 '18 at 23:27

11

$\begingroup$I have never, ever seen O(n^2)=O(n^3) in any text or other serious source. Can you cite one?$\endgroup$
– YakkDec 13 '18 at 15:29

1

$\begingroup$If we want to be more strict, $f(n)$ and $n^3$ aren't functions, $f$ and $n \mapsto n^3$ are. But "for some reason" I haven't seen anyone write $n \mapsto n^2 \in \mathcal{O}(n \mapsto n^3)$ in a text that is meant to be read by people.$\endgroup$
– JiKDec 14 '18 at 14:41

$O$ is a function
$$\begin{align}
O : (\mathbb{N}\to \mathbb{R}) &\to \mathbf{P}(\mathbb{N}\to \mathbb{R})
\\ f &\mapsto O(f)
\end{align}$$
i.e. it accepts a function $f$ and yields a set of functions that share the asymptotic bound of (at most) $f$. And strictly speaking the correct notation is thus
$$
(n \mapsto T(n)) \in O(n\mapsto f(n))
$$
or short
$$
T \in O(f)
$$
but it's customary in maths, science and CS to just use a variable somewhere in the expression to denote that you're considering functions of the argument $n$ on both sides. So $T(n) \in O(f(n))$ is quite fine as well. $T(n) = O(f(n))$ is pretty much wrong though, as you suspected. It is very commonly used though, so definitely keep in mind what people mean when they write this.

$\begingroup$$T(n)=O(f(n)$ is a completely standard use of notation so claiming that it's wrong is unhelpful. (As, IMO, is claiming that $O$ is a function; that's technically true, but it's not really a helpful way to think about it.)$\endgroup$
– David RicherbyDec 10 '18 at 12:25

38

$\begingroup$@DavidRicherby some things are completely standard but shouldn't be. $T(n) = O(f(n))$ is one example. Sure it's still good to know what people mean by this (as the OP does already), but how is it not helpful to confirm that this notation is technically speaking bogus? Why would you use it? Even if the $=$ version isn't ambiguous, neither is the $\in$ one, and the more people switch to that notation the better. It's always better to stick to what actually makes sense mathematically, unless it's much more awkward to write. $\in$ is perfectly readable and easy to write.$\endgroup$
– leftaroundaboutDec 10 '18 at 12:36

1

$\begingroup$@leftaroundabout You've put your finger on it when you say “unless it's much more awkward to write” — Working with $\in$ is indeed much more awkward to write, except in the special case where there are no $O()$ terms on the LHS and exactly one on the RHS. (See e.g. working with asymptotics like this answer of mine and compare that to the answer there that abandons all the benefits of O() notation and has to rely on an unjustified assumption.) The purpose of notation is to aid thought, and there is much more to gain here by changing the meaning of “=”$\endgroup$
– ShreevatsaRDec 12 '18 at 14:24

2

$\begingroup$@ShreevatsaR not sure where you're going here. I haven't read the linked post very thoroughly, but TBH your post there seems the most convoluted and needs a good bunch of (somewhat obscure, “can't find this one in a book right now”) rules, whereas the other answers readily give the solution from first principles. Anyways what stops you from just replacing the “abused” $=$ signs with $\in$ and $\subset$ as appropriate?$\endgroup$
– leftaroundaboutDec 12 '18 at 15:38

1

$\begingroup$@ShreevatsaR I agree that building up a set of rules / theorems is the point of theory. But what's most important is to clearly formalise for each rule when, exactly, it applies. IMO type theory is the best framework for that, but naïve sets are close enough in practice. Naïve “equations” of algebraic expressions involving $O$/$o$/$\Theta$ symbols are not, though. — “They make you think of the formalism of set theory,”–mission accomplished!–”which does not match human thought...” – doesn't it? In practice, talking about is-a–relationships is exactly what you're doing with type theory.$\endgroup$
– leftaroundaboutDec 13 '18 at 8:44

Formally speaking, $O(f(n))$ is a the set of functions $g$ such that $g(n)\leq k\,f(n)$ for some constant $k$ and all large enough $n$. Thus, the most pedantically accurate way of writing it would be $T(n)\in O(f(n))$. However, using $=$ instead of $\in$ is completely standard, and $T(n)=O(f(n))$ just means $T(n)\in O(f(n))$. This is essentially never ambiguous because we almost never manipulate the set $O(f(n))$.

In a sense, using equality makes $O(f(n))$ mean "some function $g$ such that $g(n)\leq f\,g(n)$ for all large enough $n$", and this means that you can write things like $f(n) = 3n + O(\log n)$. Note that this is much more precise than, e.g., $f(n)=\Theta(n)$ or $f(n)=O(n+\log n)$.

$\begingroup$You could also write $f(n) - 3n \in O(\log n)$. Though I admit that it can be handy to conclude a multiple-step computation with $f(n) = \ldots = 3n + O(\log n)$.$\endgroup$
– leftaroundaboutDec 10 '18 at 12:42

$\begingroup$The rearrangement only works in standalone statements. It's much more common in the middle of calculations, where that kind of thing doesn't work, and where multiple functions get absorbed together into the Landau notation. (Stuff like $f(x) = e^{-x}(e^{2x}+O(x)) = e^x+o(1)$).$\endgroup$
– David RicherbyDec 10 '18 at 12:48

4

$\begingroup$I find computations like that jarring; those equals signs aren't bidirectional anymore. I'm not sure there's more of a problem with writing $f(x) \in e^x(e^{2x}+O(x)) \subset e^x + o(1)$. I suppose that's also abuse of notation; basically you're overloading the $=$ operator whereas I prefer to lift $+$ and $\cdot$ to also operate on sets.$\endgroup$
– leftaroundaboutDec 10 '18 at 12:57

$\begingroup$A possibility would be to use slightly different notations for the set of asymptotic functions, $O(h)$, and for an undetermined element of this set, say $\mathord{\in}O(h)$. So, if $f-g \in O(h)$, one can write $f = g+\mathord{\in}O(h)$ instead of the ambiguous $f = g+O(h)$. You can then write without problem $\mathord{\in}O(h) = f-g$. Other possible notations for an unspecified element of $O(h)$ might be $\dot O(h)$, $\hat O(h)$...$\endgroup$
– Michel FiocDec 11 '18 at 9:16

Prologue: The big $O$ notation is a classic example of the power and ambiguity of some notations as part of language loved by human mind. No matter how much confusion it have caused, it remains the choice of notation to convey the ideas that we can easily identify and agree to efficiently.

I totally understand what big $O$ notation means. My issue is when we say $T(n)=O(f(n))$ , where $T(n)$ is running time of an algorithm on input of size $n$.

Sorry, but you do not have an issue if you understand the meaning of big $O$ notation.

I understand semantics of it. But $T(n)$ and $O(f(n))$ are two different things. $T(n)$ is an exact number, But $O(f(n))$ is not a function that spits out a number, so technically we can't say $T(n)$equals$O(f(n))$, if one asks you what's the value of $O(f(n))$, what would be your answer? There is no answer.

What is important is the semantics. What is important is (how) people can agree easily on (one of) its precise interpretations that will describe asymptotic behavior or time or space complexity we are interested in. The default precise interpretation/definition of $T(n)=O(f(n))$ is, as translated from Wikipedia,

$T$ is a real or complex valued function and $f$ is a real valued function, both defined on some unbounded subset of the real positive numbers, such that $f(n)$ is strictly positive for all large enough values of $n$. For for all sufficiently large values of $n$, the absolute value of $T(n)$ is at most a positive constant multiple of $f(n)$. That is, there exists a positive real number $M$ and a real number $n_0$ such that

${\text{ for all }n\geq n_{0}, |T(n)|\leq \;Mf(n){\text{ for all }}n\geq n_{0}.}$

Please note this interpretation is considered the definition. All other interpretations and understandings, which may help you greatly in various ways, are secondary and corollary. Everyone (well, at least every answerer here) agrees to this interpretation/definition/semantics. As long as you can apply this interpretation, you are probably good most of time. Relax and be comfortable. You do not want to think too much, just as you do not think too much about some of the irregularity of English or French or most of natural languages. Just use the notation by that definition.

$T(n)$ is an exact number, But $O(f(n))$ is not a function that spits out a number, so technically we can't say $T(n)$equals$O(f(n))$, if one asks you what's the value of $O(f(n))$, what would be your answer? There is no answer.

Indeed, there could be no answer, since the question is ill-posed. $T(n)$ does not mean an exact number. It is meant to stand for a function whose name is $T$ and whose formal parameter is $n$ (which is sort of bounded to the $n$ in $f(n)$). It is just as correct and even more so if we write $T=O(f)$. If $T$ is the function that maps $n$ to $n^2$ and $f$ is the function that maps $n$ to $n^3$, it is also conventional to write $f(n)=O(n^3)$ or $n^2=O(n^3)$. Please also note that the definition does not say $O$ is a function or not. It does not say the left hand side is supposed to be equal to the right hand side at all! You are right to suspect that equal sign does not mean equality in its ordinary sense, where you can switch both sides of the equality and it should be backed by an equivalent relation. (Another even more famous example of abuse of the equal sign is the usage of equal sign to mean assignment in most programming languages, instead of more cumbersome := as in some languages.)

If we are only concerned about that one equality (I am starting to abuse language as well. It is not an equality; however, it is an equality since there is an equal sign in the notation or it could be construed as some kind of equality), $T(n)=O(f(n))$, this answer is done.

However, the question actually goes on. What does it mean by, for example, $f(n)=3n+O(\log n)$? This equality is not covered by the definition above. We would like to introduce another convention, the placeholder convention. Here is the full statement of placeholder convention as stated in Wikipedia.

In more complicated usage, $O(\cdots)$ can appear in different places in an equation, even several times on each side. For example, the following are true for $n\to \infty$.

The meaning of such statements is as follows: for any functions which satisfy each $O(\cdots)$ on the left side, there are some functions satisfying each $O(\cdots)$ on the right side, such that substituting all these functions into the equation makes the two sides equal. For example, the third equation above means: "For any function $f(n) = O(1)$, there is some function $g(n) = O(e^n)$ such that $n^{f(n)} = g(n)$."

You may want to check here for another example of placeholder convention in action.

You might have noticed by now that I have not used the set-theoretic explanation of the big $O$-notation. All I have done is just to show even without that set-theoretic explanation such as "$O(f(n))$ is a set of functions", we can still understand big $O$-notation fully and perfectly. If you find that set-theoretic explanation useful, please go ahead anyway.

You can check the section in "asymptotic notation" of CLRS for a more detailed analysis and usage pattern for the family of notations for asymptotic behavior, such as big $\Theta$, $\Omega$, small $o$, small $\omega$, multivariable usage and more. The Wikipedia entry is also a pretty good reference.

Lastly, there is some inherent ambiguity/controversy with big $O$ notation with multiple variables,1 and 2. You might want to think twice when you are using those.

The Big Oh notation [including $O$, $\Omega$ and $\Theta$] provides for a rough notion of equality when comparing
functions. It is somewhat jarring to see an expression like $n^2 = O(n^3)$, but its
meaning can always be resolved by going back to the definitions in terms of upper
and lower bounds. It is perhaps most instructive to read the " = " here as meaning
"one of the functions that are". Clearly, $n^2$ is one of functions that are $O(n^3)$.

Strictly speaking (as noted by David Richerby's comment), $\Theta$ gives you a rough notion of equality, $O$ a rough notion of less-than-or-equal-to, and $\Omega$ and rough notion of greater-than-or-equal-to.

Nonetheless, I agree with Vincenzo's answer: you can simply interpret $O(f(n))$ as a set of functions and the = symbol as a set membership symbol $\in$.

Many other posters have explained that the Big-O can be thought of as denoting a set of functions, and that the notation $n^2 = O(n^3)$ indicates that $n^2$ (as a function of $n$) is in the set denoted by $O(n^3)$ (again considering $n$ as the parameter). In English text, you may prefer to write "$n^2$ is in $O(n^3)$" to avoid confusion.

Although the notation can be confusing, it may help to think of the $O$ and the $=$ as parts of the same notation, that is, to treat $= O$ as if it were one symbol. It is little different from what we do when we write >= in a programming language: two symbols, adjacent, become one in our eyes.

Another tricky aspect of the notation is that the variable acting as the parameter is not explicitly identified, or bound, the way it would be in a function declaration or a lambda notation. This can be especially confusing when there are two variables involved, as in $O(mn)$ or even more so in an expression like $O(n^c)$ since it may be implied that $c$ is a constant. Then again, some algorithms have complexity in a Big-O set that technically varies according to two variables, while in practice one of them is fixed. Still more, there may be multiple reasonable ways of measuring the complexity of one algorithm. For example, if the input is a number, your algorithm might be $O(n)$ in the value $n$ of the number but $O(2^b)$ in the bit size of the number. (Although in complexity theory per se, the bit size is usually the right parameter.)

All this is to say that the Big-O is an informal notation, hounded by imprecision, and you often have to use other context to understand what an author is saying.

As always, you're best to avoid confusion in your own writing, and I suggest avoiding the $= O$ and using instead $\in O$ or the English "…is in…"

Just to underline the point which has been made several times, allow me to quote from N. G. de Bruijn, Asymptotic Methods in Analysis:

The common interpretation of all these formulas can be expressed as follows. Any expression involving the $O$-symbol is to be considered as a class of functions. If the range $0 < x < \infty$ is considered, then $O(1) + O(x^2)$ denotes the class of all functions of the form $f(x) + g(x)$, with $f(x) = O(1)\,\,(0 < x < \infty)$, $g(x) = O(x^2)\,\,(0 < x < \infty)$. And $x^{-1}O(1) = O(1) + O(x^{-2})$ means that the class $x^{-1}O(1)$ is contained in the class $O(1) + O(x^{-2})$. Sometimes, the left-hand-side of the relation is not a class, but a single function [...]. Then the relation means that the function on the left is a member of the class on the right.

It is obvious that the sign $=$ is really the wrong sign for such relations, because it suggests symmetry, and there is no such symmetry. For example, $O(x) = O(x^2)\,\,(x \rightarrow \infty)$ is correct, but $O(x^2) = O(x)\,\,(x \rightarrow \infty)$ is false. Once this warning has been given, there is, however, not much harm in using the sign $=$, and we shall maintain it, for no other reason than that it is customary.

Donald Knuth also pointed out that mathematicians often use the $=$ sign as they use the word "is" in English. "Aristotle is a man, but a man isn’t necessarily Aristotle."

Having said that, Bender and Orszag's notation (from Advanced mathematical methods for scientists and engineers) is a lot less confusing and is worth considering. With respect to some limit, we say:

$$f(x) \sim g(x)\,\,(x \rightarrow x_0)$$

(pronouncted "$f$ is asymptotic to $g$") means

$$\lim_{x \rightarrow x_0} \frac{f(x)}{g(x)} = 1$$

and:

$$f(x) \ll g(x)\,\,(x \rightarrow x_0)$$

(pronounced "$f$ is negligible compared to $g$") means

$$\lim_{x \rightarrow x_0} \frac{f(x)}{g(x)} = 0$$

But I suppose the benefit of big-oh notation is that the constant factor is arbitrary. (And for little-oh notation, the constant factor is whatever you what.)

I went over this on stackoverflow; while perhaps the most correct answer to OP's has already been stated above (equivalence classes, restated below as #1), here is a complete answer:

sets: "$f= O(\cdot)$" means $f \in O(\cdot)$, i.e. membership in a set, e.g. $\{\frac{1}{2} x^2, (5 x^2-x+5), 2.5x,...\}$ "the set of functions asymptotically bounded by $x^2$". This is the standard mathematical treatment of asymptotic notation that I am aware of. This set has a partial order corresponding to subset-of, e.g. $O(x^2) < O(x^3) = O(x^3 + x)$ (some sets may be incomparable; DAG; see polynomial hierarchy for an interesting example).

note. "$f= \Theta(\cdot)$" means $f \in \Theta(\cdot)$. However, note that unlike the above, this is an equivalence relation (obviously the naive relation X as in f X g iff $f \in O(g)$ is not an equivalence class, since $f \in O(g)$ does not imply $g \in f(g)$; the trivial equivalence relation "both an element of $O(g)$" is perhaps amusing to conceptualize but mathematically uninteresting, whereas with $\Theta$ the multiple equivalence classes partition the space of functions).

wildcard expression: You can backwards-engineer the definition of $f \in \Theta(g)$: after some radius of uncaring near the origin (i.e. there exists an $x_0$, such that for all $x>x_0$...), there exists a band bounded constant multiples of $g$ that bounds $f$ (i.e. $LOW*g(x) \le f(x) \le HIGH*g(x)$)... and so we can backwards-engineer this to merely replace any expression $O(g)$ with the expression $k_1g(x)+err(x)$, that is, replace it with the bound itself plus an error term we don't care about (an error term bounded by $0 \le err(x) \le k_2g(x)$ for $x>x_0$ and potentially unbounded for $x\le x_0$)..... so for example if we said $f = 2^{\Theta(x^2)}$, we could equivalently say $f(x)=2^{k_1x^2+err(x)}$ where this error term is $0 \le err(x) \le k_2x^2$. We would never ever write this down though... because it's a bit silly. But I believe it may be legitimate to think of it that way, and would preserve the notion of equality. (I am eliding proper treatment of negative signs here, which may be important.)

$\begingroup$Part 1 now seems to work. But the handling of the error term in 2 is still wrong. You can't write $x^2$ as $k_1x^3+\mathrm{err}$ for any positive term $\mathrm{err}$. And, if you're going to write, say, $x^2 = 2^x+\mathrm{err}$ then I don't think it's really true to say that you "don't care about" the error term, since the error term would be $2^x-x^2$, which is much, much bigger than $x^2$.$\endgroup$
– David RicherbyDec 13 '18 at 14:46

A more exact answer would be that when we say a function f is 'big O of function g'. (I.e. x^2 + x is O(x^2)) we are saying that f(x) < C*g(x) for some value C and k where x > k. This means that g is an upper bound for the behaiver of f.