Search This Blog

Complexity Analysis for Interviews, Part 2

This is the second part of a series. Go back to part 1 if you haven't read it already.

In a fast paced interview, running time analysis is often about recognition of common running times of basic operations and combining them together, since you won't have time to derive the running time of each subcomponent of your algorithm. For example, say you had an algorithm that first sorted an array and then made a constant number of passes through the sorted array. You should point out to your interviewer that the sorting took
O(nlogn)
assuming mergesort or heapsort and the constant number of passes took O(n) which is dwarfed by
O(nlogn)
for large n. Therefore the whole algorithm is
O(nlogn)
. This is the kind of thing you have to learn to recognize. As we go over common algorithms and algorithms for specific real interview questions, I will point out the running times of the various parts. Remember that O(1) constant &lt O(log n) logarithmic &lt O(n) linear &lt
O(nlogn)
quasilinear &lt
O(n2)
quadratic &lt
O(n3)
cubic &lt ... &lt
O(2n)
exponential &lt ... &lt O(n!) factorial. So if your algorithm consists of operations in multiple Big-Theta complexity classes, the larger one is going to dwarf all the smaller ones for large n so the running time of the whole algorithm should be whichever is the largest. So if a algorithm has a
O(2n)
part, an
O(n2)
, and an O(n log n) part, then the whole thing is
O(2n)
. If you find your algorithm doing an O(1) operation n times then it will be O(n) altogether. For 80% of interviews, I think this is sufficient to recognize the running times of the common routines you use in your algorithm (e.g., comparison-based sorting, iterating through the array, looking up a hash, traversing a tree) and being able to put them together via the above rule. The below is a treatment of some of the deeper math you can learn to deal with all kinds of running time analysis. It is very much optional. It may even go above the head of your interviewer if they have forgotten their college Algorithms class, but I keep it here for personal edification.

What in interviews they call Big-O is actually Big-Theta notation, the notation for the asymptotically tight bound for the actual growth function of an algorithm. The precise distinction will become important later. Let f(n) be the actual growth function of an algorithm. Formally,
Θ(g(n))={f(n):∃c0>0∧c1>0∧n0>0s.t.c0g(n)≤f(n)≤c1g(n)∀n≥n0}
In practice, Big-
Θ
(theta) means the complexity of an algorithm ignoring any constant coefficient.
The real Big-O notation is only the asymptotic upper bound part of that. There is also the complementary Big-Omega notation that refers to the asymptotic lower bound. Big-
Θ
notation that combines the two. Note that since Big-O is an upper bound, really any upper bound, any function f(n) = n is contained in
O(n2)
and both those and f_1(n) = n^2 are contained in
O(2n)
Again, this is the precise definitions of asymptotic notation, but unfortunately in interviews, they abuse Big-O to be an asymptotically tight bound, i.e. Big-
Θ
. From this point on, we will also abuse the O(g(n)) notation to denote (
Θ(g(n))
) except when we need to make a meaningful distinction.

Deriving Recurrences

Nontrivial complexity comes from using loops and recursion. In both cases, you need to derive a recurrence equation to characterize the complexity.

Solving Recurrences

For mergesort, the recurrence is the following:
T(n)={O(1)ifn=12T(n/2)+O(n)ifn>1}
. As with all recursive algorithms, merge sort consists of a base case and an inductive case. The base case handles the leaves of the recursion tree, the simplest steps that don't need any further recursive call to the function. In merge sort, that would be the case when the input is one element. A subarray of one element is of course already sorted so we have a fairly trivial base case. The running time of the base case is O(1). The inductive case is where we make the two recursive calls to mergesort each at a cost of T(n/2) and do an O(n) amount of work merging the two sorted subarrays together, so altogether T(n) = 2*T(n/2) + O(n).

Guess and check (Substitution)

Recursion Tree Method

We went over this in the past session. The idea is to construct a two column table. One column is the iteration number or recursive call depth and the other is the amount of work to be done at each iteration. The work column can be a straight column or a tree depending on the number of recursive calls you make.

Master's Theorem

This is where we need to distinguish between Big-O and Big-Theta. For recurrences of the form T(n) = aT(n/b) + f(n), the following holds:

If
f(n)=O(nlogba-ϵ)
for some constant
ϵ>0
, then
T(n)=Θ(nlogba)

f(n)=Θ(nlogba)
, then
T(n)=Θ(nlogbalogn)

f(n)=Ω(nlogba+ϵ)
for some constant
ϵ>0
, and if
af(n/b)≤cf(n)
for some constant
c<1
and all sufficiently large
n
, then
T(n)=Θ(f(n))
.

The derivation of the theorem is beyond this post but the classic Cormen et al Introduction to Algorithms has a good treatment. The takeaway is that with this theorem, you can calculate the solution to any recurrence of the above form.

Comments

Post a Comment

Popular posts from this blog

Shrunk and White of Programming
When you put down that you know a certain programming language or languages on your resume, you are setting certain expectations for the interviewer. I would strongly caution against putting down "expert" in a language unless you invented or are one of the language's maintainers. You are giving your interviewer the license to quiz you on programming language lore. There are a handful of concepts that are considered "standard" knowledge for each language which go broadly beyond syntax and general semantics. These concepts commonly involve major pitfalls in a given language and the idiomatic technique for negotiating these pitfalls and writing efficient and maintainable code. Note, although the concepts are considered idiomatic, you can seldom infer them from knowledge of syntax and semantics alone. The tricky part here is that most courses that teach a particular programming language do not cover these idiomatic techniques and eve…

This is part 1 of a two part series. Skip over to part 2 you'd like .
For coding interviews, we are interested in gauging the asymptotic efficiency of algorithms both in terms of running time and space. The formal study of analysis of algorithms is called complexity theory, a rich field with fascinating and complicated math. For interviews, we only need a few basic concepts. Asymptotic efficiency is concerned with how the running time or memory requirements of an algorithm grows with the input size, so it is intimately concerned with how well algorithms scale to larger inputs. This is important in Computer Science and in practice because whereas some algorithms work well enough for small inputs of say < 10 inputs, the running time and space grows far faster than the input size and thus large inputs of say 10s to millions of inputs becomes impractical (usually meaning taking hours or even years of execution time). Consider sorting. Say for the sake of argument that sorting alg…

It's a challenge to outperform all the other candidates in a competitive tech job only, but there is hope. You can improve your performance with practice and watching out for these gotchas:
Make absolutely sure you are solving the right problem: I ran into this the other day. It is entirely a communication issue. When doing an initial screen over the phone, this problem is compounded. For example, maybe an interviewee is hacking out a function that returns the k highest priced products when the interviewer is expecting the kth highest priced product. One can squander a lot of time due to these misunderstandings. A good interviewer will try to guide you back to the right path, but you can't expect this. Be sure to ask questions. Confirm that the input and output are exactly what you expect. Use examples.Don't ever give an interviewer the impression that you are avoiding writing real code. This is an impression thing. This is a coding interview so you should be expecting to…