Thursday, November 18, 2010

A Google Interviewing Story

A few years ago I was entering the Silicon Valley job market and at that time looking for senior engineering positions. A good rule of thumb about interviewing if you haven't done it in awhile is to at least somewhat accept that you'll probably make a few mistakes on your first few tries. Simply, don't go for your dream job first. There are a million nuances to interviewing that you've forgotten, and a few up-front, not-so-important interviews first will educate (or re-educate) about them.

One of the first places I interviewed was a company called gofish.com. As far as I know - gofish is an utterly different company now than when I interviewed there. I'm almost sure that everyone I met there no longer works there, so the actual company isn't terribly relevant to the story. But the interviewer is. My technical interview there was with a guy named Guy.

Guy wore leather pants. Its a well-known fact that interviewers in leather pants are "extra" scary. And Guy was by no means a let down. He was also a technical crack-shot. And he was a technical crack-shot in leather pants - seriously, I didn't have a chance.

One question he asked me I'll never forget. In truth, its a pretty innocuous question - but it's also pretty standard fare for silicon valley interviewing questions at that time.

Here it is:

Say you have one string of alphabetic characters, and say you have another, guaranteed smaller string of alphabetic characters. Algorithmically speaking, what's the fastest way to find out if all the characters in the smaller string are in the larger string?

For example, if the two strings were:

String 1: ABCDEFGHLMNOPQRSString 2: DCGSRQPOM

You'd get true as every character in string2 is in string1. If the two strings were:

String 1: ABCDEFGHLMNOPQRSString 2: DCGSRQPOZ

you'd get false as Z isn't in the first string.

When he asked the question I literally jumped to my feet. Finally, a question I could answer with some confidence. (Note my answer to him was solely considering the worst cases as there are plenty enough nuances there for an interview question).

The naive way to do this operation would be to iterate over the 2nd string once for each character in the 1st string. That'd be O(n*m) in algorithm parlance where n is the length of string1 and m is the length of string2. Given the strings in our above example, thats 16*8 = 128 operations in the worst case.

A slightly better way would be to sort each string and then do a stepwise iteration of both sorted strings simultaneously. Sorting both strings would be (in the general case) O(m log m) + O(n log n) and the linear scan after that is O(m+n). Again for our strings above, that would be 16*4 + 8*3 = 88 plus a linear scan of both strings at a cost of 16 + 8 = 24. Thats 88 + 24 = 112 total operations. Slightly better. (As the size of the strings grow, this method would start to look better and better)

Finally, I told him the best method would simply be O(n+m). That is, iterate through the first string and put each character in a hashtable (cost of O(n) or 16). Then iterate the 2nd string and query the hashtable for each character you find. If its not found, you don't have a match. That would cost 8 operations - so both operations together is a total of 24 operations. Not bad and way better than the other solutions.

Guy wasn't impressed. He showed it by rustling his leather pants a bit. "Can you do better?" he asked.

What the heck? What did this guy want? I looked at the whiteboard and turned back to him. "No, O(n+m) is the best you have - I mean, you can't do this without looking at each character at least once - and this solution is looking at each character precisely once". The more I thought about it, the more I knew I was right.

He stepped up to the whiteboard, "What if - given that we have a limited range of possible characters - I assigned each character of the alphabet to a prime number starting with 2 and going up from there. So A would be 2, and B would be 3, and C would be 5, etc. And then I went through the first string and 'multiplied' each character's prime number together. You'd end up with some big number right? And then - what if I iterated through the 2nd string and 'divided' by every character in there. If any division gave a remainder - you knew you didn't have a match. If there was no remainders through the whole process, you knew you had a subset. Would that work?"

Every once in awhile - someone thinks so fantastically far out of your box you really need a minute to catch up. And now that he was standing, his leather pants weren't helping with this.

Now mind you - Guy's solution (and of course, needless to say I doubt Guy was the first to ever think of this) was algorithmically speaking no better than mine. Even practically, you'd still probably use mine as it was more general and didn't make you deal with messy big integers. But on the "clever scale", Guy's was way, way, (way) more fun.

I didn't get the job. Or I think they offered me some trial position or something that I refused, but it didn't matter. I was on to bigger and better things.

Next, I interviewed at become.com. After a phone interview with the CTO he sent me a "programming assignment". It was a bit over the top - but in retrospect, worth the 3 days it took me to complete. I got an interview and a job offer - but the biggest value was what the programming assignment forced me to go out and learn. I had to build a web-crawler, a spellchecker/fixer, and a few other things. Good stuff. In the end however, I turned down the offer.

Finally, I had an interview at Google. I've written before that the Google interviewing process does tend to live up to the hype. Its long - its rigorous and in all honesty, pretty darn fair. They do as best they can to learn about you and your abilities in an interview setting. By no means is that an exact science, but I'm convinced they give it a good try.

My 4th technical interview at Google was with a woman engineer that honestly seemed a bit bored of interviewing. I had done well in all my previous interviews there and was feeling pretty good about my chances. I was confident that if I did nothing ridiculously silly - I'd get the job.

She asked me a few softball questions about sorting or design, I'm not sure. But towards the end of our 45 minutes she told me "I have one more question. Let's say you have a string of alphabetic characters of some length. And you have another, shorter string of characters. How would you go about finding if all the characters in the smaller string are in the larger string?"

Woah. Deja-Guy.

Now, I could have probably stopped the interview right there. I could have said "Ahee! I just got this question a few weeks ago!" which was true. But when I was asked it a few weeks previous - I did get it right. It truly was a question I knew the answer to. Almost as if Guy had been one of my study partners for this very interview. And heck, people study interview questions on the internet all the time - by me non-chalantly answering the question I wouldn't be "lying" in any way. I did know the answer on my own!

Now you might think, that in the instant after her asking, and before the moment of time that I began speaking that the entire last paragraph sequenced through my thought process rationalizing that I was, indeed, morally in the right to calmly answer the question and take credit for the answer. But sadly, that wasn't the case. Metaphorically, it was more like she asked the question and my brain immediately raised its hand and started shouting "Me! ooh! ooh! ooh me! I know! ask me!" My brain kept trying to wrestle mouth-control away from me (which happens plenty) but only by stalwart resolve was I able to retain composure.

So I answered. Calmly. With almost unearthly grace and poise. And with a purposeful demeanor - with, I think, a confidence that only someone with complete and encyclopedic knowledge of this timeless and subtle problem would hold.

I breezed over the naive solution as if it were unworthy. I mentioned the sorting solution as if it were wearing a red-shirt on an early episode of Star Trek. And finally, nonchalantly, almost as if I had invented all things good and algorithmically efficient, mentioned the O(n+m) linear solution.

Now mind you - despite my apparent poise - the entire time I was fighting my brain who, internally, was screaming at me -- "TELL HER THE PRIME NUMBER SOLUTION YOU DIMWIT !"

I ignored his pitiful pleas.

As I finished the linear solution explanation, her head dutifully sank with complete non-surprise and she started writing in her notes. She had probably asked that question a hundred times before and I'd guess most people got it right. She probably wrote "yep. boring interview. got boring string question right. no surprise. boring guy but probable hire"

I waited a moment. I let the suspense build as long as possible. I am truly convinced that even a moment longer would have resulted in my brain throwing itself fully into an embolism resulting in me blurting out unintelligible mis-facts about prime numbers.

I broke the calm. "You know, there is another, somewhat cleverer solution"

She lethargically looked up with only a glimmer of hope.

"Given that our range of characters is limited. We could assign each character to a prime number starting at 2. After that we could 'multiply' each character of the large string and then 'divide' by each character of the small string. If the division operation left no remainder, we'd know we have a subset."

I'm guessing that at this point, she looked pretty much as I did when Guy had said the same thing to me. General loss of composure, one pupil was dilated, slight spitting while talking.

After a moment, she blurted "But.. wait that wouldn'... yes it would! But how.. what if.. wow. wow. that works! Neat!"

I sniffed triumphantly. I wrote down "She gave me a 'Neat!'" in my interviewing notes. I'm pretty sure I was getting the job before that question, but it was pretty clear that I was in for sure now. What's more, I'm pretty confident that I (or more precisely, Guy) had just made her day.

I spent 3 years working at Google and had a great time. I quit in 2008 to CTO a new startup and have subsequently started another of my own after that. About a year ago I randomly met Guy at a start-up party who had no idea who I was but when I recounted this story he nearly peed his leather pants laughing.

Again, if there is a moral here - it's to never chase your dream job before you chase a few you're willing to fail at. Apart from the interviewing experience you'll gain, you never know who might just get you ready for that big interview. In fact, that rule just might work for a lot of things in life.

And seriously, if you get the chance and you're looking to hire a crackshot engineer - you could do far worse than hiring Guy. That dude knows things.

(a bit of nitpicky technical detail for the fusty: characters may repeat so strings can be very long and thus counts must be kept. The naive solution can remove a character when it finds it from the large string to do that but its remains O(n*m). The hashtable solution can keep a count as the value of the key->value. Guy's solution still works just fine)

Edit: 11/30/10 - Guy from the story has found this post and gave some clarification in the comments. Worth the read.

50 comments:

Anonymous
said...

Guy's solution is strictly worse than having two arrays of size 26 with counts of how many 'A's, 'B's, ..., 'Z's are there in each string. Those can clearly be filled in linear time, and a single pass through them in constant time (assuming a constant alphabet size) allows us to determine if one string is a subset of the other.

The reason Guy's solution is strictly worse is because you are going to need big integers to represent the constructed product of primes once the strings get large enough (you already need it for a string that has 10 'Z' characters, for example), and big integer multiplication and division can't really be assumed to take constant time: their complexity is proportional to the number of digits of the number; all you can do is pick a large enough base to make the number of digits a bit smaller.

Division is relatively expensive, and you'd be doing O(m) of them. I think it would be better to assign each of the 26 characters to the lower 26 bits of an integer, then OR the bitmasks together as you traverse the first string. As you process each character of the second string, AND its bitmask with the result from the first string to see if it's there. ANDs and ORs are much quicker, and this method will take constant space. (Multiplying the primes in the first string risks integer overflow for sufficiently lengthy strings.)

Interesting solution using algebraic properties of prime numbers. Don't you think, however, that it's a little unpractical?

1) you have to precompute the prime numbers for all characters or add the n-th prime computation complexity to you problem's complexity2) assuming only uppercase letters you'd need at least 26 prime numbers which means value 101 for Z. If my quick estimate is correct, you could overflow the 64bit integer with 10 character string. Sure, you could use something as Java's BigInteger but then the math is more complex and adds to the overal complexity of the solution3) code would be hard to understand by anyone else than the author (yes, I've heard of the invention of commenting code, but this still is a valid point).

Don't get me wrong, I liked the idea behind Guy's solution, but even now (knowing his solution) I wouldn't use it in my code, would you?

Previous commenters seem to be missing the point- any good dev can come up with the right answer. A great dev can think of alternate solutions outside the box. It isn't about finding the right or best answer. It's about creativity.

Instead of using multiplication, you could use a bit-shift and OR. Then instead of division, you do an AND.Assuming the number has enough bits for the limited character set, it will work (and it's faster than division). Also, you can fit 32 chars optimally into a 32-bit integer, while 32 primes will require a larger integer.You should note that Guy's solution becomes very slow and problematic if you use unicode which the number of characters is so large than division becomes non-trivial, and generating the prime number array is a task in itself.

If it really is a string of alphabetic characters, then assign each letter of the alphabet a bit and build masks from the long string and the short string.

Then just bitwise compare the masks to see if any characters overlap.

It's still O(m+n), but each operation will be much faster than loading or reading from a hash.

It's also extremely space efficient. Each string could be stored as a single 32-bit integer, so the algorithm only requires 8 additional bytes of storage. (4 if you don't need to keep the encoding of the short string.)

Once the initial encoding of a string is done, any comparisons with previously encoded strings will be done in constant time.

In addition to the problems mentioned earlier, how are you going to store the mapping between characters and prime numbers? Assuming that there can be an arbitrary number of characters, you need to store which character 2 maps to, which one 3 maps to, which one 5 maps to, etc.

Then, when you do the division, you may need need to lookup each character's mapping once. If you don't store the mappings in a hashtable, the best you're going to get is logarithmic time for the lookups, which is no better than the sorting solution.

If you do store them in a hashtable, then the solution is basically exactly the same as the pure hashtable solution, except that it will be slower.

In short, the prime solution is worse than the other solutions in terms of run time and it involves the same data structures being used in pretty much the same ways, so I would say it is neither clever nor fast.

Also, your runtime analysis assumes that hashtable lookups are constant time. In practice, they will tend to be constant-ish, but technically, they can take linear time in the worst case. This means that the best solution in terms of worst case performance is the sorting one.

I actually used a very similar trick when I was 14 or 15 to store names => phone number relationships in TI-BASIC. The TI-82 lacked proper string handling so I decided that A would be 2, B would be 3, etc... Typing SARAH on input would yield the number S*A*R*A*H. I stored two separate arrays (one with the product of the and another with the phone number) as a crude hashmap.

When I thought of this I felt a bit like Guy. It worked since most names weren't an anagram of another name.

The search was linear but hey, I hadn't that much friends to register anyway :)

The product of the primes is effectively an array with a zero or 1 at each location. Multiplying by a prime is just a way to set the corresponding array location to a 1. Testing the product for divisibility by a prime is just a way to read the contents of the corresponding array location. The hash table in this case reduces to a simple array, so the two solutions are identical up to implementation details.

Thanks for this post. I had to learn this the hard way. My first interview in over 8 years was at what I consider my dream job. I felt like somebody smacked me with a wet trout 3/4 of the way through the second round of interviews. The questions headed in a direction that I never imagined. Needless to say, I did not get the position. But I did learn that I need to prepare myself a bit with some interviews I don't care as much about before moving on to the dream interviews.

I'm surprised noone has mentioned the obvious algorithm for solving this problem.

We use Counting-Sort(A,B,k), i.e. A is the input to be sorted, B is the sorted output, and k is the number of different elements that are allowed to occur in A. Counting-Sort is Θ(n+k). In our case k is the constant 26, so Counting-Sort runs in Θ(n) time.

Gödel numbers. See also incompleteness theorem. One of the cooler things that are part of any basic computer science curriculum.

http://en.wikipedia.org/wiki/Gödel_numbering

Now, we know that there are 25 prime Numbers less than 100, so number 26 must be... 101. This means any time you're multiplying by z, you're adding two decimal digits to the length of your decimal number.

And 2^10 is approximately 10^3. So if you ran into Z three times in the string you'd have added 20 bits to the length of the integer ((10^2) x 3 = 10^6 = 2 x 10^3 ~ (2 x 2^10) = 2^20)...

...once you play around with Gödel numbers you realise that they get super ridiculously large very quickly.

Simply multiplying all of the alphabet once would give 2.32862e+38 (according to my spreadsheet), a number 16 Bytes long (128 bits). We could say that on average each new letter adds almost 5 bits to the length of the number. On average, a string of 7 characters will make your naive implementation of the algorithm cack itself (!!!), even if you were cunning and used a long instead of an int, it would still cack itself at 14 characters (on average).

Anyway, I like the hashing solution better, and the idea about using two 32 bit ints with each bit as a flag for a character, and then just OR them (no division, no hash collisions).

In any case, it is instructive to see what happens when you expand the range slightly, lets say you treat upper case and lower case separately. The Gödel numbering solution gets enormously worse, the hashing algorithm it isn't a problem, and depending on how you did your bit flagging ops the int flag just expands to a long flag.

Start looking at a wider range, like ascii, or the values on your keyboard, or latin + greek, or, heaven forbid, some large subset of unicode, and suddenly the hashmap is really the only practical way to do this....

...except that for relatively short strings, you might actually fall back on the 'naive' (sorting) solution as being simpler, easier to implement and much easier to maintain and/or localise. Depending on the overhead imposed by the hashmap, it might even be faster.

"what's the fastest way to find out if all the characters in the smaller string are in the larger string?"

Any solution for this that always scans the entire larger string is incorrect.

Consider the case where the smaller string is the single letter "A" and the larger string is a long string that starts with "A." Your algorithm can return true after reading two characters.

Guy's solution is just silly. Its only use seems to be in interviews, where you can ask the potential hire to explain why it's inappropriate. What if the small string is "A" and the large string is "A" followed by a trillion "Z"'s? A trillion multiplications and divisions, and several trillion bits of storage, right?

Try this: If these are really alphabetic uppercase characters, and the smaller string is a set (repeats ignored), it fits in a 26-bit bitmask, which fits in a single CPU register. Load it up with 1's for every character in the smaller string, and scan the larger string masking off ~(1 << (ord(x)-65)) for every letter found. Return true as soon as the register is zero, false if it's nonzero after the larger string is scanned.

Jamie McCarthy, you are correct, of course, about not needing to scan the whole first string all the time. However, there will always be a worst case in which you do have to touch every character in the first string. In academic computer science algorithm study, it is often desired to calculate the worst case performance, even if this is not a situation likely to be encountered in the real world. I think that is the desired assumption for this problem.

I keep seeing comments on how the prime number solution is not practical. I found it the most practical and efficient solution to a particular problem that I had about 15 years ago. I would be interested if someone can supply a more efficient solution. In short, the problem is to write a python script that given a file of common English words (eg. /usr/dict/words) will find all words that can be made from a 9-letter word (eg. 'chocolate'). I have written a short blog about my results.

Paul,I'm the Guy in the leather pants. :)Thanks - that's the nicest compliment I've been paid in a long time. I'm also impressed that you gave me due credit for my prime solution. Respect! Your account is pretty much on the money.

Let me clarify a few things for your readers:

- I would never expect anyone to come up with this solution in an interview situation (and no-one ever has) - the purpose was to see whether the candidate could understand the solution and tell me something about its pros and cons.

Until this blog post, they were extremely unlikely to have encountered this solution prior.

- My version of the question starts out by restricting the input to seven letters (for Scrabble) and it was assumed you'd discard any word longer than the rack or shorter than the current best solution so a 32-bit word suffices with some finesses hinted below.

What I love about this interview question is how such a seemingly innocuous problem can have such a variety of obvious and not so obvious solutions. (Candidates who think a bitmask alone would suffice, don't understand the distinction between a bag and a set.)

I've also noticed that after I showed the candidate my solution, they were invariably a lot more interested in working with me - sometimes the interview process is as much about impressing the candidate as it is about the candidate impressing the interviewer.

- Can you suggest a way to combine the prime representation with a bitmask to improve performance?

- What changes would you need to make if we allow the rack of letters to be of arbitrary size?

- If you had to perform this test for many different racks of letters, how would you change the representation of the dictionary?

- How practical is it to cache the answers for racks of letters that you've seen already?

- Could you precompute this cache?

Thanks again for the mention. And I'm glad my solution helped you make a big impression on Google. I never regretted hiring any of the candidates who could grasp this solution. I'm just sad we couldn't make you a compelling offer. GoFish went down the toilet so if definitely worked out best for Paul. LOLCheers,Guy Argo

p.s. I retired the leather pants after the motorcycle accident, alas.But they did save my skin...

I really really enjoyed this post. Very well written and the insights were amazing. Guy's solution really blew me away at first - very unique. I featured this interview question in my side project recently and while looking up different solutions did not come across Guy's solution once. It definitely is out of the box and interesting.

A little on my side project: I recently launched a service that emails you a new technical interview question every other day. I think you and your readers might be interested in it. I thought I'd share it with you and would love to get your expert feedback! Here's the site with more info: InterTechTion - Technical Interview Questions

Once again thanks for the great article and wish you success on your latest startup!

While some of you focus on the right or wrong of any of the answers posted in this article, my biggest take-away is that if you want to do one big thing and want to do it as much right as possible, try to do some small things that you can afford to lose beforehead. As Paul said, this is true for interviews but may be true for many other things.

And Guy's comment said things very clear. His purpose of providing his solution is 1st, see whether the interviewees can understand it and tell pros and cons and 2nd, to impress the interviewees. To all of us the 1st purpose is well served and to many of us the 2nd purpose is achived in some degrees. That makes his tactic logically sound. Full stop.

I hope Guy's question was an attempt to see whether you'd speak up about a really stupid idea. His approach fails both from a computational complexity point of view and from a speed point of view. If you assume the kind of arithmetic he assumes, you might just use bitsets, and you can even express those using integer arithmetic.

And as the developers spend an unknown amount of time finding the best solution... The competition has beat you in the 'time to market' game and snapped up your potential clients. Perfection and fine tuning of algorithms have their time and place. In the end business is about profits and making money.

You can beat O(n+m) on average for matches, only failure-to-match is provably O(n+m).

You walk the *second* string first, mapping it's characters into a hashmap or bitset.

Then you walk the first string, removing each character from the set once it's seen. When the set is empty you stop and return subset-match. If you reach the end without emptying the set, you return mis-match.