Tuesday, March 28, 2017

No doubt that Data structure and algorithms are an integral part of any Programming job interview, including Java, C++ or any other programming language. In fact Data structure and algorithms are quite a favorite one and all top notch companies including Google, Microsoft, Amazon and investment banks like Goldman Sachs, Citigroup, Barclays, Morgan Stanley or J.P. Morgan focus extensively on data structure and algorithms, while hiring both senior and mid-level developers in Java, C++, and C# positions. When I shared some traditional, popular, and more frequently asked questions on Data structure and Algorithms in my earlier article, I received a lot of feedback to share some practical, scenario-based questions on data structure e.g. which kind of data structure will you use in a particular scenario and why.
This sounds me really interesting, because this not only test your knowledge of existing data structure e.g. array, linked list, trees, graphs, stack, queue, ring buffer, Map but also test how well you can apply and use this data structure in a given situation, and in some cases, can you come up with a new innovative data structure.

Of course, classical questions on data structure e.g. finding middle element of linked list in one pass or internal working of HashMap still matters, but these new questions will give you more fun and edge while preparing for Java, C++ or C# developer roles.

2 Scenario Based Data Structure and Algorithm Interview Questions

Ok, without any more introduction, let's see some good, practical questions from data structure and algorithms:

1) Which data structure will you use to implement a Limit Order Book? and Why?
This question is asked mostly on investment banking Java interviews. Sometimes it's also asked as Object Oriented design question, which involved detailed design and coding. Let me give a brief intro of a Limit order book, especially for those programmers who are not from financial services firms or not familiar with stock trading.

In Stock trading, exchanges like NYSE (NewYork Stock Exchange) maintains an order book for every security or stock which is traded on their exchange e.g. GOOG, which is a symbol of Google's stock. There are mainly two kinds of orders customers can send, a buy order, and a sell order. When we use Limit Price, which means buy order with a limit price of $50 can be executed if it found a sell order of $50 or an order of lower price says $49, but it can not be executed with a sell order of price $51. Similarly, a sell order can execute for a price, which is either equal to or higher than limit price. In general, a LIMIT order executes if it found a specified price or better price(lower in the case of a buy, and higher in the case of sell).

Orders are executed in first come first serve basis, so exchange also maintains a time priority. Now, let's see the flow, an order comes to exchange, exchange looks order book for that symbol, if it found a match it executes order otherwise, it adds that order at the end of a price queue, which represents time priority, head of the queue represent the order with highest time priority i.e. the order which comes before all the order below it.

Now our goal is to perform above operation as quickly as possible. If you look at closely, it involves finding opposite order of matching the price, which is equal to or less/greater than specified price, removing the order from order book if it matched or canceled, adding order into order book at an appropriate place if not matched.

In all these operations, traversing is a key, which hints towards binary search tree data structure. If we use a Binary Tree for storing order, we can find a matching order using binary search which is of order O(log2N), not quite fast as O(1) but still decent one.

Similarly adding and removing orders will also cost that much time, because they involve traversal. Now in order to tackle different symbol, since the order of one symbol can not match to order of another symbol, an OrderBook must be associated with a symbol. This is just a basic idea, without going into exact requirement i.e. methods required to be supported by an order book.

Btw, If you are not familiar with tree data structure and its variants e.g. binary search tree, balanced tree-like AVL and Red-Black tree and Tries then I suggest you first read a good book on Data Structure and Algorithms e.g. Introduction to Algorithms by Thomas H. Cormen

I have used Queue data structure to maintain time priority for orders of the same price. Of course, this is very high level, but this does give you some idea on how to approach a problem and choose a particular data structure. Remember choice of data structure is mainly driven by the operation performed on that e.g. We used Queue instead of Stack here, because the order which comes first, should execute first (FIFO) if price matches.

I have used a tree data structure because we need to find either specified price or a better price, which involves search between different price. Remember, we have not yet considered concurrency, thread-safety and all those things which are quite important while designing structures for high volume, low latency application, but that's a topic for separate discussion.

Here our goal is to focus on choosing right data-structure. Also, if you think the binary tree is not a right data structure for representing a limit order book, you can definitely share your thoughts and solution with us. We will see if it solves the problem with better performance in terms of time and development effort.

On the other hand, if you are not familiar with essential data structures like the array, linked list, binary search tree, balanced trees like AVL and Red-Black Tree and hash tables, I suggest you read a good book on data structures. For Java developers, I recommend reading "Data Structure and Algorithms Made Easy" by Narasimha Karumanchi.

2) Which data structure will you choose to store Market data? and Why?
This is another data structure questions, which is asked in various wall street firms e.g. Barclays, Citibank, Goldman Sachs, Deutsche Bank, WellsFargo etc. Since many of you might not be familiar with Market data, Let me brief about Market data and key things which will help us to approach the problem. In simple word, Market data are the live prices of different stocks at a given moment. Since prices move very quickly in Exchange, given volume of high-frequency trading and electronic trading, market data quickly become stale.

Suppose your system is storing market data to analyze and found under-valued or over-valued stocks to make a buying or selling decision, and you need to store only recent market data for that purpose. Now this gives us an idea, that as soon as we receive an update, we should discard old market data, even if it's not yet processed.

Since Market data is processed by another thread and it subsequently removed from the store, we need to also think of a data structure which provides consume kind of functionality. Initially, I thought of using an unbounded Queue for storing market data, which is simultaneously processed by another thread, typical producer-consumer design, but this design has a problem in terms of updating prices at high speed.

For example, if you receive an update before earlier market data get processed, you can not update that, which means your application will process old data, which is stale and even if a better data is available. Though you can tackle this problem by introducing a check and update method i.e. if a market data for that symbol already exists then remove it and add new data to the queue.

You can do that by using the contains() method but got to be careful while overriding equals() and hashCode(), because if you are wrapping symbol and price in an object say stock, then stocks with different price will be treated as differently. So Queue data structure seems a solution for this problem, let me know if you come across any other solution.

I also thought about Map, with LinkedHashMap you can get your order of insertion, and being a Map, you can update Market data in constant time, but Consumer thread, which needs to iterate and process market data will have a hard time over traversing, as Map will constantly updating.

Well, some of this question can be very open ended until interviewer gives some more details, but same time they also expect the candidate to ask the right questions, which mainly comes as what, why, when kind of queries, but remember those are very important and also appreciated by interviewers.

That's all on this list of Practical Data structure and algorithm questions for Java, C++, and C# developers. These are just a way of approaching the problem, you may have another approach or another data structure to solve these problems, and until you can justify your use of particular data structure in terms of benefits, you are perfectly fine to use that. In fact, improving your own solution, almost always create a better impression during interviews.

At the same time don't forget to prepare basic data structure and algorithms questions on linked list, array, queue, and stacks. Stay tuned, I will share few more such data structure questions as and when I come across, mean while you are always welcome to share something similar. Till then you can also use problems given in Algorithm Design Manual by Steven S Skiena to improve your data structure and problem-solving skill.

Thanks for reading this article so far. If you like this article then please share with your friends and colleagues. If you have any question or doubt then please let us know and I'll try to find an answer for you.