Algorithms Interview Questions

What are Divide and Conquer algorithms? Describe how they work. Can you give any common examples of the types of problems where this approach might be used?

Divide and Conquer algorithms are a paradigm for solving problems that involve several basic steps. First, we divide the problem into smaller pieces and work to solve each of them independently. Once we’ve solved all of the pieces, we take all of the resulting smaller solutions and combine them into a single integrated comprehensive solution.

This process can be performed recursively; that is, each “sub problem” can itself be subdivided into even smaller parts if necessary. This recursive division of the problem is performed until each individual problem is small enough to become relatively trivial to solve.

Some common examples of problems that lend themselves well to this approach are binary search, sorting algorithms (e.g., Merge Sort, Quicksort), optimization of computationally complex mathematical operations (Exponentiation, FFT, Strassen’s algorithm), and others.

Why do you think quick sort is better than merge sort?

Quicksort has O(n2) worst-case runtime and O(n log n) average case runtime while the worst and best case for merge sort is O(n log n)
However, it’s superior to merge sort in many scenarios because many factors influence an algorithm’s runtime, and, when taking them all together, quicksort wins out.

In particular, the often-quoted runtime of sorting algorithms refers to the number of comparisons or the number of swaps necessary to perform to sort the data. This is indeed a good measure of performance, especially since it’s independent of the underlying hardware design. However, other things – such as locality of reference (i.e. do we read lots of elements which are probably in cache?) – also play an important role on current hardware. Quicksort in particular requires little additional space and exhibits good cache locality, and this makes it faster than merge sort in many cases.

In addition, it’s very easy to avoid quicksort’s worst-case run time of O(n2) almost entirely by using an appropriate choice of the pivot – such as picking it at random (this is an excellent strategy).

How do Insertion sort, Heapsort, Quicksort, and Merge sort work?

Insertion sort takes elements of the array sequentially, and maintains a sorted subarray to the left of the current point. It does this by taking an element, finding its correct position in the sorted array, and shifting all following elements by 1, leaving a space for the element to be inserted.

Heapsort starts by building a max heap. A binary max heap is a nearly complete binary tree in which each parent node is larger or equal to its children. The heap is stored in the same memory in which the original array elements are. Once the heap is formed, it completely replaces the array. After that, we take and remove the first element, restore the heap property, thus reducing the heap size by 1, after which we place the max element at the end of that memory. This is repeated until we empty out the heap, resulting in the smallest element being in the first place, and the following elements being sequentially larger.

Quicksort is performed by taking the first (leftmost) element of the array as a pivot point. We then compare it to each following element. When we find one that is smaller, we move it to the left. The moving is performed quickly by swapping that element with the first element after the pivot point, and then swapping the pivot point with the element after it. After going through the whole array, we take all points on the left of the pivot and call quicksort on that subarray, and we do the same to all points on the right of the pivot. The recursion is performed until we reach subarrays of 0-1 elements in length.

Merge sort recursively halves the given array. Once the subarrays reach trivial length, merging begins. Merging takes the smallest element between two adjacent subarrays and repeats that step until all elements are taken, resulting in a sorted subarray. The process is repeated on pairs of adjacent subarrays until we arrive at the starting array, but sorted.

What are the key advantages of Insertion Sort, Quicksort, Heapsort and Mergesort? Discuss best, average, and worst case time and memory complexity.

Insertion sort has an average and worst runtime of O(n^2), and a best runtime of O(n). It doesn’t need any extra buffer, so space complexity is O(1). It is efficient at sorting extremely short arrays due to a very low constant factor in its complexity. It is also extremely good at sorting arrays that are already “almost” sorted. A common use is for re-sorting arrays that have had some small updates to their elements.

The other three algorithms have a best and average runtime complexity of O(n log n). Heapsort and Mergesort maintain that complexity even in worst case scenarios, while Quicksort has worst case performance of O(n^2).

Quicksort is sensitive to the data provided. Without usage of random pivots, it uses O(n^2) time for sorting a full sorted array. But by swapping random unsorted elements with the first element, and sorting afterwards, the algorithm becomes less sensitive to data would otherwise cause worst-case behavior (e.g. already sorted arrays). Even though it doesn’t have lower complexity than Heapsort or Merge sort, it has a very low constant factor to its execution speed, which generally gives it a speed advantage when working with lots of random data.

Heapsort has reliable time complexity and doesn’t require any extra buffer space. As a result, it is useful in software that requires reliable speed over optimal average runtime, and/or has limited memory to operate with the data. Thus, systems with real time requirements and memory constraints benefit the most from this algorithm.

Merge sort has a much smaller constant factor than Heapsort, but requires O(n) buffer space to store intermediate data, which is very expensive. Its main selling point is that it is stable, as compared to Heapsort which isn’t. In addition, its implementation is very parallelizable.

What are Red-Black Trees and B-Trees? What is the best use case for each of them?

Both Red-Black Trees and B-Trees are balanced search trees that can be used on items that can have a comparison operator defined on them. They allow operations like minimum, maximum, predecessor, successor, insert, and delete, in O(log N) time (with N being the number of elements). Thus, they can be used for implementing a map, priority queue, or index of a database, to name a few examples.

Red-Black Trees are an improvement upon Binary Search Trees. Binary Search Trees use binary trees to perform the named operations, but the depth of the tree is not controlled – so operations can end up taking a lot more time than expected. Red-Black Trees solve this issue by marking all nodes in the tree as red or black, and setting rules of how certain positions between nodes should be processed. Without going into too much detail, this method guarantees that the longest branch isn’t more than twice as long as the shortest branch, so each branch is shorter than 2*log_base2(N).

This is the ideal structure for implementing ordered maps and priority queues. B-Trees branch into K-2K branches (for a given number K) rather than into 2, as is typical for binary trees. Other than that, they behave pretty similarly to a binary search tree. This has the advantage of reducing access operations, which is particularly useful when data is stored on secondary storage or in a remote location. This way, we can request data in larger chunks, and by the time we finish processing a previous request, our new request is ready to be handled. This structure is often used in implementing databases, since they have a lot of secondary storage access.

What is a Hash Table, and what is the average case and worst case time for each of its operations? How can we use this structure to find all anagrams in a dictionary?

A Hash Table is a data structure for mapping values to keys of arbitrary type. The Hash Table consists of a sequentially enumerated array of buckets of a certain length. We assign each possible key to a bucket by using a hash function – a function that returns an integer number (the index of a bucket) for any given key. Multiple keys can be assigned to the same bucket, so all the (key, value) pairs are stored in lists within their respective buckets.

Choosing the right hashing function can have a great impact on performance. A hash function that is good for a dataset that we want to store will result in hashes of different keys being a rare occurrence. Even though accessing, inserting and deleting have a worst case time complexity of O(N) (where N is the number of elements in the Hash Table), in practice we have an average time complexity of O(1).

To find all anagrams in a dictionary, we just have to group all words that have the same set of letters in them. So, if we map words to strings representing their sorted letters, we could group words into lists by using their sorted letters as a key.

The hash table stores lists mapped to strings. For each word, we add it to the list at the appropriate key, or make a new list and add it to it then. On average, for a dictionary of N words of length less or equal to L, this algorithm works with an average time complexity of O(N L log L).

What is A*, what are its implementation details, and what are its advantages and drawbacks in regard to traversing graphs towards a target?

A* is an algorithm for pathfinding that doesn’t attempt to find optimal solutions, but only tries to find solutions quickly and without wandering too much into unimportant parts of the graph.

It does this by employing a heuristic that approximates the distance of a node from the goal node. This is most trivially explained on a graph that represents a path mesh in space. If our goal is to find a path from point A to point B, we could set the heuristic to be the Euclidean distance from the queried point to point B, scaled by a chosen factor.

This heuristic is employed by adding it to our distance from the start point. Beyond that, the rest of the implementation is identical to Dijkstra.

The algorithm’s main advantage is the quick average runtime. The main disadvantage is the fact that it doesn’t find optimal solutions, but any solution that fits the heuristic.

A significantly large static set of string keys has been given, together with data for each of those keys. We need to create a data structure that allows us to update and access that data quickly, with constant time even in worst cases. How can we solve this problem?

Let’s mark the number of keys as N. The problem presented is a problem of perfect hashing. We do a similar approach as a normal HashTable, but instead of storing collisions in a list, we store them in a secondary HashTable. We choose primary hashing functions until all buckets have a relatively small number of elements in them. After that, in buckets with K keys we place hash tables with K^2 buckets. Even though this seems to result in high memory complexity, expected memory complexity is O(N). By choosing this size of HashTables we have a 50% probability to have collisions. This makes it easy to choose hashing functions that will not result in collisions.

Can Binary Search be used for linked lists?

Since random access is not allowed in linked list, we cannot reach the middle element in O(1) time. Therefore Binary Search is not possible for linked lists.

How to find if two given rectangles overlap?

Two rectangles do not overlap if one of the following conditions is true.
1) One rectangle is above top edge of other rectangle.
2) One rectangle is on left side of left edge of other rectangle.

How to find angle between hour and minute hands at a given time?

The idea is to take a reference point as 12. Find the angle moved by hour and minute hands, subtract the two angles to find the angle between them.

When does the worst case of QuickSort occur?

In quickSort, we select a pivot element, then partition the given array around the pivot element by placing pivot element at its correct position in sorted array. The worst case of quickSort occurs when one part after partition contains all elements and other part is empty. For example, if the input array is sorted and if last or first element is chosen as a pivot, then the worst occurs.

Given a big string of characters, how to efficiently find the first unique character in it?

The efficient solution is to use character as an index in a count array. Traverse the given string and store index of first occurrence of every character, also store count of occurrences. Then traverse the count array and find the smallest index with count as 1.

How to count inversions in a sorted array?

Two elements arr[i] and arr[j] in an array arr[] form an inversion if a[i] > a[j] and i < j. How to count all inversions in an unsorted array.

Given a big array, how to efficiently find k’th largest element in it?

There can be many solutions for this. The best solution is to use min heap. We Build a Min Heap MH of the first k elements. For each element, after the kth element (arr[k] to arr[n-1]), compare it with root of MH, if the element is greater than the root then make it root and call heapify for MH, Else ignore it. Finally, MH has k largest elements and root of the MH is the kth largest element.

You are given an array of sorted words in an arbitrary language, you need to find order (or precedence of characters) in the language. For example if the given arrays is {“baa”, “abcd”, “abca”, “cab”, “cad”}, then order of characters is ‘b’, ‘d’, ‘a’, ‘c’. Note that words are sorted and in the given language “baa” comes before “abcd”, therefore ‘b’ is before ‘a’ in output. Similarly we can find other orders.
This can be solved using two steps: First create a graph by processing given set of words, then do topological sorting of the created graph.

How to find middle element of linked list in one pass?

In order to find length of linked list we need to first traverse through linked list till we find last node, which is pointing to null, and then in second pass we can find middle element by traversing only half of length. In order to find middle element of linked list in one pass, you need to maintain two-pointer, one increment at each node while other increments after two nodes at a time, by having this arrangement, when first pointer reaches end, second pointer will point to middle element of linked list.

How to find if linked list has a loop ?

We can use two pointer approach to solve this problem. If we maintain two pointers, and we increment one pointer after processing two nodes and other after processing every node, we are likely to find a situation where both the pointers will be pointing to same node. This will only happen if linked list has loop.

How to find 3rd element from end in a linked list in one pass?

If we apply same trick of maintaining two pointers and increment other pointer, when first has moved up to 3rd element, than when first pointer reaches to the end of linked list, second pointer will be pointing to the 3rd element from last in a linked list.

In an integer array, there is 1 to 100 number, out of one is duplicate, how to find?

In this case you can simply add all numbers stored in array, and total sum should be equal to n(n+1)/2. Now just subtract actual sum to expected sum, and that is your duplicate number. Of course there is a brute force way of checking each number against all other numbers, but that will result in performance of O(n^2) which is not good. By the way this trick will not work if array have multiple duplicates or its not numbers forming arithmetic progression. Here

Step 1: Start
Step 2: Take two variable I and j.
Step 3: j is positioned on last character (Technically we can do this by length(string)-1)
Step 4: I is positioned on first Character (we can do this by string[0])
Step 5: String[i] is interchanged String[j] Step 6: Increment I by 1
Step 7: Increment J by 1
Step 8: If ‘I’ > ‘j’ then go to step 3
Step 9: Stop

How do you find duplicates in an array if there is more than one duplicate?

One way of solving this problem is using a Hashtable or HashMap data structure. You can traverse through array, and store each number as key and number of occurrence as value. At the end of traversal you can find all duplicate numbers, for which occurrence is more than one. In Java if a number already exists in HashMap then calling get(index) will return number otherwise it return null. this property can be used to insert or update numbers in HashMap.

What is difference between Singly Linked List and Doubly Linked List data structure?

Main difference between singly linked list and doubly linked list is ability to traverse. In a single linked list, node only points towards next node, and there is no pointer to previous node, which means you can not traverse back on a singly linked list. On the other hand doubly linked list maintains two pointers, towards next and previous node, which allows you to navigate in both direction in any linked list.

What is binary search tree?

Binary Search Tree has some special properties e.g. left nodes contains items whose value is less than root , right sub tree contains keys with higher node value than root, and there should not be any duplicates in the tree. Apart from definition, interview can ask you to implement binary search tree in Java and questions on tree traversal e.g. IN order, preorder, and post order traversals are quite popular data structure question.

Explain what is Skip list?

Skip list the method for data structuring, where it allows the algorithm to search, delete and insert elements in a symbol table or dictionary. In a skip list, each element is represented by a node. The search function returns the content of the value related to key. The insert operation associates a specified key with a new value, while the delete function deletes the specified key.

What are the three laws of recursion algorithm?

All recursive algorithm must follow three laws:
It should have a base case
A recursive algorithm must call itself
A recursive algorithm must change its state and move towards the base case.

What Is The Difference Between A Backtracking Algorithm And A Brute-force One?

Due to the fact that a backtracking algorithm takes all the possible outcomes for a decision, it is similar from this point of view with the brute force algorithm. The difference consists in the fact that sometimes a backtracking algorithm can detect that an exhaustive search is unnecessary and, therefore, it can perform much better.