You have a large text file containing words. Given any two words, find the shortest distance (in terms of number of words) between them in the file. Can you make the searching operation in O(1) time? What about the space complexity for your solution?

My initial thoughts:
We can build and store the mapping between pairs of words to their shortest distance. That is space complexity. But the query is constant since it’s just one step of look-up.

Pre-processing: Store the locations for different words in a hashtable. One scan of the text: time complexity. Approximately space complexity.

Query: Modified binary search. For example, query is (hello, world). Lookup in the hashtable we have “hello” -> [1,2] and “world” -> [5,8,9]. We search 1 in [5,8,9] to find the nearest, which is 5. So the distance is 4. We search 2 in [5,8,9] to find the nearest, which is 5 again, yielding distance 3, less than 4. So the shortest distance between “hello” and “world” is 3. In practice, number of locations for a word is relatively small comparing to the size of the text, hence the cost of query/search is nearly constant.