Category: algorithms

You have a list of integers, and for each index you want to find the product of every integer except the integer at that index.

Restrictions: divisions are not allowed.

First thought that comes to my mind is to use 2 for-loops and multiply all the indices except the current one of the first loop. Easy, finished in less than a minute. The issue is the solution has an n^2 complexity.

How to optimize?

As usual the problem is the data is not organized in a nice way for this problem. Time to analyze!

By looking at the expanded output, I can see there is a pattern going on. Exactly what it is took me a few minutes to figure it out, but right out of the box I can definitely see that:

The lowest index of the input is added on the expanded output with an index of 1. In other words the number 1, with an index 0, of the input array, was added on the index 1 of the output array; which makes sense given that index 0 of the input cannot be considered in the product of the output in index 0.

Note: this sort of reminds me to the fibonacci problem were you have to reuse operations to avoid recalculating the same thing over and over. Its hard to think how to reuse operations though. In this case I have to break it into sub-problems to identify the patterns.

A few minutes pass of me staring at the whiteboard and I realize that the current index has all the numbers to the left and to the right of it.

index 2 of ouput expanded: [—,—,1*2*4,—]

The 1 and 2 are numbers to the left of the current number and 4 is to the right. Sounds silly, but that helps to think about the solution. In other words, the current number is the product of all the numbers to the left and all the numbers to the right!

What if I have 2 extra arrays:

before[] where its current index is the product of all the indices before it.

after[] where its current index is the product of all the indices before it.

I can then loop through the before[] and after[] array and simply multiply the current index and store it in a result[] array.

I am not sure, but that sounds like it reduces the complexity (hopefully to 2 individual loops).

It sounds like a good moment to start drawing some code (I say drawing because I am getting used to whiteboards so I write it there first).

I made a mistake here! I did not realize right away the ordering of the multiplication and assigning the arrays was important. First I need to assign the value to the product array, otherwise I am skipping ahead.

(I briefly made another mistake but corrected it in my mind. Initializing the product variables to 0 because there are no values before the first or last index. What is interesting is the representation of this emptiness works well with the number 1 and not 0 – also 0 would just 0 out the entire multiplication)

“Design a simplified version of Twitter where users can post tweets, follow/unfollow another user and is able to see the 10 most recent tweets in the user’s news feed. Your design should support the following methods:

postTweet(userId, tweetId): Compose a new tweet.

getNewsFeed(userId): Retrieve the 10 most recent tweet ids in the user’s news feed. Each item in the news feed must be posted by users who the user followed or by the user herself. Tweets must be ordered from most recent to least recent.

follow(followerId, followeeId): Follower follows a followee.

unfollow(followerId, followeeId): Follower unfollows a followee.”

Understanding the data

Each userId is related to a tweetId, in the form of a “twitter post”.

Each userId is related to multiple userIds, in the form of a “follower” or “followee”.

I need to store and organize the data in such a way that makes it easy to relate a userId to a tweetId; to perform find operations.

The data also needs to be organized in such a way that I can find a userId and based on this userId find all the followees, to then extract all the “tweetIds” from the followees.

We could create all these relationships in a Twitter class; using lists, maps and sets. The problem is the code then becomes convoluted and there is no clear separation of logic. The previous leads me to believe I need to use classes.

The Twitter class ended up being the most complicated one because it is the one that orchestrates the other 2 classes and gives them the functionality.

I chose to use List for non randomized access to data, HashMap for 1:1 relations and Set whenever I need a unique-random-access data.

The pros:

Organized data and easy to identify each user and tweetId.

Users and tweets are also encapsulated.

The cons:

Relating data outside its data structure requires some processing – retrieving the news feed of a user requires retrieving tweets of several other users. Perhaps the relationship can be improved?

Thoughts:

As part of my learning experience I thought about using the Java Streaming API somewhere in this implementation, but I did not find a good use for it. The stream appears to work better when all the data is in a single point. I thought about using the stream in the news feed method but I have not really found a good use for it when the data is scattered through different users – again, maybe the design needs to be improved.

The design does not consider any scenarios of an actual implementation: networking, databases, distributed systems. These 3 classes get much more complicated if tried to implement in a real scenario.

I also added some tests. To be honest the tests need more work as I am only testing for the ideal case.

Update:

A friend of mine suggested to think of how it would work in a database and that the list of tweets would not be held by the user table or class. By moving the tweet list to the Twitter class I was able to remove a lot of the aggregated complexity. In general, the design became much simpler and… I was able to use the Java streaming API!

Currently the design does not need date comparison since its single threaded, in a distributed system the software would more than likely need date-based sorting.

Given an array of integers and an integer target, find any 2 numbers in the list that add up to the target.

My first naive approach was to brute force it: for every number, iterate it against every other number and check to see if the addition matches the target. That is great, we will find the solution but its a very slow algorithm. It consumes n^2 processing cycles in the worse case scenario.

Trying to do better – initial thoughts:

If the target is greater than zero, we do not need to consider any number that is larger than the target.

If the target is small than zero, we do not need to consider any number that is smaller than the target.

Is the data organized in the data structure that we need? Maybe the algorithm is slow because it is constrained by the list/array of integers.

Is there any data structure that gives me O(1) access to the data? A map or a set could do the job but then how would the algorithm look like? Plus I would need to process the data to store it in the new data structure, that would take n cycles.

This did not come right away to my thoughts but after reading more about it I did understand it: the difference of any index in the array and the target can be found in the newly built set fairly quickly. If we do not find the data, then there is no answer. I also ignore any number larger than target.

Not terribly bad – 2 for loops over n, so n+n(2n). Definitely better than n^2. The solution also traded processing for memory because now an extra Set is required.

Update and reminder to myself: at first I did not take into consideration negative numbers. My initial logic was discarding all negative numbers because of the logic: “we do not need to consider any number that is smaller than the target”. The previous logical statement works only when target is greater than zero. Funny enough by trying to optimize, I ended up making it more complicated and breaking the whole thing with any target less than zero.

Update #2: The optimization was actually breaking a case. What happens when the target is 10 and the array has numbers like 12 and -2?