Sunday, July 3, 2016

Problem

Given a string, find the length of the longest substring without repeating characters.

Examples:

Given "abcabcbb", the answer is "abc", which the length is 3.

Given "bbbbb", the answer is "b", with the length of 1.

Given "pwwkew", the answer is "wke", with the length of 3. Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

Solution

The brute force approach entails looking at all O(n^2) substrings, and checking in O(n) if there are any repeated characters, resulting in O(n^3) time. The more efficient solution actually solves this in linear time. Through some simple bookeeping, we keep track of the starting position and length of the substring. While iterating through the string we must maintain the indices of repeated characters that were previously seen. A HashMap proves convenient, allowing for constant access to the index of each character. A simple int array could also be used at the cost of more space (for all possible characters).

At each character, we store the character's current index if it is either in the current max substring, or if it was previously, but is not anymore (Case 1). Otherwise, we have a character that is already in the current substring, in which case we offset the start of the substring to after the index of the first occurrence of the repeated character (and then save the new index). Below is the implementation of this algorithm. It is quite clean, and an important one to know as variations of it are very common as interview questions.