String Subsets

November 23, 2010

This exercise is part of our on-going series of interview questions:

Given two strings, determine if all the characters in the second string appear in the first string; thus, DA is a subset of ABCD. Counts matter, so DAD is not a subset of ABCD, since there are two D in the second string but only one D in the first string. You may assume that the second string is no longer than the first string.

Your task is to write a function to determine if one string is a subset of another string. You should work as you would in a programming interview; if you find one solution, search for a better solution. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

I was delighted to see that I came up with the same two O(n2) and O(nlogn) solutions that you proposed… too bad I didn’t see the O(n) solution (that one was a “D’oh” moment, really) before reading the comments. Guess I won’t quit my day job yet :-)

A much shorter solution involves removing the word from the subset one letter at a time.

def isSubset(word, subset):
"""
Determines whether all of the characters in subset appear in word.
The second string cannot be longer than the first string.
>>> isSubset("ABCD", "DA")
True
>>> isSubset("ABCD", "DAD")
False
"""
assert len(subset) <= len(word), "The second string cannot be longer than the first string."
# Go along the subset and try and remove corresponding letters from the word
for letter in subset:
if word.replace(letter, '') != word: # it was replaced correctly
word = word.replace(letter, '') # remove the letter and continue
else:
return False # couldn't find the letter to replace, so its not there
return True # all letters exist in the word