First of all: I'm not sure if this question is allowed here. If not, I
apologize
I'm trying to solve the following problem: For each word in a text find the
number of occurences for each unique word in the text.
i've come up with the following steps to solve this:
* remove all punctuation except for whitespace and make the text lowercase
* find all unique words in the text
* for each unique word, count the number of occurences.
This has resulted in the following code:
removePunctuation :: [Char] -> [Char]
removePunctuation str = filter (\c -> elem c (['a'..'z'] ++ ['A'..'Z'] ++
['\t', ' ', '\n'])) str
process :: [Char] -> [String]
process str = words (map toLower (removePunctuation str))
unique :: (Eq a) => [a] -> [a]
unique [] = []
unique (x:xs) = [x] ++ unique (filter (\s -> x /= s) xs)
occurenceCount :: (Eq a) => a -> [a] -> Int
occurenceCount _ [] = 0
occurenceCount x (y:ys)
| x == y = 1 + occurenceCount x ys
| otherwise = occurenceCount x ys
occurenceCount' :: [String] -> [String] -> [(String, Int)]
occurenceCount' [] _ = [("", 0)]
occurenceCount' (u:us) xs = [(u, occurenceCount u xs)] ++ occurenceCount' us
xs
Please remember i've only been playing with Haskell for three afternoons now
and i'm happy that the above code is working correctly.
However i've got three questions:
1) occurenceCount' [] _ = [("", 0)] is plain ugly and also adds a useless
tuple to the end result. Is there a better way to solve this?
2) I'm forcing elements into a singleton list on two occasions, both in my
unique function and in my occurenceCount' function. Once again this seems
ugly and I'm wondering if there is a better solution.
3) The whole process as i'm doing it now feels pretty imperatively (been
working for years as a Java / PHP programmer). I've got this feeling that
the occurenceCount' function could be implemented using a mapping function.
What ways are there to make this more "functional"?
--
View this message in context: http://old.nabble.com/-Newbie--What-to-improve-in-my-code-tp29156025p29156025.html
Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.