1 Answer
1

Really, the only negative comment I have is that you collect words into a Vec and then immediately iterate over it. Instead, you can just use the iterator directly.

Regarding avoiding let mut, there's nothing specific to your code I can think of, but there are a few general methods to avoid it.

The first is to make a block and return it from that:

let v = {
let mut v = vec![2, 3, 1];
v.sort();
v
}

The second is to rebind as immutable.

let mut v = vec![2, 3, 1];
v.sort();
let v = v;

The third is to use a crate like tap, which provides a trait implemented by everything which provides a tap method.

let v = vec![2, 3, 1].tap(|mut v| v.sort());

Next, I have some suggestions for alternatives and personal preferences.

Rather than splitting on whitespace then sanitizing words, you could use regex's match function and select all continuous sequences of letters. This has slightly different semantics, since it would separate e.g. "can't" into two words. However, you could instead also allow ' and - in a word. By instead scanning by regex with find_all, you can specify exactly what constitutes a word.

Instead of or_insert(0), you can use or_default(). Since the default for numbers is 0, This does the same exact thing, but is a bit cleaner.

When collecting, I prefer to annotate the type of the variable (let x: T) rather than using turbofish notation (::<T>), I believe it looks cleaner.

The way you're sorting is perfectly fine, but I prefer using sort_by_key when possible (and reasonable). In this case, you can just wrap the key in std::cmp::Reverse, a helpful struct that reverses the ord instance on it's contents.

In general, I don't see any unneeded clones/allocations. Since the hashmap is so temporary, I would typically suggest making it's key a &str, but you can't do this since you have to make your keys lowercase. However, there's a clever way to get around this. You can wrap your &strs in a helper struct that does case insensitive hash and eq. Luckily, there's a crate for this called unicase, with structs for unicode-case-insensitive and ascii-case-insensitive String/&str. By doing this, you eliminate the allocation of a String for each word. Note that this will make your words have the case of the first time they occur, so you may want to lowercase them before printing them.

\$\begingroup\$Thanks for the detailed suggestions! I had tried using sort_by_key(|(_, freq)| freq) at first, but the compile error was a bit inscrutable. Can you explain why this doesn't work with |(_, freq)| but does with |&(_, freq)|, while my original code |(_, a), (_, b)| does work?\$\endgroup\$
– jtbandesMar 2 at 3:06

1

\$\begingroup\$Here's a good answer for that.\$\endgroup\$
– JayDeppMar 2 at 6:11