Suspend/resume vs. NetworkManager

The other day while chilling beside the pool on my private island (A), I decided to head into Port Nelson (B) to check up on my various offshore accounts. Financial crisis and all you see; that Stanford thing last week really had me worried. A laptop hibernation and a short helicopter ride later, I’m in the branch office and need to look up a few things pertaining to my net worth. But upon resume, NetworkManager started reconnecting to my villa’s access point, which was all the way back on my island. WTH!!!??!?!

This problem has been around for a long time. Pretty much since the beginning of time. I looked at it last year and concluded that it wasn’t NetworkManager. This time it really annoyed me, so I made a bet with my porter that I’d figure it out by time I left to hit up this party in Bailey Town. He’s cool like that. I got to keep my money. It still wasn’t NetworkManager.

See, drivers timestamp wifi networks they know about. That way you can figure out if the network was last seen a second ago, 7 seconds ago, or so long ago that it’s dead to me. But they all use an kernel counter called ‘jiffies’ to do that. And ‘jiffies’ doesn’t increment across suspend/resume. See where I’m going with this?

So the next scan after resume, all the old networks are mixed in with the new networks, and you simply can’t tell which ones are old and which ones are new. They all look like they were scanned within the past 10 seconds. The last AP you were connected to looks like a great candidate to try, no matter where it is.

The solution is to age the scan results with the amount of time spent in suspend. This keeps both normal laptops (where you’ll usually be suspended for a while) and OLPC-style laptops (where suspend can happen for sub-second durations) happy. The patches are queued for 2.6.30, and I’ve backported them to 2.6.27, 2.6.28, and 2.6.29. They are also a prerequisite for making NetworkManager just try harder to associate when the connection fails, which I know annoys a lot of people, including myself.

Problem solved, party attended.

The big lesson? When something is wrong with the drivers, fix the drivers. Don’t hack around it like a helpless tool. And if you can’t fix the driver, well… then why did mindlessly stuff $50 bills into Broadcom’s thong in the first place?