optimization

Kia recommends that I get the oil in my 2009 Rio changed every 7,500 miles. But, anecdotally, it seemed that I always got better gas mileage right after an oil change than I did right before I was due for another one. So, I got to wondering - if an oil change costs $20, but saves me a few MPGs, is it cheaper overall to change my oil sooner than 7,500 miles? If so, where's the optimal point?

So, I got all ready to do some fancy math, and began tracking my gas mileage between two oil changes... And this is what I found:

This is why anecdotal evidence is unreliable.

At least based on this one oil change cycle, my gas mileage theory was 100% imagination. I should stick to oil changes at 7,500 mile intervals.

Of course, this is really a sample of only a single oil change cycle. It started in summer and went into winter, so winter fuel blends came out (which generally have slightly higher MPGs), and I presumably used the A/C less (though my defrost requires my A/C to be on, and I use the defrost a lot). There may be other complicating factors as well. So, I plan to track this all over again and re-visit after my next oil change cycle... because gut feelings die hard.

Still migrating old posts due to travel. Next post will be fresh content!

In a previous post, we learned that if you want to maximize your score on any individual turn of a game of "Pass the Pigs," you should always roll when there's less than 22.5 points in your hand, and hold when there's more than 22.5 points in your hand. (If you've never heard of "Pass the Pigs," the rules are explained in the prior post.)

I wouldn't mind being passed this pig for a few minutes... that's some serious cuteness right there.

However, we also concluded that that's not an effective strategy for winning the game as a whole. If you have a score of 0 and your opponent has a score of 99, for example, it would be really silly to stop rolling at 23 points just because the "22.5 rule" says to. So what's a person to do? How do you play effectively? Today, we'll generate a strategy that can help you make an optimal move in any situation. (Hint: you'll need to do a lot of math.)

Since I'm out of town for a bit, I'm migrating over a few relevant posts from an old blog of mine that I'm planning to shut down. Enjoy!

Pass the Pigs is a simple yet addictive dice game that uses cute little plastic pigs as dice. If you've never played, the rules are very straightforward. On each turn, a player rolls two pigs. The pigs will land in different positions, which will determine how many points the player has in their hand for that turn. The player may then decide to "pass the pigs" to the next player. If they do this, all of the points in their hand will be added to their official score. They may also decide to roll the pigs again to try to add more points to their hand before passing the pigs. But they must be careful! If the pigs both land on their sides with one showing a dot and the other showing a blank side, they "pig out" and lose all of the points they've accumulated in their hand! It's risky business. The first player to accumulate a score of 100 or higher wins.

Credit: Larry Moore

Like anything with dice (even pig-shaped dice), Pass the Pigs is a game of chance. That means, with a little effort, we should be able to figure out the probabilities of certain things happening in the game, and develop some optimal strategies. So how do you win at Pass the Pigs? Read on to find out.

Often, complex problems can be adequately solved by simple rules that provide an acceptable solution, even if they don’t necessarily get you to the optimal point. The cell suppression problem (summarized in last week’s post) is a perfect example of this – using a methodology that would be readily apparent to any human faced with tackling the problem with pen and paper, we can create a computerized solution that can appropriately suppress data sets containing tens of thousands of records disaggregated over dozens of dimensions. This heuristic method will likely suppress more data than it really needs to, but when all is said and done, it will finish the job quickly and without completely mangling your statistics.

Heuristics are kind of like Fermi estimation. Or, more accurately, I needed an image for this post and this was the best thing I could come up with.Image credit: XKCD

We’ll start with an explanation of the basic idea, then move on to implementing it in code.

The “cell suppression problem” is one type of “statistical disclosure control” in which a researcher must hide certain values in tabular reports in order to protect sensitive personal (or otherwise protected) information. For instance, suppose Wayout County, Alaska has only one resident with a PhD – we’ll call her “Jane.” Some economist comes in to do a study of the value of higher education in rural areas, and publishes a list of average salaries disaggregated by county and level of education. Whoops! The average salary for people with PhDs in Wayout County is just Jane’s salary. That researcher has just disclosed Jane’s personal information to the world, and anybody that happens to know her now knows how much money she makes. “Suppressing” or hiding the value of that cell in the report table would have saved a lot of trouble!

No, not that kind of suppression.

Over the next couple weeks, I’ll be blogging about some algorithms used to solve the cell suppression problem, and showing how to implement them in code. For now, we’re going to start with an introduction to the intricacies of the problem.

Last weekend, I decided to build a bed. I looked up some plans online, made some modifications, drew up a list of the lengths and sizes of lumber I needed, and went to the store to buy lumber. That’s when the trouble started. The Lowe’s near me sells most of the wood I needed in 6ft, 8ft, 10ft, and 12ft lengths, with different prices. And I needed a weird mix of cuts – ranging from only 10 or 11 inches up to 5 feet. How was I supposed to know which lengths to buy, or how many boards I needed?

Of course, I could have just planned out my cuts on a sheet of paper, gotten close to something that looked reasonable, and called it a day. But I figured there had to be a better way. Turns out, there is, and there’s a huge body of academic literature on the subject. The problem I was facing was simply an expanded version of the classic “cutting stock problem.” It’s a basic integer linear programming problem that can be solved pretty easily by commercial optimization software. So, I decided to try out some optimizations!