Posted by Q Ethan under explorationComments Off on Predictably Unpredictable

Talk to a Chicagoan about their hometown and the topic will quickly move to weather: “You can get all four seasons in the same day.” “Nobody moves here for the weather.” “It’s always changing, so it’s a good conversation piece.” I simply say that Chicago’s weather is predictably unpredictable.

These quips wouldn’t be quite so funny if they weren’t true. Chicago weather is especially a gamble during the changeover seasons of autumn and spring. (The occasional April or October snow says it all.) You never quite know what to wear until the day is halfway over.

Thankfully, that doesn’t stop Chicago from planning outdoor activities. Ask anyone who has run the Chicago Marathon: race-day temperatures are often in question till the morning of.

To celebrate the most recent Chicago Marathon, held just this past Monday, we took a quick peek at race-day temperatures over the years. Here’s what we saw:

No, your eyes do not deceive you: the high temperatures go as low as 40 degrees, while the low temperatures go as high as 70. Be sure to pack mittens and light clothing!

Posted by Q Ethan under explorationComments Off on Top Eleven NYC Restaurant Inspection Violations

After our brief knit interlude (“knitterlude?”) we’re back on the restaurant inspection beat.

Restaurant owners have plenty of reasons to want to pass an inspection. For one, if an inspector from the city’s Department of Health and Mental Hygiene (DOHMH) finds enough problems, the restaurant may be subject to a follow-up inspection or even a temporary closing. Two, restaurants will soon have to display letter grades based on their inspection scores. This means potential customers will not only see a pass/fail mark, but they’ll see how well a restaurant passed. -and who wants the reputation of being a C-student eatery?

According to a recent New York Times article, proactive restaurants are bringing in consultants to help them stay inspection-ready. The consultants, some of them former DOHMH inspectors themselves, essentially perform a mock inspection to spot potential health code violations before a DOHMH team drops in. That gives restaurant owners a chance to remedy the problems before they are subject to an official inspection. It’s like getting an advance look at the big midterm exam, except no one will get expelled for it.

We think this is a great idea, as it never hurts to have experienced guidance. We also think it’s helpful to improve one’s understanding of the situation before calling in expert help. To that end, we explored the restaurant data in search of the top eleven most common critical inspection violations. (Why top eleven, instead of top ten? Well, because it’s one more.) Specifically, we reviewed the inspection data from 02 January 2009 through 27 March 2010. This data covered roughly 55,000 inspections across 23,000 restaurants. If a restaurant could tackle these issues on their own, ahead of time, then their inspection consultants could focus on other issues.

Hand washing facility not provided in or near food preparation area and toilet room. Hot and cold running water at adequate pressure not provided at facility. Soap and an acceptable hand-drying device not provided.

9

04N

Evidence of, or live roaches in facility’s food and/ or nonfood areas.

8

04A

Food Protection Certificate not held by supervisor of food operations.

7

06E

Sanitized equipment or utensil, including in-use food dispensing utensil, improperly used or stored.

6

04I

Raw, cooked or prepared food is adulterated, contaminated, cross-contaminated and/ or not discarded in accordance with HACCP plan.

5

06D

Food contact surface not properly maintained, or not washed, rinsed and sanitized after each use and following any activity when contamination may have occurred.

4

06C

Food not protected from potential source of contamination during storage, preparation, transportation, display or service.

We interrupt our coverage of New York City restaurant inspections to preview an upcoming post.

The knitting warriors are at it again.

You may recall our coverage of Sockwars IV, the knitting assassin game, some time ago. (In case you missed it: Sockwars is a take on the old assassin game, in which you “assassinate” your victim by knitting and mailing them a pair of socks. The organizer describes it as, “the largest, bloodiest, extreme knitting tournament in the world.” ) Sockwars V is now underway and there has been some serious yarn carnage. While the competition isn’t quite over, we couldn’t help but take a preliminary peek at the body count. Call it morbid curiosity.

Here we see the number of assassinations per week, since the start of the competition:

Sockwars V: kills by week

The bloodletting peaked the second week and has dropped off since then. Reviewing this by day we get a slightly different picture:

Sockwars V: kills by date

Here we see that the assassinations come in waves. They’re closer together toward the start of the competition but become more spread out over time. This makes sense. In such a large competition I would expect a lot of people to get picked off early on, then the victors of those early battles are left to slug it out with one another.

This next chart shows just how mean these knitters can be:

Sockwars V: kills by weekday

See that? Friday has been the biggest day for assassinations, by far. We always figured Monday was the week’s designated low point; now these crazy knitters go ruining Friday! Is nothing sacred?

That’s all for now. We’ll bring you more in-depth analysis after the dust clears and one killer knitter has declared victory.

Posted by Q Ethan under explorationComments Off on New York City Restaurants: Day by Day

In our previous post, we noted that all 23,000 New York City eating establishments are subject to surprise inspections by the Department of Health and Mental Hygiene (DOHMH). Before, we used the restaurant inspection data to learn a little more about New York itself. This time let’s assume the role of a dodgy restaurant owner: what tricks can we tease out of the data to avoid our due diligence in food handling and kitchen cleanliness?

(Just as a side note, how the inspections begin? Do the DOHMH inspectors swing open the doors, Wild West-style? Do they pose as everyday patrons, then pull off their coats to reveal gleaming DOHMH badges? Do the inspectors even get badges? If not, Mayor Bloomberg, we’d like a moment of your time…)

We first asked whether any particular time of year was light on inspections. The charts hinted that some months may be more favorable than others; but after some numerical digging we learned that the variations could very well be due to chance. We saw similar results on the week-by-week data.

Looking at the number of inspections for each day, we saw something a little different:

This chart shows what could be a pattern of peaks and valleys. Those spots that appear blank? They’re really just small values, close to zero. Closer review reveals that those dips occur on Sundays. What happens if we group the inspection counts by day of the week?

Wednesdays and Thursdays look like pretty good days for DOHMH inspectors to drop in, yes. What really stands out is the Sunday value. There were just 116 total Sunday inspections, compared to thousands of inspections for the other days. Hmm…

Using a slightly different chart, we can get a better idea of the distribution of inspections across each day of the week:

If it’s been a while since your last statistics class, this is a box-and-whiskers plot or simply box plot. For our purposes it’s a tad more useful than a standard bar plot or histogram. The box plot reflects the same general shape as our bar chart, but it also shows the spreads (the highs and lows) of the data as well as the median values (the lines in the middle of each box). Not only were there few Sunday restaurant inspections, but the number of Sunday inspections varied little every week.

So far we’ve been looking at the totals across the entire city. Will we see a different pattern if we break apart the dataset by borough? For example, we expect certain parts of Manhattan will be very quiet on the weekends. A picture tells the story of how the boroughs stand on their own:

Aside from a dip in Staten Island on Fridays, the pattern is similar across all five boroughs. While we can hardly say this is a definite trend, we may be on to something. So for all you dodgy restaurant owners out there, save that Saturday night kitchen cleanup for Sunday night! Who will notice? Except for the customers, of course …

In all seriousness, we hope that restaurant owners don’t take these findings to heart. Please keep your kitchens in order all seven days of the week. The LocalMaxima crew likes to dine out. A lot. Food-borne illnesses aren’t on our list of take-out favorites.

Finally, the city requests those of us who use the DOHMH data to include the following disclaimer:

“The City of New York can not vouch for the accuracy or completeness of data provided by this web site or application or for the usefulness or integrity of the web site or application. This site provides applications using data that has been modified for use from its original source, NYC.gov, the official web site of the City of New York.”

Have some interesting data you’d like us to check out? Need our help making sense of your company’s data? Please drop us a line. Thanks for reading.

Posted by Q Ethan under explorationComments Off on New York City Restaurants: Running the Numbers

A big city is a haven for analysis because there’s so much going on in a relatively small space: people, public transit, real estate, and anything else, they all represent data points. Since New York is formally divided into boroughs, city data analysis gets an additional categorical dimension: we get to see how the five closely-related parts compare to the whole. Whereas neighborhood boundaries can get fuzzy, and zip codes are too narrow, boroughs are distinct subsets of the overall city that have developed their own flavors and, some would argue, odors.

On the topic of flavors and odors, this time around we’re looking at restaurants. The city’s Department of Health and Mental Hygiene (DHMH) pays surprise visits to restaurants to test how well they manage basics such as food handling and cleanliness. Here, the term “restaurant” is a wide net that includes everything from greasy spoons to fancy white-linen tablecloth numbers to corner coffee shops. If they serve anything to eat or drink, DHMH will check it out.

DHMH makes the raw inspection data available via the NYC Data Mine for anyone to review. We recently asked the data what it could tell us about New York City and its residents.

(Our numbers are based on census projections and the last twelve months’ DHMH inspection data, from September 2008 through August 2009. We hope to give an update once DHMH releases the rest of the 2009 data.)

The first thing we noticed is just how many restaurants are out there. We counted about 23,000. This is based on the number of unique restaurants in the data set and the (hopefully reasonable!) assumption that every restaurant in the city gets inspected at least once a year. While some places have no doubt closed and new ones have opened, those fluctuations should be mild compared to the totals.

How are these restaurants distributed throughout the city? A pretty picture shows us the counts by borough:

I expect few people will be shocked that Manhattan and Staten Island hold the extremes. That aside, the standalone numbers don’t tell us a whole lot. A restaurant client or owner or patron may be curious to know how these raw counts relate to other information, such as population.

Based on census estimates, about 8.2 million people live in New York City. That means 360 people for every restaurant, and that’s not counting the tourists! Once again, a chart will show us how the boroughs compare:

Despite having the greatest number of restaurants, Manhattan has the smallest ratio of population to restaurant count. At only 200 people per establishment, I don’t understand why I have so much trouble getting a table. Maybe it’s me?

Whether you want to run a restaurant or just eat in one, you may also want to know about the competition: how many other restaurants are there within a given space? That is, what is the city’s restaurant density?

New York City is about 300 square miles in size, so on the whole we have about 75 restaurants per square mile. The borough breakdown tells a different story:

If you’re hungry and on foot, Manhattan is the place to be! At almost 400 eating establishments per square mile, you’ll practically trip over restaurants. (Of those, there are 8 Stabucks per square mile. That’s a lot of caffeine.) By comparison, the other boroughs may require that you know where to look.

To help compare the boroughs, and to produce the obligatory pretty color chart, here we see how the breakdown in terms of their percentage of New York City’s restaurants, area, and population:

In future posts, we’ll explore what the inspection data means for restaurant owners and consumers.

Have some interesting data you’d like us to check out? Need our help making sense of your company’s data? Please drop us a line. Thanks for reading.

Posted by Q Ethan under explorationComments Off on Exploring the Five Boroughs

LocalMaxima has roots in New York City so we’re always on the hunt for information about it. Thanks to the NYC Data Mine, the city’s catalog of public data, we should have plenty of material. (Think of the Data Mine as a local cousin to the Federal government’s Data.gov project.)

In future posts we’ll take some peeks at Data Mine data, both on its own and mixed with other sources. As always, we’ll share our findings with you here on our website.

One topic we won’t cover, though, is the correlation between emergency-response times and graffiti. That’s not because of a lack of interest. Quite the contrary. It was our first idea when we browsed the Data Mine’s catalog, and was slated for today’s post. While hunting around for some additional information on that topic, though, we stumbled onto some folks at NYU who had published their findings. (Kudos, by the by.) So please, give them a read, and come back to us next time for another New York topic.

Posted by Q Ethan under explorationComments Off on Calculating Cinematic Displeasure

What’s your reaction to a bad movie? Do you mock it, MST3K-style? Perhaps you storm out of the cinema and attempt a refund? If you’re part of one particularly prickly crew, you post your thoughts on the “Mr. Cranky Rates the Movies!” website. (Warning: the reviews’ language will likely trip workplace internet filters. You were warned.) Providing a fistful of internet vigilante justice, the Mr. Cranky site is home to more than two thousand scathing movie reviews written by just a handful of people.

For fun, we’re taking a look at the Mr. Cranky reviews and we decided to share our initial results with you, our faithful readers.

As always, we began with some simple charts to get a feel for the data set. We first broke down the list of reviews based on the films’ release dates:

Mr Cranky: Ratings by release date

We see here that most of the reviews cover releases from the past fifteen years. There are a few outliers though, including a small crop of films released in the early 1970s. There’s also a curious gap in 2007. Perhaps Team Cranky needs to rest up before they take on more pain?

We removed the outliers on either end and focused on the 1995-2006 region. This still comprises about 96% of the data set (about two-thousand movies).

If a movie is bad enough, we may say that it “bombed.” Instead of the standard one- to four-star ratings, Mr. Cranky’s system is based on explosives: from one bomb (hated the least) to four bombs, then dynamite, and finally, the nuke (hated the most).

Charting the review ratings, broken down by year, we see that Mr. Cranky has been judicious in dishing out the pain. Each year’s reviews form something of a bell shape, with most movies taking the middle rating of a three-bomb score (green in the chart below).

Mr Cranky: Ratings by release date

In fact, if we plot these years as lines instead of separate breakdowns, we see plenty of overlap. The familiar bell shape is rather consistent over time:

Mr Cranky: Movies reviewed by release date

Given such consistency in the distribution, one would expect the Mr. Cranky crew to have some strict criteria for how to assign the ratings. The data we have here yield little insight into that process; all we can see is that there’s probably no bias in terms of a film’s release date. Perhaps its based on the number of child sidekicks, or car chases, or even child sidekicks chasing cars?

With more data, we would ideally be able to model the Mr. Cranky system and predict a movie’s rating. This makes for a fun party game, yes; but for movie executives such a formula could be useful. We wager Hollywood studios already have a screening process for scripts, but clearly some clunkers still make it through. What if the studios could employ a predictive model to enhance their accept/reject process? A successful model would permit them to divert funds toward sure-fire blockbusters and pull the plug on failures before they go too far.