May 2009 Archives

Google's statisticians routinely use randomized experiments to improve their products (and profit), but did you know they also conduct quasi-experiments when random assignment is not feasible? I receive the American Statistical Association's (ASA) membership magazine called Amstat News. Daryl Pregibon, a Google statistician (or "engineer" as they are called internally), was invited to write about the company's statistical practices in the May issue. He writes that Google users can be randomly assigned to treatment conditions, but

"it is usually not possible to randomly assign advertisers to treatment groups due to contractual obligations and/or their willingness to be 'experimental units' for a service for which they are paying. In such cases, we ... use statistical methods that try to tease out causal inferences. Propensity score matching, inverse propensity weighting, and double robust estimates are some of the methods established in social and biological sciences currently in use at Google when randomization is not possible."

That approach mirrors best practices in quantitative evaluation. Randomized field trials are considered the gold standard for judging the degree to which a program or its components cause a desired outcome; when random assignment is not feasible, quasi-experiments provide a valuable alternative. Evaluation researchers rarely have as much control over conditions as Google's "engineers." Consequently, evaluators must rely more on quasi-experiments to "tease out causal inferences." Another key difference is that no matter how enormous a program data set may seem and no matter how many parameters a client might want an evaluator to estimate, those amounts will never reach the terabytes of data or the millions of parameter estimates that Pregibon describes as commonplace in life of a Google statistician.

By the way, my master's paper involved applying inverse propensity weighting to account for self selection into a local public school district. Does that mean a career as a Google statistician is in my future?

With spring semester in my rear view mirror, I found some time to use the maptools package to make a proficiency map that can be displayed in Google Earth. It's essentially a choropleth map in Keyhole Markup Language (KML) format. Google Earth takes visualizing educational outcomes to a whole new level. Distributing proficiency maps in KML format would make it easy for parents, school district employees, policy makers, and students themselves to explore their district's test scores and those of nearby districts. Additionally, KML proficiency maps could help evaluators of educational programs involve stakeholders and frame questions.

Try it for yourself. You can click on the image below to explore Minnesota's 3rd grade math proficiency results with Google Earth. After you've gotten your fill of zooming around the map, try the following:

Click on a district to activate a pop-up window containing the district's name and results.

Click on the "+" sign next to "2007 Minnesota Comprehensive Assessments (MCA-II)" under the Places sidebar. The district results will expand downward, showing results in tabular format with map links. Double-click a district name in the sidebar to zoom to that location.

Use the Search -&gt Find Business section of the sidebar to find an after school tutoring program in a district of your choice.