In an otherwise pointless comment thread the other day, Dan Lakeland contributed the following gem: A p-value is the probability of seeing data as extreme or more extreme than the result, under the assumption that the result was produced by a specific random number generator (called the null hypothesis). I could care less about p-values […] The post “Null hypothesis” = “A specific random number generator” appeared first on Statistical…

Attention conservation notice:: An academic promoting his own talk. Even if you can get past that, only of interest if you (1) care about statistical methods for comparing network data sets, and (2) will be in Seattle on Friday. Since the coin came ...

Here's to the the NBER's ongoing Conference on Research in Income and Wealth (CRIW), unsung hero, home of down-and-dirty measurement mavens since 1935. Yes, since 1935! Check out Chuck Holten's fascinating CRIW description in the NBER ...

For a few years now I have given a guest lecture on time series analysis in our School’s Environmental Epidemiology course. The basic thrust of this lecture is that you should generally ignore what you read about time series modeling, either in paper...

Over the years I've posted a number of times about various aspects of using dummy variables in regression models. You can use the "Search" window in the right sidebar of this page if want to take a look at those posts.One of my earlier working papers o...

What DataUsa is doing could be – I guess – the next step in the evolution of Open Government Data websites. It’s the step from offering file downloads to presenting data (and not files) interactively. And it’s a kind of presentation many official statistical websites would surely be proud of. César A. Hidalgo from MIT discusses … Continue reading Next Step in OGD Websites

Yesterday, in the context of a post about news media puffery of the latest three-headed monstrosity to come out of PPNAS, I promised you a solution. I wrote: OK, fine, you might say. But what’s a reporter to do? They can’t always call Andrew Gelman at Columbia University for a quote, and they typically won’t […] The post A template for future news stories about scientific breakthroughs appeared first on…

In a previous post I showed how to download, install, and use packages in SAS/IML 14.1. SAS/IML packages incorporate source files, documentation, data sets, and sample programs into a ZIP file. The PACKAGE statement enables you to install, uninstall, and manage packages. You can load functions and data into your […] The post Create a package in SAS/IML appeared first on The DO Loop.

Attention conservation notice: I have no taste. Guido W. Imbens and Donald B. Rubin, Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction While I found less to disagree with about the over-all approach than I anticipated...

My fifth Ph.D. student is defending his thesis towards the end of the month: Lawrence Wang, Network Comparisons Using Sample Splitting Abstract: Many scientific questions about networks are actually network comparison problems: Could two networks ha...

Attention conservation notice: Only of interest if you (1) care about running large simulations which are actually good for something, and (2) will be in Pittsburgh on Tuesday. Kary Myers, "Partitioning a Large Simulation as It Runs" (Technometrics ...

In the comment thread to today’s post on journalists who take PPNAS papers at face value, Mark asked, in response to various flaws pointed out in one of these papers: How can the authors (and the reviewers and the editor) not be aware of something so elementary? My reply: Regarding the authors, see here. Statistics […] The post PPNAS: How does it happen? And happen? And happen? And happen? appeared…

This is Hilary’s and my last New York-Baltimore episode! In future episodes, Hilary will be broadcasting from California. In this episode we discuss collaboration tools and workflow management for data science projects. To date, I have not found a pr...

Pretty regularly – usually in the middle of one of those interminable fixed-vs-random effects discussions – someone will pipe up that “Of course, for Bayesians this random vs fixed effect distinction makes no sense because all parameters are random”. To the extent it can be made to make sense, the claim is false. It’s also […]

Here's a question that appeared recently on the Reddit statistics forum:If effect sizes of coefficient are really small, can you interpret as no relationship? Coefficients are very significant, which is expected with my large dataset. But coeffic...

Journalists are suckers. Marks. Vics. Boobs. Rubes. You get the picture. Where are the classically street-trained reporters, the descendants of Ring Lardner and Joe Liebling, the hard-bitten journos who would laugh in the face of a press release? Today, nowhere in evidence. I’m speaking, of course, about the reaction in the press to the latest […] The post Journalists are suckers for anything that looks like science. And selection bias…

One of the most common tasks in statistical computing is computation of sample variance. This would seem to be straightforward; there are a number of algebraically equivalent ways of representing the sum of squares \(S\), such as \[ S = \sum_{k=1}^n (...

One of the most common tasks in statistical computing is computation of sample variance. This would seem to be straightforward; there are a number of algebraically equivalent ways of representing the sum of squares \(S\), such as \[ S = \sum_{k=1}^n (...

To busy readers: Skip to the tl;dr summary at the end of this post. A psychology researcher sent me an email with subject line, “There’s a hell of a paper coming out in PPNAS today.” He sent me a copy of the paper, “Physical and situational inequality on airplanes predicts air rage,” by Katherine DeCelles […] The post Ahhhh, PPNAS! appeared first on Statistical Modeling, Causal Inference, and Social Science.

We are delighted to announce that the programme for the 4th R in Insurance conference at Cass Business School in London, 11 July 2016, have been finalised. Register by the end of May to get the early bird booking fee.The organisers gratefully acknowled...

We’ve turned the understanding of charts into formulas instead of encouraging people to think and ask questions. That doesn’t produce better charts, it just gives people ways of feeling superior by parroting something about chart junk or 3D being bad. There is little to no research to back these things up. The Trivapro 3D Bar Chart This 3D … Continue reading 3D Bar Charts Considered Not That Harmful

Some of the discussion of yesterday’s post reminded me of a wonderful bit from Life on the Mississippi: When I was a boy, there was but one permanent ambition among my comrades in our village on the west bank of the Mississippi River. That was, to be a steamboatman. We had transient ambitions of other […] The post Macassar appeared first on Statistical Modeling, Causal Inference, and Social Science.

At first glance, this Wall Street Journal chart seems unlikely to impress as it breaks a number of "rules of thumb" frequently espoused by dataviz experts. The inconsistency of mixing a line chart and a dot plot. The overplotting of...