Simple statistics: counting blank results

When this posts I’ll be on my way to a week long conference, and while I knew this conference was coming I always forget how much time it takes beforehand to prepare, and afterwards to catch up. So while I’m gone I’ll be posting a few things that I started some time back, but which somehow never got posted.

As I’ve noted previously, sometimes a cell will have a formula, but that formula will return a blank result. The simplest of these is =””. Its result is everything that’s between the quotation marks, which is to say: nothing.

Sometimes it can be very useful to find out how many of these there are in a given range. If you were using an if() statement to put things into categories you might want to make sure that everything gets put into a category. For example, suppose you were a teacher of a number of different classes, with different numbers of students, and had set up the following set of rules.
– All grades > 90 are A
– All grades < 90 and >80 are B
– All grades < 80 and >70 are C
– All grades < 70 and >60 are D
– All grades < 60 are F
– Everything else is blank

It looks like a reasonable set of rules, but any grade which is exactly 90 wouldn’t match either of the first two rules, so it would be blank. That’s fine if you meant it to be blank, but not so good if you didn’t. So what you want to know is not how many cells are blank, but how many are not empty but have blank results. If the number is zero, you’re good, but if the number is greater than zero you need to check your rules.

If you only had one class, you could just set the range to be the same number of rows as you have students. With multiple classes, though, if you want to re-use the formula then it’s best to set the range to the largest possible number of students — or simply to the entire column. That means you have to filter out the number of blank lines, in order to find the actual blank results.

In Excel, you can use the fact that blank results are counted by both countblank() and counta() to figure out how many cells are in both categories. =counta(A:A)+countblank(A:A)-rows(A:A) would return the number of formulas in column A returning blank results, because the cells containing empty results would be counted twice – once as blank, and once as non-blank. If you knew there were 30 students you could also just use =countblank(A2:A31), but the more general formula can be anywhere, and it will still return the correct number no matter how many students there are, or whether there’s a column header, or anything else. The general formula will, however, give an inaccurate result if the formula for assigning letter grades extends further than the list of student names.

Google Docs Spreadsheet also treats blank results as blanks, so if you knew there were 30 students and a column header you could use =countblank(A2:A31). But if you wanted a broader version that would work without knowing the number of students you couldn’t use the same formula as in Excel. Google Docs Spreadsheet treats cells as either blank or non-blank, but never both, so you can’t just count the overlap. However, a broader version is still possible. To create one you need a different column that’s guaranteed to have a non-blank entry in every row, to create a fake overlap. In this example it can be pretty much assumed that every student will have a name, so you could do something like: =counta(Names)-counta(Grades)
Essentially this is just finding the blanks by process of elimination. Count the names, and count the grades, and the difference between them is the number of names that don’t have grades, or, in other words, the number of names that have blanks instead of grades.

In Open Office Calc, however, there’s no way to count blank results using a combination of counta() and countblank(), but you can do something much simpler. You can use =countif(A:A;””). I talked about the countif() function here. Although this is a much simpler formula I didn’t suggest it for the others because it won’t work for them. In both Google Docs and Excel it will count all of the empty cells as well as the blank results, same as countblank().