OpenOffice.org: The Limits of Readability and Grammar Extensions

As a professional writer, my software needs are simple. Give me a text editor -- preferrably Bluefish, but vim or OpenOffice.org Writer will do -- and I have all I need.

However, judging by the number of aids available for writers, I am obviously in the minority. Novel-plotting databases, daily word counters, character generators -- if you can imagine the software, you can probably find at least one example. I am fascinated by all the ingenuity, but most of the time I conclude that, if you know enough to use any of these tools without them leading you into greater difficulties, you can do without them. The OpenOffice.org extensions Readability Report and Language Tool are two applications that illustrate my point perfectly.

These tests all differ in details, but all of them look at such characteristics as the number of words per sentence, the number of complex and multi-syllable words to produce an approximate level of schooling the audience would need to understand a passage. In addition, Readability Report also includes its own Weirdness Metric, which gives an average sentence score, as well as reporting on the least and most readable sentence in the passage.

All these tests are available from a top-level menu, either in a brief report for the entire document, or a detailed report that gives test results for each paragraph. The reports open in separate documents, so that you can save or print them.

Aside from the fact that Readability Report is too minor a feature to place in its own top-level menu, nothing is wrong with the extension itself. But the tests it includes are all severely limited tools, suffering from an overly mechanical view of readability.

For one thing, none of the things the test measure are, in themselves, strong indications of readability. A sentence that is three or four lines long can be highly readable if it is properly punctuated and makes use of basic rhetorical tricks such as parallelism. Similarly, the readability of words depends less on the number of syllables than on how common they are; "impossibility," for instance, should not be lumped in with "duodecahedron" and is more widely understood than a short word like "gormless."

Moreover, the tests assume that, the lower the score, the more readable a document is. However, in practice, readability depends heavily on context and the audience. Write about mounting drives, devices, or filesystems to a general audience, and you risk being incomprehensible, but, use the same terms to an audience of free software users, and the same words will probably be understood by everyone.

Because of such limitations in the tests themselves, you could use Readability Report to pare down your word choice and sentence length until it was theoretically readable by a third grader and still not write adequately. Writing is a craft, not an art, so in the end such tests mean very little.

About the most you can say is that Readability Report's detailed analysis can tell you how your readability varies from paragraph to paragraph. I don't know about anyone else, but that seems too minor a benefit when you can get much the same results by constantly reminding yourself to write simply and clearly.

LanguageTool

A grammar checker is one of the most requested features in OpenOffice.org, so LanguageTool's attempt to provide one makes it a popular extension. However, LanguageTool is not only incomplete in itself, but also fails to overcome the obstacle that makes every grammar checker I've ever seen inadequate -- the fact that in English, with its weak declensions and conjugations, knowing what part of speech a particular word might be is next to impossible except in context. And, unfortunately, LanguageTool is as blind to most context as Readability Report.

LanguageTool installs a sub-menu in the language section of the Tools menu. It runs with the spell-checker, underlining offending elements in blue if you have automatic checking turned on, or separately from its own sub-menu.

In the sub-menu, you can choose Configuration to see a list of the offenses that LanguageTool watches for. Many of the list items are not grammatical at all, so much as stylistic, such as starting a sentence with a capital letter or avoiding slang. Others items are common typos that could go into AutoCorrect. Yet, even here, LanguageTool is largely style deaf. It ignores, for example, the possibility that you might want to use redundant phrases like "the reason why" or "each and every one" for emphasis -- a widespread and perfectly acceptable habit in a language like English that has Germanic roots.

However, even within pure grammar, LanguageTool is weak. It catches subject-verb agreement only in specified cases, and, when it catches instances of using the wrong form of the verb "to be," it suggests that you use "be" as an alternative, leading you into an error. Pronoun reference and agreement in number are similarly inconsistent, while other elements such as faulty parallelism, are not mentioned at all. These lapses make LanguageTool, in its own way, as unreliable a feature as Readability Report.

Tempting, but not there yet

You may think that I am being too hard on these extensions. After all, readability and grammar are complex matters, and programming for them is difficult. Since spell-checkers are not infallible either, why should I be so negative about the effort to provide features that many readers want?

The answer is simple. A spell-checker does serve to catch the more obvious typos, and its limitations are well known. Most people who have spent any time around computers now know that after you run a spell-checker, you need to do additional proofreading.

By contrast, tools like Readability Report and LanguageTool present their findings with an air of objectivity. Users are likely to reason that if a readability test tells them that their document is clear, or a grammar checker flags an error, that the software must be right. Add a precise figure, the way that the readability tests do, and you can easily be seduced by the false sense of precision. The temptation to believe such things must be especially strong if English is not your first language or when you lack confidence in your writing ability.

However, any rule-based effort to improve your writing is going to be wrong a significant part of the time. Readability and grammar tools can be refined, but only to a limited extent. Both have been available in office suites for over twenty years, and neither is anywhere near as reliable as an experienced editor. By now, it seems likely that they never will be until we develop human-level artificial intelligence.

The idea of a shortcut is tempting, which is why such tools are so tempting. Sadly, though, none of them are a substitute for skill and personal knowledge -- and certainly not Readability Report or LanguageTool.

Comment viewing options

As one already noted, LanguageTool is not even called a "grammar checker". It's an automated proofing tool.

Moreover, the author seems to be completely ignorant of the 15 other languages supported by LanguageTool. Some of them have much more rules, so the tool is more reliable for Polish or German.

The article tries to follow all these boring criticisms of grammar checkers or general proofing tools that fill the web. Yes, these tools don't play chess nor understand your text. But they can tell you that you're confusing "it's" and "its".

I typically go over my writing myself. No software can replace walking away and coming back to read it... but there are times when having the tools can be useful. Making better tools only seems sensible.

There is the potential for a 'style checker' that could work for writers. It would learn *from* the writer. But you'd need to really tax a CPU to do it... or maybe not? Something to think on.

Not quite; LanguageTool's website says: "You can think of LanguageTool as a tool to detect errors that a simple spell checker cannot detect, e.g. mixing up there/their, no/now etc. It can also detect some grammar."

"The temptation to believe such things must be especially strong if English is not your first language"

Actually, LanguageTool is the only proofreading tool I've ever seen that contains features specifically designed for second language speakers. If I had selected German as my native language, LanguageTool would have questioned my use of 'actually' in the previous sentence.

"It ignores, for example, the possibility that you might want to use redundant phrases like "the reason why" or "each and every one" for emphasis"

Not quite; as you noted, it has a menu of options. You can switch off checks for redundant phrases.

Being a writer myself, a poet actually, and experimenting with language, especially French, I sure don't want any function or program getting in my way, so I disabled every "checker" on my OpenOffice (and any other writing program I use for that matter): spelling, grammar, capital letter at the beginning of a sentence, name it, they're all disabled. OpenOffice down to the bare bones. I have dictionaries and grammar books to do the checking with. In English and in French.
So, basically, I don't rely on any of these function to proofread my writings. I do it myself.

This is a problem that begins in our schools. Kids don't write anymore. It seems like everything is now taught with some automated, digital "make it easy" tool. I've run into high school seniors from schools ranked as top-level in the nation, and they can't even do basic arithmetic (add, subtract, multiply, divide) on paper. They're not taught to, but boy, they know that calculator forwards and backwards. A student actually told me that they're not taught this anymore.

This problem extends to writing as well. These same kids, even the really brilliant ones, can't write something in longhand on paper; they always run straight for the computer, even for the smallest things. Too many would rather text in shorthand than actually talk to people and express their thoughts in a coherent manner. When on a computer doing writing, they cannot function without their spelling or grammar checkers.

I was taught the "old fashioned" way, without digitization, and as a result, I am a very good writer, be it on a computer or on paper. It also means I can 5p3l pr0p3rl33, ppl. :-) That doesn't make me "more intelligent" than other folks. It simply means I was taught how to write. That's what we need to be doing in our schools.

Technology's great, and I love it. But Bruce is right, in that no amount of technology can replace the human brain.