Software for recall petition database needs human assistance

Madison - In their effort to review 1.9 million recall signatures, state election officials are embarking on a project unlike any they have done before, relying on newly purchased software that can convert handwritten names into entries in six searchable databases.

Experts say that the type of software the state is using can produce databases in a short time, but that officials must be ready to address numerous errors because computers sometimes misread handwritten letters.

"Handwriting recognition software is not great," said Daniel Lopresti, a computer science professor at Lehigh University in Pennsylvania. "A lot of the names are going to have errors in them."

Democrats submitted signatures to recall Gov. Scott Walker, Lt. Gov. Rebecca Kleefisch and four Republican state senators Tuesday. Under state law, the Government Accountability Board, which runs elections, has 31 days to determine whether enough valid signatures were filed to force recall elections, but asked a judge Friday for more time.

The state is spending about $100,000 to buy Artsyl Technologies' docAlpha software and get technical advice. State workers are electronically scanning the petitions so the software can read the printed names on them and convert them into typewritten characters.

The computer determines what each name is and then shows it to a computer operator alongside an image of how it appeared on the petition. The operator will fix any errors before allowing the name to be entered into the database, said Jeff Moore, chief sales officer for Artsyl.

"There's a visual verification of everything," he said. "Someone's going to be looking to say, 'That's John Doe, not Jean Doe.' "

In the past, the accountability board has left it to those facing recalls to find problem signatures. But Waukesha County Circuit Judge J. Mac Davis recently sided with Walker's campaign in a lawsuit and said the board must do more to find duplicate and fictitious names.

Board officials determined the only way they could find duplicate names was to create databases.

The petitions include signatures, printed names, addresses and dates. Moore declined to say how much of that information would be put in the databases.

Despite the likely delays, the incumbents will have difficulty stopping the recall elections because of the sheer number of signatures turned in. For instance, more than a million signatures were submitted for Walker, almost double the 540,208 needed.

Rep. Mark Pocan (D-Madison), a past critic of other computer problems at the accountability board, said he was concerned that the current plans to check signatures could go over budget and beyond deadlines.

"I have great concern that GAB has an overly burdensome process that is going to go way beyond what's needed," Pocan said.

Errors likely

Lopresti, the computer science professor, said software similar to what the state is using has an accuracy rate of 90% to 95% per character. When that rate is extended across all the letters of a name, there is a good chance for error, he said.

He stressed the importance of having humans reviewing everything that is entered, but noted doing so was time-consuming.

"You might as well just type it by hand," rather than have a computer read it, he said.

He and other experts noted that the database would not detect if someone had signed more than once using different names. It also would likely miss instances of someone signing twice using variations of a name, such as "Tom" and "Thomas."

"You have to acknowledge the fact that you're not going to find (some) duplicates," he said.

Mike Tate, chairman of the state Democratic Party, said volunteers eliminated duplicates they found before turning in the petitions. He acknowledged the state would likely identify some invalid signatures, but said they had turned in such an overwhelming number of signatures there was no way to stop the recall election.

Computers will misread many of the names because the information is handwritten, rather than typed, said Richard Fateman, a computer science professor emeritus at the University of California at Berkeley. Fateman said machines do a much better job of reading handwriting when each character is entered into a box - such as on credit card applications - or when people know that they need to use their best handwriting. Collecting signatures for a recall is much different, he noted, because people signing the petitions may be in a hurry or signing while it is raining or snowing.

"People don't necessarily print their names totally readably, so it would be kind of a catch as catch can kind of thing," Fateman said.

That situation is why the state plans to have staff review each name before it is loaded into the database.

Another way to create the databases would be to crowd-source them, having large groups of people enter a small number of petitions each, Fateman said.

In fact, that is what tea party groups are doing to build their own database of the Walker signatures. Once the petitions are posted on the Web by the accountability board, thousands of their volunteers plan to begin entering them in a secure, online database so they can run their own checks.

The Democratic Party and recall group United Wisconsin created their own database before they turned in the petitions, and Walker's campaign plans to create its own database. Only the one created by the state will be publicly available.

About Patrick Marley

Patrick Marley covers state government and state politics. He is the author, with Journal Sentinel reporter Jason Stein, of "More Than They Bargained For: Scott Walker, Unions and the Fight for Wisconsin.”