Thursday, January 29, 2009

KioskCrash: Introduction

Welcome to Leber Hall! I want this blog to be interactive and a learning experience for us all. So don’t feel shy and go ahead and ask questions and clarifications. I am going to try and walk the line between skipping too many details and plodding along boring us all with too much information. These posts will get quite technical at times, but I will try to keep the information accessible so neophyte programmers can get some benefit my writing.

The first series of posts I’m going to write will cover ways to find a crash in an unattended kiosk application. My plan is to present a sample program that will crash. Over time I will update the sample with the techniques being discussed. We will explore various options for finding the crash and why particular choices were made.

In this series an “unattended application” is a program that runs on a kiosk that doesn’t have a keyboard, mouse or any other traditional user interface devices. While there are “users” who use the kiosk they don’t have access to technical support, web browsers or email to report problems and work with someone to fix the problem. At best when the application crashes the kiosk is able to recover and restart automatically. The next best option is to display an “out of service” message to the user. The worst option is to display error messages to the user and wait for their input.

What is a crash? In simple terms it is an invalid operation that causes the OS to stop a program from executing. The most common types of crashes I’ve encountered are caused by attempting to read or write to invalid memory, overflowing the stack or dividing by zero. The Platform SDK for Windows defines 23 codes that can be generated by a hardware exception. I won’t explain them all. Read the documentation for EXCEPTION_RECORD if you would like to know more.

What’s the difference between a crash and an exception? Depending on what kind of exception you mean there isn’t any difference at all. Windows uses a mechanism called Structured Exception Handling (SEH) to handle both hardware and software exceptions. C++ has exception handling as a language feature. C++ exceptions are not the same as SEH exceptions and I’ll need to discuss both kinds as I go along. To distinguish between the two different exception types I will call the event that causes the OS to kill a running program a crash. I will use the Structured Exception Handling features provided by Windows to capture the state of the application. C++ exception handling will also be used to provide a mechanism to recover from some types of crashes.

Ok, that’s enough rambling for now. Next time I’ll present the sample application that I’ll use as a base for this series of articles.