When you need to write a form, one of the first things that pops to mind is what kind of validations are required. A good example would be the new user registration form:

This is such an ingrained instinct that we typically seek to stop such invalid states as soon as possible. If the business rules says that you cannot have a certain state, it is easiest to just ensure that you can never get into such a state.

If you are working in a system that exists purely within the scope of your application, that might actually be useful property to have. This is certainly something that you want in your code, because it means that you can make assumptions on invariants.

If you have a system that reflects the real world, however, you need to take into account several really annoying things:

Your model of the real world is not accurate, and will never be accurate.

The real world changes, sometimes quite rapidly and in a very annoying fashion.

Sometimes the invariant is real, but you gotta violate it anyway.

If the system doesn’t let the users do their job, they will do the job in spite of your system. That will lead to a system of workarounds. Eventually, you’ll have to support these workarounds. This may result in hair loss and significant amount of aggravation.

Consider the case that the prison in question is a minimum security prison, expected to have white collar inmates that have already been tried. Now you get an inmate that is accused of murder and is currently on trial. (Side note, prisons have a very different structure for people who still undergoing trial, because they aren’t convicted yet and because the amount of pressure that they are under is very high. They’ll typically be held in different facilities and treated differently from inmates that have already been convicted and tried).

What do you think will happen if the system refuses to accept such an inmate into the prison? Well, the guy is right there, and the prison commander has already authorized letting him in. So telling him that he should go back home because they “computer won’t let you in” is not going to be a workable solution.

Instead, we use a very different model for validation and acceptance of data. Instead of stopping the bad input as soon as possible, we raise a flag about it during processing, but we do allow the user to proceed, assuming they explicitly authorize this. It will look like this:

At the same time, we flag such records and require additional review of the data.

In most cases, by the way, the fact that the inmate is in residence is not something that can be ignored and will be in all reports on the state of the prison until the state changes (transferred, verdict given, etc).

This kind of thinking is important, because it means that the software intrinsically does not trust the data, but continue runs validation on it. This is a good practice to be in when dealing with systems that reflect the real world.

This does lead to an interesting question, where do we run these validations, and when?

The ground rules for a good application is that it is like an amoeba, it may have multiple locations that it accepts input, but that is the only way to push data in, through well defined channels.

This is another way of saying that we don’t allow someone else to go and poke around in our database behind our back.

Any time that we accept new data, regardless of the channel it arrives in, we run it through the validation pipeline. And this validation pipeline can add (or mark as obsolete) validation issues that should be brought to the attention of the people in charge.

Note that these sort of validations are almost always very much business rules, not type issues. If someone’s birthday is in the future, you can feel free to very easily reject that data as purely invalid. But if someone’s release date is in the past, however, you might still need to accept them as an inmate (the paperwork are really in the mail, sometimes).

I think I’ll need another post just to talk about how to implement these validation rules and behaviors, but that is still down the line. The next post topic is going to be data flow between the different systems that we talked about so far.