Basically I have several big forms (lot of fields submitted) that need to be processed, which are very similar but may differ by one or two fields. Firstly all fields get escaped and assigned to a variable of their original name (thus $_POST['f_name'] will be $f_name).
Then I need to validate the data, things like certain obligatory fields must be present, certain fields much match (confirming password/email), certain fields must pass regex check. I do this via a long if/else statement, where each failure has it's own error message.
Now of course I would like to avoid this repetition of the clumsy code, and replace it with some looping function, which will be easier to edit and maintain.

However this poses a bit of a problem, especially performing the checks and assigning individual error messages.

I would be keen to hear suggestions as how would you approach developing such validation/error reporting function.

Using php short tag is never a good idea.
–
nick rulezMar 21 '11 at 0:50

not bad, the problem with the sanitize function is that I still have to assign every single POST to a var of the same name, still trying to figure out how to avoid that.
–
cyber-guardMar 21 '11 at 12:51

@nick rulez: short tags dramatically increase readability when used in views, and the only problem with them is decreased portability, which is not an issue for you if you control the server.
–
notJimMar 21 '11 at 22:54

I'm not sure that there's one cut-and-dried approach to this problem. Here's how my company has addressed this problem:

1) Front side validation. Yes, can be bypassed. However, if you're only using it as the first line of defense it's a great solution (and acceptable to some of my biggest clients including an international banking group) I love the simplicity of Cedric Dugas' inline validation script because it's basically just a few extra characters per field. Another HUGE benefit to the inline validation--it allows us to use one centralized alert area for server-side validation errors along with a simple alert trigger via css on individual elements, while the majority are caught inline and alerted which is FAR more user friendly.

2) A class that deals with "stuff" We refer to it as the "garbage in, garbage out" It takes an array of post data, sets fields based on element names, and deals accordingly. This includes data sanitizing, validations, etc. The problem with validations is that unless you have generic types data to validate, you can get into a lot of specifics which can really gum up code in a hurry. Also, this can make you actually have to do MORE work on the front end because your field names have to line up accordingly. In our case, we deal with external webform responses from clients a lot who don't necessarily appreciate the need for standardized naming of fields, and that can get to be a headache.

3) "Chunking" sections. On huge form scenarios, we've resorted to "chunking" submits in phases via Ajax to minimize the damage to the server done in one big submit. So, user updates profile information, submit happens. User does background info section, update happens...etc. It's not right for all situations, but is some it can work well...and it allows progressive validation as you move from start to finish. I certainly wouldn't ever recommend this approach for each individual question, though.

4) "Forced Sanitation" Sounds evil, huh? In cases such as zip codes, addresses, etc you can simply fix information for the client. Rather than barking about a missing Zip Code, you can get it automatically, correct 100% of the time. That's the beauty of Google and the USPS--they're free and smarter than the average user.

I'd say it's better to do this on the client side using a javascript form validator, before anything gets submitted. Do a search for javascript form validation. It'll save you a page load and force your users to correct errors before even submitting. Here's a simple example of one way, taken from the first google hit for "javascript form validation":