Introduction to Object-Oriented Tiered Application Design

An introduction to some of the challenges and opportunites of object-oriented design.

Introduction

Most of the .NET code I see make heavy use of datasets, and a fairly simple client-server (2-tier) design. This is fine, because .NET has great support for using and displaying datasets on the screen. However, for more complex applications, the approach does not scale well in terms of maintainability. At some point, developers have to start coding business objects of some kind.

The problem is that using datasets insulates the developer from the basic incompatibility (called the impedance mismatch) between objects and databases. When you start using business objects, then you are suddenly faced with a huge range of options and challenges that you never encountered when using datasets. The good news is that, once they start using objects, few developers go back. It is an exciting and productive environment, with great opportunities for code re-use.

This article is an introduction to the subject, with a view to helping readers understand the ways in which you can design object-oriented applications, and some of the challenges that you may encounter. The focus is on small to medium-sized business applications that can benefit from an object-oriented design, but probably won't be needing Microsoft Enterprise Services any time soon.

Apologies in Advance (to the Experts)

There is very little standard terminology in this field. If I call a widget a wodget, and you are used to calling it a wadget, don't be upset with me! I try to define what I mean by every term I use, so there should be no confusion.

A Multi-tiered Application

You've probably (hopefully) heard of 3-tiered systems. The common tiers mentioned are presentation, business logic, and database. In practice, most object-oriented applications have more than 3 tiers -- they have a framework of interconnected components, typically found inside multiple DLLs, EXEs, and third party applications, generally categorized into layers/tiers. The components may all be located on a single computer, or they may be spread across multiple computers.

From the developer's perspective, a primary idea behind a tiered system is to split logic into different pieces of code, so that one or the other can be changed, extended, or rewritten without affecting the rest. There are other advantages too, such as scalability. Performance is not an advantage of object-oriented applications, except where it is obtained through scalability.

One of the driving forces used in the design and implementation of the tiers is that we should minimize duplication of logic. In a complex application, duplication of logic will lead to bugs. (This is true of a simpler system too, but in that case, it is easier to manage).

Some possible tiers of a multi-tiered system are:

Database -- this consists of the DBMS (e.g. SQL Server), the table data and structure, and logic that is coded inside of the database. The most obvious example of logic that is coded in the database are stored procedures or triggers. However, it is worth noting that the structure of the database itself is logic, in the form of data type validation, relationship validation, etc.

Data Access Tier -- separate from the database, is the code that accesses the database. There are many reasons to keep this code in its own layer -- support for multiple DBMSs, automated data auditing, connection management, hiding of connection strings, retrieving of Identity values, etc.

Object/Relational mapping -- at some point in the application, we have to translate objects to SQL, and vice versa. An O/R Mapping component takes care of this requirement. We could just code the SQL inside of the higher level objects, but there are numerous benefits to encapsulating the logic at a single point in the system. Some of these benefits can include insulating developers from knowing the database structure, supporting powerful query interfaces for the users, and insulating other objects from database structure changes.

Business Domain objects -- (also called entity/business objects). These are objects that contain properties that reflect data. At this level, you will find objects named address, client, etc. Commonly, they also embed relationship logic -- for example, a client object may have a property or method that retrieves the address object for that client instance. In this case, the objects are known as Active Domain Objects, because they can actively access related data.

Service Layer -- Typically, service objects will not have names like client, or address. They are pieces of processing and workflow logic, that usually correspond closely to use case scenarios. For example, in a shopping cart application, the service layer could handle the processing of the order. This might include sending data to an external system, emailing the purchaser, and saving the order data in a database.

Controller Logic -- (also called UI Process logic, Presentation logic). When dealing with multiple User Interfaces, e.g., both WinForms and Web, it makes sense to try and split the non-UI logic into its own layer. This is very difficult to do, but there are pre-built frameworks that help support this type of logic. They are known collectively as MVC (Model-View-Controller). The Controller logic is the C in MVC. In a shopping cart application, the controller logic could handle the flow of the web pages as the user progressed through the order process.

UI/View -- This is the piece of the application that the user interacts with. When the controller logic is in its own layer, the UI typically only contains UI Mapping code, i.e., mapping object properties to and from screen fields. Of course, it also contains code to interact with the controller objects.

Disclaimer -- Most applications do not have all of these tiers. It is perfectly fine to combine tiers, according to your own unique needs.

Examples

At this point, I think we need some overly-simplistic examples:

Example

What is it?

SQL Server, Oracle, MS Access

Database, DBMS

stored procedure

if it has significant logic, Business Domain Object, otherwise Database

ASP pages encouraged the combination of UI and Controller logic. ASP.NET improves the situation through code-behind classes. The benefits are primarily that the code is easier to read and maintain, as is the HTML.

Stored procedures that contain business logic automatically break up your business logic into multiple places. Since stored procedures are not compatible with your other code, (i.e., you cannot make use of helper functions within a stored procedure), this can lead to duplicate code. It will be interesting to see how Yukon (support for C# on the database) opens up possibilities in this area.

Challenge #1 - Validation

The first challenge in a tiered system is validation. It is a challenge, because it is almost impossible to write validation without duplicating the logic. This is because multiple layers of the system need access to the same validation logic, and some layers intrinsically contain validation logic:

often, the user interface must be nice to the user, and show validation problems before the user presses the Save button.

the controller objects and service layer may need to enforce validation.

the business domain objects must enforce validation.

the data objects usually have validation logic already, in the form of data types on the properties.

the database has validation logic already, in the form of data types, foreign key relationships, and field length limits.

Often, application designers will just ignore this challenge, and accept that validation will have to be duplicated across multiple pieces of code. However, there are some other approaches:

put the validation in a separate Rules component, that can be re-used by the UI and the other layers -- probably the best solution, but also the hardest to implement.

put the validation in the business domain objects, or the controller logic, and hope that programmers do not bypass the layer.

put validation in the UI, and in the database -- this is more common than you would think, because it is the simplest to implement.

Whatever the choice, it is practically impossible to achieve zero-duplication of validation logic. Depending on your particular needs, you will need to choose the best approach.

Challenge #2 - Security

Like validation, security needs to be enforced at multiple levels.

At the UI, we may want to disable a Save button, or certain fields/menus/buttons.

in our business domain logic, we may want to prevent certain actions based on the context of the actions - for example, we may only allow the save of an address object if it is linked to the user's own user-id.

in our database, we may want to apply permissions to tables and/or stored procedures.

Security is difficult when it is dynamic (specified in detail by an admin user), and context-sensitive, i.e., you cannot always tie security directly to a specific table or data object. If your security is simpler than that, then you may be able to make do it by implementing your security at the Controller, and/or at the database.

For the more complex, dynamic cases, an approach that I have used in the past was to attach security information to the data objects. The business domain objects attach the security data, based on the current context. The security is then enforced by the code that persists the data objects, i.e., the O/R mapping layer. This worked well for me, but YMMV (Your Mileage May Vary).

In addition to the problem of security enforcement, you have to deal with database connection strings -- if a user has access to the connection string, they can then establish a database connection directly (using SQL Query Analyzer), and are able to bypass object-based security. In that case, the only effective security is at the database level -- a very strong argument for using stored procedures, (because of the fine-grained control of security that they offer).

Another approach to the connection string dilemma is to disallow direct connections to the database. This is done by placing the data access tier on a server machine, together with the connection string. Security is then enforced at the data access level, through the use of a separate security component, usually using security tokens/tickets to communicate the user's permissions.

A final approach is to encrypt the connection string at the client. This is difficult to do well, because it requires that a private key be stored somewhere, and it is hard to find a place that is secure. It is possible to do, but in .NET 1.1, you have to use some unmanaged APIs.

Challenge #3 - to MVC or not to MVC

MVC (Model-View-Controller) is a powerful technique to separate controller logic from the UI. You'll find MVC used most often in complex, workflow oriented web applications. This is because web-based applications lend themselves well to MVC -- MVC operates in a state-driven manner, just like a web application. It is a much less intuitive pattern for a WinForms application.

In the MVC acronym, M refers to the business data objects or service layer, V refers to the UI, and C to the controller logic. The intention of the C is to encapsulate input controller logic, i.e., the code-behind of the page or form that interacts with the model. Often though, the meaning is interpreted as application controller logic, i.e., controlling the flow of pages and forms.

Many consider MVC to be the most poorly understood and implemented pattern, primarily because of the confusion over what the C means. The problem is that application controller code can be re-used across different UIs, where-as an input controller is specific to a particular UI. Many MVC implementations combine the two types of controllers, which is OK, but not re-usable across different UIs.

For example, a web page has a completely different interface to a WinForms app, yet in MVC, we may code a single piece of controller logic to handle both. When we do this, the UI has a tendency to be created for the lowest common denominator, which leads to a very uninspiring application on the higher end interfaces, i.e., WinForms.

Of course, this can be managed, primarily by creating input controllers for each platform, and generic, shared application controllers -- but it is still a challenge.

My experience with MVC is that it is nice and convenient to have the controller logic separated, specially when dealing with a web application with processes, e.g., an order process. However, I was at the mercy of the MVC framework that I chose, and in the end, I prefer control over convenience. But that's just me -- a control freak.

Challenge #4 - Transactions

The functions that the user performs drive the need for transactions, and functions are defined at the level of the controller or service layer logic -- so it makes sense that transactions should be initiated and committed at that level. In most environments, this implies that database connections should be initiated and terminated at the same level, since few platforms support transactions that transcend database connections.

Thus, the challenge is to provide database specific functionality (transactions), at a level of the application that may be substantially removed (in terms of intermediate layers) from the database. It may even be that the data access tier is on a completely separate machine.

One solution is conceptually simple - create an abstraction that represents a database connection/transaction, and can be used inside of the controller logic to begin, commit, and rollback transactions. This has a profound effect on the design of the data access tier. It means that the data access tier cannot automatically open and close database connections -- it becomes a managed resource at the level of the controller logic. In .NET terms, the data access tier has to support the IDisposable interface.

Alternatively, you can make use of automatic transactions, which are supported in .NET, COM+, and MTS. This has the additional benefit of supporting distributed transactions.

Achieving Reuse

Within your own organization, it can be extremely beneficial to establish some reusable components that fulfill the needs of particular components of the system. Some you can code yourselves, but others (e.g., Object/Relational mappers) are usually easier to purchase.

For a single application, it may make sense to code everything from scratch. But after the first, it gets tired real fast, especially typing SQL strings, doing O/R mapping, and creating domain objects. I'd rather concentrate on the real application logic. As an example, in my own work environment, I work primarily with semi-complex web-based applications that have their own databases. Some of my reusable components are:

OID generator - this is an article by itself, but basically, I prefer to use GUID fields instead of INT fields as primary keys in my database. This component generates guaranteed-unique GUIDs, for assigning to a Primary Key before saving to the database.

Data Access DLL - a simple data access component that works with the OID generator, encourages parameterized queries, and allows management of transactions.

Simple Security component - supports adding, editing, and authenticating of users, with support for roles. Passwords are salted and hashed in the database.

Audited Security component - for those projects with more stringent security requirements, extends the simple security component by adding the ability to audit logins, and lock users out after x failed logins.

POP Server component - based on a 3rd party component, supports forwarding of email based on configurable rules.

Domain object templates - using a 3rd party tool, initial sets of domain objects and O/R mappings are generated based on the database structure.

Microsoft promotes their own re-usable "Application Blocks". These are of varying quality, and I do not specifically recommend any of them. I do recommend against the Microsoft Data Application Block, because it is SQL Server specific and does not encourage good data access practices.

References

Data Access Patterns: Database Interactions in Object-Oriented Applications - by Clifton Nock. This is an excellent reference that breaks down the different components of data access. It tackles everything from object relational mapping, down to data access tiers.

Patterns of Enterprise Application Architecture - By Martin Fowler et al. - This is the definitive book on designing object-oriented business systems. It tackles everything from the UI down to the data access, in a nicely clear and concise manner.

Application Architecture for .NET - Designing Applications and Services - By Microsoft Patterns and Practices division. It is a good, short book, focused on creating very complex business applications, making use of Microsoft tools and products.

Expert One-on-One Visual Basic .NET Business Objects - by Rockford Lhotka. A very practical approach to developing a complex tiered application. The author presents his own vision of an architecture that can be reused across multiple applications.

This is very nice introduction to Objected-Oriented Tiered Application Design. It is easy to get the basic, but how can I applied to the real world application? I think you should give a very good sample project with this article.

Alternative to MVC is Presentation Abstraction Control. It has quite many good features compared to MVC.

Presentation * displays stuff to user

Control * takes care of state of object, display correct stuff, notifications etc

Abstraction * this one has data

Examples can be found from Alan Holubs web-site and it has plenty of examples on web.

According to my experience (about 3-5 years now on "wild") things don't go so well with MVC. It is excellent design pattern if application is very well designed and programmer knows what customer wants. However, at real life this is almost always false. Customer will always require modifications => modification to model, controller and view. With PAC the coupling between component is looser => lesser modifications.

And yes, the consept is working. I made recently little application that uses Visitor-pattern to display data (double, int, boolean, range, string + objects combined from previous ones). I just had to write different Visitor for Destop and www. I combined control and abstraction and used delegation => when I added field to object X it was displayed correctly on UI. Modifications was done on one place only. Sure that needs some extra work to make "real" application but overally things worked amazingly well.

One of the points of the article is that there are no hard-and-fast rules. For small projects, it is fine to have UI directly accessing databases. Larger projects need more layers of abstraction.

One way of thinking about it is in terms of areas of knowledge/responsibility. If the "business layer" is responsible for understanding the data structure, then it is not recommended that we give other layers that knowledge.

The code becomes harder to maintain once knowledge is disseminated throughout the code, because changes (added fields, changed field types, changed database provider) affect multiple areas of the system in unpredictable ways.

If the business layer accesses the DALC layer, then it is tightly coupled to the database used. Is it not?What if I want to switch to a new database?Would I have to change the business logic layer as well?

If multi-database support was one of your goals, then using layers to abstract the DB-knowledge away from the business knowledge would be a good thing.

Ideally, your business objects should have very little knowledge of the database, or even what a database is. They should know that they have to call some API to persist and retrieve the objects, but that is about all. If you need that level of abstraction, then ORMappers are the best bet (e.g. NHibernate).

If you want to abstract the database to a lesser extent (the business objects know a little more about the database, or deal with data readers/datasets), then you can use a simple DB service layer, which just abstracts away the actual SQL. Then, when you need support for multiple databases, you can have multiple implementations of the DB service layer. If you want to see an example of this, take a look at DotNetNuke (www.dotnetnuke.com).

I have used both approaches, and each was appropriate for the types of projects I used them on.

I am not sure it's a good idea for a presentation layer to instantiate business logic. As far as I know the most common architecture is MVC with session controllers and Publisher-Subscriber design pattern for UI to observe the domain layer objects, both layers are instantiated separately.

You touch on an interesting point about validation. Where should it take place?

In the GUI means it will be fast. However, its not so easy to code interfield validations. i.e. Its easy to code up enter a positive integer, its less easy to code up start_date < end_date.

I quite like putting field validations in the GUI, and have a validation button that does everything, and the intra field validations. This is done as a call to the server.

The main reason concerns other validations, name existance validations. For example, this field must exists, or this field must not exist in the database in order for the change to commit.

So for standard CRUD (create, read, update, and delete) screens it become easy so long as you call the server.

Now, if you build the interface properly between the GUI and the server, the server doesn't have to know it is being called by the GUI. You could hook up some other process to extract and insert data. For example, via a message queue.

I am working on a larger application for the first time now. My previous applications had only two or three tables and almost no relations. And now I already got 12 tables with a lot of relations.

Your article made clear what I was looking for. It provides a handfull of handlebars and tips to go on with my application.

I think we can drop the data access application block, since it doesn't work that well for most bigger applications. I designed some data access components that were a lot easier to use and didn't have more functions than just getting a datareader or executing a query without getting the results. And these work just fine for the job they do.

"Every rule in a world of bits and bytes can be bend or eventually be broken"

I'm supposed to be in "retreat", but I just had to say that I really enjoyed your article. Your discussion on the MVC pattern was right on target, and I appreciated how you enumerated different possible tiers concisely.

I'm curious, do you plan on writing about your re-usable components?

Also, you said:

I do recommend against the Microsoft Data Application Block, because it is SQL Server specific and does not encourage good data access practices.

Can you elaborate on that? A lot of people ask me what I think of the Data Application Block, and not having used it, I don't have a good answer, so I'm interested to know why it doesn't encourage good data access practices.

In general, I don't think any of the VS wizards and the entire "drag and drop" adapter/view/connection/etc., onto a form is a good practice, as it tightly couples all three layers together. What were they thinking, except to make something that someone with no programming experience could use to create bad designs.

Thanks for the kind words I had seriously considered shortening the article by removing the MVC section, but now I'm glad I didn't

I have considered writing about the OID generator, because I think its a great technique, and there are not many articles (anywhere) on how to do it.

I also considered writing about the data access layer, but there are so many articles on that that I thought it would just be lost in the crowd. Anyway, its still a work-in-progress, because it does not support nested transactions :(

Regarding the Microsoft Data block, I dislike it because it ties clients to the SQL Client provider, and it does not really do anything except provide a facade for ADO.NET. In addition, it encourages the use of non-strongly typed sql parameters, as well as stored procedures. (Nothing wrong with stored procedures, except that most people use them for the wrong reasons).

Well, now that leads to my next question, as I can never get a good answer from people as to guidelines for when to SP and when not to:

Steven Campbell wrote:(Nothing wrong with stored procedures, except that most people use them for the wrong reasons).

So, what are good reasons, and what are bad reasons? Might make for an intersting (and probably hotly debated) article in itself!

Let me see if I can put my foot in my mouth. IMO:

Wrong reasons are when an SP is just a simple wrapper to a basic SQL statement. This kind of a thin wrapper doesn't buy you anything, but some people keep professing the argument that it creates a single repository for an SQL action. But, IMO, if you change that action, you most likely are going to have to fix up all the SP calls.

Right reasons, IMO, are when an operation can benefit from the performance gain of working on the server side, rather than a lot of client-server I/O. Another possible right reason is when an operation is complex but can be encapsulated in a simple call, like doing DB admin stuff.

I think that its one of these things where you need a good reason to use them. I mean, they tie you down to a specific database vendor, so they already have something going against them.

I agree with you, that your basic CRUD (create read update delete) stored procedures are a waste of time. In addition, they make it harder to find any stored procedures that may actually contain some logic.

Historically, stored procedures were advertised as having large performance benefits. That is no longer the case (at least not as much as in the past), because the database vendors are optimizing dynamic queries much better than before. Also, you can now precompile your queries in code. So, the major wrong reason to use stored procedures is because someone said that they are faster. They are, but not enough that it will make a difference.

Good reasons to use stored procs:* the ones you said are both good* security - you can insulate the database tables from the users, so that they can only do what you allow them to do through the stored procs. * performance tweak - you have some data-related function that would be too slow in the application, so you move it to a stored procedure (this is basically the same as one of your points)* for the DBA - when all your SQL is on the server, it makes the job of the DBA easier. They are comfortable tweaking the performance of stored procedures.

"Regarding the Microsoft Data block, I dislike it because it ties clients to the SQL Client provider, and it does not really do anything except provide a facade for ADO.NET. In addition, it encourages the use of non-strongly typed sql parameters, as well as stored procedures. (Nothing wrong with stored procedures, except that most people use them for the wrong reasons)."

I use the MS Data block because it saves me time from having to write the code it takes to create all the parameters and then assign values to them. The methods exposed by my data access class only accept strongly typed parameters.

If you use some other data namespace, you can modify the source of the Data block. When the first version was released, I modified it to work with datasets. For the second version, I added an overloaded method ExecuteTypedDataset.

I have some ideas on not using stored procs. When I have some time to flush it out, maybe I'll post it up.

Regarding the Microsoft Data block, I dislike it because it ties clients to the SQL Client provider,

A few months ago I completed a project, with a MS consultant, using the DAB with an Oracle database. We did have a problem with the MS supplied Oracle provider but we were certainly not tied to SQL Server.

truly was a great article. It is very good to know all the components that you are working with. And it is especially useful to have code such as database stuff in its own tier, which you can use in other programs as well.