Software Calamity

Tuesday, December 31, 2013

Stateless : A Best Practice That Isn't Absolute

One of the best practices to follow when designing RESTful APIs is to make the requests stateless.

In simple terms, it means that an API request should contain all the parameters it needs for the server to execute it without depending on the result of a previous request.

In theory, stateless requests brings the following benefits:

Scaling: Any server in a cluster can handle the request.

Simplified Caching: It's easier to cache the result of a request by using the explicit request state as a cache key.

Simplified Troubleshooting: When parts of the request the state aren't explicitly stated, additional investigation is required to figure it out.

Reduced Learning Curve: Each API request is self contained. To call the required API, developers don't need to learn or use multiple APIs in sequence.

Following this best practice can be quite challenging since the software requirements sometimes conflict with it. For instance, any API that requires authentication usually relies on one API request to authenticate the user in order to supply valid credentials to subsequent requests.

Since this best practice isn't entirely "black" or "white", I found that defining different levels of statefulness helps when I need to develop new APIs or explain the concept.

Statefulness Levels

Permanent Data

Changing Data

User Specific Data

Session Specific Data

Random Behavior

Level 1: Permanent Data

The request passes explicit state data, the response will always be the same for a specific state.

Permanent stateless resources: A data request done on a read-only data warehouse that stores historical information will never change.

Permanent resources with a state: A resource may be delivered in different languages, the language selection is explicitly in the request so that the following requests will yield a potentially unique result. Example:

http://server.com/product/1234&language=french

http://server.com/product/1234&language=english

Caching those responses is trivial.

Level 2: Changing Data

The request passes explicit state data, the response may change over time, even if the state combinations are the same.

Examples:

Files: A picture stored at: http://server.com/picture.bmp , that picture might change over time and might eventually be taken offline.

Dynamic Data: A business object stored at : http://server.com/product/1234 , that specific object might change over time and might eventually be deleted.

Caching those responses is relatively easy. The challenge lies in defining caching rules to ensure that entries get invalidated in a reasonable time frame.

Level 3: User Specific Data

The response may look different based on the user performing the request.

Access Control: Different users might get a different result for the request.Example: Given http://server.com/product/1234, if some users don't have access to it, they will receive a "403 Forbidden" response while the other users with access will receive the information on the object.

Hidden Data: The response data is based on information hidden from a user. Example: The API http://server.com/publicity is used to retrieve the publicity you see on a page, the content of this publicity usually depends on information associated with your user account. This information isn't available to the client performing the request.

Personal URL: A URL for which the response is bound to be different between users. Example: The URL http://server.com/my-cart. The response data will change based on the user performing the request.

Note: this can be avoided if the URL could explicitly state the cart identifier in the request such as : http://server.com/cart/5678. This way, API developers can deal with more than one cart. Users might also be able to share their cart with other users.

User Preferences : An API that leverages user preferences stored on the server. Example: Users store their preferred language on the server. When different users perform the request http://server.com/product/1234 they each get a different result based on their preferred language.

Note: This can be avoided by tracking the state on the client side. The client application can read the user preferences and then use them explicitly in every request:

Load the current user preferences at http://server.com/user/4567/preferences, returns language=french

Reuse the loaded preference in the request http://server.com/product/1234&language=french

Caching those responses is more difficult. On the server side, extra steps are required to determine the correct cache key to use. Edge caching can be impossible since the edge servers won't have all the data they need to return the right response.

Level 4: Session Specific Data

The response data may look different based on the user session performing the request.

Session Preferences: The server stores user preferences based on their session. Example: A user logs in twice to the same server using two different applications. He configures one application to be French and another in English. The user then performs the request http://server.com/product/1234 on both application and gets a different result.

Note: this can be avoided by making the client application track the selected language state.

Caching those responses is as difficult as those of level 3.

Level 5: Unpredictable/Random Behavior

The response may look different, even if executed with the same parameters with the same user in the same session.

Random Behavior: The behavior isn't predictable.Example: An API that returns a random number.

Lower is better

With those levels, the goal is to reach the lowest possible level to gain the benefits of statelessness while satisfying the application requirements.

Sunday, September 23, 2012

MVC (or as I prefer : MCV ) proved to be a very useful design pattern for compiled applications. Using this pattern, a developer could create a tool to visualize how the application is working internally. Each layer would have a dedicated part presenting the current state of each layer.

What about dynamic schemas ?

A problem with this pattern starts to emerge once application logic can be changed while the application is running.

Suddenly, three layers aren't enough. A schema layer needs to be inserted somewhere in the pattern if we want to visualize parts such as the model definition or the rules defined in a controller. The role of the schema layer is to contain the definition of object structures and the logic that govern them. In other words; the class and the method definitions.

While functional, this layer representation has a few dependency problems : It implies that the layers above the schema layer can depend on all the content in the schema. For instance, the model can depend on controller or view definitions.

The solution to all software problems : Add more layers !

To clean up dependency cycles, a better solution is to split the schema in a dedicated layer for the model, the controller and the view.

The only problem left is the terrible acronym for the pattern : SMSCSV.

This view allows multiple users or developers to collaborate on a running application to monitor and apply change on any layer, demonstrating another interesting side effect that dynamic schemas introduce on traditional software design patterns.

Thursday, July 12, 2012

Smelly MVC

There are multiple approaches to the MVC design pattern, but few of them ever felt right to me.

One of my main gripes with the pattern described on wikipedia is the dependency cycle between the Controller layer and the View layer. Specifically:

A controller can send commands to its associated view.

A view must communicate to the controller to activate commands.

Dependency cycles are usually a design smell. While something seems wrong, the answer isn't always obvious.

Another problem arises on the controller layer where the code is usually a mix of commands used to manipulate the model and commands to manipulate the view. I can't help but think that those commands don't belong in the same layer.

The V and C switcharoo

The solution I apply to my designs is to switch the View and Controller layer order, leading to Model Controllers Views or the MCV acronym.

The extra 's' aren't there by mistake, but I'll get to it in a moment.

To make it work, I apply a few extra restrictions.

The Controller layer contains no "View" concepts or commands. It is now responsible to modify the model consistently and also to apply security to any changes being done.

The View layers become a bit more thick. They now contain the required logic to build the visual components and interact with their associated control layer.

A new twist comes from the Controller layers that now generates change events consumed by the View layers. This is necessary for model changes that might trigger controller changes. It also adds the benefit that a developer could modify a control layer while the application is running, and the view would keep reacting appropriately.

A RESTful of shmuck

The constraints imposed by REST usually make most developers queasy. How the hell can we make APIs without context ? How can we accomplish every possible change using only PUT, POST and DELETE?

When it comes to monitoring the controller layer, the context restriction hits straight in the teeth. Most controllers need their context to know which user is executing a command. So how do you offer a REST API to monitor a control layer?

Each user gets their instance of a control layer. Just like every view on a system gets a different view layer instance.

This means a RESTful API can be built to fetch each control layer independently. An administrator or a developer could query the control layer of a specific user to ensure that everything is behaving as expected.

Yo dawg, I heard you like MCV...

...so we put a MCV layer in your MCV layer.

If we have a RESTful API on top of our model, our controllers and our views, what stops us from wrapping all of those things together as a single model? Slap development control layers and development view layers on top of it all and use them to monitor, modify and debug all the layers of a running application?

I applied this in my pet project : The Deduced Framework. It turned out to be quite convenient; allowing the framework to offer a development environment where everything could be changed while the application is running. Whether it's a view, a controller or a model; everything is laid out in plain view. Developing an entire application without ever stopping it to compile is quite a peasant feeling.

It feels like flirting with the future of development tools, where the idea of stopping an application to compile will be as obsolete as manually managing memory allocation or keeping track of the registries on a CPU.

Friday, July 6, 2012

Building Blocks

The first time I learned the concept of object oriented, I had an epiphany; I envisioned building software the same way I assembled Lego blocks as a kid. Data and logic blocks would interconnect seamlessly to form applications quickly and efficiently. I would write a piece of code once and I'd be able to reuse it in almost any context. Gone would be the days of duplicated code.

Back to Reality

Reality turned out to be a tad more complex.

C++ allowed multiple inheritance in multiple ways using the optional virtual inheritance. This lead to the creation of new problems and their corresponding solutions.

To simplify things, some languages enforced limitations on how object inheritance would work. Java enforced a single inheritance hierarchy, using interfaces to expose multiple functional aspects.

Data protection concepts emerged to prevent the outside world from modifying internal data. Getters and Setters emerged as best practice to encapsulate data.

Extending existing objects ranges from very easy to impossible depending on whether or not the author of an object anticipated the need to override a specific aspect of his design.

My Ideal World

In an ideal world, I'd like to be able to assemble a class definition by pulling multiple aspects together to form a class definition.

Small data aspects to identify object fields that are common.

Small logic aspects that would pull data together in a reusable fashion.

Experiments

When I started experimenting with the Deduced Framework, one of my goals was to create an easy way to create data structure. Multiple inheritance was a key piece of functionality that allowed users to define those structures and assemble them.

When I added deduction rules and actions to the schema, it soon became apparent that logic could also be encapsulated in small aspects, provided that a few simple rules were followed.

Object inheritance had to occur in a predictable fashion.
For instance, if a type inherited from 3 other types, the order in which each overrides applies had to be well defined.

Object inheritance had to remain simple.In C++ terms, all inheritance would be virtual. The idea of non-virtual inheritance where you might get more than one instance of a specific field usually adds more confusion than value.

Data and Logic had to be encapsulated in small unitsIf too many pieces of data or logic are contained in a single type, reusing parts of this type becomes an all or nothing type of deal. You either get all the data and logic or you get none. The idea of separating data from logic in different types also brings values as not all the users of a type will want to reuse the same logic.

Applying those principles became quite handy while defining the data structure used to represent parts of a user interface.

Defining the Data

For instance, when defining the concept of a visual tree, I defined the "Tree Item" type using four data aspects:

Named Collection : To apply a name to my tree item.

Component Container : To store the visual component used in the tree item.

Defining the Logic

One of my first requirement in defining the logic of my "Tree Item" was to build it based on a configuration. The type "Configured Collection" indicates that aspects of the current object are derived from a configuration. I therefore defined 4 new logic types to configure each data aspect.

Configured Name : To apply a name to my tree item based on a configuration.

Configured Component Container : To build the component within the tree node based on a configuration.

Configured Style : To apply a visual style to the tree item based on a configuration.

Configured Tree Item List Container : To create the list of child tree nodes based on a configuration.

Resulting object hierarchy

Benefits

One might argue that a hierarchy composed of 11 different types is complex. However, I believe that complexity in software is usually driven by what we want to accomplish with it. I'd rather accomplish a complex task by aggregating 11 simple concepts together than by creating a few types where the same 11 concepts are regrouped together.

Reusability is also greatly enhanced. When I need to create a new "Tree Item" type where only parts of the fields are driven from a configuration, I can compose a new type by inheriting only the configuration aspects I need.

Best of all, I feel like I'm building software by linking blocks together once again.

Thursday, April 19, 2012

Less code to write, less code to test

I don't think anyone needs to be convinced that writing less code to reach the similar levels of functionality is beneficial.

Fewer lines of code usually results in few tests to run and faster development time.

In this post, I'll explain how much impact the Deduced Framework had on my code size.

The Observer Pattern

I've always been a fan of the observer pattern. It consists of model objects that notify listeners when they change.

For instance, if a model is displayed in multiple widgets in a user interface, then each widget would listen to the model and update themselves when a change happens.

Besides ensuring that the application remains consistent, it also encourages good software layering practices like Model View Controller (MVC) where a view could listen to a model without causing the model to depend on the view's implementation.

Simple Use Case : Adding Two Values

Another usage of the observer pattern is to manipulate data based on changing inputs.

For instance, if an application has a model with the following 3 fields:

Money in my Wallet

Money in my Bank Account

Total Money Available

We could then leverage the observer pattern to listen to the wallet and bank account fields.

Things work well for simple scenarios, but they get complicated when required inputs need to be fetched throughout an object tree.

The code then need to manage listeners across a network of object and react to every change that occurs.

For instance, if the model wants to listen to the money value stored in a list of bank accounts, it would need to:

Add a listener to the list when it's set.

Remove the listener on the old list if it gets changed.

Add a listener to objects added to the list

Remove listeners on objects removed from the list

Spreadsheets do this without breaking a sweat...

Now compare the same feature implementation on a spreadsheet.

The only formula required here is :

B3=B2+B1

In one line, the formula defines it's inputs, listen to them and executes itself when required.

Compared to the ~7 lines of code written above, it's easy to see how much time and code is saved.

Yet, for all their benefits, spreadsheets have their drawbacks. Every time you need a new instance of a formula, it needs to be copied. It's an obvious violation of the Don't Repeat Yourself (DRY) principle. If it turns out there is a defect in the formula, the fix needs to be applied everywhere it was copied.

Deduction Rules

The Deduced Framework attempts to bring the efficiency of spreadsheet formulas into the realm of object oriented development using Deduction Rules.

They consist of the following parts:

a list of inputs

a single output

the java code to execute

In the example above, a deduction rule could have the following values:

walletInput : a link to the wallet property

bankInput : a link to the bank property

totalOutput : a link to the total property

Code :
return walletInput + bankInput;

Once defined, the rule will automatically listen to it's inputs, execute the code when they change and update the output value.

We are now back to 1 line of code. On par with spreadsheets. Yet we still have the benefits of object oriented programming where we can create multiple instances of the object without creating multiple copies of the rule definition.

A Bigger Example : Building a Tree

The first time I implemented a visualization tree, I did it with the observer pattern. The code maintained a network of listeners across the model it wanted to represent. It reacted to all possible events that might have an impact on the view. It took about 2000 lines of code to accomplish it's function.

Once deduction rules were introduced, I used them to create a new implementation of the visualization tree. Rules were used to create child tree nodes, set their display text and set their icon. It took only 50 lines of code to accomplish the same function.

That's 40 times less code !

The number of defects I found in the code was proportional to the number of lines. The code is also much easier to learn and extend.

If you're curious to learn more about deduction rules, I encourage you to read on the Deduced Framework, post a comment below or send me an e-mail at d-duffATusersDOTsourceforgeDOTnet.