Background

Liskov Substitution Principle and subtyping

Simply put: Liskov Substitution Principle lies in the fact that if we have a bottle for liquid, we are able to pour into it water, milk, cola or acid and we don't expect that the bottle will explode.

The formal definition of Liskov Substitution Principle states that:

If S is a subtype of T, then objects of type T may be replaced with objects of type S without altering any of the desirable properties of the program.

The definition of subtyping itself sounds very similar:

If S is a subtype of T, then any term of type S can be safely used in a context where a term of type T is expected.

The key difference between these two lies in words desirable, safely. Liskov Substitution Principle should be considered more constricting than subtyping. So basically it is all about type safety and the whole spectrum from weak typing to strong typing where LSP is very rightist.

Liskov Subtitution Principle and type safety

Type safety is like a security control on an airport. Guards will not pass a person, who is potentially dangerous. It doesn't mean, that the person is really willing to destroy something, but taking into account rules, there is a considerable risk. In case of strong type safety and Liskov Substitution Principle, the rules are very constricting.

Computer languages differs in how they forces type safety. In practice: The stronger typing, the more compile errors during code writing. The weaker typing, the more runtime exceptions during program execution.

In certain circumstances, the programmer can follow programming principles in order to enhance type safety in his application, beyond computer language syntax rules.

Violating Liskov Substitution Principle in OOP

C# supports Object Oriented Programming style. OOP in turn, supports inheritance and polymorphism. It means, that we can compile the following lines:

It seems to be desirable enabling to use MyCollection in place of MyArray since they both have a Count property. However notice, that Count of MyArray is immutable, since arrays have fixed number of elements. But it is valid. Subtypes can provide more warranties.

That is how we have violated Liskov Substitution Principle. The code still compiles, but since we are able to cast down MyArray into MyCollection, we are also capable of adding a new item into a fixed sized array and that is incorrect. So here is an issue that arises when violating Liskov Substitution Principle. In such cases we have to throw a runtime exception.

An
interesting fact is, that .NET Array also violates LSP while it derives from ICollection<T>:

The Add method is not visible as Array member since it is implemented explicitly. But we are always capable of casting an Array to ICollection<T> and operate on its instance from underlying interface level.

We could also violate Liskov Substitution Principle in different way, by constricting method parameters in derived class. Below is an example. Suppose that there are further types:

MyAlmostPositiveInts - allows to add only integers that are greater than or equal -5;

MyAlmostNegativeInts - allows to add only integers that are less than or equal 5;

MyOvelappingInts - allows to add only integers that are inclusively between -10 and 10;

In case of specific integers collection, the first step is the same. We have to disallow to add a new item from the base class level, because the base class doesn't provide user with any restrictions about the range of an integer being added. Therefore if the user receives a reference to MyCollection he assumes, that he is allowed to insert there any integer and currently it is not always the truth (i.e. MyPositiveInts could be assigned to a parameter of type MyCollection). Thus specific integers collection should also derive from MyCollectionBase that contains only Count.

If we think of the current situation, we will notice that we have to check the actual type of MyCollectionBase each time we want to add an item. In practice that would be very inconvenient. There are also still runtime ArgumentException occurrences. However, they not result from violating Liskov Substitution Principle. They are there because the compiler doesn't know how to read our if statements in Add method and not because we reinforced limitations in derived class. Anyway we will deal with that issue later.

How to obtain access to Add method in safe way without explicit casting? To achieve that, integers collections must be derived from same base class or implement same interface that contains Add. Unfortunately that was already and was unsafe due to Liskov Substitution Principle violation. Though, we are still able to extract common base class to certain extent. Notice that integers collection have intersections. We can utilize that:

Each level of inheritance relaxes the restrictions regarding item being added. That is allowed. We only can't tightens the rules. Regrettably the compiler will not follow to our guidelines and we are still able to insert to our MySafeIntsBase collection any integer. Good point is, that our inheritance hierarchy and class names indicates our intentions.

MySafeIntsBase defines safe range as <-5;5> and thus can be used in places where we can expect the instance of MyAlmostPositiveInts<-5; +∞), MyAlmostNegativeInts <-∞; 5) or MyOverlappingInts <-10;10>. It will give the developer a clue, that he should insert there only integers that are valid for all those types.

MyPositiveSafeIntsBase defines safe range as <-5;10> and thus can be used in places where we can expect the instance of MyAlmostPositiveInts or MyOverlappingInts.

MyNegativeSafeIntsBase defines safe range as <-10;5> and thus can be used in places where we can expect the instance of MyAlmostNegativeInts or MyOverlappingInts.

Notice, that theoretically we could derive MyOverlappingInts also from MyPositiveSafeIntsBase and MyNegativeSafeIntsBase since the union of sets <-10;5> and <-5;10> gives <-10;10>. We can build less constricting type from more constricting. But multiinheritance in C# is not allowed. We could achieve the goal by utilizing interfaces, but I will combine that step in next subsection which debates about Code Contracts.

Code Contracts

Since .NET Framework 4.0 there is an addition support for programming safety called Code Contracts. They in particular extend the static analysis capability of the compiler. In our case, they are able to show compilation warning if we try to put invalid integer value to MySafeIntsBase collection. I will not explain how to configure and use Code Contracts, but the below is a highly secure solution, that almost disallow violating Liskov Substitution Principle (almost, because Code Contracts generates compilation warnings, not errors):

Summary

Writing bulletproof code always requires a lot of effort. The larger project, the more type safety should be applied in order to keep everything easy to maintain. In this article I've tried to explain what is the Liskov Substitution Principle, how to violate it by inappropriate inheritance hierarchy and shown the possible solution.

About the Author

Comments and Discussions

...I don't like the example with MyArray and MyCollection because it includes a design flaw in your class hierarchy in the first place. An array should not be a subtype of a collection, it has to be the other way round because a collection is a more specialized kind of array, namely an array with add/remove/clear capabilities, or mutable array if you will, and not vice versa. Therefore, it is a bad example to explain LSP violence when the actual error lives in the design of your class hierarchy. If MyCollection was a subtype of MyArray, you wouldn't have to deal with LSP violence at all.

The fact that a .NET array implements ICollection (and even IList!) is just bad design as well. If they would have been smart in designing their collection interfaces, they would have made the Count property a member of IEnumerable, or even better: an interface that lives in between IEnumerable and ICollection that serves that purpose, something like IReadOnlyCollection.

An interesting point of view. But I've never heard about collections that are specialized versions of arrays. Could you please provide a reference to the source, from which you've taken such definition? As for me, according to http://dictionary.reference.com/, a collection is a number of things collected or assembled together. They are considered unordered and their count can be undefined. They are associated solely because of the fact that they are gathered together in one place. From the other side, the main property of an array is that it is ordered and therefore can be indexed. It means that an array is an ordered collection and that means, that an array can be treated as a subtype of a collection.

However, those are only naming conventions. Ideally they should match a class definition. But ultimately, a class is defined by its members and even if there are some discrepancies between linguar meaning of its name and class'es actual implementation, the implementation takes precedence. Sometimes programmers have to use worse names merely because better names are already reserved elsewhere.

I can agree with you that it is difficult to understand some design decisions of .NET designers. They could add them as part of the documentation.

If we define a collection as having add/remove/clear methods and we constrain an array to be of fixed lenght (as it is generally adopted) than we cannot establish a hierarchy between them at all!
Option 1: An array as a subtype of a collection. It fails because a collection has mutable size since its add/remove/clear methods whereas an array has fixed size. So we could have an instance of array assigned to a pointer of collection and we could use that collection pointer to change the size of its underlying array.
Option2: A collection as a subclass of an array. It fails even more because now we have a pointer to a collection and we can manipulate its size even though its underlying base class is of an array type.

You're right and I agree with you that a collection is an unordered set of items while an array has indexing capabilities, and that makes it hard (if not impossible) to find a least common denominator in the class hierarchy, but that's when interfaces come into play and I think the ones defined in the .NET Framework, namely IEnumerable, ICollection, and IList, provide an abstraction that is too simple and not well thought through. Let me explain this in a little more detail:

First of all, like I said, there should be an interface that lives in-between IEnumerable and ICollection, which defines members such as the Count property, as well as methods like Contains() which are neutral to mutability - I call it IVector for the sake of this example.

Then there is the indexing problem. There should be an interface especially defined for this purpose, something like IIndexed, for example, which defines the Indexer as well as methods that are common to indexing, like IndexOf().

If you only had these two additional interfaces separating out the specialized functionality, you could make an array implement IVector (which derives from IEnumerable) and IIndexed, a collection only ICollection (which derives from IVector), and a list IList (which derives from ICollection and implements IIndexed). This would circumvent the LSP violation quite elegantly when you look at your objects via interfaces, because if you see it as an array, it can be indexed but is immutable; if you see it as a collection, it is mutable but not indexed; and from a list's point of view, it is indexed and mutable.

Maybe my argument that a collection is kind of a specialized array was wrong because I was looking at it from the perspective I've just laid out above (which is really just my thinking and point of view and therefore I cannot give you any references). It might be more sufficient to say that a collection is a different kind of array that only shares some common functionality that is transparent to mutability and indexing.

So, maybe we cannot establish an object hierarchy and prevent LSP violation, but we can circumvent it using interfaces.

My problem with this, and all of SOLID though is that it involves a lot of brain power needlessly. Yes you can derive a square from a rectangle, or you can choose not to. Just do one or the other and stop worrying about it. Usually as developers we have enough to think about just solving our specific problem and don't have time to consider the academic merits of what a is strictly a subtype of b.

That's just my opinion though, and although not my cup-of-tea you've done a good job explaining this.

As a pragmatic programmer i never would provide an Add method on a collection which is based on a fixed size array. So i don't get embarrassed to violate LSP. On the ohter hand, implementing the Add method in a way to create a new array with larger size, copy the old one and set the new value as last element of the array (like the String class do) i also don't violate LSP. The sample of square and rectangle you gave in the thread with FatCatProgrammer is much more suitable to explain the LSP.

The purpose of Add here is to modify the underlying collection state and that violates Liskov Substitution Principle when MyArray derives from MyCollection.

You're right, that we could implement resizing capability in the collection by defining the Add method as immutable: MyCollection Add instead of void Add. It will not violate the Liskov Substitution Principle if we override such method in MyArray. Eric Lippert gives an example how to implement an immutable stack.
However, that is not the case for implementation of void Add. The standard signature of Add is void Add and that method is thought in most cases to alternate the state of underlying object.

String doesn't implement ICollection<char>. It implements only IEnumerable<char> so it doesn't contain void Add method. Instead, it implements Concat method which semantics is different. Concat takes two collections and returns a new one containing items of both. You probably have thought about that method.

I've used examples of collections for the purpose of further considerations about intersections.
I remain of the fact that my example is a good.

It's good that different people have different opinions. From the academic view you are right and beginners may fall into the trap. From the pragmatic view i would never write such code because of the given collections in the .NET framework will mostly satify my needs.

Please show me where is written, that throwing an exception is defined as violation of LSP and I will correct that.

As far, I checked the text and I found only a mention, that violating Liskov Substitution Principle weakens the type safety and allows to put such argument into method, that is valid for base class, but invalid for derived class. From that reason, we have to check if given argument doesn't violates requirements that have been narrowed down in derived class. If yes, then we have to throw an exception. LSP in such case is about creating inheritance hierarchy, that doesn't reinforce requirements in derived classes. Safely created class hierarchy only allows to loosen requirements.

"That is how we have violated Liskov Substitution Principle. The code still compiles, but since we are able to cast down MyArray into MyCollection, we are also capable of adding a new item into a fixed sized array and that is incorrect. So here is an issue that arises when violating Liskov Substitution Principle. In such cases we have to throw a runtime exception."

Am I mistaken? We are not violating the principle, we are in fact using it, it just happens that in this case we are throwing a exception, which by the way is perfectly fine. Exceptions are useful things.

Consider the following:
We have a Square and a Rectangle. A Square is a Rectangle, so we could think that deriving a Square from a Rectangle is correct. But from the other side, a Rectangle is less constricting in terms of its edges length than a Square. A Square requires that its Height has same length as its Width. However, until we keep setters of those two private, we can use Square instance as a substitute of Rectangle instance and nothing bad might happen. It means, that in scenario where Width and Height setters are private, deriving a Square from a Rectangle is valid.
The things change, when we set setters access of Width and Height to public. Now someone could cast down an instance of Square to Rectangle and try to modify its Width without updating Height. The type is not safe anymore. We can't use safely a Square in place of a Rectangle if we allow to change their edge lengths. In such scenario, if we derive a Square from a Rectangle, we will violate Liskov Substitution Principle.

Deriving MyArray from MyCollection when MyCollection provides Add method is a violation of Liskov Substitution Principle, because arrays have fixed size and not allow adding new items. We can't safely substitute collection instance with an array in all places. This is the reason. How we handle invocation of Add method in MyArray doesn't matter. We can throw an exception or just ignore that, but it is still semantically a fault.

If we see that our virtual method have no sense in derived class or it can't handle all range of arguments that are valid in its base implementation, it is a sign, that our type hierarchy should be refactored. The only argument to not change such state is desirable keyword in definition of Liskov Substitution Principle. If changes require much effort or probability of type missusing is low or there are other important reasons, we are allowed to violate Liskov Substitution Principle and just throw an exception in case of inappropriate use.