Pages

Wednesday, April 20, 2011

Your coding conventions are hurting you

If you take a walk in my hometown, and head for the center, you may stumble into a building like this:

from a distance, it's a relatively pompous edifice, but as you get closer, you realize that the columns, the ornaments, everything is fake, carefully painted to look like the real thing. Even shadows have been painted to deceive the eye. Here is another picture from the same neighbor: from this angle, it's easier to see that everything has just been painted on a flat surface:

Curiously enough, as I wander through the architecture and code of many systems and libraries, I sometimes get the same feeling. From a distance, everything is object oriented, extra-cool, modern-flexible-etc, but as you get closer, you realize it's just a thin veneer over procedural thinking (and don't even get me started about being "modern").

Objects, as they were meant to be
Unlike other disciplines, software development shows little interest for classics. Most people are more attracted by recent works. Who cares about some 20 years old paper when you can play with Node.js? Still, if you never took the time to read The Early History of Smalltalk, I'd urge you to. Besides the chronicles of some of the most interesting times in hw/sw design ever, you'll get little gems like this: "The basic principal of recursive design is to make the parts have the same power as the whole." For the first time I thought of the whole as the entire computer and wondered why anyone would want to divide it up into weaker things called data structures and procedures. Why not divide it up into little computers, as time sharing was starting to? But not in dozens. Why not thousands of them, each simulating a useful structure?

That's brilliant. It's a vision of objects like little virtual machines, offering specialized services. Objects were meant to be smart. Hide data, expose behavior. It's more than that: Alan is very explicit about the idea of methods as goals, something you want to happen, unconcerned about how it is going to happen.

Now, I'd be very tempted to write: "unfortunately, most so-called object-oriented code is not written that way", but I don't have to :-), because I can just quote Alan, from the same paper: The last thing you wanted any programmer to do is mess with internal state even if presented figuratively. Instead, the objects should be presented as sites of higher level behaviors more appropriate for use as dynamic components. [...]It is unfortunate that much of what is called “object-oriented programming” today is simply old style programming with fancier constructs. Many programs are loaded with “assignment-style” operations now done by more expensive attached procedures.

That was 1993. Things haven't changed much since then, if not for the worse :-). Lot of programmers have learnt the mechanics of objects, and forgot (or ignored) the underlying concepts. As you get closer to their code, you'll see procedural thinking oozing out. In many cases, you can see that just by looking at class names.

Names are a thinking device
Software development is about discovering and encoding knowledge. Now, humans have relatively few ways to encode knowledge: a fundamental strategy is to name things and concepts. Coming up with a good name is hard, yet programming requires us to devise names for:
- components
- namespaces / packages
- classes
- members (data and functions)
- parameters
- local variables
- etc
People are basically lazy, and in the end the compiler/interpreter doesn't care about our beautiful names, so why bother? Because finding good names is a journey of discovery. The names we choose shape the dictionary we use to talk and think about our software. If we can't find a good name, we obviously don't know enough about either the problem domain or the solution domain. Our code (or our model) is telling us something is wrong. Perhaps the metaphor we choose is not properly aligned with the problem we're trying to solve. Perhaps there are just a few misleading abstractions, leading us astray. Still, we better listen, because we are doing it wrong.

As usual, balance is key, and focus is a necessity, because clock is ticking as we're thinking. I would suggest that you focus on class/interface names first. If you can't find a proper name for the class, try naming functions. Look at those functions. What is keeping them together? You can apply them to... that's the class name :-). Can't find it? Are you sure those functions belong together? Are you thinking in concepts or just slapping an implementation together? What is that code really doing? Think method-as-a-goal, class-as-a-virtual-machine. Sometimes, I've found useful to think about the opposite concept, and move from there.

Fake OO names and harmful conventions
That's rather bread-and-butter, yet it's way more difficult than it seems, so people often tend to give up, think procedurally, and use procedural names as well. Unfortunately, procedural names have been institutionalized in patterns, libraries and coding conventions, therefore turning them into a major issue.

In practice, a few widely used conventions can seriously stifle object thinking:
- the -er suffix
- the -able suffix
- the -Object suffix
- the I- prefix

of these, the I- prefix could be the most harmless in theory, except that in practice, it's not :-). Tons of ink has been wasted on the -er suffix, so I'll cover that part quickly, and move to the rest, with a few examples from widely used libraries.

Manager, Helper, Handler...
Good ol' Peter Coad used to say: Challenge any class name that ends in "-er" (e.g. Manager or Controller). If it has no parts, change the name of the class to what each object is managing. If it has parts, put as much work in the parts that the parts know enough to do themselves (that was the "er-er Principle"). That's central to object thinking, because when you need a Manager, it's often a sign that the Managed are just plain old data structures, and that the Manager is the smart procedure doing the real work.

When Peter wrote that (1993), the idea of an Helper class was mostly unheard of. But as more people got into the OOP bandwagon, they started creating larger and larger, uncohesive classes. The proper OO thing to do, of course, is to find the right cooperating, cohesive concepts. The lazy, fake OO thing to do is to take a bunch of methods, move them outside the overblown class X, and group them in XHelper. While doing so, you often have to weaken encapsulation in some way, because XHelper needs a privileged access to X. Ouch. That's just painting an OO picture of classes over old-style coding. Sadly enough, in a wikipedia article that I won't honor with a link, you'll read that "Helper Class is one of the basic programming techniques in object-oriented programming". My hope for humanity is only restored by the fact that the article is an orphan :-).

I'm not going to say much about Controller, because it's so popular today (MVC rulez :-) that it would take forever to clean this mess. Sad sad sad.

Handler, again, is an obvious resurrection of procedural thinking. What is an handler if not a damn procedure? Why do something need to be "handled" in the first place? Oh, I know, you're thinking of events, but even in that case, EventTarget, or even plain Target, is a much better abstraction than EventHandler.

Something-able
Set your time machine to 1995, and witness the (imaginary) conversation between naïve OO Developer #1, who for some reason has been assigned to design the library of the new wonderful-language-to-be, and naïve OO Developer #2, who's pairing with him along the way.
N1: So, I've got this Thread class, and it has a run() function that it's being executed into that thread... you just have to override run()...
N2: So to execute a function in a new thread you have to extend Thread? That's bad design! Remember we can only extend one class!
N1: Right, so I'll use the Strategy Pattern here... move the run() to the strategy, and execute the strategy in the new thread.
N2: That's cool... how do you wanna call the strategy interface?
N1: Let's see... there is only one method... run()... hmmm
N2: Let's call it Runnable then!
N1: Yes! Runnable it is!

And so it began (no, I'm not serious; I don't know how it all began with the -able suffix; I'm making this up). Still, at some point people thought it was fine to look at an interface (or at a class), see if there was some kind of "main method" or "main responsibility" (which is kinda obvious if you only have one) and name the class after that. Which is a very simple way to avoid thinking, but it's hardly a good idea. It's like calling a nail "Hammerable", because you known, that's what you do with a nail, you hammer it. It encourages procedural thinking, and leads to ineffective abstractions.

N1: So, I've got this Thread class, and it has a run() function that it's being executed into that thread... you just have to override run()...
OT: So to execute a function in a new thread you have to extend Thread? That's bad design! Remember we can only extend one class!
N1: Right, so I'll use the Strategy Pattern here... move the run() to the strategy, and execute the strategy in the new thread.
OT: OK, so what is the real abstraction behind the strategy? Don't just think about the mechanics of the pattern, think of what it really represents...
N1: Hmm, it's something that can be run...
OT: Or executed, or performed independently...
N1: Yes
OT: Like an Activity, with an Execute method, what do you think? [was our OT aware of the UML 0.8 draft? I still have a paper version :-)]
N1: But I don't see the relationship with a thread...
OT: And rightly so! Thread depends on Activity, but Activity is independent of Thread. It just represents something that can be executed. I may even have an ActivitySequence and it would just execute them all, sequentially. We could even add concepts like entering/exiting the Activity...
(rest of the conversation elided – it would point toward a better timeline :-)

Admittedly, some interfaces are hard to name. That's usually a sign that we don't really know what we're doing, and we're just coding our way out of a problem. Still, some others (like Runnable) are improperly named just because of bad habits and conventions. Watch out.

Something-Object
This is similar to the above: when you don't know how to name something, pick some dominant trait and add Object to the end. Again, the problem is that the "dominant trait" is moving us away from the concept of an object as a virtual machine, and toward the object as a procedure. In other cases, Object is dropped in just to avoid more careful thinking about the underlying concept.

Just like the -able suffix, sometimes it's easy to fix, sometimes is not. Let's try something not trivial, where the real concept gets obscured by adopting a bad naming convention. There are quite a few cases in the .NET framework, so I'll pick MarshalByRefObject.

If you don't use .NET, here is what the documentation has to say:Enables access to objects across application domain boundaries in applications that support remoting
What?? Well, if you go down to the Remarks section, you get a better explanation:Objects that do not inherit from MarshalByRefObject are implicitly marshal by value. […] The first time [...] a remote application domain accesses a MarshalByRefObject, a proxy is passed to the remote application
Whoa, that's a little better, except that it's "are marshaled by value", not "are marshal by value" but then, again, the name should be MarshaledByRefObject, not MarshalByRefObject. Well, all your base are belong to us :-)

Now, we could just drop the Object part, fix the grammar, and call it MarshaledByReference, which is readable enough. A reasonable alternative could be MarshaledByProxy (Vs. MarshaledByCopy, which would be the default).
Still, we're talking more about implementation than about concepts. It's not that I want my object marshaled by proxy; actually, I don't care about marshaling at all. What I want is to keep a single object identity across appdomains, whereas with copy I would end up with distinct objects. So, a proper one-sentence definition could be:

Preserve object identity when passed between appdomains [by being proxied instead of copied]
orGuarantees that methods invoked in remote appdomains are served in the original appdomain

Because if you pass such an instance across appdomains, any method call would be served by the original object, in the original appdomain. Hmm, guess what, we already have similar concepts. For instance, we have objects that, after being created in a thread, must have their methods executed only inside that thread. A process can be set to run only on one CPU/core. We can configure a load balancer so that if you land on a specific server first, you'll stay on that server for the rest of your session. We call that concept affinity.

So, a MarshalByRefObject is something with an appdomain affinity. The marshaling thing is just an implementation detail that makes that happen. AppDomainAffine, therefore, would be a more appropriate name. Unusual perhaps, but that's because of the common drift toward the mechanics of things and away from concepts (because the mechanics are usually much easier to get for techies). And yes, it takes more clarity of thoughts to come up with the notion of appdomain affinity than just slapping an Object at the end of an implementation detail. However, clarity of thoughts is exactly what I would expect from framework designers. While we are at it, I could also add that AppDomainAffine should be an attribute, not a concrete class without methods and fields (!) like MarshalByRefObject. Perhaps I'm asking too much from those guys.

ISomething
I think this convention was somewhat concocted during the early COM days, when Hungarian was widely adopted inside Microsoft, and having ugly names was therefore the norm. Somehow, it was the only convention that survived the general clean up from COM to .NET. Pity :-).

Now, the problem is not that you have to type an I in front of things. It's not even that it makes names harder to read. And yes, I'll even concede that after a while, you'll find it useful, because it's easy to spot an interface just by looking at its name (which, of course, is a relevant information when your platform doesn't allow multiple inheritance). The problem is that it's too easy to fall into the trap, and just take a concrete class name, put an I in front of it, and lo and behold!, you got an interface name. Sort of calling a concept IDollar instead of Currency.

Case in point: say that you are looking for an abstraction of a container. It's not just a container, it's a special container. What makes it special is that you can access items by index (a more limited container would only allow sequential access). Well, here is the (imaginary :-) conversation between naïve OO Developer #1, who for unknown reasons has been assigned to the design of the base library for the newfangled language of the largest software vendor on the planet, and himself, because even naïve OO Developer #2 would have made things better:

N1: So I have this class, it's called List... it's pretty cool, because I just made it a generic, it's List<T> now!
N1: Hmm, the other container classes all derive from an interface... with wonderful names like IEnumerable (back to that in a moment). I need an interface for my list too! How do I call it?
N1: IListable is too long (thanks, really :-)))). What about IList? That's cool!
N1: Let me add an XML comment so that we can generate the help file...IList<T> InterfaceRepresents a collection of objects that can be individually accessed by index.

So, say that you have another class, let's call it Array. Perhaps a SparseArray too. They both can be accessed by index. So Array IS-A IList, right? C'mon.

Replay the conversation, drop in our Object Thinker:

N1: So I have this class, it's called List... it's pretty cool, because I just made it a generic, it's List<T> now!
N1: Hmm, the other container classes all derive from an interface... with wonderful names like IEnumerable. I need an interface for my list too! How do I call it?
OT: What's special about List? What does it really add to the concept of enumeration? (I'll keep the illusion that "enumeration" is the right name so that N1's mind won't blow away)
N1: Well, the fundamental idea is that you can access items by index... or ask about the index of an element... or remove the element at a given index...
OT: so instead of sequential access you now have random access, as it's usually called in computer science?
N1: Yes...
OT: How about we call it RandomAccessContainer? It reads pretty well, like: a List IS-A RandomAccessContainer, an Array IS-A RandomAccessContainer, etc.
N1: Cool... except... can we put an I in front of it?
OT: Over my dead body.... hmm I mean, OK, but you know, in computer science, a List is usually thought of as a sequential access container, not a random access container. So I'll settle for the I prefix if you change List to something else.
N1: yeah, it used to be called ArrayList in the non-generic version...
OT: kiddo, do you think there was a reason for that?
N1: Oh... (spark of light)

Yes, give me the worst of both worlds!
Of course, given enough time, people will combine those two brilliant ideas, turn off their brain entirely, and create wonderful names like IEnumerable. Most .NET collections implement IEnumerable, or its generic descendant IEnumerable<T>: when a class implements IEnumerable, its instances can be used inside a foreach statement.

Indeed, IEnumerable is a perfect example of how bad naming habits thwart object thinking, and more in general, abstraction. Here is what the official documentation says about IEnumerable<T>:Exposes the enumerator, which supports a simple iteration over a collection of a specified type.
So, if you subscribe to the idea that the main responsibility gives the -able part, and that the I prefix is mandatory, it's pretty obvious that IEnumerable is the name to choose.

Except that's just wrong. The right abstraction, the one that is completely hidden under the IEnumerable/IEnumerator pair, is the sequential access collection, or even better, a Sequence (a Sequence is more abstract than a Collection, as a Sequence can be calculated and not stored, see yield, continuations, etc). Think about what makes more sense when you read it out loud:

A List is an IEnumerable (what??)
A List is a Sequence (well, all right!)

Now, a Sequence (IEnumerable) in .NET is traversed ("enumerated") through an iterator-like class called an IEnumerator ("iterator" like in the iterator pattern, not like that thing they called iterator in .NET). Simple exercise: what is a better name than IEnumerator for something that can only move forward over a Sequence?.

Is that all you got?
No, that's not enough! Given a little more time, someone is bound to come up with the worst of all worlds! What about an interface with:
an I prefix
an -able suffix
an Object somewhere in between

That's a challenging task, and strictly speaking, they failed it. They had to put -able in between, and Object at the end. But it's a pretty amazing name, fresh from the .NET Framework 4.0: IValidatableObject (I wouldn't be surprised to discover, inside their code, an IValidatableObjectManager to, you know, manage :-> those damn stupid validatable objects; that would really close the circle :-).

We can read the documentation for some hilarious time: IValidatableObject Interface - Provides a way for an object to be invalidated.

Yes! That's what my objects really want! To be invalidated! C'mon :-)). I'll spare you the imaginary conversation and go straight to the point. Objects don't want to be invalidated. That's procedural thinking: "validation". Objects may be subject to constraints. Yahoo! Constraints. What about a Constraint class (to replace the Validation/Validator stuff, which is so damn procedural). What about a Constrained (or, if you can't help it, IConstrained) interface, to replace IValidatableObject?

Microsoft (and everyone else, for that matter): what about having a few more Object Thinkers in your class library teams? Oh, while we are at it, why don't you guys consider that we may want to check constraints in any given place, not just in the front end? Why not moving all the constraint checking away from UI or Service layers and make the whole stuff available everywhere? It's pretty simple, trust me :-).

Bonus exercise: once you have Constraint and IConstrained, you need to check all those constraints when some event happens (like you receive a message on your service layer). Come up with a better name than ConstraintChecker, that is, something that is not ending in -er.

And miles to go...
There would be so much more to say about programming practices that hinder object thinking. Properties, for instance, are usually misused to expose internal state ("expressed figuratively" or not), instead of being just zero-th methods (henceforth, goals!). Maybe I'll cover that in another post.

An interesting question, for which I don't really have a good answer, is: could some coding convention promote object thinking? Saying "don't use the -er suffix" is not quite the same as saying "do this and that". Are your conventions moving you toward object thinking?

59 comments:

Anonymous
said...

Oop is fine as long as you don't allocate them (see Google's recommendations) and don't activate method code iwhen retrieving values (kills performance). Also, avoid dependencies to classes of the same or higher level or your code will corrupt. Oh, and keep things simple, the KISS principle.

Exercises apart, thanks for the post. I haven't managed to read most of the 'classics' yet, I really need to improve my time management :SBut let me ask a one billion question...Is there just one software that can claim to be Object Oriented (obviously I'm talking about something with public evidence)? Even in Design Patterns there are many of them called something-or (mediator, iterator, decorator, visitor) that resemble functional programming with 'fancier constructs'. Many areas of computer science have a solid background theory based on functional decomposition (I'm thinking about compiler construction, for instance, where Parser, Lexer, ASTBuilder are main concept).Sadly, it seems that a lot of OO gurus have taken this path :(

Fulvio: I'll start with the simple stuff. Compiler construction is often approached with an half-hearted OO approach, where a recursive descendent parser is built using inheritance, but after that, it's all procedural programming.However, it's quite simple to fix, basically like Peter Coad said: remove the -er, look for the "managed" object and put the responsibilities there.Lexer -> LexicalGrammarParser -> Syntaxetc.About the million dollar question, I'm afraid I don't know any. Maybe I'm spending too little time reading code :-).About the exercises:ForwardStride sort of loses part of the responsibility (providing the current value). I'd suggest that you try drawing a picture of the sequence and of that "thing" you are trying to name. Not a UML diagram, but a visual representation of the concept. See if you get a reasonably precise picture and if that reminds you of something :-)The ConstraintChecker requires an overall design, you can't just try a name without assigning responsibilities. Hint: what is triggering a "checker"? What is the difference between a checker and a Constrained?

mmm well, about IEnumerator, the only image I can think of it's the head (IEnumerator) to read a magnetic tape (the sequence).

For ConstraintChecker, I have to think about it a little more, but some hints could be useful...

what is triggering a "checker"? I believe a change of state inside the object. (Am I right?)

What is the difference between a checker and a Constrained? the constrained is an object subjected to contraints, the checker (from a procedural thinking view) has the responsibility to validate constratints.

Previous Anonymous: cursor, or the slightly more precise/verbose ForwardCursor, is indeed a reasonable option (it was also my first idea when I thought of alternative names for IEnumerator). Some would say that -or is not much better than -er (in some languages there is in fact no difference) but I still like Cursor more.

Fulvio: this may seem too much "dead poet society" :-)), like "stand on your desk to see things in a different way", but I actually think that the (traditional) choice of "head", "iterator", "cursor" etc comes from focusing too much on that thing that moves, and too little on the underlying value.

Consider this picture: it seems like most people think of (a) when presented with the concept, focusing too much on that little thing on top, and their choice of names is heavily influenced from that.

What if you draw it like (b)? Does that suggest you any alternative?

Since we're playing with this little problem, what if I change the picture (and therefore the concept) to (c)? Does it remind you of something? It's a known structure in computer science, used (with different names) in a few algorithms but also in operating systems (also at the kernel API level). A good name for (c) may suggest a good name for (b), which after all is a special case of (c).

(of course, it's much easier to turn off our brain and call it an IEnumerator, or an iterator, despite the fact that by itself is not enumerating or iterating at all; curiously enough, I've lately discovered that the wikipedia page on iterator includes a similar critic of name :-)

About the constraint checker: this is not the kind of exercise that you can solve with an educated guess :-). You actually have to design that thing to really understand the forcefield and come up with good names. Still: why is not a Constrained object checking its own constraints? Why is a checker needed at all (note that I've subtly, but intentionally, moved from IConstrained to Constrained, because an interface can't really do stuff :-).

mmm i discarded cursor 'cause it's really similar to iterator (well I knew the solution had to be harder :)

But even after your picture I can't figure out how to name something that provide access to an element in a sequence and can move forward.

Honestly I can't find a name even for c) (it reminds me of "slice" but it's not the answer, I know. Maybe I'm so tired my brain is off anyway)For now I give up...but I will try again in a more relaxed moment.

I'm not confident i'll ever find out the names so if the author will give the solution I really appreciate :DAs a side thought, I was thinking about your statement "an interface can't really do stuff". So many times I see different project that separate interfaces from an abstract base class implementation that in the end is required as a base to each interface implementors, something like have a Figure as interface and then an AbstractFigure as an abstract class that factors common base implementation and then concrete figures derived from AbstractFigure. What do you think about that?

Note: I'm not claiming this is the "perfect" name. It's a reasonable name that is not overly focused on the procedural aspect of moving.

That said: a very common name for (c) in my previous picture is "moving window" or "sliding window" because it represents a window over a larger data set / sequence. Now, the difference between (b) and (c) is that the window is restricted to a single value. So, SlidingValue seems like a sensible name.

As an aside, a SlidingValue class would suggest a different interface. It's ugly to ask a SlidingValue for its Value :-), it should BE a Value.

About your question: sometimes, that's the natural consequence of single implementation inheritance. The (mild) pressure to add mixins or traits (not C++ traits) to SII languages is coming out directly from this kind of discomfort.Having an interface in place is also handy for testing, and you may not want to inherit from the abstract class in that case.Sometimes, however, the interface-abstract-concrete layering is just the consequence of blindly following "rules" and putting interfaces everywhere, without a real need. Context, as always, is the key :-)

Sliding Window...well after got it, it seems obvious :S!!Just as marginal note...maybe a SlidingItem could be good if we don't wanna change the interface and furthermore it can remind us that we are visiting a collection :)

If you were to rename Lexer to LexicalGrammar or Parser to Syntax, wouldn't one assume that the single purpose of these classes is to hold the definitions of the lexical grammar and the syntax? That is, as opposed to actually being classes that do something.Or would that be a result of procedural thinking as well?

Honestly, I can't seem to find the issue with names like Parser or Controller. Maybe I just have a wrong concept of what Object Oriented Programming actually entails. If so, I would honestly like to be educated on the subject.

Maybe I don't think Procedural Programming is Very Evil, as some OOP-preachers seem to say, and hence am not disgusted by names like IEnumerator.

wouldn't one assume that the single purpose of these classes is to hold the definitions of the lexical grammar and the syntax? ---No, unless one thinks of a class as a data structure. The behavior / responsibility of a class is exposed through meaningful methods like Check, BuildExpressionTree, etc etc for a Syntax class.

The issue with Controller is well known: you easily end up with a godlike-class controlling a lot of stupid classes. That's why so many people preach about having a skinny controller and a fat object model: because most people tend to do exactly the opposite (you never have to preach something people are already keen on doing). Unfortunately, all the logic you put in a controller is usually not reusable (controllers are normally highly customized).

I began this post with a quote from Alan Kay, one of the founding fathers of OOP. I think that whoever is honestly interested in appreciating the nuances of object orientations should spend some time pondering on early works like that. I certainly did, both because of age :-) and because I don't subscribe to the notion that being "modern" means rejecting the past :-).

As an aside: I failed to mention that I don't consider procedural programming intrinsecally "wrong" or "evil". I expect some properties from procedural code, and different properties from OO code. For instance, procedural code is often structurally simpler compared to OO code, has less delocalized plans, etc. What I consider wrong is claiming that something is OO when it is, in fact, procedural. Fake OO code gives me the worst of both worlds. That's evil :-).

I would quibble about your grammar complaints--there's a long tradition and a consistent system underlying names like "MarshalByRefObject". ("We marshal these objects by reference." becomes "This is the marshal-by-reference style of marshaling.", which becomes "These are marshal-by-reference objects." The original is probably the call-by-value/call-by-reference distinction.)

Other than that, great article. Anything to avoid another AbstractKeywordRemovalInitiatorFactory....

Yeap, I guess I'd have to redesign JSF from the ground up :-) to fix that. Of course, one could start by wondering why we can't just rename EditableValueHolder into "input" and start from there...

In fact, the JSF is a perfect example of "bad abstraction" (in the sense of my "cut the red wire" post) where concepts have been abstracted to the point where they bear little resemblance with the problem domain. It's not impossible to fix, but a complete overhaul would be needed...

Just a couple of days ago I was watching a video by Kevlin Henney on infoQ (http://www.infoq.com/presentations/It-Is-Possible-to-Do-OOP-in-Java, from 39:00) where he covers your point about finding good names.

Reading your article made me think of domain-driven design and "Services" in particular. From my understanding, services are typically stateless and inherently procedural in nature.

Would you agree that these should not follow your naming rules, and that even when working within the OO paradigm there are parts of the system that are best modelled procedurally, by using services, for example?

An in-depth discussion of DDD and some of its choices would take at least an entire (long) post, so I'll try to avoid theory and go straight to the practice. That is, I'll take a couple of examples from the DDD book and tinker with them.(the answer turned out so long anyway that I had to split it into two comments :-)))Example 1 (simple)"For example, if an application needs to send an e-mail, some message-sending interface can be located in the infrastructure layer and the application layer elements can request the transmission of the message. This decoupling gives some extra versatility. The message-sending interface might be connected to an e-mail sender, a fax sender, or whatever else is available." (page 73, unfortunately is not on google books)So one may proceed with an EmailSender or FaxSender service. Sorry, that sucks :-). There is a nice OO abstraction there, waiting to be found. It's called MailBox, or OutBox if we're only concerned with sending. Now, we can easily contend that I'm just fiddling with words here, but I'm not. The reason I want OO concepts to emerge is that it's way more natural to add responsibilities to an OO concept like the MailBox than to the EmailSender. Because in practice, you'll also want to configure the mailbox. You want to know the current state (queued messages, failures, whatever). You may want to specify a callback function to be called when some message has been delivered. Etc etc.

Lest someone thinks that I'm making this up, let's move forward a few pages (181 and following, it's partially on google books http://books.google.it/books?id=7dlaMs0SECsC&lpg=PP1&dq=domain%20driven%20design&pg=PA181#v=onepage&q&f=false but most of it is missing)An external Sales Management System is wrapped by a service, called AllocationChecker. The initial responsibility seems function-like: given a cargo, how much of this type of cargo may be booked?. After a short discussions, however, the allocation checker takes on two responsibilities: - deriveEnterpriseSegment(Cargo) [which provides a way to translate concepts from the SMS domain into our own concepts]- mayAccept( Cargo, Quantity)

Now, are we seriously going to ask an AllocationChecker to deriveEnterpriseSegment? It just doesn't fit. Moreover, a few pages down the AllocationChecker is potentially becoming stateful so that it can cache the enterprise segments. Besides, the service is crying to be an object: it's taking a Cargo on every call (feature envy).

We have [at least] two ways to fix this. The simplest way is to identify a decent domain concept, like SalesCategory (since cargoes are grouped by a category known to the SMS). So our code would be like (using Java and short variable names):

I've kept the book terminology and signatures. Having a class nicely scales to the most reasonable evolutions, like asking for how much could be actually booked when a cargo is rejected, etc.

The second way to fix it goes much deeper. Of course, the most "natural" place for those functions is the Cargo itself. We don't want to put them there because those responsibilities are not strictly part of "our" application domain. They're placed on a boundary with the sales management system. Well, that's a linguistic problem, isn't it. It's because we look at the problem through the rather narrow perspective of statically typed languages with closed classes. So a different solution is to allow objects to be supplemented when needed. This is possible (in limited ways) in objective C, in C# (with extension methods), and soon in Java 8 as well (http://blogs.oracle.com/briangoetz/resource/devoxx-lang-lib-vm-co-evol.pdf). Stretching the idea a little, we come into the concepts of subjet-oriented programming, and ultimately in the realm of aspect-oriented programming. Stepping on the philosophical soapbox for a moment, Alan Kay used to say that he expected Smalltalk to be just a step in an evolutionary chain, something other people would improve upon, but that didn't happen. I actually consider AOP+OOP the next step, providing much of the flexibility that we currently lack. I see no problem in having my cargo enhanced with more responsibilities (methods) when used within the context of the Booking Application.A theme I'd like to discuss at some point: mainstream languages keep reinforcing the idea that in order to separate the artifacts (source files) we have to separate the run-time instances (objects). That's a byproduct of languages and technologies, and it's not inherent in a paradigm.

Trying to answer your question now: I use services in some of my architectures, meaning the SOA notion of services. The interface of those services (which are usually called from remote systems) is not OO, because I'm doing services, not distributed objects. The implementation of those services, however, is usually OO, not procedural. Even the infrastructure of those services is OO, as there is much to be gained from that. So, in a word, no, I do not agree, sorry :-).

Once I rewrote a project (yes, usually bad idea) and realized that OOP creates lot of superfluous code. I used a PersonRole pattern for example and it turned into a nightmare over time: lots of creator objects, wrappers etc. for differnt customer or user types etc, ugly direct dependencies to each Role scattered all over the system. I rewrote the whole stuff by using a hub and spoke design with events/messages, separated (immutable) data and behavior into "services". That resulted in less code, more flexibility, better maintainence, less coupling, easier evolution. No object hierarchies or association nightmares, just pretty old event handlers. Events btw. make a stable API as they are solution agnostic. Now a change_in_the_problem_domain_occured and indepdendent parts of the system save it, send an email, delete or check other stuff and so on... I guess, even though object thinking might better fit to the real world, it doesnt mean, it's easier or more elegant when it comes to computer solutions. Sorry if this is a little bit random, but i have to go to bed now ;P

Green: unfortunately, without a chance to see your code, I guess we'll never know.

For an example of a controller-free OO design which I do not believe is creating "a lot of superfluous code", see my recent post "life without a controller, episode 1" (and episode 2).

The troubles you experienced seem to indicate that the wrong decomposition was chosen in the OO approach. A domain-driven decomposition is often natural, but not necessarily best. That has little to do with OO. "pretty old event handlers", for instance, fit the OO paradigm very well, without resorting to switch/cases or [function] pointer tables etc.

Why don't you explain your problem, design and code somewhere, so we can all learn something?

Thank you for some good examples of names that could use improving. I worry, though, when people frame the issue as "your coding convention is hurting you", rather than "your coding convention gives you a change to learn something", because I don't think people need to abandon their naming conventions, but rather expand their view.

At the risk of looking like I'm providing link bait -- I promise this reference relates -- compare with my summary at http://link.jbrains.ca/nP9Fvk

About my choice of words: interestingly, the original title was different. I usually tend to use more neutral tones. Still, that day I decided to try a different style, as a communication / marketing experiment I would say.

In the end, it worked. While I’m sure I’ve written better things, this post spread more rapidly, got more clicks and more tweets than any other.

That said, I tend to use the concept of backtalk quite often. My “Listen to your tools and materials” (http://eptacom.net/pubblicazioni/pub_eng/ListenToYourToolsAndMaterials.pdf, appeared in IEEE Software back in 2006) is an exploration of backtalk in software development.

You’re also right to say that in some cases, it is much better to expand our views (for instance, by adopting a multi-paradigmatic approach) than to ask for change.

However, I said some cases, because there are exceptions. If you see someone driving in the wrong lane, you can join him, start a lengthy conversation, persuade him to consider the cultural issues of driving into another country, and die while trying, or you can tell him to change lane, fast. Your choice, of course :-).

Within the context of object-oriented development, adopting the –er, I-something, something-able conventions as a starting point is very much like driving in the wrong lane. It’s fine if you change country (paradigm). It’s not if you want to drive here. So, yeah, sorry, I think in some cases you actually have to abandon some naming conventions (which, by virtue of being conventions, are implicitly suggesting that it’s ok to do things that way).

I’ve seen that picture of yours before – it’s very nice; I like the intellectual process one has to go through to build it. In practice, however, you’re describing a process, but it’s not very clear if it is what you consider a natural process, something that you see happening in the wild, or a recommended process, something that you propose people should go through as they design software.

Honestly, I haven’t seen that process in the wild. It requires a unique combination of OO-naïveté (so that one starts very far from the optimal), persistence (so that one goes through all those steps), professionalism (so that one actually cares doing all that) and intellectual sophistication (so that one can actually appreciate the transformational process itself). In the immortal words of Dilbert’s boss, “we want them brilliant, but clueless”. It’s a rare combination around here (your mileage may vary, of course :-).

In practice, there is a serious risk that people will stop at the structurally accurate / vague stage, especially when patterns, libraries, and conventions seem to suggest that it’s ok to stay there.

If you meant that as a recommended process, I would say that it is an interesting proposal to get out of sub-optimal saddles, ponder on how you got there and get some ideas on how to get out and where to go. It’s not the way I tend to reason, and not the way I see many effective designers think, but I certainly appreciate the thinking behind it, and I have no doubts some people may want to work that way.

Still, I don’t really see it as antithetical to suggesting avoidance strategies from the very beginning (like: stay away from –er, I-, -able, -object within the OO paradigm).

Emphasis on improvement as a continuous process does not need to go against emphasis on starting with the right foot (or in the right lane :-).

maxidr: the original work on subject oriented programming (1993) can still be found here:http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131.4805&rep=rep1&type=pdf

the idea of SOP was to emphasize the notion of an object as something with an identity, but with a behavior "composed" by the different views expected by different callers (subjects).

There is, of course, some overlap with the later notions of aspect orientation. On a small scale, and within the context of a traditional OO language, you can also see some subject-orientation in the solution I've adopted in my post on living without stupid objects, by composing on the same identity (created on the data layer) the behavior (business logic) expected by another subject (the application layer) through inheritance + IOC. Of course, to fully support SOP, different notions are needed (like dynamic mixins). One can build quite a decent SOP system in javascript, given some motivation :-).

Honestly, I tried watching some conference videos on DCI some time ago, and didn't like it much. The emphasis on the MVC paradigm in the papers was depressing, but I tried anyway, out of respect for good ol' Cope. In the end I didn't like it, as it lacked the conceptual integrity, sharpness, etc. that I'm looking for in new paradigms / approaches.

That said, there is certainly an overlap between DCI and some applications of AOP (although they seemed to claim the opposite in that conference, that is, that AOP is a subset of DCI, to which I would seriously object), so yeah, in a sense, there is some overlapping with SOP as well. I would say that SOP and DCI look at [parts of] the same problem, but end up proposing different solutions.

Carlo, thanks for the very insightful post. I think I'm going to need to read it a few more times before it really sinks. I'm an OO skeptic, probably because I still don't understand OO that well (even after many years working with OO languages). But I'm wondering how you feel about the following problems:

1. Serialization: How do you serialize an object for transmission in an OO way, taking into account that the same object can be the source of multiple representations?

2. One criticism that I heard Rich Hickey make towards OO is that you cannot use an object beyond what the original designers thought it could do. For example, say I have a class that represents a sequence of objects. In Ruby you can easily add a few useful methods to an object by extending the Enumerable (I know, I know) module. Then you can, say, use #map, #select, #reject, etc. However, a user of my sequence-like class will be unable to use random access, for example, because I didn't thought to include it. How do we deal with this? (Related to "It's better to have 100 functions operate on a single abstraction that 10 functions operating on 10 abstractions", see http://stackoverflow.com/a/6160116/79103)

Before I get to your points, I'd like to add some perspective to this post, especially for those unfamiliar with my work and my general line of thinking.

I'm not an OO zealot. Actually, I believe in multi-paradigm programming. I had my good share of LISP back in the 80s, I've immersed myself in Prolog longer than most sensible people do :-), etc. So the purpose of this post is not to say "do it the OO way or die", but instead "if you do it OO, do it right".

I tend to liken paradigms, languages, and the structures we build with those languages to materials, with a set of expected properties (hence my work on the physics of software). If you build something using steel, you expect some properties. If you use wood, you expect different properties. It's not that steel "is wrong" or wood "is wrong". They're wrong if you expect a different set of properties. And of course, trying to shape wood the way you do with steel ain't gonna work, and you may end up with a burned piece of wood. Fake OO is like claiming, at that point, that burned wood is steel. Except it's not.

In this sense, I see most criticism of programming paradigms as rather myopic. You frequently see a dangerously good :-) speaker / writer saying the equivalent of "see, stainless steel cannot be shaped into foil thinner than 0.01mm, so we never should use metals at all", which is very much like finding a case that cannot be perfectly covered by one language, and then claiming that the entire paradigm is a failure, and of course they propose a different paradigm ("you should always use graphene instead!"). They argue so convincingly that you really want to believe them.

This kind of rhetoric, in my opinion, is keeping the entire software design field stuck in dark ages, but ok, it's the de-facto technique to get market share. It's obviously skewed to the trained eye. For instance, functional programming and immutable objects are all rage now. Interestingly, there is a theoretical result which is suspiciously ignored by that community, which proves (see Pippenger, "pure vs impure lisp", http://www.cs.princeton.edu/courses/archive/fall03/cs528/handouts/Pure%20Versus%20Impure%20LISP.pdf) that for a category of problems, the best solution using immutable structures runs in O(n log on) while using mutable structures you can get O(n). Does that result in failure for Clojure or the entire functional paradigm? No, of course, because every material has its own properties, its own field of application, etc. For instance, LISP-like languages are pretty good when you want to write programming tools, and excel when you want to target LISP-like languages :-). They're also very good in a few more problem-frames, of course.

Still, in the real world, we build complex structures using many materials. One may criticize concrete for not being transparent, but I would not want to live in a house built entirely using glass. I'd like glass for my windows though, thank you. So even in multi-paradigm programming we need to adopt a paradigm for the overall structure, and adopt sub-paradigms where they're a better fit. For instance, in my "cut the red wire" post (http://www.carlopescio.com/2011/06/cut-red-wire.html) I basically used an OO shell around a FP core built using immutable objects, all in C#. And it was a piece of cake : ). Of course, it's not the only choice, though I've found it a pretty good combination in many cases.

Now, coming to your specific questions, they seem to be more about encapsulation than OO (so any paradigm based on modules hiding implementation details will suffer from the same kind of issues), but ok.

Serialization: How do you serialize an object for transmission in an OO way, taking into account that the same object can be the source of multiple representations?--I hope you see that it's very much like the foil thing. I mean, given the set of programs one may want to write, given the subset where serialization is needed, given the subset where one may want multiple representations, objects are not a best fit and we usually have to expose data and not only services. Which is, by the way, the default condition in other paradigms : ). Hardly a failure, sorry; someone may even claim victory, saying: my worst case is your default case :-). Over time, the OO community has identified a few "standard" ways to do it (double dispatch / visitor; reflection w/ annotations; or a repository-like concept, possibly complemented by an IoC like I've shown in my "living w/out stupid objects" post (http://www.carlopescio.com/2012/07/life-without-stupid-objects-episode-1.html), with multiple repositories).

One criticism that I heard Rich Hickey make towards OO is that you cannot use an object beyond what the original designers thought it could do.--Tough I like Hickey quite a lot, he epitomizes the "dangerously good speaker" I was talking about before :-). Of course, having access to implementation details gives you more freedom. In exchange for a more rigid material, as now I can't change that thing anymore because you depend on it. That's a age-old debate, probably dating back to Brooks vs. Parnas (with Brooks saying, 20 years later: I was wrong, information hiding is the way to go). Note that OO usually comes with a set of concepts (like inheritance + protected, or open classes, or mixin, etc) that help you deal with this kind of issue, without exposing data / implementation as a default.

I have a class that represents a sequence of objects. In Ruby you can easily add a few useful methods to an object by extending the Enumerable (I know, I know) module. Then you can, say, use #map, #select, #reject, etc. However, a user of my sequence-like class will be unable to use random access, for example, because I didn't thought to include it. How do we deal with this?--Well, let's make an important distinction. It would be one thing to say that the language / library (because it's not a paradigm issue) won't even let you expose random access in a uniform way (for those containers where it makes sense). That would be easily dealt with. It is a different thing to say that your class could expose random access, but you didn't want to, because you want to reserve yourself the freedom to change implementation (say, from an array to a tree). It is, finally, a different thing to say that you could expose random access, you don't plan to change the implementation at all, but you forgot to expose it. So you have a design error, and somehow we expect the paradigm to fix it :-). Very much like: I have a long function; I forgot a conditional in the middle of it, so you can't really use it for all input (but it's hard to know which input). Is it a paradigm problem that I can't fix it from the outside?

Oversimplifying this kind of things by saying that when everything is public or everything is a list you don't get into troubles is, well, more of a marketing strategy than a calm and reasoned assessment of a problem.

That said, the OO paradigm has necessarily to face the issue of unexpected usage / extension. Besides implementation inheritance and protected, most languages include now some form of "open class" technique, although there is usually a tendency to favor mechanisms that don't break encapsulation. Because if you didn't include random access, and I see by popping open your class that you could have included random access, does that mean that you'll always use that internal implementation, or that you reserved yourself the right to change it (and so you didn't expose it, intentionally?). Another interesting approach would be that of the selective violation of encapsulation (just because I want to add one method doesn't mean everyone has to be granted access to everything). There is a lot of pragmatic space here, that languages are filling over time.

Yet another approach is to supplement OOP with AOP, and mix-in a random-access interface into your class. Of course, that aspect is exactly a local violation of encapsulation :-). In the spirit of multi-paradigm programming, OOP + AOP is again an interesting combination.

The real issue, it seems, is more like "if I have encapsulation I cannot use the underlying data structure at my will". On one side, I hope you'll see that this is very much like coming to steel with a wood mindset, and insisting on cutting steel along the fibers, and expressing surprise and scorn because you can't find fibers. If you come to OO with a LISP mindset that everything is a list and you express you logic through clever composition of abstract list operations, you'll never find yourself at home in OO land. It's not the way you shape that material. Of course, you get several interesting properties once you stop looking at things that way (like: I can later go in and change the damn data structure to something better, and you won't notice; I usually have a better match with domain concepts; etc), and you also lose some things (like: programs don't look like math anymore :-). It's hard to cover this kind of mindset in a comment, but I hope it's somewhat clear.

In practice, one may find that some things are better made of wood, and others made of steel, embrace a multitude of paradigm, learn the inner power of each, and combine them to resolve forces in the best way. Or one may find he wants to belong to a tribe and celebrate the fearless leader and say that his language is pure and referentially transparent and spit on those poor slobs dealing with mutable state. It's a free world : ).

Carlo, thank you very much for the time and effort you took to reply to my questions. I deeply appreciate it.

Like you (I believe) I'm not interested in strictly adhering to any particular philosophy. I understand there's much to learn still, and I think your blog will be a very good aid in this.

That said, it seems that what sets programmers apart is the quality of their analytic thinking, plus the body of knowledge they accumulate over the years. I'm aware that I need to learn how to properly think analytically (which will also help me not be swayed by whatever the fad of the moment is). So, if you don't mind, I'm going to put 2 other questions:

1. Since the quality of your solution is a direct consequence of the quality of your analysis and thought process (as you illustrated in your post above), how do you get better at analyzing and thinking through a problem to get to a solution?

2. What reading material would you recommend for someone that is trying to become a better programmer/analyst/thinker?

Hmm, actually I just found http://www.carlopescio.com/2011/10/youre-solving-wrong-problem.html which sorta/kinda answers my first question (in a somewhat disheartening way). Wondering if you have something more to add to that?

Nice post. I agree to most of it. The sequence terminology I think is not the best abstraction. In this case enumerable is more appropriated, or you should say Mathematics naming conventions is not good enough too. Anyway, is good to think about it.

The discussion on the name for objects and methods is interresting. And it apply to more than just objects.

But the hypothesis used there - that object programming is somewhat better - might not be always true.

Objects are well know to be good to model mainly behavioral things... Like UI elements. You want to hide the internal plumbing, the internal inherent state of theses elements and provide a simple interface to use them. And you really have a complex hierarchy of widgets.

But many time we don't make new GUI API. IT is more focussed on data. You enter data on the system. It store the data and return it later on request. Scientific computing is also often very data centered. You have data in input, you process it, and have data in output.

Traditionnally in IT, modelisation was done around data (database schema, input/ouput format). This is still true today with XML/JSON as exchange format, and if we hide database schema by using ORM, thoses domain object are typically behaviorless.

In this type of architecture, we use services that tend to be more and more stateless (think all this REST thing for example), that take input, process it, and provide output. This is very procedural/functional design. But this no accident. The tasks to perform are naturally a good fit for procedural/functional programming.

We don't have to use one way to modelize everything. Sometime you really want encapsulated state and single dispatch, then you use OOP. Sometime you prefer to promote a stateless architecture, explicit data formats and want to free you from the limitation of single dispatch. Then you'd be better with some functional thinking.

In many schools, OOP and UML is seen as the only true way. People then try to do everything with it, even when this is obviously a less than perfect fit. One could thing that we need to improve our OOP modeling skills, that true. But a more interresting approach for me is to improve our modeling skill to not always think in term of OOP and avoid to design non OOP things with OOP.

Foudres, if you just scroll a bit up in the comments, you'll find my answer to David Leal. It should clarify my position on objects and multi-paradigm programming.

I don't really agree with your classification of what is good for OO and what is not (see for instance my post "life without stupid objects" for an example of doing database-oriented stuff without behaviorless structures, while preserving strict layering), but this is sort of marginal to the discussion, and would lead to the usual, endless debate.

I believe in multi-paradigm programming, with OO being one of the paradigms. The point of this post, as I said to David, is not to say "do it the OO way or die", but instead "if you do it OO, do it right".

Can we call it a Strainer?I'd like to put my Constrained objects in a Strainer and see which ones are caught (valid) and which ones leak through (invalid).

Jokes aside, I often find it more useful to "validate" when performing transformation from an input container to a model object. That way the model object can enforce its own constraints and avoid ever being invalid. The trick of course is providing enough specificity during failure to give the user or data source enough feedback to fix the problem, and then distinguishing between constraint violation and some other kind of transformation failure.

Jonah: as I mentioned to Fulvio above, the Constraint / Constrained things was mostly meant as: find a better design where a ConstraintChecker is not needed at all :-).

About the reading list: David question was more about “analysis” than OO, but every so often, someone is asking me for a list of “must read” books on OOP as well. It’s a tough question, because the real “must read”, in my opinion, are books from the 80s or early 90s, but when I began making a list, I realized that, yeah, sure, a book is a “must read” but perhaps only chapter 2 because the rest is obsolete, etc. So a good list should be more detailed and reasoned than just a sequence of titles. At that point, to be honest, I gave up :-(. Guess I need to give myself a motivational kick :-))

Not sure if this helps, and I don't know what Carlo's opinion is, but one of the books I really enjoyed reading lately is "Practical Object Oriented Design in Ruby". It's a very down to earth approach, it's an easy read and covers a lot of ground.

@carlo: I see the comment now. And yes, it makes sense for the constraint logic to reside in the Constrained objects, but don't you still need a container to hold the list of those objects, and to iterate through them, sending a message to each like "validate()"?

@david: Funny you mention that book. I purchased it about 2 weeks ago and just finished it. I think it is a fabulous book and gave me a number of new insights even after many years of OO programming and reading a decent number of other books.

I think next up might be Chamond Liu's "Smalltalk, Objects, and Design" -- any thoughts on this? It seems most of the best OO books are the old ones

but don't you still need a container to hold the list of those objects--Simple answer: the Constrained object may very well hold its own Constraints.

and to iterate through them, sending a message to each like "validate()"?--Simple answer: the Constrained object can do this.More interesting / challenging design: a Constraint is probably checking some Condition, based on some Variables. When a Variable changes, the Constraint must be re-evaluated. Do we need a manager / engine to do that? Why is not the Variable informing the Constraint?

In practice, some languages get in the way when implementing smart variables. For instance, two languages I'm using a lot have both an int and Integer type, the first is the usual "fast" thing, the latter is "an object", but they wasted the opportunity to make that object a little smarter :-). Understanding the physics of software, and the basic idea that very often we want to react when a value is changing, could bring some simple (yet very valuable) changes into programming languages. When we have to create a different class, set up an observer, etc just to get a notification, we might be tempted to go the "manager class" way. Things would be different with a more supportive language :-)