Post navigation

Configuration files and switches considered harmful

I was just wondering why gpsd doesn’t have a configuration file in /etc/gpsd.conf, like most other Unix/Linux software?

Because configuration files are evil, and not to be countenanced unless they become an absolutely necessary evil. Which in gpsd’s case is not yet, and I sincerely hope not ever.

Yes, I know this sounds like heresy coming from a Unix programmer. But the trouble with configuration files is that they too easily become an excuse for not doing the right thing. (The same is true of command-line switches; all of what I have to say here against config files applies to switches as well, which is why gpsd has so few of them.)

Can’t decide whether that new feature is a good idea or not? Create an option. Can’t settle a design issue? Create an option. Can’t be bothered to figure out how to autoconfigure your software properly? Create an option. Want to double your software’s test complexity? Create an option.

Lots of options and elaborate config files are sometimes necessary for Swiss-Army-knife-like system utilities; I implemented a boatload of them in fetchmail because I had to. But doing that made me unhappy and I junked a lot of delivery options when I figured out that forwarding fetched mail to SMTP was the right thing.

It’s better to design so you don’t need a configuration file. One of gpsd’s goals is to be zero-configuration; when it’s installed properly, you plug in a USB GPS and it just works. Nothing to tweak, nothing to configure, nothing you have to mess with.

While gpsd sometimes departs from this ideal, we treat those departures as bugs to be actually fixed rather than kluged around with an option. So the right question to ask youself is not “how might an option fix this?” but “how can we teach the program to figure out the right thing itself?”.

226 thoughts on “Configuration files and switches considered harmful”

I really appreciate that you pointed this out. Most things in most configuration files never change, which means they shouldn’t be there, adding complexity and increasing what corporate programmers would call production support documentation.

“how can we teach the program to figure out the right thing itself?”

This is why I like standardized directory structures and programs which figure out where they are, and therefore, where everything they need is. But I really need to dramatically reduce the number of things in my configuration files, that’s for sure.

I have to disagree. While software should do the right thing by default, it must also have mechanisms to alter that behavior when the user so desires. I found out the hard way that people use your software in ways you’ve never considered. What you believe is the right thing may very well be so in 95% of all cases, but that remaining 5% will be SOL. Which is why I try to go for sane defaults that work out of the box, but make the configuration options available for those who need them.

I couldn’t agree with this more. And I think with chagrin of my beloved Vim, which typically gets installed with an /etc/vim/vimrc full of options that no sane person would turn off these days. Fortunately I’m not in charge of testing that… though I *am* on QA for a large commercial product with much the same problem. Le sigh.

While I agree with the general sentiment of UI simplification and autoconfiguration, having to break out a compiler to change assumptions is an even worse UI than a seldom-used command line switch. Also, beware of macism – hiding configurability from users makes it more difficut for them to troubleshoot or fix or workaround bugs.

Can’t decide whether that new feature is a good idea or not? Create an option. Can’t settle a design issue? Create an option. Can’t be bothered to figure out how to autoconfigure your software properly? Create an option. Want to double your software’s test complexity? Create an option.Lots of options and elaborate config files are sometimes necessary for Swiss-Army-knife-like system utilities; I implemented a boatload of them in fetchmail because I had to. But doing that made me unhappy and I junked a lot of delivery options when I figured out that forwarding fetched mail to SMTP was the right thing.

Speaking of mail, Sendmail digged itself quite deep into this particular hole. In its early days, it wasn’t yet apparent that electronic mail would come to be equated with SMTP, so Sendmail has a million options for just how mail should be handled. That, and compiled config files with a lunatic syntax.

The world is much better when things are no more complex than they must be.

I’m with Winter and Inkstain on this. If something’s never going to change, sure, hard-code it. But for things that users may want to mess with(the one that comes to mind is a fullscreen toggle on games), give them the option. I can’t speak to whether GPSD needs them, having never used it, but options are generally a good thing.

The ideal of NEVER needing a configuration file is not attainable in many cases. Most people are familiar with von Moltke’s “No campaign plan survives first contact with the enemy”. Monster’s Corollary to von Moltke is “”No software design survives first contact with the user”.

When we were talking about the USB-to-serial chipsets, it was clear that autoconfiguration isn’t possible, because it is not possible to know a priori whether there is some device on one of these converters that will go absolutely apeshit if you send it “AT\n“. For instance, I know from experience that enabling logins on a serial tty connected to a UPS, with certain monitoring software running on SCO OpenServer, could cause a system shutdown or instant reboot without any shutdown at all. Very ugly business.

Therefore, there must be some mechanism by which you can tell gpsd “Do not try to probe this device.” Without that mechanism, your program will be rightly blamed for device rape.

You can do that with command-line switches, a config file, and/or via environment variables, which in turn can be saved in a “config file” and dot-included before the application is executed. A config file should be implemented to allow default values to be omitted, making the default config file an empty file or no file at all.

To further clarify my position; ideally users should have the full Luxury of Ignorance. Most of the time, things should just work. However, there must also be fully documented and supported alternative configuration mechanisms for those cases where the “magic” is inappropriate or worse yet, inadequate.

If you want a good example of “magic” failing spectacularly, try to install HP’s driver for a XQX protocol printer without having said printer connected to the local USB port.

I’m seeing a strong consensus on usability here. Everyone seems to be drawing the line in the right place: hardcode the stuff that should never change, sanely default the stuff that should rarely change, and put a switch on everything else.

It should also be noted that many programmers put in config files and options because they couldn’t possibly know what the right thing would be, because standards for behavior were still evolving. Sometimes you can autodetect a format (gif vs. jpeg; au vs. wav). But not always. (But if you can and you didn’t, your program is suboptimal.)

It took me several years as a Linux user before I even knew configuration files existed. (Ubuntu really is great for newbies.)

But now that I know about them, I’m quite happy with them. Linux, to me, is all about configurability and customization. There are so many things I just can’t do with my Windows computer at work. With Linux, there is always a way. And a few lines of text in a config file, in my opinion, is so much easier than a million GUI menus full of checkboxes, even if I need to use Google to help me out sometimes (okay — almost always).

So by all means, create the right default configurations and simplify your program to where command-line options aren’t always necessary, but please give me the maximum amount of control over what your program does on my machine. There’s no tradeoff here. With the right design, it is easy to get the best of both worlds.

Command line options present a ~ 2^N set of interfaces to the outside world -> assuming mutual exclusivity of the options. Without mutual exclusivity it is somewhere between N and to 2^N with the real number probably closer to the latter in most systems out there.

Every new option introduced doubles the complexity. It doubles the number the number of theoretical possibilities that can break the system. They’re evil – plain and simple.

There’s also the quandary of when you’re doing a “dist upgrade” and the new package wants to install a new “default” configuration file. Keep current, use default version, attempt merge – all the possible choices are wrong.

I think it should be one of the founding principles of Nix that all programs must ship with an empty config file (or one with all the defaults listed but commented out – basically the same thing). Like the comment about Vim above, if you’re shipping with non default settings, you’re doing it wrong.

In any case: why not all of them? Most of ls(1)’s options are just filters, so why not use actual filter programs for that and have ls(1) do only what it’s meant to do: list files ? I’ll give special exception to -d (to return a directory itself and not that directory’s contents, although even that could be resolved by explicitly indicating either path or path/ ). I was about to add -r, but that’s what find(1) is for.

>Most of ls(1)’s options are just filters, so why not use actual filter programs for that and have ls(1) do only what it’s meant to do: list files ?

Because typing in those grep/sed/awk filters by hand every time would be a pain in the ass. ls is kind of a special case, at the extreme of my category of “Swiss-Army-knife-like system utilities” in my original post.

If you mean canned filter programs, so you end up typing things like “ls | ls-shortlst” all the time, it’s hard to see how this would be an improvement.

>Most of ls(1)’s options are just filters, so why not use actual filter programs for that and have ls(1) do only what it’s meant to do: list files ?

The options I actually use to ls are usually -l, -t, -r, and -a. I’ll accept -r as duplicable by a simple filter, but the others aren’t, really (unless you’re going to do something stupid like have ls only ever output -la, and require a filter to get your nice simple short list without dotfiles back).

On config files: It pisses me off having to do a whole lot of tweaking of config files before I can do anything, but I really like having them available to mess with later (more knobs are always a good thing). My thought is that most of the required autoconfig could simply output a good configuration file, which could then be manually tweaked later. Autoconf is nice for users, but it’s often fun to simply mess around with a program and see what it can do.

@esr
> If you mean canned filter programs, so you end up typing things like “ls | ls-shortlst” all the time, it’s hard to see
> how this would be an improvement.

I actually wrestle with the idea a lot, which is why I will always phrase it as a question, heh.

I see it more as having extremely small generalized programs. `ls` to list files, `sort` for sorting (options here for HOW to sort make sense), `stat` of some sort for getting file details (which you could perhaps feed to a colorizing filter if you like ANSI color chrome(I do)), `mc` for columnizing, and so forth. Then just tie it all together into a tidy shell script to get your ls(1) analog, except you’ve got all these nice little components you can reuse elsewhere and without having to rely on dynamic linking and such. Plus smaller, cleaner code bases and much easier debugging.

I admit being too young to have Been There, but I can’t help but see a lot of the BSD additions on the old Unix stuff to have been symptoms of ‘missing the point.’ GNU further improved the BSD additions by making that paradigm smoother and more usable (in many cases, not all), but maybe it was the wrong path to begin with.

Purely academic pondering, though, since what we have now Works and definitely works well enough.

I agree with you. It is always better to compose with chained/piped functions than with command line options that increase the 2^N complexity. The swiss knife utilities would be a library of filter functions that you invoke separately while keeping basic form of “ls” simple and beautiful.

>I admit being too young to have Been There, but I can’t help but see a lot of the BSD additions on the old Unix stuff to have been symptoms of ‘missing the point.’

There’s some justice to this charge, at least if you’re talking about program composition via shell scripts. On the other hand, under the technological constraints of the time (text-only, non-pixel-addressable displays, primitive or nonexistent support for text-screen interfaces a la vi) a lot of the Berkeley additions to command-line switches were actually optimizations for usability by human fingers. Nowadays they look less useful in aggregate because human-finger usage has substantially shifted to text-screen interfaces and GUIs.

It seems like it would be useful to develop a taxonomy of programs (a la Design Patterns) to help programmers identify situations in which there’s clearly a Right Thing and when configuration files are absolutely necessary. For instance, some configuration files are really declarative programs, and in some situations all the crazy knobs truly are necessary; this shows up most obviously in “framework” programs like xinetd/tcpd or Apache, where the configuration is a description of how to wire up some environment of files or documents and a declarative file is more human-friendly than an imperative setup script would be.

Anything that can not have a config file* shouldn’t – and this is doubly true of utility software and utility daemons and the like.

However, complex end-user software almost always should have a config file*, even if it’s just an optional one to store user preferences. And as Christopher Smith says, there are server platforms where you just can’t live without one.

gpsd shouldn’t have a config file. A word processor should.

(* Where “config file” includes whatever means of storing config data your host OS supports, be it a plain file, a registry hive, or a .plist. Doesn’t matter if it’s a discrete file or not, naturally.)

> On the other hand, under the technological constraints of the time (text-only, non-pixel-addressable displays, primitive
> or nonexistent support for text-screen interfaces a la vi) a lot of the Berkeley additions to command-line switches were
> actually optimizations for usability by human fingers.

That makes a lot of sense, and I can feel my (lack of) age showing in that I hadn’t considered that angle before, spoiled as I am by the ubiquity of curses and even the most minimal of window managers.

Offtopic: Apparently, due to the constraints of being contractually obligated to do development for IBM (presumably to deliver non-GPLed code) and not really wanting to play any more, Oracle has agreed to spin off OpenOffice to the Apache Foundation.

It will be interesting to see what happens with OpenOffice v LibreOffice — this is a very rare instance of GPL competing head-on with permissive licensing, with nominally the same starting codebase, on a large and important project. One could argue that the developers that Oracle and IBM will presumably bring to bear on the Apache version tilt the playing field, but then again, (a) that’s a fact of life; and (b) the GPL being what it is, it will be far easier for the LibreOffice folks to take OpenOffice code than the other way around…

Personally, I think that if a company isn’t interested in playing any more, and wants to do right by its loyal customers, using a permissive license and letting the market decide on whether the dominant player will be copyleft or not is really the right thing to do, so despite my distaste at Oracle’s Java machinations (and despite the fact I don’t really know any details of the OpenOffice deal yet), I think I might like where the Apache Foundation can take OpenOffice.

> On the other hand, under the technological constraints of the time (text-only, non-pixel-addressable displays, primitive or nonexistent support for text-screen interfaces a la vi) a lot of the Berkeley additions to command-line switches were actually optimizations for usability by human fingers.

This is a good point. Isn’t it fair to say though that a lot of times the unix philosophies of “small is beautiful”, and “do one thing and only one thing but do it well” get confused with hanging on to relics of the past that only make sense in the context of their day? Either the philosophy is more important, or the relics themselves. Since plan9 was mentioned, wouldn’t we all have been much better off if we had chucked the old unix away and moved on with life to plan9? I think we would have.

Hard-coding shouldn’t be too problematic if your source code is well-written and easy for anyone to understand. Which leads to the compromise of embedded scripting systems– you still have to modify the source code in order to change some things, but source code modification is well-supported and easy.

Well, I think I made it clear in The Art of Unix Programming that I value the philosophy more than the relics. But one has to be careful about dismissing the relics too lightly; as with biological evolution, sometimes a feature that evolved under one set of pressures becomes important in a different way when the environment changes.

For example, the Unix tradition of programs that emit no gratuitous messages on success and are generally sparing of output dates from an era when humans could read faster than a TTY could print. Nowadays, on fast machines with fast displays, the same design heuristic guides us away from interfaces that needlessly distract the human user.

And as for Plan 9…it failed to take over because, while it was better, it wasn’t enough better.

@The Monster: “When we were talking about the USB-to-serial chipsets, it was clear that autoconfiguration isn’t possible, because it is not possible to know a priori whether there is some device on one of these converters that will go absolutely apeshit if you send it ‘AT\n’.

This is true, but it’s a bad example because it’s a case of bad device design. In a world where devices used good design principles, *ALL* devices would be designed to be probed via an agreed-on standard, because only in such a world can autoconfiguration “just work”.

There are many programs out there that contain ugly hacks not because the program designer loved hacks, and not because he/she was a bad coder, but because the piece of software had to interface with other programs/chipsets that were not designed properly.

It’s unbelievable that in 2011, we *still* have devices that “go apeshit” when probed, thereby blocking all attempts at autoconfiguration.

. In a world where devices used good design principles, *ALL* devices would be designed to be probed via an agreed-on standard, because only in such a world can autoconfiguration “just work”.

While this idea works in theory, in practice device design (from the outside looking in admittedly) resembles some form of ultimately dysfunctional design by committee process where the majority of the committee members will only talk to a small subset of committee members and at least one of the members is specifically attempting to subvert the process to gain differentiation advantage. That we get any useful standards at all from this is nothing short of miraculous.

A recently linked thread from Linus re: arm architectures is a good example of this.

In regards the topic, I echo multiple comments up-thread. Autoconfiguration is brilliant but when the magic doesn’t work i want to be able to override and that needs to potentially apply to anything that could need autoconfiguring.

> And as for Plan 9…it failed to take over because, while it was better, it wasn’t enough better.

Plus Bell Labs had overly-restrictive licensing for a good chunk of Plan 9’s life. And, really, by the time it got into full ‘doing really cool stuff’ mode, it was far too late to make any kind of waves. There’s still good people working on it, though, and more than a few great ideas still being developed in and from it. 9P I think could gain a strong following if it got more visibility, and something like Plumber coupled with Go would be Really Interesting, even outside of Plan 9, if Linux-land could be hacked so programs return in P9 style (strings) instead of Unix style (ints).

> Well, I think I made it clear in The Art of Unix Programming that I value the philosophy more than the relics. But one has to be careful about dismissing the relics too lightly; as with biological evolution, sometimes a feature that evolved under one set of pressures becomes important in a different way when the environment changes.

That is a fair point. Most likely the reason why unix rules the world today. It evolved the right set of genes back in its day and its biological makeup proved to be ever so useful and versatile as the environment changed.

> And as for Plan 9…it failed to take over because, while it was better, it wasn’t enough better.

The “good enough” effect at play again. I hate that. I wonder if one could say the same of Windows too. That in spite of its blue screens, viruses, registry, DLLs -etc of all the goodies that make up windows- that it was “good enough” to rule the PC world. Windows is still not as usable as the first computers I had ever laid fingers on (Commodore Amiga – late 80s), and definitely far less stable. But it was “good enough”.

@jsk:

> Plus Bell Labs had overly-restrictive licensing for a good chunk of Plan 9?s life

I think that is the real reason. If they had BSD’d the thing from day 1, history of unfolded differently.

It is only my own humble experience and I cannot speak with a great deal of wisdom, but in 90% of programs I’ve used or worked on, I have found a situation where having the option was better than not having the option.

Auto-configuration might work in 99% of the cases, but in the rare 1% case (it does happen) it goofs up and you need a fall-back. Programs or software with the fall-back of manual configuration has saved me on numerous occasions over the years – sometimes in critical situations.

I go with the consensus that 99% sane defaults + option to change them is the most practical arrangement.

Oversimplification can be as much of a devil as excessive customizability.

> Oversimplification can be as much of a devil as excessive customizability.

The question is where the optimum point lies. I think anything over 4 or 5 command line options is way too much complexity. If you think of your utility as a mathematical function with all the cmd line options as arguments to that function you will find that most people (the human mind indeed!) cannot easily reason about the ins and outs to that function if the argument list exceeds 4 or 5. If functionality beyond what is possible with 4 or 5 arguments is required then the problem needs to be abstracted differently so that both “coverage” and “complexity” are sanely managed.

One possible way of breaking up the problem is like this:

utility-sane
utility-special-case-1
utility-special-case-2

What will happen is that over time utility-sane will survive. utility-special-case-1 and utility-special-case-2 will be ditched away after the convergence effect standardizes the world on utility-sane. There should always be some place where you can download ulility-special-case-1/2. Rather than linearly growing the cmd line option space with a zillion useless options (that may well stay with us for the next 30 years), you would have managed both the complexity and coverage problems differently.

Yes, but who decides the special case? What is special case to the programmer might be the sane option for a particular user. And over time if the special cases are going to be relegated to the sidelines and not maintained like the sane default program, you have a problem.

In some cases this might work out well, but if you branch out different versions of essentially the same program with the same functionality just to cover some special cases, then maintenance of all the common code over a period of time will become a big issue. Yes, you can handle this with VCS systems and so on, but it’s just an ugly solution. Why not just incorporate those special cases as an option?

Yes, and the point of my original post as that (especially for user-facing software) programmers need to start seeing options as a cost rather than as a benefit. A cost one has to pay sometimes, but a cost.

@esr
“Yes, and the point of my original post as that (especially for user-facing software) programmers need to start seeing options as a cost rather than as a benefit. A cost one has to pay sometimes, but a cost.”

Yes, indeed.

For instance, the ls options. I very often do
> ls -1 | awk ‘{…..}’ | bash

and sometimes
if [[ `ls -1tr|tail -1` = "Myfile" ]]; then ….

Now, for me, the cost of the command line options in ls are less than the cost of adding extra cut and sort filters which require their own options. Options which cannot be ignored.

Many programs have a state which you want to preserve between uses. Such things end up in config and rc files. Many things cannot be decided beforehand, like the colors of the ls output (monitor dependent). They have to be user configurable, hence options.

The down-side is, that every project seems to creep towards a Turing Complete configuration/options. Which is bad, because that is where scripting languages have been created for. The application is written because the user does not want to use a Turing Complete application, but because this application does something very specific.

So I am always asking myself whether any option/configuration is actually supporting the specific task of the application, or whether it is extending the tasks it can be used for. Things that extend the task are very suspect. For other tasks, there are other applications.

I have always found that the line between configuration and programming is a very subtle one. Most configuration, be that file, command-line or GUI, is basically assigning values to variables. Many more complicated configurations f.e. rules in e-mail clients also have conditionals. All you need to add is loops and basically your configuration is a programming language.

This is necessary as long as your actualy application is written in a difficult and compiled programming language like C. And it is closed source. It was drummed into me at school to always avoid “hardcoding”, that’s the worst thing a programmer can do, because users cannot change that. My favourite is configuring SAP R/2, it is so damn extensive that usually they did not configure it manually but imported and industry template and went on from that. Why did it have to be some complicated? Because they wanted to avoid at all costs having to customize compiled, close-source ABAP code (much like COBOL).

Now when you are developing something open source in an easy, interpreted language like Python then it suddenly becomes unnecessary. You can just have a “config file” in Python, assigning values to some variables there and providing a short documentation how to change it. Or generally write your application so that the general user-facing logic is separated in one fairly simple file so that it can be customized without diggin too deep into it.

IMHO the most interesting part of LISP-like languages is that thy can make a hierarchical config or data file directly executable and basically become the application itself. This is highly exciting.

This is done by design -> Designer of the utility gets to make that call.

> but if you branch out different versions of essentially the same program with the same functionality just to cover some special cases, then maintenance of all the common code over a period of time will become a big issue

You don’t branch anything. You compose all your utilities, including the special case ones, from smaller underlying functions. The codebase remains one. Any shared functionality will be shared in the underlying functions. You build the entire utility set with 2 or 3 layers. If your design exceeds that, then your design needs to change. 3 layers is the maximum for humans to reason about programs, just like 5 cmd line options is the maximum for a utility.

> Yes, you can handle this with VCS systems and so on, but it’s just an ugly solution. Why not just incorporate those special cases as an option?

Rather the programmer himself and the few uber geeks who know how to google stuff up and download what they need, you wanna burden all of humanity with mile-long man pages. That is bad! Wicked and evil too! What we need is simplicity, manageability, and composability out of small reliable and rugged pieces. Not gigantic monoliths with vast swathes of dead code that will be carried on our backs for decades to come.

> This is done by design -> Designer of the utility gets to make that call.

But most UNIX programs aren’t designed. They seem to be hacked and coded by people familiar with the environment and what they want to achieve and assume that their target audience desires the same.

> You don’t branch anything. You compose all your utilities, including the special case ones, from smaller underlying functions. The codebase remains one. Any shared functionality will be shared in the underlying functions. You build the entire utility set with 2 or 3 layers. If your design exceeds that, then your design needs to change. 3 layers is the maximum for humans to reason about programs, just like 5 cmd line options is the maximum for a utility.

But if there are a few lines of extra code for a single option, what you are suggesting comes down ultimately into just a mechanism for the end user – not the actual coding technique. As a matter of fact, the same thing can be done with just several one liner front-end shell scripts calling the program with different switches or options to hide a single complex command. This is often done in many command-line programs in UNIX. Do you suggest this is not a viable alternative? How different is this from just creating separate tools?

I don’t see how merely hiding options under a thin layer is different from providing actually different programs to achieve slightly different goals.

What I said was that the second approach means more redundancy and more maintenance.

> Rather the programmer himself and the few uber geeks who know how to google stuff up and download what they need, you wanna burden all of humanity with mile-long man pages. That is bad! Wicked and evil too! What we need is simplicity, manageability, and composability out of small reliable and rugged pieces. Not gigantic monoliths with vast swathes of dead code that will be carried on our backs for decades to come.

Rather than commenting on the length of the man page, I would think what is important is how understandable it is. You talk of documentation which is another subject.

I believe most programs should be self-documenting – either through the command line, a config file or maybe even a GUI/Wizard for novice user.

@Michael Hipp
“If you want a classic case of what not to do, ponder the ‘find’ command.”

I would not be so sure. It is a classic example, but not necessarily of bad design. The point is that the find command is not really used interactively. I use only the simple cases directly. For more complex searches, I would construct shell scripts calling find. And if you use shell script wrappers (tcl/tk), it is a very, very powerful command.

I see commands like find more as engines. You have to supply a front end to make them useful. But if you care to make a front end, it can do almost everything you would care to put the label “find stuff” on.

If you want a classic case of what not to do, ponder the ‘find’ command.

Meh. The find command is one of those “swiss army knife” utility programs. Sure, find has a lot of options, but there’s good reason why it does have all those options.

Before GNU grep’s -R option (and I humbly bow to whomever coded it), I used to use ‘find’ a lot to search for strings in directory source trees, as in ‘find . -type f -exec grep -l "foo" \{\} \;‘. I still use it every now and then out of habit and on some *nix machines where installing GNU grep isn’t practical or possible. find is a sort of universal -R switch when used like that.

But I also use it a lot to find changed files within a directory tree — less so in source trees as I tend to use version control software in those and those tools have better functionality for that.

Personally, however, I find that avoiding the need for switches or configuration items is a very useful trait in any program. OTOH, having no ability to configure or at least override can be very annoying.

One piece of software I think is a good example of what I consider the ideal is SANE. My saned configuration contains exactly nothing but the default comments. So long as the proper USB items are configured on the host OS and there is driver support for the scanner in question, saned does just work, but I can always override things like network defaults if I need it. Even with network clients, the only thing that I need to specify is a list of hostnames to attach to, though I suppose it would be nice if these were discoverable via zeroconf or something like that.

@Tom, “Most things in most configuration files never change, which means they shouldn’t be there, adding complexity and increasing what corporate programmers would call production support documentation.”

I can’t entirely agree with your argument that this is a reason why configuration files are bad. Because, most things in configuration files never change once they are set per machine . So the configuration files provide the simplest and easiest method to customize a configuration for each machine.

I don’t want a Windows Registry thank you very much.

But ESR has a great point that configuration files are misused. And you can count me in as a guilty party.

Morgan Greywolf Says:
Meh. The find command is one of those “swiss army knife” utility programs. Sure, find has a lot of options, but there’s good reason why it does have all those options.

You both miss the point. The first major fault with ‘find’ is that the simplest and most likely use case requires switches just to get it to work.

If I want to find a file named “Eric” in the current directory tree I should be able to say “find eric”. But instead I have to first type “man find” to once again immerse myself in the myriad of swiss army knife switches until I find the magic option that tells it to search by name (!) and to be smart enough to ignore case unless told otherwise. And to ponder that I might have occasion to use ‘find’ often enough to memorize even a small number of them is too horrible to contemplate. (A possible corollary of Eric’s post is that tools should be smart by design and only revert to stupid when told to do so.)

Or perhaps as Morgan suggests, ‘find’ is really a god-like everything-plus-the-kitchen-sink swiss army knife and no-one has yet bothered to write the tool that deals with the simple and most common use case of wanting to just say “find eric”, but some legacy bozo already used up the correct name for it. (No, ‘locate’ is no help.)

This is true, but it’s a bad example because it’s a case of bad device design.

In your perfect world, there would be a standard command that can be sent to any serial device to interrogate it, which is guaranteed to not bork the device. But we don’t live in your perfect world. Those bad devices exist in the real world, and work OK as long as nothing talks to them that doesn’t know how to do it safely. They’ve been that way for years, and everyone who works with serial devices knows it. One of the biggest drivers behind USB was how crappy it is to deal with serial devices.

Along comes gpsd, which sends strings of unsolicited data to any serial device on a USB converter that gets plugged into the computer. Because it isn’t possible to know what that device is simply based on the USB identification, this can’t be guaranteed not to do Horrible Things. It’s bad enough if this behavior is “opt in”; it’s even worse if no opportunity to opt out is provided. Is it gpsd‘s fault that serial devices work this way? No. But given that serial devices are the incumbent, it’s gpsd‘s responsibility to, at a bare minimum, provide that opt-out mechanism to protect devices from unwanted probing.

This is a lose-lose situation where there is NO way for the program to figure out the right behavior for itself. There simply has to be some way for a human to tell it what’s right.

This is true, but the program can still certainly help the user out here. For example, if gpsd doesn’t find a known GPS device when invoked with no options, it can print a gpsd command with the option to use a particular port for each unknown serial port it finds.

Then, one triple-click followed by a middle mouse button and an enter-key later, the user is up and running…

gpsd may already do something like this. I have no idea, since the only GPS device I have is a Garmin GTU-10, permanently affixed to an expensive mobile machine I’d rather not see go walkabout…

I would vote with Hari. From what you tell me, gpsd does very well almost all the time, call it 99.9% (substutute a different number if you wish)
If you have a config file option (normaly nonexistent or empty), that could tell gpsd “Ignore device x, do not talk to it”, you could probably add another 9 to the ‘just works’ case.
If you could say in the config file ‘Treat device x as special case y’, you should be able to get it down to where then only folk that are still broken is those who need two or more usb devices that all ID as ‘x’ with no way do differentiate between them until you poke at one and see if you get smoke.

How much of this would be worth the effort to implement is a question left to the student — I would think that the first case would be fairly straightforward to implement and would have sufficent benifit to be worth the trouble.

only folk that are still broken is those who need two or more usb devices that all ID as ‘x’ with no way do differentiate between them until you poke at one and see if you get smoke.

This is not that unusual, though. There are many devices that use a USB RS232 interface and assuming that a USB RS232 interface belongs to a GPS device is just not a good plan. Many of those front-panel LCD displays, for example, use such thing. Copy protection dongles. Mobile devices. Modems. UPSes. The list goes on…

@Morgan Greywolf
>This is not that unusual, though. There are many devices that use a USB RS232 interface

Possibly true. On the other hand, adding the ability to deal with the ones you can distinguish can’t be a bad plan.
Just because it isn’t perfect doesn’t mean it’s useless. Adding a 9 or three to the 99.9% that works is still worthwhile, you won’t ever get them all anyway. I’m counting the percentage of users, not the percentage of possible devices — I’m sure there are vast multitudes of poorly behaved devices available, just that few people will have more than one or two of them. Note that the device not only must be badly behaved, it must also lack some sort of identifying characteristic that is unique to this particular situation before the config file option will be unable to cope. It isn’t necessary for the device ID to be unique in the larger world, just within the set of gadgets you are trying to work with here and now.

The 99.9% users won’t need it at all, just that last few that are attempting to get a cheap gps gadget to work along with a usb-serial adapter, a usb-parallel adapter, and a usb camera — or some other sort of edge case.

>So do you have a command-line switch to tell it to use a specific device it would otherwise not know about?

No. It just turns out that the only devices to need wakeup strings before you can sniff incoming data are RS232. So those probes don’t get shipped if the device is USB; instead, we wait for packets to come in and definitively identify the device as a GPS before issuing subtype probes.

I’m underwhelmed by the argument that I should include switches and configurability because I don’t know how users are going to use my software. If I don’t know how they are going to use my software, what confidence do I have that I can add the correct switches or configurability?

>If I don’t know how they are going to use my software, what confidence do I have that I can add the correct switches or configurability?

Sometimes you can apply general principles. For example, one of the GPSD programs is gpsdecode, which does batch decoding of GPS/AIS data streams and logs. It has both a default AIS decoding to JSON objects and a switch to emit the same information in a simple DSV format. I don’t know who uses that, but I do know that many data-analysis tools that don’t parse JSON do accept DSV data.

@Jeremy Bowers
IMHO, the developer must learn to challenge his own assumptions. Never assume your software is the only thing running on the machine. Never, ever, ever hard code URL’s/IP addresses/ports.

For example, a particular vendor whose name I will withhold in an effort to protect the guilty, shipped a software package that came with a Sybase back-end. It never occurred to said vendor that users might have other things running on their respective machines that also relied on Sybase. And so they hard-coded the port of the database. For most of their customers, it was all OK and things “just worked”. But for a considerable minority, “hilarity ensued”.

I’m underwhelmed by the argument that I should include switches and configurability because I don’t know how users are going to use my software.

No kidding. You pretty much have to have some idea of how users are going to use your software. Even in esr’s example, wehre he doesn’t know who uses DSV, the thing is DSV is one of those lowest-common denominator formats that any program that emits tabular data is going to use because, if nothing else, you can always import it into a spreadsheet for analysis and printing and so forth. Or even import into a database, etc. IOW, you don’t have to know how users are going to use DSV because it’s almost a guarantee that someone will need it. And the switch is a no-brainer because you have to have some way of signaling to gpsdecode to emit the non-default format.

If you don’t know what your users are doing with your program, you shouldn’t be writing it.

If you don’t know what your program is supposed to do, you’d better not start writing it.

— Edsger Dijkstra

If you don’t know what your users are doing with your program, you shouldn’t be writing it.

— Morgan Greywolf

I used to believe these things. I still believe that you should have a glimmer of an idea of what your program is supposed to do and what your users will do with it before you get started, but for some problem domains, memory-managed dynamic languages like Python dramatically reduce the up-front design required before you start coding.

The hard part of writing throwaway code is working up the courage to throw it away. After you’ve spent 5 hard days writing something in C, it psychologically represents a large investment. But it’s often true that you can get to the same functionality in an hour or two in Python, and then it’s not such a huge investment.

Also, dynamic languages can make it much easier to refactor code, to get from where you are to what your experience now tells you is a better design.

IMHO the most interesting part of LISP-like languages is that thy can make a hierarchical config or data file directly executable and basically become the application itself. This is highly exciting.

Hierarchy is undoubtably nice. But a lot of end-users might choke at the syntax.

Ordinary people from all walks of life are used to using indentation to represent hierarchy. But for some reason, some programmers will simultaneously claim that they indent their code consistently and that it’s really a horrible idea for that indentation to be meaningful to the computer (a la Python). Go figure.

Anyway, YAML contains the kernel of a few good ideas, but it spun out of control and got way too complicated. That’s one of the reasons I developed rson. You can see rson in action with rst2pdf stylesheets. For example, the default stylesheet.

Shenpen Says:
“IMHO the most interesting part of LISP-like languages is that thy can make a hierarchical config or data file directly executable and basically become the application itself. This is highly exciting.”

I don’t find that at all exciting. Actually it sounds like a great description of what not to do. Hence all the “Eval Seen as Harmful” articles that dot the landscape. Whether as a user or a programmer I don’t want my data to be executable. Call me old school: code and data should be distinct.

> I don’t find that at all exciting. Actually it sounds like a great description of what not to do

If functions are first class citizens in your language, then you can store functions, pass them around like any other argument. You can essentially write programs that themselves compose programs. If you do that, the distinction between code and data becomes less meaningful.

> Call me old school: code and data should be distinct.

Tell that to the OOP loonies. Code and data should definitely be distinct in imperative programming.

uma Says:
“Tell that to the OOP loonies. Code and data should definitely be distinct in imperative programming.”

OOP encapsulation isn’t a violation of distinct code and data. Notice I said “distinct”, not “separate”. But as you point out the various “factory” patterns and other “programs that write programs” are a wonderful way to cross the line – if the objective is to write unmaintainable code.

But even in bazaar-mode development or extreme programming, you aren’t writing code and throwing it away over and over. You may throw away one or two, but you’re basically writing code and avoiding nailing features down until they’re needed.

But notice the difference between what I said and what Dijkstra said. Dijkstra thinks you should know what the program will do before you start writing it. I’m not recommending that; I’m simply saying that proper software development methodologies necessarily entail communication with your end users and an understanding of how they are using your code. What shifts is when you understand that, not what you understand.

The main difference in bazaar-mode development over XP is that bazaar development presumes that at least some part of your end users are themselves hackers and that they will code what they need. That means that you, as lead developer, may not understand how your users are using your code until after some part of it is written; but total ignorance is still impossible.

>“programs that write programs” are a wonderful way to cross the line – if the objective is to write unmaintainable code.

Quite often programs that write programs are the only way not to have unmaintainable code.

Case in point: the AIS support in GPSD. The AIS protocol (especially the International Maritime Organization extension messages is a design disaster. Tight-packed binary datagrams, weird field encodings, 62 major message types with literally tens of thousands of sub-variants. If I had to write all the bitfield-extraction and JSON-generation code by hand I’d go insane. More to the point, the defect rate would be unacceptable.

The fix: four little code generators written in Python and driven by declarative specification of the message layouts (the code is in devtools/tablegen.py in the source tree). These generate C code that I can paste into the driver and a few other places with minimal hand-tweaking.

esr Says:
“The fix: four little code generators written in Python and driven by declarative specification”

To me this isn’t quite “code writing code”. Often the best way to solve a thorny problem like this is exactly what you say – by an external declarative or table-driven description.

But that’s a long way from Shenpen’s “config or data file directly executable and basically become the application itself”. Yours smacks of sound engineering, the latter of “we did it ’cause it would be cool”.

> But for some reason, some programmers will simultaneously claim that they indent their code consistently and that it’s really a horrible idea for that indentation to be meaningful to the computer (a la Python). Go figure.

Holy wars over what the “best” indent style is.

In fact, I’d expect that those programmers who are the most consistent about how they indent are the ones with the biggest objections. The more consistently they indent, the more inflexible they’re likely to be about switching to a “heretical” indent-style demanded by a meaningful-indentation language.

> But notice the difference between what I said and what Dijkstra said.

For a lot of the exploratory programming I do, “If you don’t know what your program is supposed to do, you’d better not start writing it” and “If you don’t know what your users are doing with your program, you shouldn’t be writing it” seem to be functionally identical statements to me, if only because I am the only user, and I don’t know what my program is supposed to do because I haven’t yet figured out what my user (me) is doing with it…

@Deep Lurker:

> The more consistently they indent, the more inflexible they’re likely to be

That may be true for some people, but I haven’t actually found that to be the case for the actual people I know. What I do find is a deep disconnect between what people say they do and what they do, and an allergy to a system that forces them to do what they say they do…

AFAIK, there are holy wars over where you put the first brace, but Python doesn’t care about those. The holy wars over indentation seem to have to do with number of spaces and spaces vs tabs, but Python doesn’t care about any of that either. It just wants you to be consistent.

I would say that it is very rare to know exactly what a new program should do and be when I start it.
First off, the specifications frequently change as the project goes along.
Second, the understanding of what we are trying to do will change as I and my users (typically two or three) start to understand the task better.

I mostly try to do it with the extreme programming paradigm — get a minimal program to work, then fix the problems and lacks as they become obvious. I’ve learned to mostly avoid adding features untill a user requests them and can give a coherent reason why they want it.

Then I modify the program to meet that request and wait for the next one. Overall it works pretty well. We seem to have one of the better production data systems in the window industry.

The more consistently they indent, the more inflexible they’re likely to be about switching to a “heretical” indent-style demanded by a meaningful-indentation language.

That’s part of it, but for me it’s more a matter of “the very notion of One True Indentation Style” is wrong. My style is
blah
{
blahblah
blahblah
{
this-n-that
this-n-that
}
}
blah
so that the opening and closing brace are vertically aligned with each other; the braces and a column of space between them could be pretty-printed to have the same background color, making for a very obvious block.

But I make exceptions to that style where I think strict adherence to fixed rules actually gets in the way of readability. A compiler that treats indentation as significant can’t make exceptions ever.

I do that because otherwise the entire body of every function would be indented, wasting precious space when I’m working in an 80-column, 24/25-line environment (which is a majority of the time). I also use only two spaces per level of indentation most of the time because tabbing produces wrap-around, which is VERY hard to read in that environment.

But I don’t want a compiler/interpreter enforcing my personal preferences on every programmer any more than I want a government mandating my religion on every citizen.

That’s exactly what I’m talking about. You don’t have to know what the program will do and be when you start out, but at some point, you do know how your users are using your code. As you just pointed out, it’s impossible to know how to implement a request for a new feature until the user gives you a coherent reason why he or she needs it.

Yes, and the point of my original post as that (especially for user-facing software) programmers need to start seeing options as a cost rather than as a benefit. A cost one has to pay sometimes, but a cost.

Replace “programmers” with “game designers” and you will understand why many games become less enjoyable at approximately f(op)^1.414 where f(op) is the function of the number of additional options added to the game.

OOP encapsulation isn’t a violation of distinct code and data. Notice I said “distinct”, not “separate”

It is. Objects by definition hide data (state) from you. And their behavior often depends on that hidden state. The distinctness between code and data might be valid inside the object but not outside. Outside you might call an object.method with the exact same arguments and get completely different results on every call. This does not happen in a procedural program unless the code of the function is dependent on some global state (which is rare). Unlike hidden state, global state is transparent to the entire program. Because it is transparent the mental model (of code and data distinctness) is preserved in traditional procedural programming, but not in OOP.

It is. Objects by definition hide data (state) from you. And their behavior often depends on that hidden state. The distinctness between code and data might be valid inside the object but not outside. Outside you might call an object.method with the exact same arguments and get completely different results on every call. This does not happen in a procedural program unless the code of the function is dependent on some global state (which is rare). Unlike hidden state, global state is transparent to the entire program. Because it is transparent the mental model (of code and data distinctness) is preserved in traditional procedural programming, but not in OOP.

I don’t think you mean what i would normally understand you to mean when you say “distinct code and data”.

If your argument is that when you access something in, say, c# that looks like a property you end up executing a function… ok that’s blurring the distinction between code and data. If you’re talking about things like Reflection and generating classes out of data, yeah that might be defensible. But how is encapsulated state violating distinct code and data? The distinction between this is data (e.g. exists in the data segment) and this is compiled code (e.g. exists in the code segment) isn’t affected by whether that data is private or public, that’s basically syntactic sugar and compiler errors.

When i think “distinct code and data” i think the opposite of what ASM programmers would do when they would reduce the size of their programs by using the op-codes of the program to initialise data. I think of building trampoline function calls that replace the pointer at the calling location with a different (usually more appropriate) pointer when being first run.

Also are you arguing that world visible global state is a good thing? Because if so, i don’t think i could possibly disagree with you more. Global state is a useful thing, very rarely it’s even a necessary thing, but i don’t think i’m being excessive when i say it’s never a good thing or a desirable thing. And i’m not alone in thinking this, ACM had a paper (quoted by wikipedia) published about this in 1973. C2 has a page about it here. In the follow-on link about the ACM paper it describes global variables as being “slightly better than … GoTo”.

if your argument is that when you access something in, say, c# that looks like a property you end up executing a function… ok that’s blurring the distinction between code and data

Not just that. There is hidden state, that you may NOT have access to in any way, that impacts global behavior. I can call the same public method with the exact same argument list and get two completely different results in every call. This is hideous. It is evil. It’s bad. And YES, it is far worse than global variables.

Also are you arguing that world visible global state is a good thing? Because if so, i don’t think i could possibly disagree with you more.

Compared to hidden state of objects which is also global state: YES. I’d rather have visible/transparent state than hidden state controlling behavior any time of the day. Global variables are a lot loss harmful than global objects with hidden state, and if used sparingly, intelligently, and with a thin API access/manipulation layer around them they would be far better than the hidden state of objects which in effect is global state too (that effects global behavior) only that it is scattered all over the place and infinitely more difficult to debug.

ACM had a paper (quoted by wikipedia) published about this in 1973. C2 has a page about it here. In the follow-on link about the ACM paper it describes global variables as being “slightly better than … GoTo”.

Sure. And they gave us what instead ? They gave us C++ -> which is WORSE than GoTos. Isn’t Bjarne a fellow of the ACM and recipient of ACM awards ?

I imagine there is some degree of hyperbole in Eric’s claim. There is no doubt that configuration options are a path of least resistance to either making a design decision or automatically detecting something. However, clearly there is a balance to be had. For example, it is far from obvious how one would re-engineer Apache to work without a configuration file (or files.)

Some have pointed out the exponential growth introduced by command line options, but that growth is, to some extent anyway, representative of real user needs. I remember when I discovered that many file utilities (mv, cp etc.) were in fact exactly the same program, just hooked up with links (I don’t know if that is still true today, but it was when I looked.) The program name was kind of like a command line option. Here the designer choose to multiply the number of programs rather than the number of program options, trading off two different kinds of complexity (the cost of learning a new program verses the cost of learning some new options to an existing program.)

I think another example of that would be grep/egrep/fgrep/agrep. Should they be separate programs? I think it is a judgment call (and I’m going to bet the first three are identical with links too.) Same deal — incorporate some options into the name of the command itself.

The plain fact is there are certain types of information that can’t be auto-detected. Should this directory allow executable files on my web site? What is the IP address of my SMTP server? What is my security certificate? What blog posters do I want to automatically throw in the spam bucket? Etc. What screen saver does this user want to display on his desktop?

GPSD is a simple program in a sense — not to say it doesn’t have huge internal complexity. However, ultimately its spec is simple: produce reports of your position on this port in a reliable format. Most programs have less clear user level specifications.

For example, it is far from obvious how one would re-engineer Apache to work without a configuration file (or files.)

thttpd :-P

Well, okay, it has a config file, but it’s very, very small.

The plain fact is there are certain types of information that can’t be auto-detected. Should this directory allow executable files on my web site? What is the IP address of my SMTP server? What is my security certificate? What blog posters do I want to automatically throw in the spam bucket? Etc. What screen saver does this user want to display on his desktop?

Should this directory allow executable files on my web site?

Use the noexec option to mount, then you don’t have to worry about it.

What is the IP address of my SMTP server?

Can you say “MX record”? (yes, both sendmail and postfix support using an unspecified smarthost this way)

What is my security certificate?

There are ways of doing this; certificates could come from LDAP, for example.

What blog posters do I want to automatically throw in the spam bucket?

This, like several others, including the address of the SMTP server and the server certificates, these aren’t necessarily configuration items, but are, instead actually data.

In terms of your original argument, I can see a point of view where you get OO == “mixed code and data” but i wouldn’t term it that way at all. Instead OO is about making unmediated access to data a thing of the past. Why unmediated? Because unmediated access to data makes it much more difficult to enforce certain constraints on the data, the most obvious one being concurrency management.

Sure you don’t _have_ to enforce constraints, but not doing so just means when someone has a brain fart the result is a broken product and an angry support call. Don’t be that guy.

Not just that. There is hidden state, that you may NOT have access to in any way, that impacts global behavior. I can call the same public method with the exact same argument list and get two completely different results in every call. This is hideous. It is evil. It’s bad. And YES, it is far worse than global variables.

I doubt i’ll convince you otherwise but let me take one run at it.

Part of being a developer is all about controlling dependencies. The only way that you can keep developing software (past the toy program stage) is to reduce the range of things that can be affected by an given change. I say this because any place that is affected by a change has to be known about and checked for consistency with the change.

Global variables, by their nature, have the potential of a complete graph in terms of dependencies. So if i make a change to how I use that global variable, i have a potential set of affected code modules of “everything”. “But surely we only need to look at things which include the affected module?” i hear you say. True, unless someone’s created (knowingly or not) a colliding global object which makes our problem worse because now we’ve got the same block of memory that’s actively being used for two different purposes.

Our only mechanism to combat this is social in nature (“start all global variables with ____ and don’t ever redefine them unless you want to debug when things go wrong”).

Don’t get me wrong, i’m not suggesting that we should get rid of globals as a reflex. They serve a purpose that most languages don’t solve. But at the same time I don’t think we should accept any use of them as being “good” or “right”, especially when there’s another option.

They gave us C++ -> which is WORSE than GoTos. Isn’t Bjarne a fellow of the ACM and recipient of ACM awards ?

Oh the fallacies… but lets move past that.

This comparison is best described as “Apples and Oranges”. Especially when you consider that some C++’s support Goto (i don’t think it’s part of the standard however). C++ isn’t a particularly good implementation of OOP at the best of times and has other issues besides. Short version C++ isn’t necessarily OOP and not all of C++’s short-comings are because of OOP.

And don’t get me wrong, like all methodologies, OOP has it’s benefits and it’s drawbacks. In a million years coders will look back at OOP and feel much the same about OOP as many coders feel about waterfall now… “under what flawed understanding of the universe could this possibly be considered right”. Except that we weren’t there to see the wild west development processes that were being used before waterfall just as those million year coders won’t have been here to see the horrors that OOP is intended to combat, and one of those horrors is the spaghetti string code that comes about, in part, because of an over-reliance on global variables.

Saying “OOP encourages encapsulation and encapsulation is worse than global state” is, to me, about as backward as saying “Science encourages medicine and medicine is worse than therapeutic bleeding”. As I said above however, i doubt my ability to convince you of that. Time will tell.

Instead OO is about making unmediated access to data a thing of the past. Why unmediated? Because unmediated access to data makes it much more difficult to enforce certain constraints on the data, the most obvious one being concurrency management.

Unmediated access to exposed global data is far better than mediated access to a black box where that black box has its own hidden state and where a whole bunch of other black boxes have their own hidden state too, and where the interdependencies and interactions between all these objects are are subtle and dependent on that aggregate hidden state that is spread across those objects. The bugs that result from OOP are far more subtle and devilish than anything you will see from global variables or even from GoTo statements. The solution to GoTos is easy (a grep script that runs every night). The solution to global variables is easy: Use them sparingly, manipulate them with a transactional library (thin layer), and have a grep script to make sure that that no one tries to mess with the globals vars directly inside his/her subroutines. Both solutions are manageable. On the other hand, there is no solution to OOP. The only solution I can think of is putting all your global state in one single object and manipulating that state via methods while keeping all your other objects as stateless. But if you do that, you might as well ditch OOP altogether, because your stateless objects then become nothing but libraries of subroutines and not *real* living objects. You’d practically be back in the procedural paradigm if you do this.

Part of being a developer is all about controlling dependencies. The only way that you can keep developing software (past the toy program stage) is to reduce the range of things that can be affected by an given change

And how exactly does OOP succeed in this ? Controlling dependencies is all about tightly controlling global state not distributing it across a maze of global objects and pretending that it doesn’t exist.

Global variables, by their nature, have the potential of a complete graph in terms of dependencies

It is the other way around. It is distributed stateful objects, which are interdependent on one another, that is the recipe for the kind of subtle graphs that can make your head spin. A neatly organized group of global variables can far more easily be tracked vis-a-vis their relationship to the program logic and one another.

Yes. It can be harmful. The windows registry for example where they took a large group of uncoupled entities (applications) and tightly coupled them into one point. The damage wasn’t from the singleton itself as much as it was from forcefully coupling otherwise uncoupled entities.

Unmediated access to exposed global data is far better than mediated access to a black box where that black box has its own hidden state and where a whole bunch of other black boxes have their own hidden state too, and where the interdependencies and interactions between all these objects are are subtle and dependent on that aggregate hidden state that is spread across those objects. The bugs that result from OOP are far more subtle and devilish than anything you will see from global variables or even from GoTo statements.

I completely disagree, and the last twenty years of computer science disagrees, but i will leave it there. I will not be able to convince you.

So you are saying the way to eliminate configuration files is to stop calling them configuration files?

No. I’m saying that what you are calling “options” aren’t necessarily options, but are actually data — inputs to the program. And data is discoverable. A program wanting to know where the SMTP server is can simply discover that via zeroconf DNS entries. A Web server doesn’t need an option for “this is my CGI directory” — simply allow CGI in all directories and use filesystem and mount permissions to control which folders actually allow CGI programs to run.

IOW, Unix has a rich infrastructure for doing things like this. Use it.

>I completely disagree, and the last twenty years of computer science disagrees, but i will leave it there. I will not be able to convince you.

I also disagree. I am very careful about not using globally visible variables in the programs I write, with a limited exception near debugging flags.

GPSD represents what I think is good style. The place of global variables is taken by a context structure which points to an array of session objects managing each attached device (which, this being C, are just structs with method pointers in them). There is another array of structures for managing client connections – and that’s it. These structures are owned by the daemon’s dispatcher layer, the top level of the program, and are not directly visible anywhere else. Everything else passes around structure pointers.

By organizing my `globals’ in this way I ensure that all my common storage is in one place that I can easily eyeball, so potential interactions among different bits are easier to spot. It also makes writing diagnostics that dump all this state relatively easier. And because all the functions that mutate this state have explicit pointer arguments to a session or context, the data dependencies internal to the code don’t get swept under the rug.

Morgan Greywolf Says:
> No. I’m saying that what you are calling “options” aren’t necessarily options, but are actually data

It doesn’t seem to me there is always a bright line between the two.

> — inputs to the program. And data is discoverable. A program wanting to know where the SMTP
> server is can simply discover that via zeroconf DNS entries.

Isn’t the DNS just a fancy config file?

> A Web server doesn’t need an option for “this is my CGI directory” — simply allow CGI in all directories
> and use filesystem and mount permissions to control which folders actually allow CGI programs to run.

I think most web masters reading along just broke out in hives. Certainly what you suggest is doable, it just isn’t such a great idea. Config files allow you to gather all the important configuration data in one place, rather than distributing it over multiple different configuration systems.

I think what you’re saying is actually closer to what I am saying than to what JonB is saying if you read the whole exchange.

Your global state is concentrated not distributed. You’ve a used a good technique to isolate it and to make it owned only by your daemon (or what I called the thin transactor layer). I don’t disagree with that. You’ve used global state sparingly (whether pointers to structs or Full Status/Debug Globals). I don’t disagree with that. You pass it around as pointers to any function that has a business of reading/altering that global state. That is the right thing to do.

You’ve done the 1st part of managing global state, namely isolating that state, concentrating it in one place, and managing access to it.
The 2nd part imo is to limit the number of functions that receive these struct pointers (the handles to the global state). Any subroutine that receives a handle to the globals might produce completely different results if called with the exact same argument list.

In other words if you think of your subroutine as a math function, you no longer have a 1-to-1 mapping between inputs and outputs. The output of your routine might be dependent on the global state. That violates the cleanliness of functions as mathematical constructs which should always accomplish a 1-to-1 mapping between inputs and outputs. By limiting the number of functions which violate the 1-to-1 rule you drastically reduce the number of places that can cause serious bugs in the program.

All the techniques you described are good “procedural” techniques. OOP on the other hand is about distributing global state not concentrating it and abstracting it out into one place. It’s about creating living objects each with its own encapsulated state. If JonB would organize his OOP program the way you described, it would not longer be OOP, it would be procedural. The only thing that would be OOPish about it is the subroutine calling syntax. The syntax of calling his subroutines would become -> myLib.myMethod(arg1 … argnN) rather than the less verbose myFunction(arg1 … argnN) :-)

How about LDAP and/or Active Directory? What about a database? The Windows registry? GNOME’s gconf? You’re right that it’s a wide fuzzy line.

Personally, I view DNS as yet another kind of directory, similar to LDAP, rather than as a fancy shared /etc/hosts file.

I think most web masters reading along just broke out in hives. Certainly what you suggest is doable, it just isn’t such a great idea. Config files allow you to gather all the important configuration data in one place, rather than distributing it over multiple different configuration systems.

Obviously it’s not standard practice, and it’s certainly not what I do on any of the public facing Web servers I maintain professionally. After all, there is something to be said for tradition and standard practice. However, I pointed it out to show that you could make a Web server that requires little or no configuration data.

I take a more holistic view of the system than most IT people. Ask your network admins if they know how to code in C, Python or Java. Ask your Windows admins if they know the best way to tune an Oracle database. Ask your storage admins what they know about to carving out VLANs on your Cisco switches. I’m not very much like those people. Ask your Unix admins to debug live applications or design a new N-tier architecture. Watch the funny looks they give you. I know all of those things. I just happen to be called a “Unix administrator” right now. :)

Transactor layer = (Layer that reads/writes into the global state – whether direct assignment or through struct pointers)
————————————————————-

Functional Layer = Layer that implements the program logic
————————————————————-
a) Satisfy the 1-to-1 rule in as many of the functions as your imagination and creative design allows
b) Limit the number of functions that can call the transactor layer to the absolute minimum

The traditional procedural paradigm allows you to achieve this picture far better than OOP. That is pretty much what I am saying.

All the techniques you described are good “procedural” techniques. OOP on the other hand is about distributing global state not concentrating it and abstracting it out into one place. It’s about creating living objects each with its own encapsulated state.

I’m having a hard time holding on to what you mean by global state.
By global state i mean variables(of any kind) created with global scope (e.g. ones you can do extern in C/C++).

OOP does not lead to the distribution of this state. I believe OOP implies the emptiness (or as close as possible) of this state. The closest you would get are items that are technically in that layer (in terms of implementation details) but are only accessible from a small set of files (a part of the inheritance tree) which is enforced by the compiler.

Yes this kind of state is distributed but this is not global state because it’s already limited in scope.

This is what i’d call global state and (as is the case with GPSD) it should be empty (or as empty as possible). This is what i’m railing against, this is what is worse by a long margin than anything that OOP may throw at me. Everything in here is a potential source of pain.

I’m not railing against variables that you create on the heap in the local scope and pass around through function arguments (including constructors).

I’m having a hard time holding on to what you mean by global state.
By global state i mean variables(of any kind) created with global scope (e.g. ones you can do extern in C/C++).

“Scope” does not define global state. It merely defines the accessibility of a variable. A computer program is nothing but a clocked state machine. The clock comes from the time reference, which in turn increments the program counter as the program executes. That “global” state is a snapshot of all mutables that affect the global behavior of program at any given point of time, regardless how/where that state is distributed within the program and the means by which you access that state. That is what I mean by”global state”.

It is not scope that defines “globality” of state. And that is the fallacy that most people fall into. What defines “globality” is whether that mutable can effect the global behavior of the program. That is why other programming paradigms make it their primary goal to combat mutables through and through. Mutables can easily fool you into thinking they are not “global” simply because their scope is limited.

OOP is far worse than traditional procedural programming on that count -> Because of the mental model they encourage. OOP encourages the creation of numerous *living* objects each with its own hidden state. It views the world as objects. Not as processes. The objects interact with each other through the program logic in very subtle ways. And yet much of the “global state” is hidden, distributed, and often times inaccessible to you if you want to debug something. It is in that context that proclaim global variables (those with global scope) as the lesser of the two evils because at least with a global variable I can set a breakpoint on it, while with hidden state that impacts global behavior (which I also call global state) you might not even be able to do that.

uma, your argument entirely baffles me. Frankly I find it amazing that you would post your opinions here on this blog: a blog populated by many people who have actually written and maintained many large programs, and who know from experience how deeply difficult global variables make it to manage and maintain such software, and who simply don’t experience the phantom “hidden state” problems you are referring to. I think part of the problem is that your argument is so hard to follow. It is this hand wavy theoretic thing, saying things like “A program is nothing but a clocked state machine.”

Can you give me a real world example of a program you wrote where this putative hidden state caused you serious problems, that would clearly not have happened had you been using global variables instead? Can we perhaps talk in concrete terms rather than theory?

I also want to add that I think that “global variables” and “object oriented programming” are in fact entirely orthogonal concepts. “global variables” means variables outside of the stack (where “variables” means both values and references to heap data.) Object oriented programming is a way of thinking about representing problem domains in software. Which is to say, an object is not a program artifact, but a representation of a real world domain level item. Of course it is not a perfect mapping, but without referring to the source code I am sure Eric has objects that represent GPS devices, connections to his server, GPS data formatting conventions and so forth. These objects are not program artifacts but domain artifacts. Whether they are global of stack is entirely irrelevant.

“Object oriented programming” is not about code and data together. That is just syntactic sugar. The benefits of object oriented program arise out of this simple fact. For example, regardless of how the spec for GPSD changes, it is very unlikely that it will stop calling for GPS devices, and user connections, even though their precise definition might change. This makes the design much less brittle to the inevitability of spec changes.

But not to let you off the hook and dive into this theoretic fluff I just talked about, lets talk concrete examples: bugs that nasty hidden state caused that were hard to find and would have been candy with global variables. Lets hear them.

But not to let you off the hook and dive into this theoretic fluff I just talked about, lets talk concrete examples: bugs that nasty hidden state caused that were hard to find and would have been candy with global variables. Lets hear them.

Uma’s argument is technically correct. But in reality? I’m not sure so many subtle “hidden state” bugs exist. I’ve had a few pop up over the years that left me scratching my head for a bit, but nothing I couldn’t eventually trace back to an ID ten T error on my part.

“Scope” does not define global state. It merely defines the accessibility of a variable. A computer program is nothing but a clocked state machine. The clock comes from the time reference, which in turn increments the program counter as the program executes. That “global” state is a snapshot of all mutables that affect the global behavior of program at any given point of time, regardless how/where that state is distributed within the program and the means by which you access that state. That is what I mean by”global state”.

The behaviour of a program at any given point of time (sans concurrency) is solely affected by it’s inputs and any globally accessible variables. This is relatively uniform across all imperative languages, certainly all that I have used. The only thing that OO brings to the table is an implied pointer to the owner of the method (method being defined as a function attached to an instance of an object).

Personally your definition still leads me to globally scoped variables being worse global state that anything OO could do… if only because of the potential that a function I call in a different (unrelated) module is capable of modifying that state variable. Perhaps, as Jessica mentions, your horror story will clarify where the specific pain point is.

BTW, i expect the common response to be that your example is either a poorly designed OO class or violates some obscure principle that applies half of the time. An issue with OO is that it is a “paradigm in flux” and most training in it is both incomplete and old. For example: early OO was all about modelling the real world in classes, more modern OO has a tendency to shelve that idea as being unworkable in practice and moved more towards a “model what matters while maintaining simplicity” stance and yet uni students are still taught the bikes, cars and trucks are all vehicles with 2/4 and many wheels style examples.

> An issue with OO is that it is a “paradigm in flux” and most training in it is both incomplete and old.

This. In fact, when I was hastily skimming your post, I mentally read “incompetent” for “incomplete.” On further reflection, that’s still probably a better word.

I have deliberately done OO programming in plain C on multiple occasions. In C you can declare a pointer to an unknown opaque type, and no functions outside your library know what’s in the type.

For embedded systems, you can even have a function that returns the size of the underlying type, so you can allocate memory for your object outside the library. Then you can pass around the pointer to the object to the library functions.

I have also done OO, including virtual function tables, in the distant past in plain assembly language. The MS and Turbo assemblers were works of art, and you could do things in macros that any competent lisp programmer would grok but that would completely baffle C programmers. (Mind you, I have also written C macros that completely baffled some C programmers, but that’s another story — the C macro processor is not Turing complete, and doesn’t have any knowledge of the code it is helping to build.)

It seemed amazing to some of my co-workers back in the early 90s that I could add an additional assembly language file to the link, and it would wire itself in and change the behavior of the rest of the driver without any explicit calls to it.

Having said all that, I never really got into C++. Most of the features always seemed like ill-conceived solutions in search of problems. These days, C is my seldom-used assembly language and Python is my high level language. (When I’m not writing documentation or Verilog.)

OO doesn’t entirely prevent the problems that arise from global variables. What it does is to force the programmer to think about what variables really have to be globally accessible, and to create explicit methods for reading/setting those variables. The mindset is that each object in the program is on a “need to know” basis. Every bit of information that it really needs to do its job is explicitly passed either when one of its methods is called or it calls the methods of some other object. Then each object class can be programmed by a different person/team; they have no concern for the internals of another class. So long as the class reliably does the job it’s specified to do, the other programmers don’t, and generally shouldn’t, care how it does it.

You can still have problems if you don’t implement those methods carefully enough, and there’s always the possibility of one thread having used a Get method right before another used a Set method, so that different parts of the program are using different values of that variable. But there are ways to address that as well.

In a non-object language, the project manager will have to issue a document to the team, describing the rules for handling global information. He might even order a header file that declares macros for setting and reading globals, and hides from the programmers the actual names under which the variables are stored. The compiler won’t enforce all those rules for him, although it’s probably not too difficult to have a makefile scan the source files for modules that access those variable names rather than the macros, and abort the build after ratting out the offending programmer.

It’s just a question of whether you are willing to accept the overhead imposed by the compiler doing all of this for you, or the responsibility for doing it yourself. Like everything else, it’s a trade-off, and there is no single right answer for all situations.

Morgan Greywolf Says:
> I’m not sure so many subtle “hidden state” bugs exist. I’ve had a few
> pop up over the years that left me scratching my head for a bit,

Just to be clear, the question is not “give an example of the putative hidden state causing a problem,” it is “give an example of the putative hidden state causing a problem that would not have arisen had a global variable design been used.”

Since turn about is fair play, let me offer a counter example.

A while ago I was writing a server that listened on a socket for a client connection. When the connection took place it initiated a conversation with the client, which gathered data (user name, statistics, request response) and when complete saved that data. Since clients connected only rarely the original design was that the server would handle only one client at a time, and if a second client happened along it would have to hang out in the TCP buffer until the first was done.

However, anyone who knows software knows what comes next. Yes, the spec changed and I had to handle multiple simultaneous clients. In the original design, I could have stored the collected data from the session in a global variable. Fortunately, I did not. After the server accept on the socket, I passed an object around the stack, and stored the object in the “hidden state” of the server listening object. To implement multiple threads, I changed my single item into a list, and continued to pass around a single object as before. Since there were no other global variables, the code changes were minimal. Yeah me. If I had stored that session state in a global variable, this change would probably have required a touch to just about every single function in the server.

In this case, it is patently clear that global variables are really, really bad.

I chose this example not because it is particularly bad, but because it represents a very common class of problems with global variables. I’ve seen this sort of problem many times, and debugged it out of crappy legacy code more times that I care to remember.

Picture this lovely world with me.. where everything is object-oriented. What I mean is that even taking a shits in the bathroom would be object-oriented in this world. EVERYTHING would be an object in this world.

Now let’s try and picture what things would look like. In this world, rather than having data files for storing our word processing documents (e.g. “.doc”, “.odt” etc), the whole thing would be object-oriented. Every document file would not only contain the data but would also be bundled with the binary that would open it. Now, some OOP smartass would remark “Oh. you don’t actually have to bundle the whole binary. Ha ha ha! All you’s have to bundle is an object wrapper that would wrap around the real app, and if the real app is installed on your system your document would open. If not you’d get a messagebox”

That is the OOP world right there for you. The most idiotic idea ever invented in computer *science*. You’d have to binary-patch every document to open it up with something else (e.g. openoffice OOP app). Oh but wait a second, I forgot. Since we live in an OOP world there would be no such thing as binary patches since those patches should be nothing but methods that can only bundled up with the data/objects that they’re supposed to operate on.

That is the OOP world right there for you. In the real world that we live in sanity has prevailed. A word document on your system is in reality a “global variable”. Any application can try to open it or muck around with it. But how many in reality try to do that !

If you understand this post, then you’ll understand everything else I posted here.

# The Monster Says:
> Like everything else, it’s a trade-off, and there is no single right answer for all situations.

For sure, there are very few absolutely absolute rules. Reducing global variables decreases program complexity comes about as close as you can get as programming rules go. If your girlfriend asks you if she looks fat in these pants my advice would be that in pretty much every situation a “no” would be the appropriate answer. But I agree, there are no absolutely absolute rules.

> Every document file would not only contain the data but would also be bundled with the binary that would open it.

That’s probably an apt description of some versions of MS Word documents.

As others have pointed out, this may reflect the way that object orientation is taught, but in no way reflects the way that object orientation is usefully employed by sane practitioners.

> A word document on your system is in reality a “global variable”.

You may consider it that, but it would be extremely silly for an application that loaded a document to keep information about it in global variables, because then it would be extremely difficult to open two documents simultaneously…

uma Says:
> A word document on your system is in reality a “global variable”.

I was looking for an actual example of a real programming problem, not fluffy theoretical stuff like “a word document is a global variable.” I was thinking along the lines of “I was writing this type of program and had this problem when I used OOP. If I had used global variables that problem would not have existed for this reason” How about a real, concrete example from a large program you have actually, personally, worked on. I gave you one from my personal experience.

In regards to your analogy, it is patently flawed. Every object in an object oriented program does not have a copy of the code. It has references to the code. Exactly the same is true in a GUI shell. In this case the pointer to the function is generally a command line, or an indirect reference based on something in the filename. But it is pretty much a close analog. So I think you are wrong. GUIs are actually an example of an object oriented paradigm that works pretty well.

But please, don’t spin your wheels on my theoretical vacuous fluff. I’m profoundly interested in programming, I am always interested in learning something new, so I really do want to hear a real world concrete example of what you are talking about.

As others have pointed out, this may reflect the way that object orientation is taught, but in no way reflects the way that object orientation is usefully employed by sane practitioners.

For every sane practitioner there are 10 insane ones and all they do is litter up the world with tangled webs of classes and utterly undebuggable shit. FOSS projects have more sanity and quality a) because they’re done by people who care b) because of what esr describes as the “peer-review” effect.

You may consider it that, but it would be extremely silly for an application that loaded a document to keep information about it in global variables, because then it would be extremely difficult to open two documents simultaneously…

Your global variable can be an array of pointers. The array would be truly global and static. Every time you open a new document you’d malloc your required data and assign one fresh pointer in the global array to point to it. The only limitation under this approach would be the maximum number of documents you can open -> Which would be defined by the size of your pointer array. Another useful global would be an integer (n_active_doc) that would index the location in the array which points to the active document. Everytime you change the active document this variable would be changed to index the proper the location in the pointer array.

If you make all your operations on this global state “transactional and atomic” (database style) and if you access it only through the transactor layer, you can manage your global state sanely, while at the same time sparing yourself an insane amount of handles that would have to be passed all over the place and up/down the hierarchy. The problem with the handles isn’t the extra protection they give you. The problem comes when/if for some reason you have to change your type, and if that change has to ripple all over the place.

Sure. In theory the global vars are *visible* everywhere. But you can solve that problem with a nightly script that you run to make sure that no one tries to muck around with these variables directly (ie without invoking a transactor function and without doing that at an appropriate layer in the design).

The example that I gave you was for a real system, that you, I, and everyone else use on a daily basis. It’s the most proliferated computing system on the face of the planet.

In regards to your analogy, it is patently flawed. Every object in an object oriented program does not have a copy of the code. It has references to the code.

So you are saying that you wouldn’t have to binary patch up those references in order to open up your OOP doc in another program ? You’d still would have to patch the references Jessica and make them point somewhere else. I am pretty sure of that :-)

I think though that references wouldn’t be enough, and that you’d actually actually have to bundle up a blob of binary executable with every document (what I called the wrapper with the message box) to truly conform with OOP methodology. That is another discussion though.

But your example isn’t about writing code, and you didn’t write the programs you referenced. I was asking about your personal experience creating programs, where, apparently, you are appalled by this terrible object oriented paradigm. Forgive me for being cynical, but it sounds to me like you are one of these squawking crows who sit on the wall laughing at everyone else’s stupidity, while not doing anything themselves. Perhaps I have misjudged you, and if so you have my apologies.

If you have an actual programming example, I’d love to hear it. I’m very open to learning new things, but since I have both written and managed the writing of many large programs, and my experience directly contradicts your claims, I’m going to need some significant data to make me take you seriously.

It wasn’t a helpful example. Give us an example of a problem someone had while programming, please.

I can sympathize with you. I hate hidden state, especially when debugging. But your general complaint is true for many programs that read a file, query a DB or respond to a series of messages. The functions in the program will behave differently depending on old input data which changed the program state. This is not an OO problem though. It’s a lack of source problem, because you can have the same problem with procedural code using global variables, or even procedural code using a struct. If you don’t have the source to the procedural code, the state is hidden by obscurity.

We had an example like this today. The problem was a configuration parameter that needed to be updated to reflect a hardware change. But it was for a third party piece of software, so we couldn’t use a debugger. The instructions for that particular hardware change did not reference this rarely used configuration parameter so the production support people did not update it. Eventually someone read the documentation on special cases and fixed it, but until then it was a mystery. Yet that global configuration parameter had been visible the whole time if someone had known to look.

Conversely, if you have the source for an OO program, the state isn’t hidden. You debug right into the object which owns the state and the value is easy to see.

OTOH, I seldom use a debugger. I just don’t need one most of the time, and setting up a debugging session for enterprise systems when I might is such a PITA that it is almost never done. Perl 5 doesn’t have the greatest debuggger, and when I’m working with Java I’m writing code on my laptop but running it on some server, with a different team member building, a second installing and another doing production support. Another entire team is sending us data and another entire team is receiving our results, all in real time. Working for a really large company with hundreds of interconnected apps means that, even if we all wrote procedural code with global variables it would be exactly like hidden state in OOP. I don’t have either physical or login access to the hundreds of servers and switches I would need to debug the data flow. No one does. Our world is a very large set of black boxes.

For every sane practitioner there are 10 insane ones and all they do is litter up the world with tangled webs of classes and utterly undebuggable shit. FOSS projects have more sanity and quality a) because they’re done by people who care b) because of what esr describes as the “peer-review” effect.

This applies equally to globally scoped variables as well. For every sane coder there are 10 insane ones that will see nothing wrong with putting “extern int i” into their code. At least i’ll have the piece of mind that if i don’t use globally scoped variables they’re not likely to smash what i’m writing directly.

If you make all your operations on this global state “transactional and atomic” (database style) and if you access it only through the transactor layer, you can manage your global state sanely, while at the same time sparing yourself an insane amount of handles that would have to be passed all over the place and up/down the hierarchy.

You know that you just described hidden state in OO right?
Perhaps the biggest difference is that this kind of hidden state is advisory whereas the “real deal” is enforced by the compiler.

Sure. In theory the global vars are *visible* everywhere. But you can solve that problem with a nightly script that you run to make sure that no one tries to muck around with these variables directly (ie without invoking a transactor function and without doing that at an appropriate layer in the design).

I trust the compiler more than I trust your nightly script. (no offense intended)

*Do you* or *do you not* have to binary patch in the example I gave you to open the document in another program ?

In that example, by implementing every document as a true “object” the document would no longer be a “global variable”. It would only be accessible to its methods and it’s only those methods that would be able to operate on it. How is that better than the abstraction that treats the word document as a “global variable” accessible to everybody ? How is complexity being “increased” by treating the document as a “global variable” in the systems we all run on and work on on a daily basis ?

Not only you have to show that OOP (POO) is better in my example, you’d have to show that it is *much* better. Because you (not someone else) gave that girlfriend analogy when describing global variables. Otherwise, you may have to acknowledge that perhaps you’ve had a bit too much of the OOP koolaid, and that maybe (just maybe) there are other lifeforms out there (which you condescendingly label as squawking crows and barking dogs and theory types) who are simply grosed out by your OOP/POO methodology.

> Too bad you can’t delete that last comment. What a waste of bits. Give us a real example or stop writing.

What I gave you was a real example. The simpler the example the better. The example is a pretty clearcut case how the “global var” abstraction shines over the OOP abstraction in a given problem/setting that we all interact with on a daily basis.

The example is a clear example of what it means for code and data to *really* be separate. In OOP, code and data maybe separate as far as the memory layout and representation in memory is concerned. They are not separate in the abstractions they force on you. The word doc example is a great example of the kind of systems we’d end up in the real world if we OOP’d the living shit out of everything.

In this world, rather than having data files for storing our word processing documents (e.g. “.doc”, “.odt” etc), the whole thing would be object-oriented

Actually, the way the Windows Registry deals with file extensions is a
pretty object-oriented approach:
If you right-click a file with a registered file type, you see a list of things you can do with that file. (The one that is bolded is what would happen if you just double-clicked it.) Creating a file $foo.$bar is instantiating an object of class $bar, and the Context Menu provides a list of public methods for that class. The actual code to implement those methods is not stored in each file. The relationship between the objects and the methods is enforced by explorer.exe.

So you are saying that you wouldn’t have to binary patch up those references in order to open up your OOP doc in another program ? You’d still would have to patch the references Jessica and make them point somewhere else. I am pretty sure of that :-)

Example of Global var abstractions being superior to OOP methodology -> given earlier:

If you are asking for examples of Distributed State I can give you one: SIMULATORS which simulate phenomenon X vs time.

If you model everything as an object with hidden state, where the interaction between the objects happens via the public methods they export to the outside world, you will end up hanging yourself. Object A calls a method in object B. Object B updates some hidden state. Then based upon that it decides to call some public method in object C and so on. You end up with BOTH, circular dependencies (between the living objects), and hidden state that is near impossible to keep track of.

OOP encourages this, because it encourages lumping state and code together rather than abstracting out the state in a very clear cut and clean way.

Let us also not pretend “procedural” means “C” either. Procedural can be done using dynamic languages with very powerful set operatings and list/table support. It doesn’t even have to be interpreted. You can have dynamic and compiled. If you judge by hacker preferences and their coding styles in dynamic languages like perl+python, it’s pretty obvious (to me at least) that they have voted for good old procedural over OOP. Maybe – just maybe – that has something to do with some real flaws in the OOP way of viewing/abstracting the world.

uma says:
> The word doc example is a great example of the kind of systems we’d end
> up in the real world if we OOP’d the living shit out of everything.

I hate to pile on, but this statement is such a simple illustration of the problem. Who suggested we OOP everything, scatalogically or otherwise? The rest of us were talking about writing programs, it is you who keeps changing the subject to GUIs and such. I understand your desire to give a simple example, but simplistic is different than simple, and moving the goalposts is different that scoring a goal.

If you can’t even give us one single example of a real world programming issue that arises out of choosing OOP rather than globals, then I am afraid you are flapping your gums. Something like “I chose this OOP design, it broke in this way, and clearly if I had used a global variable design I would not have had that problem.”

It is a simple enough request. No theory, just practical example. We are a bright bunch around here, don’t worry, we should probably be able to follow along with a real concrete example. I’ve given you one from my perspective, and Eric also gave one earlier. Its your turn now. Something about actually writing or designing programs and software architecture. That would be a big help. Thanks.

> What I gave you was a real example. The simpler the example the better. The example is a pretty clearcut case how the “global var” abstraction shines over the OOP abstraction in a given problem/setting that we all interact with on a daily basis.

Nope. It’s a hypothetical. Not real. Give us a real example.

> The example is a clear example of what it means for code and data to *really* be separate. In OOP, code and data maybe separate as far as the memory layout and representation in memory is concerned. They are not separate in the abstractions they force on you. The word doc example is a great example of the kind of systems we’d end up in the real world if we OOP’d the living shit out of everything.

Code and data are never separate, no matter what the paradigm. Never. You have to know the data format, which is always expressed in code. Data formats are always abstractions forced on you. Do you really think a voltage is the same thing as a bit? Your arguments aren’t making any sense and you have completely failed to respond to my examples. Do you program? If so, give us real examples from your programming life, not fake hypotheticals.

With global variables what you have is a big pile of spaghetti, and action at distance. You never know if calling some fuction wouldn’t affect global variable (and forget about being thread-safe and reentrant). Think for example about troubles with global variable errno… if you know what it is.

What success ? Perhaps I am blind. But I don’t see it.
Is there any place where OOP is successful outside GUIs (which for some reason Jessica seems to think I mentioned earlier) ?

> With global variables what you have is a big pile of spaghetti, and action at distance

In a language like C, yes. In other languages and paradigms, the ENTIRE STATE of the program is global and is carried by the entire program as a whole. In fact, in these languages, the only place where you can define state is global (ie you cannot have variables defined within your functions). The only thing that language (by design does) is police you into writing your program logic in functions that have no side effects (ie meeting the 1-to-1 rule mentioned earlier).

The entire state of the program (which is all global) is available to you if you need it. And yet, these paradigms (where the entire state is GLOBAL) have been used to build real world systems with millions of lines of code and reliability in the five 9s (99.999%).

Ask a hardware designer like Patrick Maupin if he’s building a state machine whether he’d rather keep all his flip flops on one neat place (Global) nice and tidy, and carefully design his combinational logic around them, or whether he’d rather distribute the flip flops all over place in a tangled web of dependencies. Just because languages like C have no mechanisms for enforcing the “nice and tidy” part, doesn’t mean that other languages/paradigms suffer from the same limitation.

I must admit I love short cuts myself, and using global variables is a delicious, if dangerous and messy, short cut in many situations.

However, I am now totally confused about the whole OOP vs global variables issue. Is it even possible to argue this properly? OOP is a higher level of thought and more appropriate in higher level programming methodology discussions. Global variables are just specific tools (for the lack of a better term) in any general programming toolkit. In my limited experience, I most humbly admit that I cannot figure out the debate at all.

Having said that I agree that OOP can be really a difficult beast to master and the temptation for casual programmers is to fall back on plain old procedural programming.

So still no actual answer then? Still bleating on about Word documents and bathroom habits?

You come on a blog populated by many serious programmers, who have written between them millions of lines of code, produced hundreds of applications that have tens of millions of satisfied users. You tell us that our design philosophy is deeply, fundamentally flawed. You tell us that pretty much everything that has been written about software architecture for the past 30 years is wrong. And we are programmers, and so we are curious. We learn new things — that is what we do — so we are open to listen to this new thing. Nobody here is closed minded, on the contrast we are all pretty much iconoclastic. However, we are also practical people. Hand wavy theory and architecture astronaut talk does nothing for us.

And so we say to you: “show us the code.” And we keep asking, and asking, and you keep talking about everything except actual practical examples. Apparently you don’t have any, and you’ve got nothing to show us. The only reasonable conclusion is that you are full of shit, or you are a troll. And both are rather pathetic.

> Ask a hardware designer like Patrick Maupin if he’s building a state machine whether he’d rather keep all his flip flops on one neat place (Global) nice and tidy, and carefully design his combinational logic around them, or whether he’d rather distribute the flip flops all over place in a tangled web of dependencies.

Utter, total, ahistorical, nonsense. Hardware designers love using other people’s chips and cores to solve problems. The whole IC revolution is based on private state. Software designers got the idea of OO from hardware. Haven’t you heard of “Software ICs”.

Ask a hardware designer like Patrick Maupin if he’s building a state machine whether he’d rather keep all his flip flops on one neat place (Global) nice and tidy, and carefully design his combinational logic around them,

I think this example actually shows, as Jessica and others have been explaining, the orthogonality of the concepts of global state and object-orientedness.

Hardware designs are, practically by definition, object oriented. Each synthesizable Verilog or VHDL file represents stuff that will actually, truly become an object.

I haven’t been paying too much attention to the discussion, but I think most of the people arguing with you usefully employ a milder form of object orientation than what you seem to think is required in order to achieve the label “object oriented”.

Others have alluded to the fact that there was a lot of bathwater, but lots of people have found the baby in there. And the exact shape and capabilities of the baby depends heavily on the problem domain and, yes, the implementing language.

In Verilog, for example, a synthesizable module does (as you point out) represent both code (e.g. combinatorial interconnects) and global state (e.g. flops). But it is also an object. If I want two of them, I simply instantiate two of them, and voila! I now have two copies of my object. You can’t do this very easily in C — take a design that was very simply designed using global variables, and then decide you want multiple copies of it. This is why it is useful to code as much as possible to not directly rely on globals, but to rely instead on some sort of magic handle that can eventually access a global array, or structures allocated dynamically on the heap, or whatever.

A lot of the useful syntactic sugar of object orientation just takes this common-sense stuff and makes it easy. There is a bit more that is somewhat more advanced and useful for some things, and then additional stuff that is extremely esoteric. With a language like C++, the Pareto principle and Sturgeon’s law both apply in spades.

I think you missed the point of the hardware example. Of course, hardware that you synthesize is a real object.

If you are building a Mealy machine, you “next state” is always a function of your current state, plus whatever the combinational logic yields at the clocking point.

This is exactly the way functional/logic languages view the world. All your state is global in those languages. All of it. However, it forces you by design to write your code such that all your code is combinational. They don’t allow you to sneak your own flip flops inside your combinational chunk. The combinational chunk is your “program logic”. Combinational by definition has no mutables (registers, flip flops) and fits the 1-to-1 criterion.

The reason you have “provable” programs in these functional languages, is exactly the same mechanism how you have “coverage” and test vectors (which at the end of the day is provability) of your hardware.

One example of these provable programs that might interest you is the OKL4 kernel – a mathematically proven kernel. Proving a program in something like a pure functional language is very much like applying test vectors in HDLs. It literally tests all possible states that the program can be in and proves that it complies with specification at all times! Of course the OKL4 is written in C, so the C code has to be written in a way such that it exactly maps to the way functional languages work. It is a painstaking process but it is doable for something on the order of 10 or 20 kLoc (size of a uKernel). There was even a project to remove the linux kernel from Android, and replace it with OKL4 + hacked libraries/utilities around it and from what I remember reading in the published material they reported gains across the board. The only thing that made it a wash at the user level was Java and the VM which as we all know is slow as christmas.

@ Jessica:

> Show us the code, or go away.

You’re just babbling. That is all. You’re also imagining lots of things that I never even talked about (GUIs). The example I gave you, is what would happen if we treated everything like an object and OOP’d everything. Which is exactly what OOP calls for. If you don’t practice that, then you are not a true practitioner of OOP/POO. And if you’re not a faithful practitioner, you should not lecture us about your religion.

Here is what Joe Armstrong (developer of Erlang) has to say about OOP. I came across this last night. Pay attention to objections 1,4. Pretty much exactly what I said here.

If OOP really works like hardware, how come our software hasn’t become a 10000 times faster since the 386/486 days of the mid 90s. I assure you, the hardware people delivered the goods and we may not be far from the day when you will have a teraflop and a 100 cores on your machine. Yet software is still as slow as the 486 days. Why do you suppose that is ?

If OOP really works like hardware, we should have seen seen the scalability in software speed ! But we didn’t. Why do you suppose that is ?

No one is asking you for an example of what would happen if. They want an example of what did happen when….

That is to say, no hypotheticals here. Name some actual OO program where this “hidden state” you keep going on about has actually been a problem. Alternatively, show a program implemented without OO because to do so would be to hide some state that must be visible to all objects, and sharing that state information via public methods won’t work.

I can’t speak for Jessica etc., but for myself, the reason I want you to do it is that I don’t believe it is possible to show an actual working non-OO program that couldn’t be written to use OO or vice versa. Any program I can imagine writing in either class of language could be written in the other, but one may require a great deal more work on the part of the programmers because of the language design.

As I said before, the only question is whether the project manager for the program wants to let the language itself enforce discipline about which parts of the program are allowed to modify which parts of the data, or do so via other means.

how come our software hasn’t become a 10000 times faster since the 386/486 days of the mid 90s.

Software that does exactly the same things is that much faster. But just as one’s clothing and other personal belongings will expand to fill any amount of closet space, one result of faster hardware is that people expect software to do more (including things that are really just eye candy).

Have you seen any of the previews of Windows 8? It’s going to have an interface not terribly unlike Windows Phone, where all of your Win8 apps are never really completely closed; they will have their own bit of real estate on a virtual surface (larger than the display) upon which they can display some basic state info even when not fully opened up. Those multi-core CPUs we’re all getting now will be fully loaded down just letting these apps keep refreshing their little “panels” or whatever they’re called (bigger than icons; smaller than windows).

No, that’s you. You’ve stated your opinion and given us no real evidence you are correct. We want evidence.

> Yet software is still as slow as the 486 days. Why do you suppose that is ?

Did you think about this argument before you made it? Hardware has sped up because of advances in materials and manufacturing. But advances in materials and manufacturing don’t make software faster except by making hardware faster.

> Here is what Joe Armstrong (developer of Erlang) has to say about OOP.

Well, he doesn’t give any examples either. I love Erlang, but I have no reason to think Armstrong’s completely unsupported opinion on this matter is definitive.

> Name some actual OO program where this “hidden state” you keep going on about has actually been a problem.

Please go back and read my earlier posts. I gave an example of distributed state and how hidden state cannot be ignored and how hidden state is indeed “global state”. The example that I gave was SIMULATORS that simulate some phenomenon vs. real time and the kind of dependencies that result from modeling everything as an object. People simulate everything from biomedical phenomena to hardware in real time. If the example is not clear or you fail to understand something let me know.

> Software that does exactly the same things is that much faster.

I can’t agree with this. Unless your definition of “does exactly the same thing” is a 10 line fortran benchmark or something like that. Why is my word processor not a 10000 times faster ? If OOP delivered the goods and indeed works like hardware our big software apps would indeed be *that much* faster.

This guy says OOP is better when you have a fixed set of operations and an increasing number of new things, but functional programming is better when you have a fixed set of things and an increasing number of new operations.

> 1. Nothing can change your data. Once you get it, you’ve got your copy and it’s yours. Nothing can mysteriously change it while you are operating on it. Conversely, a function cannot affect data other than what it received (no side effect).

> 2. It’s threadsafe. There are no concurrency issues since nothing can have a reference to the same data. There’s no way for one function to interfere with the data in another function. And there’s no way for code to have stale references.

> 3. No Null Pointer Exceptions. NPEs happen when a piece of code knows the definition of an object (its class) but makes the mistake of trying to invoke methods on a non-existent instance of that class. In functional programming, there are no object instances of code. Function signatures are checked statically and then called, so there’s no chance of a code instance not existing, once those functions have been imported.

> 4. Expressions can be compounded. This one might take a little effort to grasp at first. Since there are no instances of objects, and since all data is always passed by value from one function to another, then you can have once function call another function and so on without having to worry about instantiating objects or checking for null values. I can pass a string value to substring function to a concatenation function to a parsing function to a date constructor to a date formatter, etc. So my code can be made of compound expressions that elegantly build the logic I need, rather than having to write a new function (or object or method) that does what I want it to do.

> It’s well known in the Java world that stateless applications will perform better than stateful ones, everything else being equal. Enterprise Java Beans and web apps are often designed so that they do not need to maintain state information across requests if possible. In functional programming, there’s really no way to maintain state in memory (although you could write it to the database). Since there are no references to objects, there is a built-in performance advantage. Because of this, you can combine and compound several functions and still maintain high performance.

I guess my last point was really to say: Start convincing us! It can be done. You have the language tools, which makes me think you have the other tools needed. You might have to use some Google-Fu or Bing-Fu. You got a start by quoting Armstrong.

>If OOP delivered the goods and indeed works like hardware our big software apps would indeed be *that much* faster.

Conclusion doesn’t follow from premise, or rather you’ve snuck in a premise the originators of OOP wouldn’t recognize. The promise of OOP wasn’t that software would “work like hardware” as in “as fast as hardware”, it was that OOP would tame software complexity and thus reduce the defect rate.

OOP failed to deliver on this promise, but the failure had nothing to do with your fixation on “hidden state”. It failed to deliver because amplification of our capacity to handle software complexity gets almost immediately used in building software systems that are enough larger to bring defect rates up to where they’re just barely tolerable again. Also, early OOP proponents misunderstood what the most important sources of defects were (and still are) and neglected the single most important one – dynamic storage allocation. The colossal mess that is C++ is the tragic consequence of that neglect.

I never bought into the OOP religion, because I understood from the beginning that its zealots were like the proverbial drunk searching for his car keys near a lamppost a block from his car because that’s where the light is better. OOP is not salvation, it’s just another tool fit for some uses and unfit for others. That said, your attack on it is completely wrongheaded. You’ve confused several problems orthogonal to the OOP/non-OOP distinction with problems created by OOP, and failed to identify the problems that OOP genuinely does cause by itself.

> You’ve stated your opinion and given us no real evidence you are correct. We want evidence.

What would this evidence look like? Are you looking for code fragments of how OOP fails to deliver? Is that what you’re looking for. The examples that I have given can easily be mapped into code/abstractions etc. Do you think that people did not try to build systems where things like word documents are done in an OOP way exactly the way I have explained? They HAVE! And their abstractions fail. There is some guy in Russia I came across the other day who is building an OOP operating system. EVERYTHING is an object in his system. EVERYTHING. In other words if you have a raw text file it is actually not only saved as a raw text file but actually bundled with code (wrappers or references) that would open up the file.

It’s pretty straightforward to understand everything I posted in terms of code. What matters more is the abstractions, and which of these abstractions fail and which of them succeed in modeling some real world thing. The code is detail.

If what you want is actual code fragments on how OOP fails, I can post links of people who have written extensively on this. With code fragment examples and everything. But is that what you’re looking for ?

> The promise of OOP wasn’t that software would “work like hardware” as in “as fast as hardware”, it was that OOP would tame software complexity and thus reduce the defect rate.

I did not claim that the premise of OOP was that “software work like hardware”. I think it was Tom who made that statement. So it was logical on my behalf on question why the speed gains did not materialize as they did in hardware. I agree with your statement that OOP was sold as the magic formula that would “tame software complexity and thus reduce the defect rate” and ultimately cure all software ills (complexity + bugs).

> The examples that I have given can easily be mapped into code/abstractions etc.

That would be hypothetical. No thank you.

> Do you think that people did not try to build systems where things like word documents are done in an OOP way exactly the way I have explained? They HAVE!

I know.

> And their abstractions fail.

Those would be the real world examples we want.

> There is some guy in Russia I came across the other day who is building an OOP operating system. EVERYTHING is an object in his system. EVERYTHING. In other words if you have a raw text file it is actually not only saved as a raw text file but actually bundled with code (wrappers or references) that would open up the file.

So give us some examples from his OOP operating system which illustrate your points, if they are handy.

> What matters more is the abstractions, and which of these abstractions fail and which of them succeed in modeling some real world thing.

Failing to properly model a problem is just as easy using global variables. I, personally, have failed to properly model a problem using global variables, local variables, structures, objects, and Perl lists and dictionaries. I can even fail to properly model a problem in assembly language. Getting your data abstractions wrong is so easy people even do it when talking to each other, and in writing all the way back to Sumeria.

> If what you want is actual code fragments on how OOP fails, I can post links of people who have written extensively on this. With code fragment examples and everything. But is that what you’re looking for ?

Yes.

> I did not claim that the premise of OOP was that “software work like hardware”. I think it was Tom who made that statement.

I said this:

> Utter, total, ahistorical, nonsense. Hardware designers love using other people’s chips and cores to solve problems. The whole IC revolution is based on private state. Software designers got the idea of OO from hardware. Haven’t you heard of “Software ICs”.

The idea of a Software IC was that you would purchase an object which would be a black box to you, like an IC is to a hardware developer. You would only be able to interface with it via it’s defined inputs and outputs. The internals would be completely inaccessible, just like they are in an IC. You would then compose your design out of these Software ICs.

This didn’t work the way the advocates of Software ICs thought, BTW. The problem is that while hardware designers may want to tweak the inside of an off-the-shelf IC they purchase they know it’s impossible, so they don’t try. Software designers, however, know that it would be relatively easy to change / fix a Software IC if only they had the source. So someone always wants the source. However people do successfully use object oriented libraries all the time. I regularly do, in both Java and Perl. And I cannot recall an instance where I needed to look at their private state to solve a problem. I can recall once or twice wanting to do so, but really, I have enough code to learn without having to dig into some Perl or Java library.

Black boxes are a feature! I do not want to read source code all the way down through the microcode to the bare metal.

No, but automatic memory management in dynamic languages comes pretty darn close to being a silver bullet for a huge class of problems, especially for one-off infrequently executed complicated stuff where programmer time is much more important than CPU time.

> No, but automatic memory management in dynamic languages comes pretty darn close to being a silver bullet for a huge class of problems, especially for one-off infrequently executed complicated stuff where programmer time is much more important than CPU time.

It’s more like sealant for wooden buckets where you use wooden buckets to hold everything – very useful – than a silver bullet to kill werewolves.

It doesn’t solve the modeling problem. It doesn’t solve the large organization communication problem. It other words, it doesn’t solve the big problem that everyone fights everyday where I work: The knowledge problem.

That’s what Brooks was complaining about. He didn’t have as many problems with automatic memory management either, because his systems avoided using heap memory like the plague.

Check this page out. I came across this last night also. Beautiful page dedicated to anti OOP. Full of code fragments. The author apparently loves global variables as much as a do. Even for GUIs (where OOP has made its success) he argues there are better ways of doing GUIs than OOP (which is pretty much my viewpoint also. Only that I believe in a different approach than his).

I will look up that Object oriented OS page for you. Where editing a text file or even printing it out happens through calling it methods.

Global state is not a silver bullet either.

I happen to think that most people are naturally hardwired to think in a procedural way (the majority) or in a functional way (the minority – which I happen to belong to). Neither of these two paradigms were hyped as “silver bullets” the way OOP was. That is definitely one of the reasons why OOP generates such strong emotions amongst those of us who always were in the naysayer camp.

> On the other hand, those big problems don’t tend to show up as the kind of point defects that the OOP people were thinking of, either.

Well they sure claimed they were going to attack the knowledge problem. OOP was supposed to make it easier to completely encapsulate things so that you don’t have to tell everyone about them. And you were supposed to be able to buy Software IC’s to handle all your data abstraction needs so that you could reuse code and not have to know about it. And you would be able to use inheritance so that you could customize your brand new Software IC without having to know it’s internals and rewrite it.

I’m pretty sure I internalized Jeff Duntemann on OOP before I internalized Brooks on bullets. I remember the excitement.

Later, XML was supposed to provide the same thing. Everyone was going to share their data models via xml, so that all billing systems (for example) would just pass around data using the same super-duper billing XSD.

Again, for a certain class of problem, it actually puts a bigger dent in these than you might think. I have a lot of internal customers. If I were going to spend 3 weeks coding something in C for them, we would have to spend about half a week over the course of a couple of weeks discussing exactly what I was going to build.

But if I just dash off a prototype in a day, the code itself becomes the communication. Not in the sense of “use the source, Luke!” but in the more concrete sense of “run this and describe to me if you need anything else.”

It’s a bit more complicated than that when I’m coding something in 3 weeks that would have taken 6 months. But, still, the knowledge that I will have a prototype soon and that modifications are mostly trivial means that I can get started much sooner with a lot less communication overhead.

Brooks had the worst of all worlds. A memory-managed OS would probably have been way too slow and resource hungry, especially on the hardware of the time, so that wasn’t really an option for him. But for a lot of other problems where multiple programmers are required, if memory management gives you enough of a boost in individual programmer productivity to reduce the total number of team members, then yes, it’s a modeling and communication win.

> Sure, global variables sometimes cause name conflicts and other problems, but is sure bloat and roundaboutness better than a relatively slim possibility of a name conflict?

> * There are procedural ways to group routines and scope to reduce or eliminate what would be global variables, but these rarely get any attention. Perhaps because name conflicts are not common enough to justify the extra syntax and expert proceduralists know this. Good procedural designs use the database as the “global domain model”, not tangled webs of lingering classes such that the namespace rarely needs to go outside of a given task. I am not promoting mass use of globals. Used judiciously, they are rarely a problem in small and medium applications in my experience.

> I just heard someone say that they found an old procedural program of theirs that used too many global variables and too many parameters. Rather than blame his bad programming or lack of knowledge about procedural/relational organization techniques, he blamed the paradigm and used it as a sorry excuse to proceed with OOP.

I don’t read that as love of global variables. I read that as: Global variables can be useful, but they cause problems if you aren’t careful. Maybe I’m projecting, because that’s what I think about them.

It looks like the article will be interesting, but I haven’t seen the part where he complains about hidden state.

Good procedural designs use the database as the “global domain model”, not tangled webs of lingering classes such that the namespace rarely needs to go outside of a given task. I am not promoting mass use of globals. Used judiciously, they are rarely a problem in small and medium applications in my experience.

End Quote.

This happens to be almost exactly what I was saying. The part that I love the most with most in reference to global vars is this: “Used judiciously, they are rarely a problem in small and medium applications in my experience.” I agree! I agree!

> It looks like the article will be interesting, but I haven’t seen the part where he complains about hidden state.

Sure. That might be more of a complaint for those of us in the FPL crowd. But I gave a pretty good example of that: Simulators + what Joe Armstrong has to say about hidden state which is pretty much what I have been saying, and which btw is supported by very solid and rigorous theory.

If my example is not clear we can dig deep into that example and I can give evidence from real world implementations of these simulators where they not only implemented the state as global variables, but where there hasn’t been a single successful attempt to to build an OOP simulator for these applications, IN SPITE OF THE FACT that the underlying system is made of nothing but interacting objects and in spite of the problem being heavily researched in academia/industry.

Still though he complains of “tangled webs of lingering classes such that the namespace rarely needs to outside of a given task”. I love that description of OOP. That is exactly how I always felt about OOP.

OK, so I really tried to read uma’s link, but when I came to the section on why “OOP reminds me of communism”, I’m sorry, I had to stop. I’m wondering if it is really a fake site, like the onion or something like that. Doesn’t Godwin’s law apply here?

What I asked you for uma is something from your own experience. My question was not “list the faults of OOP”, anyone who has worked with OOP knows, for example, that there are multiple ways to break down a polymorphism hierarchy, or that there is sometimes an impedence match between internal and external systems. The question is not whether OOP has flaws, it is whether global variables somehow solve these flaws.

Exactly how are global variables a superior solution to the fact that you sometimes need converters to map from particular file formats, or database tables, into internal representations? FWIW, anyone who has developed real software than does a lot of this sort of thing knows that encapsulation and polymorphism are important tools for making this sort of thing maintainable, And anyone who works with modern OO languages know that there are tools that automatically build entity relationships between internal structures and database tables using the full plethora of meta data in a database.

Exactly how do global variables eliminate the fundamental cause of polymorphic representational ambiguity, which is, after all, a design problem not an implementation problem?

But I think what we are all curious about uma is how come you are not talking about your own personal experience writing programs. How come you are busy citing other people rather than telling us the hurdles you have personally faced with OOP, and how they were solved by the catharsis of global variables. I, for one, am wondering if you ever got past fizz buzz.

And since global variables share the wealth of their values to everyone in the programming community, since no process really owns any data, and since, to make that work, you need a politburo of a mediation layer, it seems obvious that global variables are the real communist solution.

(That was irony for the humor impaired.)

And FWIW, I am not even an acolyte of OOP. My views on this are pretty close to Eric’s.

It doesn’t sound like what you were saying. It sounds like the opposite of what you were saying. You were making global variables sound like fine wine or a good cigar, like they were God’s gift to programmers, the silver bullet that would slay the OOP beast. When I read what he said, it sounds like global variables are a necessary evil.

Maybe this entire discussion has been a bunch of folks in violent agreement with one another.

> You were making global variables sound like fine wine or a good cigar like they were God’s gift to programmers

I think you ought to read back what I said. And read very carefully. I used very measured words when talking about global vars, and described a rigorous methodology for using them based on very strict criteria (e.g. 1-1 functions -aka no side effect functions- implementing the bulk of program logic, transactional layer that defines transaction+protocol in a manner that is easy to reason about across the entire program etc). I expressed my preference for global vars over OOP. Which is very similar to the preferences that he is making in his anti OOP page (for small and medium applications).

> When I read what he said, it sounds like global variables are a necessary evil.

Sure they are. All state is necessary evil including global vars. We have different conceptions though about what constitutes “global state” (a concept that is broader than global vars) and which is more harmful and which is less harmful. I happen to think that distributed/hidden state is THE WORST form of global state (I understand that you don’t call it global but I do). Joe Armstrong in his OOP sucks page seems to make a very similar point about hidden state!

> Maybe this entire discussion has been a bunch of folks in violent agreement with one another

I am not so sure. I think though that this anti OOP link that I posted is truly fantastic cure for anyone who thinks OOP and the last 20 years were great. Damn! I should have googled this shit earlier.

The example that I gave was SIMULATORS that simulate some phenomenon vs. real time and the kind of dependencies that result from modeling everything as an object. People simulate everything from biomedical phenomena to hardware in real time. If the example is not clear or you fail to understand something let me know.

What “global state” exists in a simulator that is better dealt with as global variables rather than public methods of an object that act as gatekeeper to that state?

# The Monster Says:
> What “global state” exists in a simulator that is better dealt with as
> global variables rather than public methods of an object that act as gatekeeper to that state?

As a matter of fact, simulators practically beg for an object oriented design. If, for example, you are developing a simulator for traffic flow on a freeway, it just demands that you have car objects that maintain their internal state, and interact on a freeway object. It just screams out to be written that way so that you can easily add more cars, or use polymorphism to implement tracks or Priuses, or add other freeway objects to simulate flow from one to another, or whatever else. If my memory is correct, C++ was originally designed specifically for writing simulators.

>If my memory is correct, C++ was originally designed specifically for writing simulators.

That sounds like a slightly telescoped version of the prehistory of C++. What is certainly true is that the OO features in C++ were modeled on the language Simula-67, which as the name implies was heavily oriented towards writing simulations. The early C++ papers, however, don’t indicate any special interest in that area.

> What “global state” exists in a simulator that is better dealt with as global variables rather than public methods of an object that act as gatekeeper to that state?

The global state would be represented in one gigantic matrix visible everywhere.

The alternative (modeling every physical object as as an object with its own hidden state) a) distributes that state across the objects b) makes it extremely difficult to follow program flow (which effectively would amount hopping from one object to another).

The easiest/best thing to do in this kind of simulations is to represent the entire system by one gigantic matrix (global), that would be solved and re-solved every time its needed (e.g. some event happens).

The global matrix is a beautiful snapshot of the entire system at any given point. The alternative (modeling every object as object) would make you totally lost as your execution hops from one object to the other. It makes it totally impossible for you to parallelize your code (if say you’re running on a supercomputer) because the arbitrary “hopping” execution is extremely difficult to visualize in a structured way.

Mind you, all we are dealing with here is real physical objects, a problem that -in theory at least- should map into OOP as easy and as cleanly as GUIs and widget objects.

If OOP really works (even in modeling real physical objects) things like finite-element modeling or a wide range of other numerical modeling techniques (which effectively all rely one gigantic global state) would have vanished. The FORTRAN folks would have gone extinct and instead, we would have ended up with OOP simulators for every system out there that is made up of underlying/interacting physical objects.

It makes it totally impossible for you to parallelize your code (if say you’re running on a supercomputer) because the arbitrary “hopping” execution is extremely difficult to visualize in a structured way.

Keeping things in large arrays might make it possible for multiple CPUs to efficiently do things in parallel, until you run into memory bandwidth problems.

Then you resort to non-shared memory or NUMA, and then you wind up using something like MPI or Linda for your IPC, and all of a sudden huge global arrays aren’t the best solution…

uma Says:
> The global matrix is a beautiful snapshot of the entire system at any given point. The alternative (modeling every object as object) would make you totally lost as your execution hops from one object to the other.

I have no idea what “hoping” means in is the context. There is no more hoping than there is when the processor hops from one function to another.

The idea that oop makes parallel programming harder than using a global variable is so totally backward I barely even know where to begin. The key to parallel programming is managing state in small controlled chunks, relevant only to the running thread as much as possible. This prevents process/thread dependencies, reduces the need for interlocks and their consequential deadlocks, and reduces the possibility of the bane of parallelization: race conditions. Putting all the state into one huge global variable is exactly the wrong thing to do for parallel programming.

Keeping things in large arrays might make it possible for multiple CPUs to efficiently do things in parallel, until you run into memory bandwidth problems.

Then you resort to non-shared memory or NUMA, and then you wind up using something like MPI or Linda for your IPC, and all of a sudden huge global arrays aren’t the best solution…

I agree. Parallelizing over matrices isn’t easy or straightforward. This comes from the tightly coupled nature of matrices. It’s still far easier though to parallelize matrices than doing your simulation with objects. Usually the most difficult decision is where to make your cut and partition. Partitioning in the “right place”, building interconnect matrices between the partitioned chunks, and the use of MPI does tend to work out. Still you’re not achieving true scalability with this approach (ie your execution time does not linearly scale down with CPU cores). You’re as slow as your slowest partitioned matrix chunk (amdahl’s law).

There is no more hoping than there is when the processor hops from one function to another.

True. But remember we have modelled all our state as global. ‘Hopping’ between functions here (e.g. the solver functions) is the equivalent of ‘hopping’ between the internel methods of one object, not ‘hopping’ between different objects.

It doesn’t sound like it because you are not providing the details. What particular problem were you trying to solve? How did OOP make it hard to solve in ways that global data would have made easier.

> I think you ought to read back what I said. And read very carefully. I used very measured words when talking about global vars

Maybe you should read your words very carefully. Such as “measured” words like these:

> The author apparently loves global variables as much as a do.

You also spoke of how a global data solutions “shines”.

Your problem is a lack of source and visiblity, not OOP. There is no hidden data in OOP when you can debug into the OO code and see the hidden state. It’s no different, if you don’t have the source, from a record in Pascal, or a struct in C, or even the COMMON data area in a COBOL program. If someone writes a carefully constructed piece of code which pulls the data out of a database the way you like and gives you a web API to it, all built with procedural or functional code, but which does not reveal to you all the data it is using to make it’s decisions, you have the same problem. Where I work, I’ve seen this problem when interacting with a COBOL program, for goodness sake. Our non-COBOL code wants to know why the COBOL code made a particular decision, but the COBOL program doesn’t make that data available.

We see this all the time, because we have hundreds of applications interacting, each with their own databases and their own local states. They cannot put all their data in one big matrix for your programs to look at. For one thing, some of the data is terabytes! For another thing, some of the data is private customer data which is protected by federal law and your program does not need to know all of it, so you are are not allowed by federal law to know it. Other data is merely protected by contractual agreements with our corporate partners, but again, you and your particular program does not need to know it (although, again, your program may need to know some of it), so access to that data is restricted. For another thing, giving your application access to every piece of data my application uses (including configuration variables, etc, etc) would involve a great deal of expensive work on my part which my company would have to pay for.

Believe me, when I work a production support issue, I would love to have root access to all the servers I’m interacting with and DBA access to all the databases involved, along with all the source code for all the apps. That ain’t gonna happen, because the security people at my company aren’t idiots.

Hidden state isn’t OOP. It’s real life in the real world. Deal with it.

The link I posted yesterday is very specific to business logic applications – which seems to be the line of work that you’re doing. That site is about as thorough as one can get.

> It doesn’t sound like it because you are not providing the details.

I have given you examples from system architecture at large, to real world simulators that a) I have written myself b) I have been around for a long long time. Objects fail in modeling real objects in the real world in just about any situation not involving strict/clean hierarchies. They also fail because of the evils caused by their hidden state. When an object has hidden state, yet it arbitrarily screws with everything around it based on that hidden state, then maybe that state shouldn’t be hidden after all. Maybe it should be global.

What if that hidden state causes a nasty bug? What if that hidden state yields to a subtle proliferation of other “bad” hidden states around the system? All you’re saying is that this is the “real world. Deal with it”. And all I would tell you is “BS”. Just because one paradigm fails in how it approaches the question of “state” in the program, doesn’t mean that others do.

> You also spoke of how a global data solutions “shines”.

And I gave real world examples for that. Read all of what I posted about simulators again. I have given you everything you asked for: links with the same point of view. Links which thoroughly debunk OOP for business apps and databases. Examples from my experience(s) with hidden distributed state vs global state in simulators. What else do you need ? Be specific. That way I can hit the nail on its head.

I have no problem with “hidden state” that does not impact your relationship to the outside world. For example, some internal loop counter that the compiler automatically generates for you (on say a ‘foreach’ statement) which you use for some trivial internal housecleaning which does not in any way impact the outcome of calling on your public methods. I have no problem with *that* kind of hidden state because that kind of hidden state is harmless at a global/program level. It causes no *side-effects*. It doesn’t have any impact what so ever on my interaction with your object. I call your method with the exact same arguments. I get back the exact same result ALL THE TIME. I would call TomDeGisi.Add(a,b) and I would get the exact same result all the time. All the time. Now, if I call TomDeGisi.Add(a,b) and instead of getting (a+b) I get a message box telling me that you’ve erased the hard disk because (because of some internal state that you have) then that internal state of yours that you fucked our world with SHOULD BE GLOBAL.

Now how to manage all that global state as a whole is another legitimate question which we can answer once we agree on the evils of hidden state.

Just because languages like Java or C have no mechanisms of enforcing that doesn’t mean that other paradigms suffer from the same problem. Does that make sense?

In other paradigms, while state is carried by the entire program as a whole, it is only special functions that do the modifications.

A good analogy from hardware is the “Mealy machine” once again. If the Mealy machine is implemented with D flip flops it is only the gates (functions) the drive the “D” signals directly that cause a change in the global state. Even if the entire state machine (program) has “visibility” of the state.

1) I have white hair.
2) It was a billing system.
3) The company that sells the billing system never throws any code away. They merely wrap a layer of the silver bullet du jour around what they already had.

So –

Yes.

uma,

> Now, if I call TomDeGisi.Add(a,b) and instead of getting (a+b) I get a message box telling me that you’ve erased the hard disk because (because of some internal state that you have) then that internal state of yours that you fucked our world with SHOULD BE GLOBAL.

You can wish in one hand and spit in the other and see which one comes true. You can’t have global data. I don’t care how much you want it. You can’t have it. It has nothing to do with OOP, at all. Nothing. You can’t do without hidden state. It exists. It is written in the fabric of the universe. It isn’t OOP. It’s life. When we were writing code to read wheel sensors (which sense when a railroad car / locomotive wheel is above the sensor) and using that to determine speed and using that to determine the length between the wheels and using that to divide the train into cars the changes in the acceleration were hidden state. Similarly, I can want to know the current state of one of our customer’s in one of our provider’s databases. It would be very useful to me to either audit the data or to provide it to our customer service reps so they can see when things are wrong and fix them. But our old provider does not give us an API so we can see it. That would cost extra. None of the existing providers for this service provide such an API. We now have a new provider. They don’t provide such an API yet, but they will, because, for once, an IT guy (me) asked for one. But if they weren’t very hungry for our business we still would not have one. Your global state wish costs real money. That’s why it’s a fantasy.

> Is this explanation detailed enough ?

Not to answer the question we want answered, which is: why is OOP the problem? I know that hidden state is a necessary evil in all the ways you have stated. My examples make that clear. The last time I ran into an evil hidden state in an evil Java program the evil hidden state was encapsulated in a database just the way you like. What isn’t clear is how OOP is the problem. OOP does nothing to create hidden state that every other programming paradigm I’ve ever used doesn’t do. Give details that explain how OOP did this to you.

Sure. It just isn’t correct from a system perspective. Your super duper program written in your purely transparent language does not expose all its data to my super duper program written in the same language but on a different box ’cause your bosses told you (for very good legal reasons) not to share it.

> What isn’t clear is how OOP is the problem. OOP does nothing to create hidden state that every other programming paradigm I’ve ever used doesn’t do

Not *every other*. And that is where we disagree. Procedural has the same problem too. Yet it is less. The only way for my add(a,b) subroutine to return the messagebox with “erased hard disk” message is if say it polls a global and decides to do that based on what it finds in that global var.

Since global variables are equally frowned upon in both procedural and OOP, and since OOP encourages/celebrates hidden state, it is safe to assume that chances of some hidden state screwing up our existence, is far more likely than the chances of some subroutine doing the exact same thing on its own.

It is also another demonstration of how your hidden state, buried 17 fucking levels down in your hierarchy, is indeed GLOBAL. Because it only takes a global for my subroutine to be able to be able to fuck our world up in the same way.

Now having said all that, back to your *every other* statement. This is simply flat out wrong. *Every other* paradigm does not have or (worse) encourage hidden state. Pure FPLs do not create hidden state -> They don’t have mutables of any sort. period.

> Not *every other*. And that is where we disagree. Procedural has the same problem too. Yet it is less. The only way for my add(a,b) subroutine to return the messagebox with “erased hard disk” message is if say it polls a global and decides to do that based on what it finds in that global var.

Almost. Structs / records have the same problem as objects. Good procedural code uses structs to organize data.

> It is also another demonstration of how your hidden state, buried 17 fucking levels down in your hierarchy, is indeed GLOBAL. Because it only takes a global for my subroutine to be able to be able to fuck our world up in the same way.

Yes, we are in violent agreement again. I know all the ways data can hide.

> Pure FPLs do not create hidden state -> They don’t have mutables of any sort. period.

True. It’s also true that pure FPLs would not have been useful for any application I have ever worked on. The only FPLs I would ever consider using in real life are not pure.

Is the Message Box example I gave to Tom clear as far what the code would looks like ?

@Tom DeGisi:

> True. It’s also true that pure FPLs would not have been useful for any application I have ever worked on.

I happen to think that FPLs (pure or not) fit the DB business logic type of applications far better than than OOP. That is just my opinion. They won’t solve your cobol black box problem. But they would certainly on their own.

> I happen to think that FPLs (pure or not) fit the DB business logic type of applications far better than than OOP. That is just my opinion. They won’t solve your cobol black box problem. But they would certainly on their own.

Non-pure FPLs are wicked cool and sound fine for business applications. I would love to work on something written in Erlang.

Pure FPL + DB != Pure FPL

In addition, in a large business IT environment pure FPL with a DB will have hidden state.

uma Says:
> Is the Message Box example I gave to Tom clear as far what the code would looks like ?

No of course not. It bears no resemblance to reality. I have never had an Add function erase my hard disk. It is just more hand wavy theory. I am looking for an actual report from the field of bug you had because of the accused OOP that global variable would have solved. You know, a bug that was reported in Bugzilla that you actually had to fix, in a real program that did real things.

Suppose you’re using this routine to add some numbers in a banking application. Does that make it a bug similar to these Bugzilla field report bugs of yours? It’s a bug. It might even be worse than erasing the harddisk and the messagebox that tells you that it did that :-) And it is probably buried 17 layers deep into some godforsaken OOP/POOPOO hierarchy.

In a subroutine the only way you can create the same that behavior is

if (myGlobal == ABC) {
return 1000000;
}
else {
return a+b;
}

It takes a global for my routine to replicate your hidden state bug.

But, since this last fragment (the subroutine version) can also exist inside your object’s method, that makes your OOP/POOPOO methodology twice as capable of generating the same sinister bug. It can create the bug with a global (as did my subroutine). Or it can do it with the hidden state. Now since globals are universally frowned upon, but hidden state is celebrated by the POOPOO crowd, that makes the *real world* likelihood of your POOPOO code generating this kind of bug, perhaps 10 times as likely vs if your application was written in good old procedural. That might also explain why Java hasn’t managed to replace all that cobol :-)

This is what Joe Armstrong meant when he said that POOPOO has the “worst possible model”. And that bug Ms. Jessica is real. As real as it gets!

First of all, your example is not real, it is made up. “We’ll make it return…” tells me it is not an instance of something you have actually seen, but is again a theoretical thought experiment. All design methodologies have pros and cons, the question is how do these balance out in the real, practical world of writing programs, and unless you are talking real examples, you end up in silly, unimportant local minima traps.

Nonetheless, your example is so silly, I’ll run with it. You say what if some internal state affects the output of the function; it might be hard to find. That is true, whenever a function’s output is not a direct calculation from its explicit inputs, it is more complicated that a pure function.

However, somehow you discount exactly the same effect of the global state. The global state is just as likely to affect the result, and this is, in my practical experience, much worse. Why? Certainly the global state is more visible, but that is a problem, not a benefit. It means that it necessarily can be changed from a long distance away (regardless if it is changed through an access layer or not.) So the things that affect the function’s output are distributed all through your code.

When the state is “hidden”, which is to say encapsulated, all access to the state is localized into code right near the problem function or in one of it’s base classes. It is easy to see all the things that might affect the function’s output, because all the functions to which the internal state is accessible are right next to the affected function.

So you example is exactly backward. State affects things, however, localized encapsulated state is much more controllable and tractable. If you can change the state everywhere in your program, you need to look everywhere in your program to fix a bug. If you can only change it in a limited number of places, you only have to look there. This is encapsulation, and it is one of the core benefits of object oriented programming.

This is particularly important when it comes to unit testing, and especially test driven development. If the object or your test has its input state spread over many different global variables it is almost impossible to control that set of states in such a way as to generate tightly controlled, isolated test cases. Without solid, white box testing instrumentation in place you software will quickly spiral into a never ending bug list.

If you wrote large programs and worked with teams of programmers, I wouldn’t even need to tell you this, because it would be obvious to you.

Despite its name, myGlobal doesn’t have to be a globally-scoped variable. For instance, it can in fact be exposed only to other functions in the same source file, whose object code is then linked into your application. It could be part of a library where the linking is done at load time, and you don’t even necessarily have source code for it.

But even if it is a global variable, there is no place in the program other than this subroutine that privileges a certain value of this variable to be able to cause a function that normally adds its two arguments to instead arbirarily return the value One Meeeelion <holds pinky to corner of mouth while stroking weird-looking cat in lap>.

If your source code gives this special meaning to a certain value of myGlobal it is a problem regardless of whether you’re writing in an OO language or not. OO doesn’t magically make incredibly stupid code better. It just forces you to explicitly specify what information is allowed to go where.

Object-oriented programming is eliminated entirely from the introductory curriculum, because it is both anti-modular and anti-parallel by its very nature, and hence unsuitable for a modern CS curriculum. A proposed new course on object-oriented design methodology will be offered at the sophomore level for those students who wish to study this topic.

Jessica dear: I DON”T write code as part of “teams of large programmers” (presumably in corporations). I QUIT THAT. A long time ago I did.

> Jessica dear: I DON”T write code as part of “teams of large programmers” (presumably in corporations). I QUIT THAT. A long time ago I did.

I see why your opinions and evidence are so completely useless for our purposes. You do different work. Maybe your opinions are worthwhile in that domain. What is it? Organic chemistry? Astronomy? Linguistics?

The example is clear. We’re talking about two things here a) global state b) hidden state

The example was given in terms of these two. What you ramble about isn’t invalid. It is just a “distraction”.

What you call “distraction” is my attempt to demonstrate that the question of whether the information is global is orthogonal to the question of the language being OO. The procedural languages have ways to make this information non-global, and the OO languages have ways to make information global. The difference is how hard you have to work to do one or the other. In general, OO makes it easy to do “hidden state”; to expose information to the world, you must do so explicitly. Depending on the specific procedural language, it can be rather difficult to do hidden state, but none make it impossible.

Your “unit testing” and “real world bug” rants sound a lot like those corporate HR ladies and their silly checklists. If some employee(s) do not fit their checklists and corporatese babble, the problem is with the employees not their checklists.

I have posted evidence/examples/links of every single claim I made. Including the last quote about how they are dropping OOP from curricula because it is both “anti-modular” and “anti-parallel” by its very nature. In other words, I am not the only one making those claims. What have you posted other than silly rants about my background ?

I know who ESR is thank you very much. If I didn’t know who he was, and if I wasn’t (among so many others) thankful for his work and contribution to mankind I wouldn’t be spending time on his blog.

> Your “unit testing” and “real world bug” rants sound a lot like those corporate HR ladies and their silly checklists. If some employee(s) do not fit their checklists and corporatese babble, the problem is with the employees not their checklists.

Your dismissal of the “unit testing” and “real world bug” issues makes you sound like you know nothing about either large non-corporate open software or corporate software. All the cool kids in open source, most of whom would rather not give the time of day to “corporate HR ladies and their silly checklists” absolutely love good unit tests and spend their time on real world bugs.

What do you do for a living?

> I have posted evidence/examples/links of every single claim I made.

Most of which is of low quality. The links mostly aren’t helpful, since they seldom address the hidden state problem.

Let me offer you this suggestion. Maybe the “fault” is with your understanding not my opinion or my examples. Have you actually considered that possibility? Have you considered the possibility that some knowledge may actually be beyond the grasp of the Jessicas of our world ? (e.g. the kind of knowledge that led to dropping OOP from the curriculum because it is BOTH “anti-parallel” and “anti-modular”).

You ought to consider this possibility man. Either Jessica and her common/shared wisdoms is right, or those who claim OOP to be “anti-modular and anti-parallel by its very nature” are. It cannot possibly be both ways.

Well, since you haven’t been able to explain yourself to Jessica, or Monster, or esr, or I, I’m going with the problem being you. For example, in the last comment you tried changing the subject again. We want to know about how your hidden state issue is solved by global variables. None of us is in love with OOP. You bring up “anti-parallel”. Well, I’ve already mentioned that I like Erlang. One reason is the way FP makes parallel processing easier. But your global variables don’t solve that problem. They may make it worse. Not helpful in explaining hidden state and global variables. Rather the opposite. OOP is anti-parallel? Sure, just like procedural. And completely beside the point.

Anti-modular? Global variables are more anti-modular than OOP. FP may be more modular than OOP. Is pure FP more modular than impure FP? Dunno. Depends on the definition of modular, I guess. Again, beside the point.

You keep trying to change the subject to “OOP is bad”. Not interested. The problems I have aren’t going to be solved by switching to FP. They are going to be solved by people who care about the quality of their work, no matter what programming paradigm we use.

I want to understand your hidden state / OOP / global variable point. I can use that information. FP? Cool, but it isn’t happening where I get paid anytime soon. Get rid of Java? OK, but it isn’t happening where I get paid anytime soon either.

You/Jessica are definately more in love with OOP than I am with global mutable variables.

Well, I’ve already mentioned that I like Erlang. One reason is the way FP makes parallel processing easier

But you haven’t asked yourself the question why the Erlangs of our world excel at parallel programs, when state (all state) in languages like Erlang gets carried by the entire program as a whole (i.e. is global and transparent, albeit cleverly managed). If we believe Jessica, we should be in deep doo-doo, and not able to write programs that beautifully and linearly scale with CPU cores the way we can do with Erlang.

Global variables are more anti-modular than OOP.

If you allow anything to touch them and modify them, then yes they are at the same level of anti-modularity as OOP (They still offer the benefit of transparency though). But if you manage them well (allow their modification but on very strict rules), they actually help modularity. Case in point FPLs which treat state (all of state) as one gigantic global, yet they manage the system/protocol by which it is accessed/modified.

It is the “random-access, random-mutate” languages like C, which offer no mechanisms/rules for managing that global state that have/create a problem. Those are the languages that have given globals a bad name.

If you *get* this subtle point. Then you will get everything that I have said here.

Explained in other words: The reason you don’t want me to know about your hidden state, isn’t because that hidden state impacts me and therefore I am entitled to know something about it. It is because you’re scared shitless that I (or some other arbitrary thing) might modify it. In languages like C, Java, that risk is very real and is there. In FPLs it isn’t. Somehow, some way (by sheer magic), they make your state transparent to everything it impacts, yet they don’t allow arbitrary modification of it. They only allow modifications via special mechanisms/rules. The failure of OOP, is NOT in its desire to protect that state from tampering by everybody else (encapsulation). The failure comes from the lack of transparency when that internal state impacts global behavior of the program as a whole .

The problems I have aren’t going to be solved by switching to FP. They are going to be solved by people who care about the quality of their work, no matter what programming paradigm we use.

People who do quality work are needed everywhere. You’re stating the obvious here. The reason problems do get solved by switching to FP are:
a) FP is incredibly expressive (possibly order of magnitude less lines of code than Java)
b) FP understands the question of state like no other paradigm.
c) FP is inherently modular. It is the ideal paradigm for unit testing. A function is the ideal “unit testing” unit.
d) FP is inherently scalable.

I want to understand your hidden state / OOP / global variable point. I can use that information.

That is a fair question. What you are asking me essentially is this: “How can I use Java to write code in a functional style” ? I don’t have a short answer for this.

I have an Erlang book. I read it. It was fascinating. I just don’t have a way to apply it at work yet.

> But you haven’t asked yourself the question why the Erlangs of our world excel at parallel programs, when state (all state) in languages like Erlang gets carried by the entire program as a whole (i.e. is global and transparent, albeit cleverly managed).

I knew that. It’s also true to a certain extent in Perl and Java, via reflection and other deep magic.

> But if you manage them well (allow their modification but on very strict rules), they actually help modularity. Case in point FPLs which treat state (all of state) as one gigantic global, yet they manage the system/protocol by which it is accessed/modified.

Yes, and if globals are write once they parallelize beautifully. In any paradigm.

> That is a fair question. What you are asking me essentially is this: “How can I use Java to write code in a functional style” ? I don’t have a short answer for this.

I do. Jeez. Don’t save any state. Have everything return a value. You can write pure functional code in almost any language. You can depurify it with your big world global by just by having one class, call it say, Global, with all the variables in it as public statics. Easy. Want to make some of the variables write once? OK. Make them private, add a public getter and setter, but put a private guard variable on the setter.

A Java or Perl guru (unlike me) could do some fancy things to make this even easier.

FP languages make writing functional programs easier, like OO languages make writing OO programs easier. You can write either in C, or even assembly. It’s just harder.

Perl 6 HAS functional programming features and they are migrating into Perl 5.

> The failure comes from the lack of transparency when that internal state impacts global behavior of the program as a whole .

Yeah, that’s the part we want to know about and you haven’t explained. Everything above is interesting, but I already knew it. It why I like Erlang and am interested in it. It’s also beside the point. Explain, giving a real example, a problem about internal state. I maintain that getting rid of hidden state can be a good idea, but it’s generally unavoidable, and there are reasonable design decisions where you can chose to have internal state. In the case of OOP, if you have the source it isn’t a significant debugging problem. If don’t have the source, all states are hidden, in OOP, procedural or FP. This is why you have to explain it. Because it doesn’t make any sense.

> Yes, and if globals are write once they parallelize beautifully. In any paradigm.

That is why I always thought, a good implementation of globals in languages like C would have been with something like unix file file system privilages (rw). Only one or two functions would be allowed to write. The rest of the system only has the privilege of observing.

> It why I like Erlang and am interested in it

Erlang has a truly elegant design. It also somewhat departs from other FPLs especially in its concurrency model which is based lightweight fully-isolated process that are inexpensive to spawn and which communicate via message passing. Other FPLs use software transactional memory and atomic transactions to do concurrency. I prefer the erlang model of concurrency (actor model) than the STM model. It also has an elegant way of handling global state and making it visible to all the lightweight processes that are always doing their concurrent thing.

> I maintain that getting rid of hidden state can be a good idea, but it’s generally unavoidable, and there are reasonable design decisions where you can chose to have internal state.

The only criteria that works is this: If the state has only local impact then it deserves the right to be local. If that state impacts the return value of any public methods or impacts global behavior any other way, then it doesn’t. It has to be made transparent.

> In the case of OOP, if you have the source it isn’t a significant debugging problem.

It is if some kind of subtle bug develops based on internal state. It is very difficult to debug. We have already seen how some very local state can have very unlocal effects. If those unlocal effects build up in a gentle way and ripple through the system they can be very difficult to trace.

For example, let us say that the add routine makes a subtle mistake rather than the drastic ones that I gave. Let us say that subtle mistake causes the program execution flow to go in some other direction (other than what would have happened had the precisely correct value been returned). As the execution flow takes its diversion, it results in yet another other small error due to some other internal state of another object (say the parent object). And so on. Let’s say your errors accumulate gently over a month-long period before you commit your records to a database. These type of bugs are so difficult, you don’t even know where to start.

Explained in other words: The reason you don’t want me to know about your hidden state, isn’t because that hidden state impacts me and therefore I am entitled to know something about it. It is because you’re scared shitless that I (or some other arbitrary thing) might modify itThat’s not true. There are also scenarios in which I don’t want you reading “state” that belongs to some other part of the program because I don’t want you to rely on that state. You may even read the source code for one version of my code and think you completely understand what I’m doing with a certain variable, feel entirely justified in assuming that when that variable’s value is equal to your variable ABC it should modify the behavior of your Uma.Add() routine. But then I change the behavior of my part of the program, and that particular value of that variable no longer has the special significance you thought it had.

If I want a globally-accessible state variable/function/method, I’d better explicitly document the definition of it. If I don’t, the project manager will ask me why I’m exposing this information without writing that spec for you to follow when you write your part of the code. And none of this has a damned thing to do with whether we’re using an OO language.

For example, let us say that the add routine makes a subtle mistake rather than the drastic ones that I gave.

Allowing that routine to depend on variables other than the two explicitly passed to it are what causes the mistake. That seems like a really strong argument against leaking that information, even read-only.

> That is why I always thought, a good implementation of globals in languages like C would have been with something like unix file file system privilages (rw). Only one or two functions would be allowed to write. The rest of the system only has the privilege of observing.

Perl 6 has features like this. In Perl 5, if you want these features you go with Moose, an OO library. Heh.

> The only criteria that works is this: If the state has only local impact then it deserves the right to be local. If that state impacts the return value of any public methods or impacts global behavior any other way, then it doesn’t. It has to be made transparent.

The Monster covers this. You are dead wrong. In many cases it is a really bad idea to make it transparent. This is actually a classic problem in API development for all sorts of OS and library calls. If you expose your internal state, people WILL write code that depends on it, even if you document that it may change in future releases. You end up being unable to improve your OS / library because you will break somebody’s code. Fail.

> It is if some kind of subtle bug develops based on internal state. It is very difficult to debug. We have already seen how some very local state can have very unlocal effects. If those unlocal effects build up in a gentle way and ripple through the system they can be very difficult to trace.

> For example, let us say that the add routine makes a subtle mistake rather than the drastic ones that I gave. Let us say that subtle mistake causes the program execution flow to go in some other direction (other than what would have happened had the precisely correct value been returned). As the execution flow takes its diversion, it results in yet another other small error due to some other internal state of another object (say the parent object). And so on. Let’s say your errors accumulate gently over a month-long period before you commit your records to a database. These type of bugs are so difficult, you don’t even know where to start.

More hand waving. Give us an example.

The truth is that hidden state is easy for me to deal with if I have source and hard otherwise.

What do I do if I have OO code? Add a getter and print the state to the log. C code? Make the static external and print the state to the log. FP code with your monster global? Print the state to the log. Gee, I saved a few lines. The real work is the same. Find the code to modify is the same. Modify the code is slightly easier. Regression test is the same. Test to see if what I need is being written to the log is the same. Fill out the $%@#@ paperwork to migrate to production is the same. Engage the integration team is the same. Write the readme file is the same. Watch the results in production is the same. FP hasn’t bought me much.

Notice that was the easy branch. If I don’t have source I get to sit on a call with dozens of other people while we try to figure it out. Getting a vendor to add a logging statement to their code is almost impossible, even if they wrote in in FP.

You aren’t making your case. You are making a great case for open source, though.

> The Monster covers this. You are dead wrong. In many cases it is a really bad idea to make it transparent. This is actually a classic problem in API development for all sorts of OS and library calls. If you expose your internal state, people WILL write code that depends on it, even if you document that it may change in future releases. You end up being unable to improve your OS / library because you will break somebody’s code. Fail.

Sure. That isn’t a transparency problem. That is a dependency/design problem. In other words your design either wants to screw me over with your internal stuff, or creates a dependency on your mutable state that makes code more change-unfriendly. Either way it is a design problem that originates from your object, not from my need to know about something that impacts me. It isn’t a transparency problem.

It isn’t that I am entitled to know about something that impacts me that is wrong. That is a fundamentally correct principle. It is the change-unfriendliness of the code that results from this situation that is the problem. That is in fact another reason why internal mutable state is as evil as global variables and no different in many of its global effects. The best thing to do is to eliminate the internal state altogether in your design to break the dependency. That requires some creativity! And puts the burden in the right place (your object and your design).

> More hand waving. Give us an example

No handwaving. Read the example again. And ask more specific questions what you don’t understand. Is it the way the errors accumulate that you don’t understand ?

> Sure. That isn’t a transparency problem. That is a dependency/design problem. In other words your design either wants to screw me over with your internal stuff, or creates a dependency on your mutable state that makes code more change-unfriendly. Either way it is a design problem that originates from your object, not from my need to know about something that impacts me. It isn’t a transparency problem.

> It isn’t that I am entitled to know about something that impacts me that is wrong. That is a fundamentally correct principle. It is the change-unfriendliness of the code that results from this situation that is the problem. That is in fact another reason why internal mutable state is as evil as global variables and no different in many of its global effects. The best thing to do is to eliminate the internal state altogether in your design to break the dependency. That requires some creativity! And puts the burden in the right place (your object and your design).

In cases where people are taking design shortcuts, yes. Most of those are taken because neither time, nor money, nor design skill, are limitless. In other words, they are good designs, too, given the constraints real people operate under. Y2K was like this. We saved a lot of money by only saving two digits. In most cases though it is reflecting this truth: Reality isn’t pure functional programming. It is, for example, completely impossible to have a disk system or database that does not have mutable state. And no, you do not need to know the internals of the file system or the database, even though it certainly can impact you like mad.

> No handwaving. Read the example again. And ask more specific questions what you don’t understand. Is it the way the errors accumulate that you don’t understand ?

Not a real example. Handwaving. Hypotheticals. I understand your example and I understand that it is made up. In the real world, when I have had a problem that has taken months to develop it has always been related to human errors accumulating, with usually only one machine error. Rarely do I see two or three machine errors, and most of those happen bang, bang, bang. The sequence of events MUST depend on internal state because I am reflecting the real business world, which is not functional. Events are required by law, contract and business methods to proceed sequentially through a series of steps. That’s state. Vast amounts of the system must be hidden to me, like: “What was that customer representative thinking?” or “What was that system, which does not belong to me, doing?”

Trust me, having done tons of prod support, I want to see as many things as I can. I am well aware of how horrible it is not to be able to see old state. Shoot, I frequently can’t see old logs! But neither FP nor anyone’s fantasty design skills will fix this.

One more thing. In FPLs you want need logs because there is no internal state to log.

The best analog I can give you is unix pipes. Imagine this. Some program in unix reads a text file. Then pipes it to another. Then pipes it to another. And so on. At the other end of the pipe you redirect output into a file that gets stored on the disk. No mutable state is created a long the way. It’s just that one function feeds it output to the other. The other in turn feeds its output to another and so on.

Another analogy is spreadsheets. Spreadsheets are like FPLs. Your column/cells are expressed as functions of other column cells. There are no mutables. You make a change to one cell, the other cells that depend on it change appropriately. If you got your functions written right (the composition), and your sheet does what it supposed to, then you are done. There is no “internal state” to be logged. There is no mutable state to debug any where. There are only inputs, outputs, and functions that do transformations between inputs and outputs.

Not only this, “unit testing” becomes a breeze. Why? because testing individual functions like MAX, or MIN, or SUM for correctness, is sufficient to guarantee that any composition using them will also yield correct results. If MAX, MIN, or SUM are correct, then any composition made of them is also correct. You wouldn’t have to go debug inside MAX or MIN. You just have to do your composition right. That is why FPL is fundamentally modular. While OOP isn’t.

> Not a real example. Handwaving. Hypotheticals. I understand your example and I understand that it is made up.

If that is your objection, I have had that stuff happen to me in the real world. Specifically in writing simulators (of biological systems). Slow gentle buildups of errors that make your head spin. Especially if the errors are two small to notice initially, but only become noticeable after they accumulate.

> It is, for example, completely impossible to have a disk system or database that does not have mutable state.

Sure. Programs that have no “side effects” (ie that produce no mutable state – e.g. write to your harddisk), do nothing other than get your CPU hot. The reason we need computer programs is the “side effects” they produce that are useful to us humans. FPLs have mechanisms for dealing with input, output, global state etc of the places where you need state. That is what all the ‘theory’ is about. How can you make a whole paradigm that is “stateless” by default, yet still capable of producing outputs that are useful to us that include mutable output state.

> One more thing. In FPLs you want need logs because there is no internal state to log.

I was talking about globals in an FP which isn’t pure.

> The best analog I can give you is unix pipes.

Sometimes I have to save those piped results to find defects, even though they are, as you said, purely functional results.

> That is why FPL is fundamentally modular. While OOP isn’t.

Your example shows that pure FP is modular at the function level, while OOP is modular at the class level. This is well known, and does not bother me. I don’t need modularity all the way down to the function level.

> Not only this, “unit testing” becomes a breeze. Why? because testing individual functions like MAX, or MIN, or SUM for correctness, is sufficient to guarantee that any composition using them will also yield correct results. If MAX, MIN, or SUM are correct, then any composition made of them is also correct. You wouldn’t have to go debug inside MAX or MIN. You just have to do your composition right. That is why FPL is fundamentally modular. While OOP isn’t.

Sure. Any time you save state you increase the number of execution paths which must be tested. Exponentially. This is also well known.

> If that is your objection, I have had that stuff happen to me in the real world. Specifically in writing simulators (of biological systems). Slow gentle buildups of errors that make your head spin. Especially if the errors are two small to notice initially, but only become noticeable after they accumulate.

Great! Can you happen to remember any details? What the problem was and how you eventually solved it? I know that’s hard. I don’t remember anything close to the number of defects I’ve worked on in my life.

> Your example shows that pure FP is modular at the function level, while OOP is modular at the class level.

I don’t agree with 2nd part of the statement. I don’t agree that OOP is modular at all. Even at the class level. Simply because of the “inconsistent” behavior behavior resulting from hidden mutable state. I may call your public methods with the exact same arguments yet get two completely different results. This would not happen in FP. For something to be modular, it has to guarantee consistent behavior at its interface points. OOP doesn’t.

> Great! Can you happen to remember any details?

This was > 15 yrs ago. I have since moved to a completely new career. So I don’t remember the exact details. But The language that I used was Obj Pascal (Delphi). This was all done for prototyping, and vetting of our models etc. I would notice for example that a week or two into my runs I would be seeing strange behavior. In the end I found it was due to gentle buildup of errors similar to what I explained.

However i just wanted to correct a misunderstanding that you seem to have about erlang. Saying that any state in erlang is global is about as wrong as you can get and still breathe.

Nothing in erlang is global and anything you think is global is probably an actor, that is a separate erlang process that anyone can message and get replies from. You could probably do a really good approximation of what you’re talking about as an actor. But the state is still hidden from anything that’s not that specific actor. The only way to access it would be to send messages to that erlang process and get a response back and that would be equivalent to writing a .getX() method in OO.

Using Joe Armstrong to support your point of view is also wrong headed. Joe’s primary and principle thesis is that any state that is shared means that your code is now potentially unsupportable and unscalable. He’s even gotten a PHD based on this thesis. He doesn’t really believe the OO hype, and I don’t blame him. The OO hype is as retarded as any IT hype.

Note that i stopped monitoring this thread when i realised we were mostly in violent agreement, we just disagree as to what we’d prefer. Your experiences tell you that global variables are supportable and OO is evil. And my experiences with erlang and listening to Joe evangelise the dangers of shared state is just one of the reasons why i hold that globals are pure evil. Another reason is I expect programmers to follow the path of maximum laziness. Global variables are easy. Over-deconstruction of the problem space is much harder (Indeed, it seems the bigger issue is splitting apart functionality that has no business being coupled together).

> Nothing in erlang is global and anything you think is global is probably an actor, that is a separate erlang process that anyone can message and get replies from. You could probably do a really good approximation of what you’re talking about as an actor.

Yes. I agree with you. Erlang by design is a truly stateless system. Everything is fully isolated from everything else by design and messages are the fundamental primitive for sharing information and state. But there is such a thing as global state that is carried by the system as a whole. And it is made available to everyone who needs it or better yet automatically delivered to everyone impacted by it.

> The only way to access it would be to send messages to that erlang process and get a response back and that would be equivalent to writing a .getX() method in OO.

This is the part that I have difficulty agreeing with. Some objects hide state that impacts me, and do not make it available to me even with .Get(). If it did, that creates the kind of dependency that Tom DeGisi + Monster don’t like.

> Joe’s primary and principle thesis is that any state that is shared means that your code is now potentially unsupportable and unscalable.

Ah, this is where I don’t entirely agree. Even though I prefer the erlang world view, the truth is that it is mutability that makes the code unsupportable/unscalable not the shared state itself. Suppose the only time you wrote something into your global vars (ie mutated them) was when you initialized your program. Does that make your code unscalable? Not really! That is the mindset that the STM people also take (langauges like Clojure).

> Note that i stopped monitoring this thread when i realised we were mostly in violent agreement, we just disagree as to what we’d prefer.

I am the kind of person who prefers something that I know is harmful (because it puts me in the right state of mind to manage it), rather than something that looks harmless but beneath the surface can in effect be just as harmful. If I have succeeded in making one point clearly, namely that internal, hidden mutable state can indeed be as destructive and bad as global vars, then I have made the point I wanted to make. Ask anybody about global vars, and everyone will tell you of their dangers and how you can hang yourself. Ask anyone about internal hidden state in objects and everybody will tell you its the greatest thing since sliced bread. It’s this latter statement that I so vehemently disagree with.

“But you haven’t asked yourself the question why the Erlangs of our world excel at parallel programs, when state (all state) in languages like Erlang gets carried by the entire program as a whole (i.e. is global and transparent, albeit cleverly managed). ”

There is no global in the C sense of global (ie one memory location that all people see via direct memory addressing). There is global in the higher level sense of global (e.g. some process that fans out a global state to all kinds of spawned processes that are impacted by it).

Having said that, if you actually restrict mutating globals and put it in the hands of a few empowered entities (those with the privileges to do so), it doesn’t matter much whether the rest of the program only see copies of that global (e.g. via messaging or some other copying mechanism) or see the see the global directly as long as they have no means of modifying/mutating the global. Hence the rw privilages on globals that Tom and I discussed.

So to tie the post and all the discussions together, I will now describe a small system I have that uses what essentially amounts to lots of configuration files (to the extent those are indistinguishable from a declarative programming language), lots of objects, and lots of global state :-)

One of my hats at work is that I am “the emulation guy.” I stuff things into FPGAs to be able to test before we build a chip.

The FPGA build flow is contorted. Sometimes there are loops, for example when you are doing slight modifications to place and route to achieve timing closure. Sometimes there are alternate build paths. For example the loop described above, or perhaps the replacement of a firmware ROM image inside an existing FPGA image that doesn’t require resynthesis or a new place and route.

make is completely useless to me.

make’s dependency analysis is all about conjunctions. There are no disjunctions, no concept of “this process can start with A as inputs or B as inputs, and it does slightly different things depending on which is available.”

The out of date analysis is worse than useless — it’s actively dangerous. For some bigger emulations, I essentially have a bimodal distribution of task times. They’re all either sub 10 seconds, or above 2 hours (and sometimes above 8 hours). A rebuild due to a false problem like network time synchronization can easily lose a day’s worth of work. To be fair, a checksum based out of date scheme like scons, waf, or other modern makes use would remove that problem, but still wouldn’t really solve anything. I know which non-parallelizable long-running processes are going to have to be invoked when I type at the command line, and I don’t care if a few 2 second process “wastefully” get reexecuted. I just want a fully repeatable easy way of specifying the recipe.

Another problem that make doesn’t solve (but modern systems help with) is that of “wrapping” external tasks. I want to be able to have a standard method of setup and teardown that allows creating directories, copying files, creating config files, destroying directories etc., a standard way of teeing output to logfiles and to the terminal (through filters to remove excessive warnings in some cases), etc.

(By the way to uma’s point, some of the Xilinx tools maintain their own state inside their build directories, and have their own little invisible mini-make language, where they try to save me time by not rebuilding things that don’t need rebuilding. This is the biggest time-waste in the world because when it breaks, it breaks silently and badly. The most import “wrapping” of external tasks my system performs is that of destroying any state that the external task might want to mistakenly rely on.)

Unfortunately, for me, scons and ruffus and waf completely go about it the wrong way. Though I dearly love Python, and it’s great to use as a DSL for some problem domains, I think it sucks for other problem domains, and there is no way on $DEITY’s green earth that I’m going to compose the order of things in a pipeline using Python and decorators (as with ruffus). I would also much prefer not to use straight Python as a declarative language, like you can do with scons and waf. (And don’t get me started on the problems with using XML as a declarative language, either.)

So, I have two completely different paradigms inside my build system. How to build something is described in pure object-oriented Python, inheriting from a Task class that allows efficient description of lots of standard stuff like creating directories and stuffing configuration files inside them, but yet allows overrides for any kind of special setup and teardown, and allows easy parameterization of the external tool that is to be called. Each external tool is wrapped in a subclass of Task, in its own file in the directory that describes the available tasks.

What to build is described in a hierarchical declarative language, based on my RSON parser. Each build recipe is essentially a script in RSON that invokes the generic python build script to “execute” it.

The subclassed RSON parser stuffs the entire RSON file into an object tree, then (using task data from the tree), composes the tasks, letting each task’s installer read stuff from the object tree and update the object tree with information about what outputs it will make available for subsequent tasks. So when a task object gets initialized, it can make subtle decisions about what to build based on the objects that will be available to it. Once all the tasks have been initialized, then they are run in order.

Conceptually, the tree is very similar to, e.g. the Windows registry, in that it has a hierarchical set of nodes that can each simultaneously contain almost any type of object and also have sub-nodes. Tasks may be passed the top of the tree, or if a task involves subtasks, those may be passed a root that is farther down in the tree.

As with any make system, an error in a task will abort the make. Quite often, there are no useful intermediate build products. For example, the synthesis tool performs best if it sees the whole design, so you don’t really individually compile modules, and if the synthesis fails, you have to restart it after you fix whatever missing semicolon caused it to choke. But sometimes there are useful intermediate build products. For example, maybe I can re-run place and route with new timing constraints without having to resynthesize. For those few cases, having a separate recipe is no big deal, especially if the recipe format is small and easily grokked.

So to recap. For this particular problem, for my way of working, the program I have evolved using my limited skills and knowledge has:

– A huge, global, shared object-based data structure that any task can examine and alter, that describes what is going to be built.

– Task descriptions in a main-class-per-file paradigm that rely heavily on inheritance to define default behavior and standard capabilities.

– “Evil” configuration files for each build recipe, because some things are just better to express declaratively than procedurally.

– The build order, which even simple tools like make have been able to auto-discover for years, is explicitly defined (at a high level) in the configuration file.

In other words, it’s not a system that just anybody can find fault with. Noooo, I’ve achieved the holy grail and coded a system that everybody can find fault with.

I’m a little surprised that nobody’s linked to Jonathan Rees’ `definitions of OO’ write-up yet; maybe it’s a given that most people here have already read it, but there do appear to be some that haven’t….

> If you are, then you will probably find my reasons for despising it (other than the above) to be very touchy-feely, and they are. I think comprehensive OO (occasional OO is fine) is a very poor metaphor for the world, a poor tool for problem solving, and an unnatural way to think. Here are some particulars off the top of my head:

> – It accounts poorly for symmetric interaction, such as chemical reactions and gravity.

> – It deals poorly with information-oriented phenomena such as mathematical objects, copying, broadcast, encryption, caching, and signing. E takes a step in the right direction by introducing the “selfless” notion, but this isn’t enough.

> – It usually has a very imperative, operational flavor to it. It begs you to write a recipe, not to describe the result that you want as functional languages do. This isn’t really inherent, and E is much better on this count than C++/Java.

> – It forces you to always decide who’s on top in any interaction, i.e. which object should define the method, a decision that’s not always easy and one that you often want to change.

> – It’s a poor match both syntactically and semantically with natural language, making it awkward for expressing things that are in people’s minds.

What I like most about most OO languages is their modularity. I don’t want everything to be an object, but modules are great. More recent OO attempts, such as under Perl 6 and Moose for Perl 5 emphasize roles rather than inheritance. This makes for Actor oriented programming or maybe Anthropomorphic oriented programming. Anthropomorphic oriented programming is a natural way for people to think – in fact we can hardly keep people from thinking anthropomorphically when it is a bad idea.