On Cobol

According to Gartner, [...] there are 250 billion lines of COBOL source code being used, with 15 billion new lines each year. A major AAA national company has some 35,000 COBOL modules plus supporting COPY books and so on in its inventory. A major airlines has 848 COBOL modules in its crew management system with some 3,000,000+ SLOC of code. Merrill-Lynch runs 70 percent of its daily business on COBOL systems.

News from Suse/Novell

Novell is changing the file system software used by default in its Suse Linux operating system, aligning with rival Red Hat and moving away from a project whose future has become entangled with the fate of a murder suspect.

That leader from news.com pretty much summarizes the story. Other relevant details: Suse is moving away from ReiserFS and adopting ext3 due to "customer demand".

Saturday October 07, 2006

05:34 PM

More from Steve Yegge

If you want real supporting evidence, you should construct the scientific kind, using experiments. Yes, I'm afraid that means looking at the failures too. Ouch! So bad for marketing. Whenever you hear Agile people asking around for "success stories", remind them politely that only looking at the positives is pseudoscience.

(emphasis added)

My only response: s/Agile//;

Thursday October 05, 2006

08:47 PM

More on Perl and XML

As I mentioned in my previous post, using XML to avoid writing a parser is a bad thing to do.

This reminds me of a conversation I had with gnat many years ago. He asked why Perl was always lagging behind in its XML support, since Java was all over XML, and everything was moving over to XML faster than light speed. My response was that perhaps this was just a sign that Perl doesn't need bleeding edge XML support because Perl isn't used in the places where everything is moving to XML.

At the surface, that explanation seems right. Perl has always been strongly associated with system administration, and system administration is one of those domains that's XML hostile. (and for good reason; where's the benefit?) Perl is also very good at slicing and dicing raw text (like log files, random bits of malformed HTML, nm output, whatever), so it doesn't need everything shoved into an XML format just to avoid writing a parser or using a regex.

Now, ~6 years later, the answer is clearer. Perl doesn't cleave tightly to XML because it doesn't need to.

If you look at how XML is abused as a means to avoid writing a parser, one of the most common abuses is as a means to add a measure of dynamism to languages like Java. Perl doesn't need to do that, because it's a dynamic language. Need to extend a class with a new method? Just do it. No need to implement a forrest of classes just to use XML to pick one and reconfigure it on the fly.

Another common XML abuse: serialization. If you have no natural means of representing data, XML is as good as anything. More readable ASN.1 or other binary representations. Not as terse or as clear as JSON or YAML. But if you have a natural way evaluate expressions (like, say, eval), then all that's necessary is a module to serialize data in a manner that plugs into the evaluator (like, say Data::Dumper). Sure, this is a means of using eval to avoid writing a parser, but doesn't require using XML just to avoid writing a parser. (This may not be a bad thing; Lispers have done this for decades.)

So, Perl and XML aren't joined at the hip simply because they don't need to be.

08:18 PM

Some skills are just necessary

Knowing how to build a compiler is certainly one of the skills on this need-to-know list. Compilers are fundamental to what we do every day as a programmer. Knowing how the compiler works will let you make intelligent decisions about program structure, decisions that have real impact on the quality of our programs. More to the point, most programs have to parse input (either from a human being or from a machine) and make sense of it. To do that, you have to build a small compiler. Corrupting XML for this purpose, simply because you happen to have an XML parser lying around, is inappropriate at best.

Basically, you’re selfishly making your life easier at an enormous cost to everyone else. For every hour you save, you’re subjecting every one of your users to many hours of needless grappling with overly complex, hard-to-learn, hard-to-maintain, impossible-to-read, XML-based garbage. This is no way to make friends and influence people.

There are lots of good examples where XML is the perfect tool, and many, many examples where XML is just an easy way to avoid writing a parser. And there are many XML languages that scream poorly-designed-vocabulary-design-to-simplify-implementations.

Of course, parsing and compiling aren't the only skills a programmer needs to know. At $WORK, we're looking at database validation -- auditing a database to prove that all referenced records are present, and all records are referenced properly. (This is above and beyond the guarantees that referential integrity can enforce.) The process sounds complex, up until you realize that it's just a mark-sweep garbage collector, more or less.

Computer Science degrees may seem like a bunch of BS at the time, but they're actually quite necessary. You never know when you're going to walk into a problem that has a textbook solution.

Shared Nothing is in your future

Let's be clear that the JVM has been lurching toward a shared-nothing model already for well over five years.

The servlet model, the EJB model, the Jini model... these are all in various ways attempting to provide semi-shared-nothing models so we can each contribute our own isolated parts to the whole. Unfortunately they're each anachronistic, one-off, partial solutions. JSR 121 is attempting to make this even more general for all kinds of pojo scenarios.

Given the isolation model that just comes for free with Erlang, the vast majority of the complexity and variation of these lurches toward isolation for Java would just go away. (Erlang itself has a *little* cruft but that's nitpicking.)

Haskell pushes you in this direction, because "shared nothing" is the default state; making a multithreaded program is somewhat easy, and making a shared memory multithreaded system actually takes a decent amount of effort and wizardry. PHP also pushes developers into a "shared nothing" model, which is rather telling.

Even if Erlang isn't in your future, chances are that lessons from Erlang are in your future. There's something to be said for high performance, highly scalable systems that are ready for massively multiprocessor (and multi-node) systems.

05:40 PM

HOP in Action

Of course he's a little biased, but the examples are screaming out for (a) iteration over a lazy list, and (b) extracting a socket I/O pattern from repetitive code, and using a closure generator to connect a generic server to a simple function to manage requests.

(OK, the second example is a mouthful to describe, but the intent is to separate the wizardly bits from the mundane application code, and it succeeds admirably in that regard.)

Wednesday September 27, 2006

10:04 PM

Lisp is dying. Film at 2300

Before Paul Graham, Lisp was dying. It really was, and let's not get all sentimental or anything; it's just common sense. A language is always either gaining or losing ground, and Lisp was losing ground all through the 1990s. Then PG came along with his "I'm not talking to you if you're over 26 years old" essays, each a giant slap in our collective face, and everyone sat up and paid attention to him in a hurry. And a TON of people started looking very seriously at Lisp.

Lisp might or might not have experienced a revival without Paul's essays, but it's moot: he showed up, and Lisp got real popular, real fast. And then he said: "Don't use it!" Sort of. I mean, that's effectively what he said, isn't it? By deciding to pre-announce Arc, he Microsofted Lisp. Killed it with vaporware. It's a great strategy when you're an evil empire. I don't think that's exactly what Paul had in mind, but let's face it: that's what happened.

Ovid, and Coding Standards

The first big problem comes in defining “standard practices”. Any Perl code which doesn’t run under taint mode is immediately suspect. Buffer overflows using untrusted data should not be tolerated. Home brewed encryption? Out. [...] But there are problems there. Any of the aforementioned “issues” could potentially be defended. Someone has to be the first person to try a new encryption method. Also, there are too many other areas where standard practices is a terribly ephemeral thing. It’s not a problem easily solved.

Sorry, Ovid, but you're using a strawman to tear down your main point. Which, from a rhetorical perspective, is rather odd.

The problem does boil down to defining standard practices. Anyone who violates those standard practices, either out of malice, negligence or ignorance, is guilty of malpractice. Period.

There is no loophole for homebrewed encryption. There is no loophole for being the first to use a brand new encryption algorithm. And this loophole cannot be used as proof by induction that any new endeavor needs an excemption from the strictures of good practice.

Why? Because cryptography is a branch of information theory, which is a branch of mathematics. If you set forth to build a new encryption system, you need to follow the rigorous mathematical practices for designing encryption, not the lackadasical hacking process of running rot13 over a stream of input an even number of times and declaring it "encrypted".

In fact, Phil Zimmerman of PGP fame did this a few times before he sat down with a cryptographer, who showed him exactly how weak his homebrewed encryption schemes were. So Phil gave up and just did a plain old public key cryptography system. Ignoring the body of work on strong crypto systems would have been malpractice out of ignorance.

Similarly, when NIST was poking around for something to replace DES, they didn't throw a half dozen homebrewed algorithms against the wall and hope for the best. The used the best practices for developing crypto algorithms (publishing papers, formulating attacks, detailed proofs, etc.), and determined that AES was good because it was provably strong enough to replace DES for the next few years.

So there are no loopholes when it comes to enforcing good practice. When Robert Jarvik developed the artificial heart, he didn't get a free pass from his medical responsibilities because no one had built an artificial heart before. Instead, he was still bound by his ethical obligations as a physician before experimenting on a human subject.

The same thing goes for other licensed professions, like engineers and lawyers.

The difference between licensed professions and software developers is that there is no agreed "standard body of practice" to draw from, nor is there "best standard practice" that practitioners must uphold, or be found guilty of malpractice. For example, what exactly belongs in the "standard body of practice"? Database design? Stored procedures? Race conditions? Taint mode? Class hierarchy design? Design pattern abuse? Secure data handling? Good crypto? Dropping permissions? Numerical analysis? Testing?

Is that list complete? Is it all taught in a 4-year degree program? Reliably?

Are software developers certified against that body of practice? Do we sit for the equivalent of a bar exam, medical boards, or engineering certifications?

Until we have a consensus view on all of these issues as an industry, any talk of software malpractice is premature at best, misleading and distracting at worst.

</rant>

Monday September 25, 2006

01:13 PM

Tcl is dead. Film at 2300.

Newbie: Hi! I'm in computer science at the local university. My
grandfather told me that I should learn Tcl and Tk because it'll allow
me to do all kinds of cool stuff. But when I mention that to my friends
at school, they all laugh at me and tell me that Tcl is totally obsolete
and for losers. Now I'm worried! Is Tcl dying out?

Jeff sighs, and picks up a desk calendar from the shelf behind him. He
looks down, quietly says "right on time", and puts a big check mark next
to a handwritten note on the current day. He then makes another entry
on the calendar three months later.