Arguing for code generation, I am looking for some examples of ways in which it increases code quality. To clarify what I mean by code generation, I can talk only about a project of mine:

We use XML files to describe entity relationships in our database schema, so they help us generate our ORM framework and HTML forms which can be used to add, delete, and modify entities.

To my mind, it increases code quality because human error is reduced. If something is implemented incorrectly, it is broken in the model, which is good because the error might appear sooner since more generated code is broken too.

Since I was asked for the definition of code quality, let me clarify this, what I meant is software quality.

Software Quality:
It is not one attribute but many, e.g. efficiency, modifiability, readability, correctness, robustness, comprehensibility, usability, portability etc. which impact on each other.

@EmmadKareem I added a short definition of it on the original question.
–
platzhirschMar 31 '12 at 12:09

1

I think that automated code generation will help increase consistency and uniformity in your code. In some cases that does increase quality but I don't think it's a catch-all.
–
joshin4coloursMar 31 '12 at 14:49

10 Answers
10

Code generators cannot generate better code than the person who wrote the generator.

My experience with code generators is that they are just fine as long as you never have to edit the generated code. If you can hold to that rule, then you're good to go. This means you can reliably re-generate that part of the system with confidence and speed, automatically adding more features if needed. I guess that could count for quality.

I once heard an argument for code generators that a single programmer can produce so-and-so many lines of code per day and with code generators, they could produce thousands of lines! Obviously that is not the reason we are using generators.

+ the italicized section a thousand times. Code generators should function like your compiler or the C++ template mechanism: you should never have to manually edit their output. The only time you should ever read the output is if you suspect a bug.
–
anonMar 31 '12 at 23:38

@anon: Homans should not generally edit the output of code generators or compilers, but it can sometimes be perfectly reasonable to have a build process which entails running one piece of machine-generated code through a program which applies some modification to it. There are also occasions when it may be necessary to have a human hand-edit the output of a build process if it is necessary to patch fielded code while changing a minimal number of bytes, but when code is tweaked by hand in such fashion one should also archive all files from the build process (not just source!) and...
–
supercatMar 26 at 15:40

...also update the source code to match the semantics of the hand-edited object code.
–
supercatMar 26 at 15:40

I totally agree with those who say that code generation is fine as long as you never have to edit (preferably, never have to look at) the generated code.

If we can accept that the generated code is approximately the same number of lines as hand written, and if we can say that it is bug free, then number of lines which might potentially contain bugs has decreased. Ergo, code quality ought to have increased.

Addendum: of course, other factors, such as execution time, may play a role.

Personally, I have written quite a few code generators, but never as an initial approach.

It has always been when I noticed a repetitive pattern in existing code, so my generator takes some existing code when adding new, but similar, code and parametrized some variable parts of it.

To that extent, my generated code is almost identical to existing hand-written code (except that it tends to better visually laid out and more uniform, which I find aids legibility, if it ever does have to be looked at).

Btw, I advocate inserting opening/closing comments which indicate that the code was generated, including details of the tool and its maintainer.

In addition to Martin's answer, I would add that SQL code generation is very good when you work in a record-by-record basis (select * from tab1 where tab1.pkcolumn = :parameter, update tab1 set [any number of columns] where tab1.pkcolumn = :parameter, etc). And your ORM will shine in that scenario, because of the SQL that needs to be generated is indeed repetitive.

My main worries is metaqueries - queries on object's properties that the ORM translate to SQL using whatever algorithm. Very similar metaqueries can generate SQL that's completely different - and have no guarantee that this generated SQL is performatic.

A metaquery language that translates to another language (SQL) that translate to a query plan to effectively execute the data gathering. And the generated result must be objects, so the ORM must instantiate the affected objects - so it can trigger another rain of queries to fill the attributes of the objects not brought by the metaquery itself...

I'd argue the opposite -- presuming you are writing interesting applications, code generation decreases code quality. The nature of code generation rewards very cookie-cutter, overblown, over-specified frameworks that become very hard to deal with without continually reliance upon the code generation tool to continually generate bigger, more complex, uglier bunches of code. While it can be a good tool, it really shouldn't be the primary tool in the box.

Agreed, the garbage that comes out of some ORMs (garbage from the standpoint of someone who knows how to write well-performing database code) is a good example. It often sort of works enough for someone who doesn't know what they are doing to think it works. And new programmers don't get the skill to do the harder stuff that needs to be done outside the generator because they don't understand basic concepts.
–
HLGEMApr 25 '11 at 21:29

1

Ooh, +1 or -1.... on one hand code generation is very useful to remove boringly repetitive code where you have a definition that is simply expanded into code, but then you're right that it gets overused into all manner of 'time saving' complexity that ends up an an anti-pattern in itself.
–
gbjbaanbMar 26 at 15:14

and that the generated code transparently maps back to the original, so you do not have to debug on generated code.
–
user1249Apr 2 '12 at 14:20

@Thorbjørn: I agree. On one app I've had to maintain there is generated Fortran. The need to be able to debug it was lost along the years, and I'm the only one stupid enough to still be around to field the service calls :)
–
Mike DunlaveyApr 2 '12 at 14:32

I disagree that the code generator should be flexible. It needs to be targeted - do one thing well, not lots of things. It should take a small, well defined input and write a chunk of code for you. When it starts to be the program, its headed for failure.
–
gbjbaanbMar 26 at 15:16

@gbjbaanb: I agree. That's why I said enough flexibility. To me, the issue is not the code generator itself, but the domain-specific-language that serves as its input. If that DSL is too flexible, the user has to swim around in options. If it is not specific enough, the user has to work around its limitations. I can give examples of these.
–
Mike DunlaveyMar 26 at 16:08

The code generation rules are written once; they are not hard coded for every instance of code generated, and thus reduce the potential of human error in copy/pasting the content with slight modifications.

Unless you have to edit the generated code - which are not DRY at all.... I had to do that recently - it's not pleasant at all. If I had to manually edit an autogenerated code base again, I'll charge thrice!!!
–
Fabricio AraujoApr 3 '12 at 19:14

1

You should not have to ever edit that code; edit the code that did the generation itself and augment it with additional rules if necessary. Editing generated code should be the last resort.
–
earlNamelessApr 8 '12 at 22:35

1

I would like to have that choice.. I didn't.
–
Fabricio AraujoApr 9 '12 at 17:51

I used to work in a shop that relied on code generation heavily. In my mind it made the code for the project very uniform. And in that respect, the quality was OK.

However, when you are no longer allowed to write custom code because everything has to go through the generator then I think you lose some of the edge of being a programmer.

So I think this is a double edge sword topic for sure. Yes generators are great because they reduce errors and increase code standards, however, they also make "some" of the programmers dumb, because they are reliant on the generators instead of having to get their hands dirty.

Assembly language programmers used to say this about compilers. So I'm not sure this is a great argument. Being forced to get your hands dirty can be a good learning experience, but once you've learned you should use the most productive tool available.
–
MarkJMar 30 '12 at 18:59

@MarkJ: Sometimes assembly can actually be better than a compiled language for uniformity. For example, in some embedded systems it's useful to be able to code the equivalent of x=someValue ^ 0xFF ^ 0xFF ^ 0xFF ^ 0xFF; and have it be coded with four XOR-literal instructions. If the code storage medium can only write blank (0xFF) bytes, the above construct will allow four arbitrary changes to the value. Even if one rewrote the expression as x=someValue; x = x ^ 0xFF ^ 0xFF ^ 0xFF ^ 0xFF; and compiler evaluated all the xors at runtime, it might still use a "complement all bits"...
–
supercatMar 26 at 15:52

...instruction rather than an xor-immediate.
–
supercatMar 26 at 15:52

Code generation doesn't affect code quality, per se, so much as code consistency.

Generated code will be consistent between instances of generation. If the generator is designed to emit good quality code, then the generated code will be of consistently good quality. If, however, the code generator emits bad quality code, then you'll get consistently bad code.

Code generation may also be used to build code faster. Faster, however, does not mean better... It could just mean you get your bad quality code that much quicker.

I would say in your case it might increase quality a little bit, but reduces development time by a lot. Sometimes the generated code is flaky, awkward, or just plain bad. In those cases, teh generated code can decrease quality and add more testing / fixing / regression testing time to the project. And some tasks are just too complex to be easily generated - the generator becomes a whole separate system (possibly bigger and more complex than the main project) unto itself.

I think automated code generation and code quality are somewhat orthogonal and do not necessarily correlate.

Code generation is merely a way to solve a specific technical task. Whether it results in increased code quality very much depends on what you're doing.

Your situation is a good example of code generation resulting in increased code quality through early catch up of potential errors.

I can give you another example when automated code generation diminishes code quality. It's out almighty ASP.NET WebForms. It does automated code generation by translating an hierarchy of UI controls into HTML markup, which is everything but stable, predictable and manageable.

To draw the conclusion, automated code generation can help increase code quality when used properly.