Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

nickirelan writes "Why Learning Assembly Language Is Still a Good Idea by Randall Hyde -- Randall Hyde makes his case for why learning assembly language is still relevant today. The key, says Randall, is to learn how to efficiently implement an application, and the best implementations are written by those who've mastered assembly language. Randall is the author of Write Great Code (from No Starch Press)."

It is a big deal when you can do it without root permissions. Running a webserver on Linux that allows clients ssh access? Oops, one of them can take down the whole box. You're right that it's not very on-topic, though...

"This bug is confirmed to be present when the code is compiled with GCC version 3.3 and 3.3.2 and used on Linux kernel versions 2.4.2x and 2.6.x. It has been tested to work on, and crash, several lame free-shell provider servers."

If it affects all 2.4 and 2.6 linux kernels, I wouldn't call servers affected 'lame'. Especially free-shell provider servers. That's lame, testing a local exploit on a public shell server.

Just because you have access to source code does not mean you can do source-level debugging. Under Windows at least, the target binary must be built correctly, the correct symbols must be available, the source used to actually build the target must be available, etc.

While I was at Microsoft, most of my debugging was done using the console-based debuggers: i386kd/alphakd/etc for kernel-mode, and cdb/ntsd for user-mode. For many years, these debuggers were incapable of any form of source-level debugging, so we did without.

Knowing how to read disassembled code in the debugger and match it up with source code is a vital skill, far more important than the ability to write assembly language from scratch.

Disclaimer: I learned to debug before I learned to code.With extremely few exceptions, machine code performs exactly as advertised. When things are not exactly as they should be, it helps to be able to see exactly what is going on.

Performance is much more a matter of structure (exponential complexity) than language (poor linear complexity). As to level, "high level" languages limit you to their implementation of a few concepts. Depending on where the heavy lifting is, Perl could easily outperform optimized C.

Depending on where the heavy lifting is, Perl could easily outperform optimized C.

But if Perl is written in C, wouldn't that mean that Perl can never be "faster" than C?

To put it more concretely, couldn't I just write a program in C that does EXACTLY what the Perl program does, down to the last data structure? And if I did, wouldn't that mean that Perl can't ever (theoretically) be faster than C?

You could even take it a step further. You could write an exact duplicate of the Perl program, leave out the parts of the Perl interpreter that you don't use, and you probably would end up with an overall faster program. Thus, in most cases, C could trump Perl.

Actually, there's a Perl module that does this. But if you wanted to write said C program from scratch, it could take you a while--writing the Perl program would be much faster. And then there's the choice of algorithms involved. Have fun writing your own sorting functions in C, your own regular expression library in C, your own arbitrary precision number library in C, your own reference-counting garbage collector in C, your own closures in C...

Of course, there are other pre-existing C libraries you could use that do all of these things, but there's no telling whether or not they're faster than what Perl uses internally, and since you're re-implementing everything in C anyhow, you might as well just write your own!

So the point I'm trying to make here is this--Perl is convenient because it has all these things written for you and integrated together. Sure you could write a C program that does the same thing, but you'd end up re-inventing the wheel many times over, and you'd have to work hard to make it a better wheel.

Or you could target Parrot instead, for a lot of things it's already faster than Perl. It also has a JIT compiler, so who knows--it might generate some code here and there that's faster than what your C compiler generates.:)

That seems to be a bit of a straw-man argument, arguing against the use of libraries in C, but for them in Perl.

All things being equal, any CPU-bound program written in C is highly likely to be faster than the same in Perl.

The reason I write lots of Perl and hardly any C these days is because virtually everything I write is I/O bound and just doesn't need the performance. I may as well use Perl and save myself a heap of time.

Actually, HP's Dynamo project showed how JIT'ing a machine code into itself (like JIT'ing x86 code INTO x86 code) can give you speed advantages. The reason is that the JIT can look at the runtime characteristics of the code and do optimizations based off of that. It can move code so it's on the same page as other code that is called in sequence, optimize branches so that there is a higher chance of branch prediction and to save the pipeline.

Your point is taken. However time is the most valuable resource of the programmer. A tool which can provide the fastest solution in both development and execution can be an attractive option. Computer resources are meant to be spent, but a programmer's should be invested.

As tools have matured, I think we can see that the emphasis on time has only increased. Reusability and abstraction for more reliable interfacing are important OOP goals. Standard and broad libraries, IDE's, gc, etc. are all geared towards saving someone time. There are less obvious perspectives to this paradigm as well - low, economical maintenaince and error-free code, for example.

I think a lot of programmers have gotten into the habit of sacrificing time for performance, and vice versa. In some form, this will always be true. However it is an asset of the programmer to have the choice of exactly what component of his project he should spend the most time on. Hopefully, this is the design.

But if Perl is written in C, wouldn't that mean that Perl can never be "faster" than C?

Superficially, that seems an obvious truth, but it doesn't necessarily hold in practice for several reasons:

By definition, high-level languages have more expressive features than a glorified assembly language. Those features allow the programmer to express certain concepts directly, in a form whose semantics that the compiler can understand. That in turn may allow the HLL compiler to optimise the output code it produces in a way that the LLL compiler couldn't. Consider the number of optimisations that no current C compiler can perform because of the risk of variable aliasing, which simply can't happen in some high-level functional languages because effectively there are no variables, only immutable values.

Moreover, in practice, pretty much nobody goes to the effort to implement an optimised high-level language feature in a C program unless they're writing a compiler for that HLL anyway.

Finally, there is the issue of run-time optimisation, which is to an extent just a variation on the themes above. So-called just-in-time compilers can potentially look at the run-time behaviour of a program, and dynamically adjust the executable code to better handle what is really happening, with the real data being provided. How else do you think an overweight beast like Java now gets performance at least comparable to the C and C++ world? Since no compiler can ever know what real data will be provided, the only way to achieve this effect in a LLL like C is to write your own virtual machine, and effectively create your own new language with a JIT compiler. Not a lot of projects do that, because it's insanely complicated.

In other words, with today's compiler technology, and more importantly today's run-time environments, C is no longer automatically the king of performance, and it is both theoretically and practically possible for much higher level languages to outperform even hand-optimised compiled C code.

Of course, the price you pay is the initial overhead for the JIT compilation process, usually when a program first loads. However, this is one area where rapidly increasing hardware speeds really tells, because that directly reduces the overhead of that bootstrapping process, so the field of more level the faster hardware gets.

I expect traditional, compile-only technologies to fade into the background over time; in the programming language "performance vs. safety+power" spectrum, they aim at a target nobody will need to hit any more. There will always be a need for LLLs, if only to write the underlying platforms to support HLLs, but for regular application development, their days are numbered.

The parent's point was that if Perl takes a second, then perfectly-optimized C cannot take more than a second, because the Perl interpreter is written in C. That said, it's a fairly moot point, because you could probably have written 500 Perl programs in the time it takes to optimize that one C prog.

No, you can't exactly debug a fubar memory stack with just printf. Maybe in your hello world program, but not when things get complicated.:) Trust me, I know. I'm writing a rather large network application at the moment and somewhere along the line I must've overshot an array, but one mistake can ruin a whole application, and printf'ing wont help you.

Learning how to debug is just as precious to a programmer as learning how to code.

This is called a
heisenbug [astrian.net], in case you are wondering. They occur mostly due to a smashed stack and are indeed damn hard to track.

You can of course use assembly to track the bug, but I myself find that tedious. If you are programming in plain C (and not C++), you can use lint, a tool that evaluates sourcecode, very often. When lint reports no more possible problems you are done.

If you happen to use C++ you'll probably have to shell out big bucks for a linter or be out of luck because there are only commercial linters available.

Tho, that's why I always have a Linux system with
valgrind [kde.org], which is amongst other things a memory debugging tool, available on it (unfortunately valgrind does not work on any of the BSDs). Valgrind will scream and give a stack backtrace when your program does something wrong - be it an off-by-one error, be it memory being read uninitialized or whaterver. A truly genial tool.

"No, you can't exactly debug a fubar memory stack with just printf. Maybe in your hello world program, but not when things get complicated.:) Trust me, I know. I'm writing a rather large network application at the moment and somewhere along the line I must've overshot an array, but one mistake can ruin a whole application, and printf'ing wont help you."

Okay, I'm also writing a large network application, and I find printf statements very helpful indeed. It's Windows, so my main debugging tools tend to be message-boxes and fprintf to files. Even though there's a good "debugger" available, it's quite often at too low a level to see what's actually happening in the program.

One problem is running multi-process, multi-threaded code on several computers at once. Sure, debuggers can be made to work that way if you install Visual Studio on each machine you're testing on, but it can be inconvenient to say the least.

With print statements, your program can alert you in real time, using all the functions available in code but not in a debugger, and doesn't need to be carefully compiled as debug with the appropriate modules running in a debugger with the right break-points set. Just add a message box, and your program will tell you what it's doing.

Debuggers are great for examining memory structures, or for when you really don't know what the hell your program is doing, but for most purposes a few well-placed print statements and a logical series of tests can help you find the problem.

Also, don't forget, a good deal of programming is still done in assembly. Both in a job I've had coding stuff and in my current research (crypto), I did/do a lot of assembly programming.
Yes, learning assembly will make a better programmer out of those who never will code assembly again, but for some people, assembly is a valuable and often-used skill

I agree. It sucks that a lot of programmers think learning assembly is just something they should do some day to gain better insight.

Personally I think the curriculum for coding should begin with asm, and the student should work his way up to the higher level languages.. c, pascal, and finally java or perl.

Than it wouldn't be an afterthought that some things are actually easier to accomplish in asm (most things depending on your line of work). Not to mention the eventual nested nested nested loop that just needs to be optimized.. shouldn't be a roadblock for any programmer.

It's a shame that schools are phasing assembly classes out of their computer science curriculums. If anything, it makes for a great foundation on which to learn more modern languages while teaching students things about computers that they probably wouldn't take the time to learn otherwise.

I was fortunate enough to learn assembly on a few architectures. Admittedly all of them are microcontrollers (PIC, AVR, HC11), but it really does force you to understand how the machine is working.

Some of the more serious asselbly languages still scare the hell out of me though; have you ever looked at the assembly for the TI C6x DSPs? It took me quite a while to come to terms with something simple, and the C compiler can better use the 8 parallel execution units better than I can...

Knowing something about the low level hardware / machine instructions and what compilers do to translate high level languages definitely helps programmers make better programming decisions.

Given the rise of byte code environments like Java and.NET and the sophisticated tools available for working with them, I would think schools would do well to teach a class programming at that level. While obviously not a true assembly language, something like Java byte code is a lower level look at how programs really work, and since Java is highly portable you wouldn't have to worry about what hardware the students used like you would with an actual assembly language course. I still think doing some work with assembly and getting some exposure to hardware architectures is best, but this might be a reasonable alternative.

I have to agree. I run into quite a few people looking for programming jobs who don't understand what the CPU has to do to execute their code. They do dumb things like multithread something that is CPU bound becuase they have no understanding of how what they write actually gets executed. Same thing with regard to data representation mistakes.

I'm not saying that everyone has to become a proficient assembly level programmer but I think a lot of people would be a lot better HOL programmers if they understood something about assembly language. I wonder how many Windows buffer overflow exploits are simply the result of someone not understanding that just because you can't express it in a HOL doesn't mean you can't exploit it from assembly code.

Not that I disagree with your point (actually I do agree with it alot!;-) ), but, I'd guess, almost a half of the people thinking in multithreading terms in this world (definitely half the total IQ, not to offend anyone;-) ) use it to solve CPU-bounded problems on SMP boxes/multithreaded CPUs (Tera SMT; Intel's Hyperthreading on massive (ASCI) scale, things like that...)

When you just start throwing processors (or thread contexts) on the problem, you will find soon enough that any problem is I/O bound...;-)

I am assuming you said this to be funny, otherwise it is indeed a shame.

ECE (especially those with a heavy electrical engineering lean) people deal with microprocessors. Motorola chips have special features that you can't access with most C compilers and thus it is necessary to know assembly.

Also, until recently, finding a good C compiler wasn't cheap. Now, of course, there are free ones.

Coming from an ECE program without a microprocessors class in which you apply Assembly will make you less competitive than the graduates coming from schools in which engineers are taught both practical assembly application, and high level languages.

Hate to say, but the kind of optimization you learn about by knowing
assembly language is just not necessary for most programmers these days.

I learned programming in the 80's, and I did learn assembly language, starting with 6502
assembly. I would subconciously code my C so that it would produce faster
code. Every block of code I wrote would be optimized as much as practical. My
code was fast and confusing.

When coding Perl or Java I would keep in mind the details of the underlying
virtual machine so I could avoid wasteful string concatenation or whatever. I
cache things whenever possible, use temp variables all the time, etc.,
etc.

I've spent the last few years trying to UNLEARN this useless habit. There
is just no need. And in highly dynamic languages like Ruby, it's pointless.
You can't predict where the bottlenecks will show up.. almost every project
I've worked on has either had no performance problems, or had a couple major
performance problems that were solved by profiling and correcting a bad
algorithm.

Stuff like XP and agile development have it right: code as simply as
possible, don't code for performance, then when you need performance you can
drill down and figure out how to do it.

To me a beautiful piece of code is one that is so simple it does exactly
what it needs, and nothing more, and it reads like pseudo-code. Minimalism is the name of the game.

So my advice is, don't learn assembly language. Learn Lisp or another
abstract language. Think in terms of functions and algorithms, not registers
and page faults. Learn to program minimally.

On another note, the tab in my Konqueror for this article reads: "Slashdot
| Why Learning Ass...". Heh.:-)

Use of assembly doesn't preclude thinking in terms of functions and algorithms.
Like nearly any form of programming, it pretty much requires such thought -- in abundance.
But given that I have a limited amount of attention to spend on each line of code, focusing on registers, branches, instruction sets and memory layout takes away from time better spent on clarity, modularity, and algorithmic sophistication.

It's much easier to take code written for clarity and correctness and make it fast, than take code written for speed and make it clear and correct.
That's what profilers and coverage tools are for.
Once you've measured, code your inner loops and bit-fiddling in assembler if you must, but only after your program is working and well-tested.

I have seen FAR FAR too many students in my various college programing classes who think nothing of calling functions with 15 parameters and copies of large datastructures (not references) and other such things. I really think that assembly should be one of the FIRST things taught to future programmers. So many people I've run up against don't have any idea how computers work. Sure things are "mentioned" in classes, but so much is lost on them. Somthing as simple as "passing 2 things is much MUCH faster/easier than passing 10" don't get taught.

By passing 10 things, their job is easy. That's all they see. They don't know about registers (other than they exist and are sorta like variables on the CPU). So they don't know that to pass 10 things you might put some in registers but the rest will have to be passed in memory (which is slow) as opposed to puting everything in registers (if at all possible) which is faster (especially for simple functions).

The only problem with assembly is the catch-22 mentioned in the article: you have to do all sorts of "magic" to print out to the screen or read from the keyboard, which can be confusing. And it takes a while to get them up to the point where they can start to understand that magic. My school teaches assembly (sorta) on little 68HC12 development boards that have built in callable routines that perform things equivelent to printf and such, so there is little voodoo involved which is nice.

I'm not saying assembly is neccessary, but I DEFINATLY think it's important for programers to learn how things work under the compiler. I have seen FAR too many hideous bits of code that no one who understood the underpinnings of assembly would never dream of.

By passing 10 things, their job is easy. That's all they see. They don't know about registers (other than they exist and are sorta like variables on the CPU). So they don't know that to pass 10 things you might put some in registers but the rest will have to be passed in memory (which is slow) as opposed to puting everything in registers (if at all possible) which is faster (especially for simple functions).

SO WHAT! Programs should only be optimized if:

1) the program is doing stuff so intensive that it runs slow
or
2) It is being run all the time in the background by the system and can slow down the system as a whole.

98% of the time it just does not matter.

Easilly readable code is FAR MORE IMPORTANT.

I have written code in more than half a dozen different languages but my favorite language these days is Python. It runs ten times slower than C, but in most cases, it just doesn't matter. Most of the time, the code feels instantanious.

SO WHAT! Programs should only be optimized if:1) the program is doing stuff so intensive that it runs slowor2) It is being run all the time in the background by the system and can slow down the system as a whole.98% of the time it just does not matter.

I agree with points 1 and 2, however, if you're doing any non-trivial programming, I wholeheartedly disagree with the 98% figure. Not every bit of code is a throw-away piece. If your code only runs occassionally, and it's not performance critical, then yes, make readability your main priority.

But a lot of times your code is performance critical, or at least will be in the future. Code has this tendancy to stick around once it's actually working. Too often have I seen code that, when it was written, was not getting used a lot. It worked, so nobody thought anything of it. As they scaled the system up, that code became a major bottleneck and eventually somebody had to go back and optimize the hell out of it, or simply throw it out and rewrite it to be fast.

It's good to write readable code. But it's also good to write code that is reasonably optimized at the same time. No need to go to extremes, just don't do stupid things like passing around huge 4 kilobyte variables to functions and such (example I've seen). Pass a pointer instead. Or a reference. Just write smart code. You can still make readable code while making it optimal enough to scale pretty well. Only very, very rarely do you have something that needs to be super well optimized, and then you usually are better off writing the critical sections in machine code anyway.

Easilly readable code is FAR MORE IMPORTANT.

Easy readability is far more important when that code scales to the level you need it to scale to. Readable code that doesn't actually work in the system you're trying to put it in is worse than useless.

What you are advocating is that the students should understand the cost of the code that they write and you are saying that understanding assembly is the way to do that.

But here is where you are missing the point.

It is not the only way and sometimes it is the wrong way.

With virtual machines and interpreting languages, knowing the machine code of the CPU becomes pointless. You need to know what is costly for _your_ language.

What I think that you _really_ meant was that your students should understand compiler technology. That is how you understand the cost of different language constructs in a way that is portable across compiled, interpreted and byte code languages. This is unfortunately not something that you can have beginners learn, it is more of a third year thing.

Also, I've programmed professionally for 10+ years in most programming languages known to man, and I agree with your parent poster. Write simple and sensible code. Optimize when needed. More often than not, you will find that the code is fast enough. If it isn't, then you saved so much time writing the code originally, that you can spend a lot of time optimizing the problem areas.

If you disagree with this approach (write simple code and optimize when needed) then I'm willing to bet that you've never programmed outside a university environment.

I must disagree. Without having any resources, I would suggest the bulk of software developers are building business applications. You know - the non-computer science stuff. Not compilers, not operating systems, not the latest whiz-bang game, etc.

A number of us are true computer science students, and we cut our teeth in assembly, so-to-speak. That being said, I disagree that it is necessary (or even good) to understand the machine at the low-level. I have never done x86 development (instruction set and memory models never made sense to me) and I have never seen the JVM byte-code that I use daily. Nor do I care to.

If you're writing code that is supposed to be optimized for the machine, you've missed close to a decade of compiler development. Dealing with multiple pipelines, delayed branching, etc is best left to a machine. I have more pressing issues to solve - like delivering good software.

The compiler optimizations are pretty astounding today. The JVM run-time optimizations are amazing. My knowledge of hardware architecture is 20+ years old. I'll trust the compiler writers as well as the JVM designers.

The focus for the bulk of us is on maintainable applications that can be delivered "on time, within budget, blah blah blah." Illogical algorithms and/or writing code for the computer and not for the human don't help anybody. In fact, I'd probably just throw it out and start again - it's the fastest and least stressful way to deal with it.

The most important tool to hone and keep tuned is your mind. Those with good logical reasoning and critical thinking are going to do well. They are the ones *I* look up to.

I would suggest teaching unit testing (ie, JUnit) - including what to test and how to test correctly (both difficult topics) - and debugging skills (which I wished I had more of when I started) instead.

If you want to cover hardware, use a book like CODE (by Charles Petzold) to give people an idea of computer structure. Nothing more than that - and even that isn't required.

While the general idea you're expressing - that people should understand how computers work to program them - is a good one, I think you're missing the point to a large degree.

The things that are most important in performance are, generally speaking, algorithms. It's important to understand things like:

Cache - the computer operates faster on small working sets.

Algorithm efficiency - O(nlogn) is better than O(n^2) for most problems.

Latency - network or disk round trips are bad.

Etc. The sort of thing you're proposing, with stuff like function call arguments, loop conditionals, etc. are micro-optimizations, and are very seldom worthwhile for programmers. Micro-optimizations are almost always best left to the compiler writers, who can, in effect, program them once and let everyone reap the benefits.

Consider your example in particular: A function with 2 arguments instead of 10 isn't really faster. First off, it's only slower on an x86 - many architectures have these things called registers, which you can use for things like function arguments. Second, those function arguments are spilled to the stack just before the function call jump. This means they're extremely hot in the D$, and will hardly even be any more expensive than a register to reload at all. Third, if you break the big function into a lot of little ones, you're incurring more call overhead and more pressure on the I$. Fourth, breaking the function up causes multiple copies of the function prologues and epilogues, which will easily overwhelm the register spilling cost. Etc. etc. etc.

In other words, those 10 arguments are only microscopically slower, and may even be faster!

In this case, the student should avoid writing a function with 10 arguments if it makes the code clearer. The value of this sort of incredibly trivial micro-optimization is fundamentally dwarfed by the value of readable code - if the student can read her code, that means she'll have fewer bugs and can spend more time optimizing it. And that's what'll really make the program faster.

You should only consider worrying about optimizations at this level if you've already optimized your program at an algorithm level fully, have profiled it, and determined some particular pieces are extremely dense hot-spots that need to be improved by hand. But if you're doing that, you may want to consider recoding the hot-spots in assembly anyway.

These days, these sorts of hot spots tend to be media codecs, and the way to speed those up is to use SIMD instructions - which can only be used properly from asm. So even before worrying about these sorts of extremely tight micro-optimizations, you'd want to recode in assembly just to use the special vector instructions! And in asm, readability is even harder to obtain, so you'll probably avoid a lot of the sort of micro-optimizations a crazy compiler will do just so you can make sure the code works right.

That is exactly the point! Programming in assembly is another of these facets, and just as important as all the others. Instead of shouting at each other, maybe you should recognize that you're both right!

While the majority of the Unreal engine is C++, we often write assembly-code versions of critical functions for specific platforms. Of course this is done after the C++ versions are tried and tested, and the bottlenecks are idetified.

To take full advantage of processor features like SSE or AltiVec you don't really have a choice.

For example, UT2004 contains SSE and AltiVec assembly versions of some of the vector and matrix manipulation functions, some of the visibility culling code, etc. The amount of work Dan Vogel put into this kind of optimization is one of the reasons that UT2004 runs better than UT2003 on the same hardware.

Learning assembly language is useful, as it's sometimes the right tool for the job.

So basically you don't have time to do it right, but you might have time to do it twice?

Learning assembly isn't all about optimization, either. Being familiar with how the machine works right down to the core will make you a better programmer, peroid. Personally speaking, it also helps develop that zen like ability to "think like the computer", and that helps you program not just more efficiently but more effectively since you can think things out better. You can't tell me you're not a better programmer for having been exposed to it... it simply changes the way you think about the machine.

It can also be argued that "beautiful" code has no bearing on performance. It's also the kind of "Oh performance isn't an issue anymore" and "make te source code pretty" thinking that we now need gigahert+ machines with 128MB RAM just to write a goddamn letter... it's really quite sad that so many programmers just let their applications fill the hardware vacuum they think their users will have, or should have, just because they didn't take an extra day to think about what they're doing and write their code a little more efficiently.=Smidge=

Much more useful in most systems is knowledge of the system components at a level higher than the CPU--details about how the OS works (scheduler, memory management, etc.), how the language you're programming in is designed (is tail-recursion done without a stack?

Of course, some of these, even if you don't HAVE to know assembly language to understand them, knowing assembly language makes it easier to understand. Most people who know assembly language have a much more concrete view of the differences between pointers and values. When you have personally had to think about whether to push the value or the memory location, when you have to think about which addressing mode you need to use in that situation, it makes the idea of pointers and stacks and calling conventions a TON more concrete. It also makes many of the ideas of sequencing and linearity a lot more concrete. This is something that I've found a lot of new programmers have difficulty with - they have trouble thinking in straight linear fashion, and assembly language absolutely forces you to think that way.

Anyway, that's the reason I wrote my book on assembly language. See my sig for more info. Randall Hyde actually wrote me a pretty good review on barnesandnoble.com. I got a good one from Joel Spolsky, too.

To me a beautiful piece of code is one that is so simple it does exactly what it needs, and nothing more, and it reads like pseudo-code. Minimalism is the name of the game.

From an old fart who likes assembly language, total agreement.Assuming the primary goal is performance, the blunt reality is that about 90% of the code is irrelevant as to impacting that performance. Any screweys in that code, particularly trying to "improve" performance, will have indirect deletorious effects on that performance.

I learned 6502 assembly in the 1980's on a Commodore 64. I even have it all imported into my system, into a few D64's full of software I wrote myself - to run in an emulator for old time's sake. (Gee, hard to believe that some of the programs are almost 20 years old now.)

I had a lot of the habits that you describe, and I now program simply in C++ for either Linux or XP.

However, I had run into some performance issues with certain critical loops that were executed millions of times, such as a loop that iterates through pixels in image processing, and I wanted to view the disassembly of it. I understood enough assembly to be able to optimize a tight loop in a plain C code routine, and verified that the assembly was just as good as handcoded non-MMX assembly. (Some compilers do an amazing job now) The only way to improve the performance further in my case, would have to have written MMX/SSE/SSE2 for this 0.05% of a computer program, but even so, I deemed it not to be still worth the effort.

Now, if you are talking about realtime video filters, such as deinterlacing and sharpening (think Adobe Photoshop style plugins executed 50 or 60 times per second for every interlaced video field at 60 Hz for NTSC, 50 Hz for PAL), you still need matrix math operations such as MMX/SSE/SSE2 assembly language if you want to do lots of video enhancement realtime on a live video source.

One example program is the open-source dScaler project - dScaler Realtime Video Processor [dscaler.org] . You can do REALTIME sharpening filters, denoising filters, motion-compensated deinterlace filters, 3D-like chroma filters, diagonal-jaggie removal filters, etc, all the above simultaneously, on a LIVE real-time video source from a cheap $30 PCI TV tuner card, on today's high end Pentium 4 and Athlon systems. All this would not be possible without assembly language. Now, they are talking about adding realtime HDTV enhancement (1080 interlaced -> 1080 progressive). Run your cable/satellite/DVD box connected to your home theater PC running dScaler, and hook the home theater PC to your HDTV, and the live homemade "upconversions-on-the-fly" you are seeing are shockingly better looking than the bad quality upconvered video you watch on TV; (Important: Don't use S-Video output, connect the VGA output directly to the TV using a component-output adaptor. It's 6 times sharper than S-Video. For more information, see AVSFORUM's Home Theater Computers Forum [avsforum.com] section for more information about getting HDTV-quality video out of your computer to your HDTV television, especially if the HDTV television does not have a native VGA input.)

(For watching live realtime videoprocessed video, I don't recommend a $30 TV tuner card, the power users like to get more expensive cards such as approx-$250 PDI Deluxe card, which is a Conextant 23882-compatible card that actually has a Y-Pr-Pb component input for computers! Supposedly better analog signal-to-noise ratio, better A/D converter electronics, better power filtering.)

The point is that you don't need assembly language most of the time, but there definitely sure are times that it's exeedingly, absolutely critical.

1) You are a programmer, and knowing how the computer functions is your job

2) Many of the high-level constructs are better understood when you know what it is they are trying to abstract. It will also keep you from doing stupid things like making everything in java a BigNum or whatever that is.

3) The idea of references and pointers are a lot more fuzzy for programmers who never learned assembly language. The difference between a pointer and a value is harder to grasp.

4) Debugging is a lot easier when you know assembly language, because you know how the parts fit together. You understand what a calling convention is, you understand how memory mapping works, you understand how the stack works - you just can see the whole picture of how the machine is processing your data.

There's even some optimizations that you can do still in higher-level languages that you get from knowing assembly language. For example, in C, the first member of a struct is accessed faster because the compiler can just do straight indirect addressing rather than base pointer addressing. It might also convince you to rewrite your loops so they have a better chance of fitting entirely into the instruction cache. But even without these things, knowing assembly language is useful for the four reasons I outlined above. It's also useful for people who are having trouble learning to code, because it forces them to think on a much more exacting, step-by-step, concrete level.

You know, when you're trying to get a game out the door that runs at 30fps and your competitor has a similar game running at 60fps because they coded their inner loops in assembly you begin to realize why optimization is important after all.

Come on - all that may be great, but what really matters is what gives the most bang(features) for the buck (cheapest)... and in the minds of the average CEO with 10 billion options this is done by outsourcing to the lowest cost provider (India, China, etc..) regardless of code training, language etc..

Right. Because after reading that article, I thought, "Boy, if *I* was a CEO, this is how my guys would program everything." Actually, he says in there that writing everything in Assembly is not for every project.

So what's great about it is that if someone is looking to make a step to become a better programmer, this would be a good direction to check out so that you truly understand your code. If you are happy being a VB code jockey (or any code jockey for that matter) then

Also, given that modern optimizing C compilers can often optimize better than humans, it may make sense to embed critical sections of assembly into C code, and let the compiler optimize the rest...

In the past, I used to do a lot of assembly language programming, but would always end up being burnt by having to completely rewrite everything for a new CPU/graphics card. It's much more productive to write a generic algorithm in C/C++ and use the assembly output to identify where the optimisations can be mad

Misuse of high level languages such as visual basic, as well as off the shelf components for everything, has led to a level of code bloat in todays applications that is inexcusable.

You, sir, are insane. Much of my job involves pushing around regular expressions and hash tables (aka associative arrays aka dictionaries). I know several flavors of assembler on distinct hardware platforms (x86, 68k, 6502, MIPS) so I say this out of experience rather than fear of the unknown: I'd rather swallow my own tongue than write anything non-trivial in a low-level language.

Seriously, a lot of people who know what they're doing have provided a huge library of functionality for me to pick and choose from. If I need to write a GUI app, I'll do it in Python with GTK or QT bindings. I am competent to build it in assembler, but why? It wouldn't be portable, it'd shave a very small amount of size from the end product (most of the project's resources are likely to be spent in the GUI libraries and not the core of the program), and would take 20 times longer than necessary.

There are a very few areas where low-level languages make sense. I haven't touched any of them in years.

While I learned assembly, and found it useful for learning to understand exactly how the machines think, I'm not sure I agree with his basic premise. Namely, that great code (code that is well designed for it's job, and easy to work with and under) is always the efficient code, in machine terms.

The machine thinks one way. A human thinks in another. Code that is well designed for easy updating, and extending, is code that is easy for a human to understand. If that is not the most efficient way for the machine to do it, that may be the price for 'great' code in this project. (The ideal balance depends on the project, of course. A kernel should be machine-efficient, for example.)

Efficiency in terms of coding is a wonderful art and I think it's still applicable today. Kernel-level routines, games, drivers, etc. all benefit from tight coding in assembly language.

But let's be honest here. Computer Science 101: an efficient algorithm coded in an inefficient way will always beat out an inefficient algorithm coded by hand in 100% optimized assembly. I'll put my crudely coded Javascript quicksort algorithm against your finely honed 100% assembly bubblesort algorithm any day. Not only will my algorithm beat the pants off of your algorithm, but I'll also code it in far less time and with way fewer debugging sessions than you would. Also, the higher-level language you go, the better it is for security. How easy is it to introduce things like buffer overflows, array out of bounds, etc. errors in assembly? How easy is it to do that in Java, C#, etc.?

So yes, writing in assembly language is still good and has its places. But let's keep it to those places, shall we?

I'll put my crudely coded Javascript quicksort algorithm against your finely honed 100% assembly bubblesort algorithm any day. Not only will my algorithm beat the pants off of your algorithm, but I'll also code it in far less time and with way fewer debugging sessions than you would.

You're on. After my exams are over, I'll code a bubblesort algorithm in assembly language. I wonder how large the dataset will have to be before you win? Mail me [mailto].

Yes, this really is the crux of it all and I left that out. I participated in a very interesting challenge to generate 10 unique random numbers in a scripting language. The goal was for minimal time. As it turns out, a simple array check of whether or not the number has been included worked the fastest due to the fact that you're generating only 10 numbers. As soon as it got up to 100 or more, the array approach O(n^2) broke down.

So for a small dataset, I'll award you your prize already.:) For a large, random dataset I think I'd win out on that one. Check out this sorting algorithm demo [rit.edu] page (uses Java applets). Looks like the Shear sort kicks ass over all of them.

I'll take that bet...but since you choose the algorithms, I choose the architectures, and I choose a base-line PIC microcontroller. It has a 2-level deep hardware stack, Let's see your recursive javascript code run on that.

Learning it is good, even if one never ends up actually seriously developing in it. It advances your ability to solve problems in a language independant fashion and equips you to be able to more rapidly model solutions to brand new problems, even in the languages that you may already know.

Developers smarter than you have spent decades building useful higher-level layers to speed up the development of complex code. You would be wise to leverage this incredible infrastructure for the 99.999% of projects that do not benefit from obsessively tweaking the finest details.

Knowing what assembly is and how it works is beneficial. Mastery of assembly is completely pointless for anyone outside of OS kernel, compiler construcution and embedded development...which probably means you. Your time will be better spent figuring out how to make Java programs 10% faster most of the time.

For some reason I have a strange desire to teach someone 6502 assembly language. I'm not sure why that is, and the rational side of me knows that I'm never going to find anybody who's even half-interested in learning it. I think that perhaps the reason I want to teach it to someone is that it'd be nice to experience someone else coming to the realization of the power of abstraction, the awareness that so much is possible using such simple building blocks. And yes, I strongly believe that knowing how to program in assembly language (and how it relates to the underlying machine language) makes one an instinctively better programmer. And it's frikkin' neat. It's like driving a standard versus an automatic. You become one with the computer.

I've programmed a few embedded systems in assembly and it's not very fun at all.

To make matters worse, each CPU has it's own instruction set, and special set of commands that you must learn before you can even sit down and start writing code.

With C++ or at least a C compiler, you don't need to worry about so many implementation details. You should only resort to assembly if you absolutely, must have the performance required. Maybe the author of this article forgets how difficult it is to debug assembly code, or how difficult it is to implement abstract concepts such as OO at such a low level.

I don't agree at all that writing "efficient code" necessarily creates better code. Writing "clearer" is better from a quality standard.

We have compilers for a reason, to produce assembly code as efficiently as possible for a higher level language. Most 99% of the time, the compiler will optomize the code just as well, or better than you can.

I would still recommend learning assembly language to C++ programmers simply so they understand how the computer is actually working. But to require anyone to program in assembly requires a great deal of justification.

I think there is more at stake here than just writing efficient applications. For one thing, writing proper multi-threaded code often requires thinking at the assembly level. Many of my coworkers who are all high-level-language-only programmer types couldn't understand (until I explained it) how the Double-Checked-Locking Java example was broken.

Assembly language will always be needed to optimize certain types of algorithms, that don't translate efficiently into C. Try writing a FFT algorithm on C using a DSP, and compare it to what can be done in native assembly. The difference can be an order of magnitude or more. Some processors have special purpose modulo registers and addressing modes (such as bit reverse) that don't translate well into C, at least not without extending the language. Fixed point arithmatic operations are not supported in ANSI C either, but are a common feature on special purpose processors.

For low power/embedded applications, efficiency makes sense as well. Every CPU cycle wasted chips away at battery power. A more efficient algorithm means a smaller ROM size, and the CPU can either be clocked slower (can use cheaper memory and/or CPU) or put into a halted state when it isn't needed. (longer battery life) Coding small ISRs in assembly makes sense as well, as C compilers often must make worst case assumptions about saving processor context.

That being said, only a fool would try and re-write printf or cout in assembly, if they have a C/C++ compiler handy. Hand optimization is best used as a silver bullet, for the most computationally intensive or critical functions.

Intel could give many kickbacks to university programs, but they appear to get criticized for chips with too much baggage and backward compatability.

The RISC PowerPC processor has potential, but the number of consumer desktops with it has been on the decline (Is anyone but Apple left?). Computers might be too expensive for some students.

A Palm Pilot / Handheld sounds like a great choice to me. They're cheap and can be synced with whatever consumer desktop the user has (I can't imagine coding assembly in Graffiti). The limited hardware is probably a plus for academic purposes.

I think this fellow makes some great points, but what platform and tools would you choose to learn assembly with?

I would start with an emulated 8-bit microprocessor or microcontroller, such as the Z-8 or the 68HC908. This way they can run the emulation on a platform they already have, and such devices, embedded within ASICs are the most likely target for a pure assembly effort anyway.

Just a couple of years ago I did a fairly large 68HC908 application for a housekeeping processor entirely in assembler.

I have very fond memories of writing pentium-optimised asm...the rules were complicated enough to make things interesting, but still comprehensible.

Nowadays, the x86 ISA is just an API...god knows how the core actually executes instructions and in what order, which makes it very hard to optimise code beyond a certain point. You get more mileage from optimising memory access patterns and doing other such dull, dull, dull work. I get my asm coding fix elsewhere nowadays.

Quoting: "the best implementations are written by those who've mastered assembly language".

I haven't read this book, but I'd hope that there would be some pretty good justification of the above statement. I suspect that it's not, though. First of all, who defines what the "best implementation" is?

As Knuth says, the first rule of program optimization is: "Don't do it". Trying to optimize a program when you're writing it leads to all sorts of problems including difficult to maintain code, increated time and budget required for the project, and often it's not even a hot spot anyway.

I used to be very concerned about using making my code fast, but have (over the decades) decided that making it obvious is much more important than speed, particularly in the initial implementation. Profiling allows you to concentrate on the 20% of the code that the program is actually spending 80% of it's time in, instead of guessing where the hot spots are going to be.

I've found that another benefit of using simpler code is that I'm more likely to throw away whole sections of simpler code and try radically different algorithms or mechanisms. More complicated code I find I'll try to just tweek instead of dumping wholesale. Randically different approaches can lead to 10x speedups where tweeks of existing code may give you 2x speedups, if you're lucky.

Don't get me wrong, I'm all for trying different approaches. I'm not sure I would have come to the same conclusion I have now if I hadn't spent quite a long time trying to write optimized code. It was a very different world back then, but I know I wasted a lot of time optimizing code that didn't at all need it. It was an experience though.

I think learning compilers and how they will take your code and mangle it into machine code is more important than learning assembly, specifically. Building your own compiler will require you to learn some assembly (or at least the notion of it) which is sufficient for this purpose.

In the real world, you generally don't write an application targetted to a specific CPU. You trust that your compiler is generally going to produce efficient machine code for your algorithm. Sometimes it won't, but if there isn't a performance problem with that particular section of code it usually isn't worth the effort to do anything about it.

The point of using a profiler to optimize your application is that usually you're going to identify a couple key areas where you need to do some tuning because your algorithm is not efficient regardless of the architecture on which its running.

Further, within the x86 family, different CPUs have different performance characteristics. The most efficient machine code on one may be the slowest on another. So to write the most efficient programs for x86, according to this guy's definition, I would end up having to implement run time CPU checking with different code paths depending on the result.

Guess what? It isn't worth the additional development time unless the result is substantial. How do we know which improvements would be substanial? We profile the code.

This guy needs to take a leave of absence from teaching and work as a programmer for a while before he shoots off his mouth on topics where he clearly has no clue.

I'm fully in favor that most programmers should learn some assembly. But learning how to do efficient code is not the reason for learning assembly. Assembly should be learned thoroughly by systems programmers (who write operating systems, core libraries, compilers, etc) and certain embedded programmers because they might actually need to use that skill directly. Other programmers should learn some to the extent that it teaches them what's really going on inside the machine, but they should not dwell on it (unless they find it fun). Efficiency should focus on choosing (or developing) the proper algorithms for the application being developed.

If one is going to do programming where pointers would be used (systems programming and lower level applications programming, such as in C), then I suggest learning assembly as the first language. Two or more decades ago, that advice worked because most people didn't learn to program until they took a class in it or such. Now days, people destined to be programmers are learning some programming by around age 10 (usually in whatever language is easiest to get started in, which is generally not always the best to develop larger applications in). By the time they've done a lot of programming, they either "get it" with regard to pointers, or don't, and are set that way for life. This is unfortunate (and results in much insecure programming).

With the incredible power provided to us by modern CPU's, efficiency is just about completely irrelevant for 99% of non-game applications. Think... when was the last time you thought "This word processor just doesn't respond to my keypresses fast enough." or "AIM takes way too long to open a new IM window."? The reason why these programs aren't getting "faster" (as the article complains) is because there is no way to do so. They spend 99.9% of their time waiting for user input already.

Optimizing code which doesn't need optimization is Bad with a capital 'B'. When optimizing code, there is almost always a tradeoff between efficiency and maintainability. Efficiency often requires cutting corners, killing opportunities for future expansion, or, at the very least, writing ugly code. When that added efficiency does not lead to any noticeable benefit to the user, why do it?

Now, granted, you shouldn't use an O(n) algorithm when an O(lg n) one exists to solve the same problem. However, knowing the difference between O(n) and O(lg n) has nothing to do with knowing assembly. The only benefits you can get out of knowing assembly are constant-multiplier speed increases. And, frankly, shaving off 50% of 0.1% CPU time used is not going to help much.

Really, the speed of modern CPU's is sickening. I can't count the number of times I've written a piece of code, thought "This is going to be so slow...", then watched it execute near instantaneously. Even when running programs in a prototype programming language I'm working on -- which currently runs about 40x slower than C, because it's a crappy prototype -- this happens to me regularly. The only time your code is going to be noticeably slow is if you are processing a very, very large data set or you are using slow algorithms. In the former case, sure, knowing assembly will help, but such cases are extremely rare in typical applications. In the latter case, find a better algorithm.

All too often, high-level language programmers pick certain high-level language sequences without any knowledge of the execution costs of those statements. Learning assembly language forces the programmer to learn the costs associated with various high-level constructs. So even if the programmer never actually writes applications in assembly language, the knowledge makes the programmer aware of the problems with certain inefficient sequences so they can avoid them in their high-level code.

Fair enough. Like he says, it works with any speed processor to make things faster.

Most of the rest sounds like praise of free software. Free software does not suffer, "unrealistic software development schedules." Free software authors can go read the source code to gcc and gcc and the gnu debugger both have had more attention lavished to them than any proprietary equivalent.

I used to have two binary search utilities. One written in assembler and one written in a shell script. Same logic. The script was faster. Performance isn't just a matter of number of lines of code.

In this case I think the reason was the system having to load the binary vs. the script interpreter already being resident in memory. The start up overhead dominated the actual runtime overhead - binary searchs are very quick.

I agree with the underlying premise that more programmers should learn an assembly language and the intimate details of how a computer works. However, I disagree with several part of the article.

- He faults inefficient coding for the failure of software speed to keep up with CPU speed (or at least, its a "large part".) This is much less true than he lets on; Amdahl's Law means that the CPU is less and less responsible for the speed of an application, while things such as disk seek/transfer times, memory access times, and network latency all play huge roles in the speed of your computer's software.

- He seems to think that it's not terribly hard to become an "efficient" assembly language programmer. Bzzt, wrong! In the modern era of superscalar architectures, pipelining, processor specific instructions, branch delays, and memory heirarchies, it takes a hell of a lot of knowledge and experience to beat the performance of a good compiler.

- He apparently hasn't tried any large assembly language porting efforts lately. I'd love to see the effort involved in porting a large x86 assembly language program to a MIPS architecture, all the while maintaining that coveted "ultra-efficiency". The reality is that a good compiler can be reasonably efficient at porting a program to a new architecture, while a programmer usually isn't.

- He also apparently hasn't tried debugging a large chunk of assembly code lately. It is a fact of life that it is very difficult to debug assembly. By using a high-level language, you are increasing the readability of your software, which tends to decreases the number of bugs.

I could go on, but needless to say, I'm not impressed with the numerous assumptions and generalizations about assembly that he makes. Learning assembly will make your high-level programming better, and limited use of it can be appropriate, but using it all over the place is a huge mistake.

Having said that, knowing assembler is useful because it teaches you how the machine works.

However, most modern compilers can generate code that is much faster than handwritten assembler - especially because they know how to take advantage of the specialized processor architectures (hyper-threading, pipeling etc).

I've been reading this site since it was chips n dip, but this is the first time i've ever felt the need to comment.

I can't believe that any developer could write and code without some knowlege of ASM. Disclaimer: I'm self-educated, so have no bias except my own.

More importantly, if you don't know ASM and can't understand machine language, you'd never get through Knuth's tome "The Art Of Computer Programming".

In my opinion, this is the most important work ever written in our field. Any developer worth his/her salt should have at least read and understood these books, and completed at least the simple exercises.

the examples in tAoCP are written in machine language for a fictional machine, but the depth one learns by reviewing what that machine does with its data is important in any project.

I've never programmed professionally in ASM. infact, i usually work in Perl/PHP/Python. But i would not be able to write quality code in those languages if my mind was not constantly thinking of the machine. After all, i'm a computer programmer, not a linguist or scientist.

Not knowing assembly, or at least having some idea as to how a computer processor works, would make a programmer useless in my eyes. leave them to Access or VBA, and leave the coding to us pro's:)

I beleive that ASM should be taught first. If you can't understand ASM, you'll neer be a good programmer, so why bother learning Java/C++ or whatever? would you trust a doctor who didn't know how the body works?

One thing that many programmers take for granted these days is that compilers produce correct code nearly all the time. They've gotten really good over the years and are really a testament to the quality of compiler engineers. Even so...

I've been a programmer for over a decade and I've always found the worst problems to debug are when the problems aren't in your code but in the compiler. Compilers are programs too and have their own bugs. They aren't always 100% accurate at generating correct machine code for your source. And until the compiler gets fixed in the next patch or rev, you may be stuck with broken code unless you switch compilers.

Sometimes disassembly of the problem code and inlining correct assembly can be the difference between shipping a product or missing a deadline because you've spent months sitting around for the next compiler version to fix your problem for you.

I didn't start programming until I was in my mid 20s (degree in Chemistry). I liked fiddling with Java but always felt uncomfortable about what was going on underneath the hood. So I took a few classes. The first was Computer Systems Architecture. We wrote a game in x86 assembler. That class completely opened up my understanding how programs actually worked. To understand the stack, the state machine nature of things, and memory was an awakening experience.

Now (~5yrs later) I'm a fully capable programmer and an even better designer. My preference is C, binary file formats, networking protocols, crafting elegant solutions for multiplexing IO. I'm lead on a project used in production by many companies large and small.

I genuinely feel assembler is a vital part of the learning process for a programmer.

If you don't understand what's going on at the machine level, you are going to run into trouble eventually because your perception of the runtime environment is slightly or even wildly incorrect.

Example: When programming in languages like C or C++, you have to know what a stack frame is and basically how it's implemented, so that when something goes wrong you can correctly diagnose the problem. If you just know the corresponding language syntax (i.e. the scoping rules), you won't have the first clue where to start.

This applies to Java as well - just replace "machine code" with "bytecode" and "CPU" with "virtual machine".

In all these cases, a compiler takes your program specification (the source code) and produces the *real* program (in machine code or bytecode) - and that is what is executed and that is what you will be debugging and analysing. If you don't understand basically what machine code is and how it works, you will keep running into brick walls. I've seen this over and over again - the new graduates who just can't see why their program is behaving the way it does, because they never did assembly programming, or studied the run-time environment of programming languages, and so have these bizarre ad-hoc mental models of what's happening that bears little or no relation to reality.

I'm not saying that assembler should be used any more than it is currently, but if we are going to be using compiled languages (C, C++, Java), then it simply *must* be taught. There is simply no way to avoid this if you want to be a half-way productive programmer in those environments.

Imagine a mechanic who has built his own engine from scratch, or a painter who has made his own paint, or a musician who crafted his own instrument. Those seeking to exercise their art on the lowest possible level will always have superior insight into those that don't.

I look after pnm2ppa, which is a print processor to convert pnm image bitmaps from Ghostscript to PPA, which are HP's worst ever printers. Ever. They are so dumb, they make Bush look like a Mensa candidate.

When I first came to the code, it was written by someone who thought they knew better than the compiler, and structured the code accordingly.

We had hand-unrolled loops, unusual and rampant use of the "register" keyword, the occasional volatile, and strange padding in structures to try to align the data to what he thought the processor would use. There were arithmetic "if"'s, nasty pointer usage, throwing away type information (ie casting to void *), and strange methods of going through the data.

When I hand simplified all the code, it went about 15% faster. In inner loop case, over 100% faster by re-rolling a single inner loop because the person who unrolled it didn't understand how branch prediction worked and even less about large data structure walking and L1/L2 cache interaction. gcc 3.3 improved the performance of the code by about another 15%.

But you know what made the biggest change? A simple replacement of floating point gamma correction with a lookup table ordered in the simplest possible way. That shaved literally 30+ seconds off every page render on my PIII/800.

And you know what? The new GLUT is shorter and more readble, and is easier to tune for color correct output. It costs about 4 MB of RAM.

Assembly has no place in the modern day programmer's skillset. Humans do not know how to schedule instructions properly. They do not know how branch prediction will work unless the data they use is static. They should not waste their time on understanding the difference in L1 cache strategies (which are wildy different in the x86 families and AMD Opterons). They cannot work out how to best keep the data pipeline full on a wide range of processors. But you can help compilers work this out for you by:

* Design the system in the correct way first time - what do you actually need to do? Don't do anything else* Learn and keep up with the best generic algorithms for a wide range of activities (such as sorting, arrays, dictionaries, etc) and keep a library of well tested and bug free examples

* Write simple, clear, maintable code* Never, ever, ever throw away type information* Never, ever, ever throw away data aliasing* Never, ever, use the "register" keyword* Never use "volatile" unless you know why you need it* Document tests, data and code properly. This pays off big time every time you come to add new features or fix old ones

Lastly, program like a software engineer not a cowboy. Code must be correct then fast. Not fast and wrong.

It is vastly more important to be able to read assembly
code than write it. In most cases this involves reading
the code the compiler generates.

An earlier poster mentioned how such a skill can help you
find compiler bugs. This can be the case, but it is rare;
I have located two such bugs in 20 years of programming.
A more common use is to locate bugs in your code. When your
brain refuses to see the missing braces around the wrongly
indented code, or an spurious semicolon at the end of an
if or while statement, reading the generated assembly code
can save some extra hours of frustration. You will be able
to see that the code the compiler generates differs from the
code you think you wrote, and this will point you
to the bug's location.

As I argue in Code Reading [spinellis.gr],
other cases where reading assembly code can be of use are:

to understand how an implementation-defined operation
(e.g. a type cast) is actually performed,

to see how compiler optimizations affect the resulting code,

to verify that hardware-specific code written in C/C++ actually performs
the operations you expect, and

An appreciation of how compilers generate code for modern
structured languages.
The key ideas here are the stack as a way to handle nested function
and method calls, the frame pointer as a way to access function arguments
and local variables, and the virtual function table often
used for implementing dynamic dispatch in OO languages.
The Red Dragon Book [amazon.com] is the venerable classic here.

A way to obtain an assembly code listing of your high-level language code.
The Unix compiler's -S switch, and the Java SDK "javap -c" command
are two methods I often use.
In gdb you can use the disassemble, nexti, and stepi commands
to examine your program at the level of discrete processor instructions.

So, once again everyone is coming out of the woodwork to explain why this article is either right or wrong based entirely on the tiny subset of programming tasks that they actually do.

For some applications - embedded systems, drivers, games (my specialty), or any other real-time application - assembly is either very important to understand or actually essential. There is (or was) no other way to program the PS2's vector units, for instance.

For database work, batch or text processing, network admin, or anything else where speed is nice but not a show-stopper, "make it work" is much more important than "make it work fast."

I've always felt knowledge of assembly makes one a better programmer regardless of the application. Even on a high level, understanding why (unnecessarily) using data types larger than the system's register size is going to hurt performance can only be a good thing. Understanding assembly is fairly fundamental to understanding computers, as opposed to just using them.

..I changed my mind several times and decided I agree with the author of the original article.

I do disagree on several points that have been raised, but they don't defeat the final conclusion:

- I do agree that premature optimization has been lethal to many software projects. But I have met as many people who commit PO in HLL's as assembler, so this is not an argument for or against the language.

- The comparisons of startup times and code sizes with the '80s (the 80's! Why, in the 70's we had only... never mind) are amusing, but uninformative; there are a lot more services embedded in the average OS or word processor today. There is a degree of bloat, but the statistics are misleading.

- Hand-crafted assembly code is unlikely to be optimal in light of processor pipelining, multiple execution units, and scheduling. I used to know how many clock cycles each instruction in the PDP-11 instruction set would take to execute for each addressing mode; this information is not nearly so useful for today's processors.

- There are architectural considerations beyond assembly. As early as 1983 a colleague of mine brought a VAX-11/780 (a screamer for its day) to its knees, and came to me complaining bitterly about the processor and/or compiler performance. It turned out that the code in question, which used massive multi-dimensional arrays (in FORTRAN), had compiled into a two-instruction loop (three-operand multiply and an increment/branch), but the code was generating six page faults per iteration! He would not have avoided the problem just by using assembler, but my deeper understanding of the machine led to the identification of the problem.

All that being said, the title of the article is "Why Learning Assembly Language is Still Good." At the end of the day, while I opt to write in Java (or Objective-C, which I'm just picking up), I am better equipped to write good code knowing assembler, and a few other things behind the language and runtime I'm using.

A number of years ago, while I was working for a Baby Bell, we were building a system, and integrating it with BEA's Tuxedo. One day a couple of the consultants came to me, to tell me that it was crashing, and they couldn't figure out why.

I pulled it up in the debugger (a Sun box), and stepped through the code...and, when I found it was crashing in Tuxedo, I did something the consultants (young guys) had no clue you could do...I stepi'd *into* the binary.

Now, I didn't know Sun assembly language, but that was irrelevant. I nexti'd and stepi'd my way through, and found the name of the function it was crashing in, which will *always* be there, even in a stripped binary, and where it was doing it in the function.

I could then call BEA (I was senior technical, and Rank Hath Its Privileges), and get info from their developers.

Turned out to be the environment, not a bug, but the point is, once in a while, *knowing* what how things work Down There will save your butt, and maybe even lead you to better code.

mark "and I pushed my kids to know what
happens under the hood of the car, too"

We evaluated assembler vs. C for an 8086 hosted system about 15 years ago. We looked at several compilers and wrote some simple functions in C and assembler to compare the code generated. The best compiler (Watcom) produced code that was tighter then the hand coded assembler. The compiler did things with the code like sharing parts of a return block and optimizing register usage in ways that would make maintaining the code extremely difficult. With more complex chips, I doubt that a human developer could track all of the details required to produce tight efficient code.

In theory, a really good assembler programmer could produce more highly optimized code, but not on a consistent basis and within schedule constraints.

I don't argue that assembler isn't useful. I learmed more about how computers work wwhen I took an 8080 assembler class in college. And for certain problem domains like embedded systems, assembler is often necessary. But I don't write any more code in assembler than absolutely necessary.

I was reading hex dumps and hand coding small assembler routines in high school (read: 24 years ago), and its been an invaluable skill over the years. I've written 'C' programs that call assembly routines to access OS functions that no routines existed for, understood how parameters got passed on the stack so when something got corrupted I could look at a hex dump and figure it out within minutes.I took the TCP/IP software for an old minicomputer at my old job, licensed to the particular CPU, and figured out how to defeat the licensing so it'll run on any machine... all with no source, just by decoding/hacking the assembler and changing a few BNZ (branch Not Zero) to straight branches. I've played with building my own boards, and writing drivers for them.

From the standpoint of knowing how things work.. having the base knowledge of how the underlying hardware works, I can pretty much pick up any language on the fly.

no concept of the difference in 6 compares & 3 logical and's, vs two compares. Not that his way wouldn't work.. but *efficiency*. In a lot of ways, in my mind, mastery of assembly language can bring great insight into the *best* way to accomplish something.

I agree that knowing how to code at the assembler level is often the hallmark of the best programmers.

But I'd turn the premise around - I think that the best programmers learn how to code at the lowest because they want to and are interested in it. Then they learn about both the benefits and complications (pipeline stalls, cache effects...).

But on another level, teaching assembler in college is increasingly difficult. Students in many CS programs are hard pressed to learn much more than Java and C# - very few know any language other than those in the C/C++/Java/C# family plus Perl and Python. Instead they learn all about GUI's, IDE's,.NET and so on.

I'd love to see students really learn assembly language (though ideally it would be for something other than the plug-ugly x86 series architectures), but then I'd also love to see them learn Lisp, Haskell and a few more languages, as well as Unix, Windows, VMS and a few more OS's, as well as HTML, XML, TeX and a few more ways to mark up information, as well as OpenGL, Postscript, X windows library calls and a few more graphics systems, as well as Calculus, Linear Algebra and a bit more math, as well as.... (well, you get the idea).