I'm a student, fresh into programming and loving it, from Java to C++ and down to C. I moved backwards to the barebones and thought to go further down to Assembly.

But, to my surprise, a lot of people said it's not as fast as C and there is no use. They suggested learning either how to program a kernel or writing a C compiler. My dream is to learn to program in binary (machine code) or maybe program bare metal (program micro-controller physically) or write bios or boot loaders or something of that nature.

The only possible thing I heard after so much research is that a hex editor is the closest thing to machine language I could find in this age and era. Are there other things I'm unaware of? Are there any resources to learn to program in machine code? Preferably on a 8-bit micro-controller/microprocessor.

This question is similar to mine, but I'm interested in practical learning first and then understanding the theory.

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
If this question can be reworded to fit the rules in the help center, please edit the question.

2

What exactly is the problem here? If you're asking if it's possible to code in machine code then the answer is probably "yes". If you're asking for tutorials then a) make it clear that's what your question is, but b) it's not a constructive question.
–
ChrisF♦Dec 20 '11 at 9:17

@SK-logic, yeah, the machine code programming would get insufferable after about 1 hour. You're right, a better and more productive idea is to get down to the CPU implementation. There are also virtual versions of the 6502 (visual6502.org) as well as folks who have or aspire to construct CPU's using modern discrete logic (bradrodriguez.com/papers/piscedu2.htm)
–
AngeloDec 20 '11 at 12:50

9 Answers
9

People don't program in machine code (unless they are masochistic). They use (or develop) tools to generate machine code (compiler or assembler, including cross-development tools), or perhaps libraries generating machine code (LLVM, libjit, GNU lightning, ....). So resources about machine code generation, compilation, optimizers, and micro-architectures are also relevant.

And very often, a good optimizing compiler generates better machine code than you could do. You'll probably don't be able to write a 200 line assembler code better than a good optimizer.

If you want to understand machine code, learn assembly first. It is very close to machine code. Use it wisely, only for things you cannot code in C (or in some higher-level language, like Ocaml, Haskell, Common Lisp, Scala). A good way is often to use asm instructions (notably GCC extended assembly feature) inside a C function. Reading the assembly code (generated by gcc -S -O2 -fverbose-asm) can also be helpful.

Current processor's instruction set architecture (i.e. the set of instructions understood by the chip) are quite complex. Common ones are x86 (a typical PC in 32 bits mode), X86-64 (a desktop PC in 64 bits mode), ARM (smartphones, ...), PowerPC etc. They are all quite complex (because of historical and economical reasons). Perhaps learning first an hypothetical instruction set like e.g. Knuth's MMIX is simpler.

What about those who want to create and assembler? There are reasons to learn machine code, though they aren't that common.
–
JettiDec 20 '11 at 15:06

I would say that it is learning an instruction set architecture (using the assembly mnemonics). You rarely learn explicitly the exact encoding of instruction (e.g. that NOP is 0x90). You many need to know it when writing an assembler or a machine code generator. (Likewise, you rarely need to learn by heart the UTF8 encoding of Unicode).
–
Basile StarynkevitchDec 20 '11 at 15:08

An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture.

So Assembly is a symbolic representation of machine code.

You may now be asking "Ok, so how do I learn all of that?" I am so glad you asked:

Understand what it is. It is very low-level and will give you a very in-depth understanding of a computer. You might want to start with Wikipedia and then read this short passage.

If the original poster build a computer from scratch, he will have to define (not just learn) his own assembly.
–
Basile StarynkevitchDec 20 '11 at 14:49

@daniels I understand the reasoning by learning adding from bits which is true low level. +1
–
AceofSpadesDec 20 '11 at 18:50

An alternative to building a computer from scratch could be learning some old processor (and its assembly language) like the Z80 or 6502 that is still simple enough to be understood. I guess there are even emulators that you can play with.
–
GiorgioSep 6 '12 at 8:20

@AceofSpades A great way to easily build CPUs and CPU components (eg. an adder) is with redstone in Minecraft, I would recommend that. I've started working on some simple machines in Minecraft, and it has highly boosted my understanding of the theory and logic behind computers.
–
AaronFeb 15 '13 at 0:26

My suggestion? Learn MIPS and learn how to build a (simple) MIPS processor. It's actually easier than it seems.

The advantage of MIPS over some of the other architectures is simplicity. You won't get caught up in a ton of little details, but you will still learn all of the big ideas you need to write code in other architectures.

Coincidentally, this was the final project for my (third) intro CS class. If you want, you can read the assignment and browse through the lectures as videos or slides.

Among other things, we did cover how MIPS code gets turned into binary; we even had to decode some (very simple) machine code on the exams.

Even if you don't want to cover everything, most of the lectures were given by one of the students' favorite lecturers and are fun to watch by themselves.

I'm a student, fresh into programming and loving it, from Java to C++ and down to C. I moved backwards to the barebones and thought to go further down to Assembly.

Excellent path to take. My jump (fall?) from C to Assembly and lower was a university course Computer Organization and Design, based on the book by the same name.

Highly recommend this book for the first chapters on basic MIPS assembly, all the way through pipelining and memory architecture. Even better would be to take a course around the same theme, or find some lectures online.

I have an instruction set that was made for this, a simulator, and some tutorials on the basics one instruction or concept per lesson. Just type the program, run it then learn what it does, move on to the next lesson.

I also have simulators for a few mainstream instruction sets as well. Any or all of which are good for using to learn asm (if you really feel you have to learn x86, learn it last, and use a simulator like the one I have forked, 8088/86 first then move forward). Learning against a simulator has pros and cons, one major pro, esp when starting out, is you dont crash anything and you have great visibility. Jumping in head first into an embedded platform, microcontroller, etc to learn a new instruction set you have to overcome the hurdles of not being able to see what is going on, leading to a long list of ways to fail...

I first learned 6502 Assembly Language on the BBC Microcomputer (Model B, 32K). It had an awesome BASIC implementation that included a macro assembler. We had them at school so I wrote all sorts of mischievous programs that would do things like direct screen buffer manipulation to make a Lemming walk across each screen, around the room (they were networked) if the machines hadn't been used for 10 minutes. It resulted in fits of giggles among my Year 7 friends.

When I got a Commodore 64 at home, I learned that it had a 6510 CPU which also ran 6502 assembly language but with some interesting extras. I had to buy an assembler (came on a cartridge) and invoke the programs via BASIC. With grand visions of writing a best-selling game, I eventually managed to create several demos that bit-twiddled video display hardware registers on interrupt to do interesting colour bar effects that animated to funky chip music. Impressive, but not that useful.

I then got an Acorn Archimedes A310 which had an ARM2 CPU so I used the same awesome BASIC implementation with built-in macro assembler as the BBC Micro (same heritage). I managed to put together a couple of games which an arty friend provided graphics for, plus some sinusoid-based trippy demos. Both of these were hard work to program and bad code could take down the machine (accidentally trip hardware reset register, etc), losing everything if I hadn't saved (to floppy!).

At University I was introduced to C++ and thus C. I was able to use it to program Sun/Solaris and some other large mainframe computers. I have no idea what CPU architectures these machines ran on - I never needed to use assembler or read machine code as the C++ tools gave me the power I needed to produce professional applications.

After Uni, I worked on Windows and several flavours of Unix. C and C++ worked on all these machines and eventually Java did too.

I then worked on Windows and Dreamcast using C++ with DirectX with comprehensive tool chain for debugging.

I then took a job working with ARM-based chipsets for Smart TVs (in 2000). Although my experience with ARM2 may have been relevant here, the job was C based. I found that all the poking about with hardware that I'd done on the Archimedes could also be done in C using straightforward bit-twiddling operations. Part of my role was to migrate the code base to Windows, Playstation 2, Linux, other TV and mobile chipsets. All of these platforms were available with both a C compiler (often GCC) and some level of API to write to the underlying machine - the embedded world is rarely a kernel O/S. I didn't ever need to know the full machine code for any particular platform beyond writing a boot loader and mini-BIOS, both of which jumped into C code at the first available opportunity (after setting up trap vectors, ensuring endian-ness and instruction-mode and establishing a stack).

The next job was working with C++, C# and JavaScript on Windows. No machine code.

The current job is working with C++, JavaScript, Python, LUA, HTML and other languages on various platforms. I've no idea what machine code these platforms run, nor do I need to know - the compiler translates our code into whatever it needs to be. If it crashes, I catch the error in a debugger or through runtime diagnostics (exceptions, signals, etc).

For fun, I develop iOS applications on the little spare time I have at home. It uses Objective-C and an API that works across multiple chipsets. Apparently they're ARM-based, but I've never seen any machine code in my development.

While its' a fascinating exercise to learn assembly language, there are now much higher-level tools and languages that allow you to be an order-of-magnitude (or two) more productive.

The number of job opportunities available to an amazing assembly language/machine code programmer are miniscule compared to something like JavaScript, Java, C#, C++ or ObjC.

I'd advise you make this a hobby/side-interest rather than a main goal.

Code by Charles Petzold is a very good introduction to the subject and describes the process of building a computer including how to construct adders, counters and RAM arrays and introduces machine code and assembly language and their relation to higher level languages. It's also a great read on the history of computing.