Suppose you're a CPU, and you're presented with a bunch of alternating 1s and 0s at a certain point in memory. How does it know which are instructions, and which aren't? It can't possibly be as simple as "even addresses are data; odds are instructions" because that would be a huge waste, wouldn't it?

"Each opcode will consist of an instruction of N bytes, which then expects the subsequent M bytes to be data (memory pointers etc.). So the CPU uses each opcode to determine how manyof the following bytes are data.

Certainly for old processors (e.g. old 8-bit types such as 6502 and the like) there was no differentiation. You would normally point the program counter to the beginning of the program in memory and that would reference data from somewhere else in memory, but program/data were stored as simple 8-bit values. The processor itself couldn't differentiate between the two.

It was perfectly possible to point the program counter at what had deemed as data, and in fact I remember an old college tutorial where my professor did exactly that, and we had to point the mistake out to him. His response was "but that's data! It can't execute that! Can it?", at which point I populated our data with valid opcodes to prove that, indeed, it could."

It can't even in modern CPUs. StuG is somewhat correct on the opcode bit, the opcodes do encode how much data follows, but that data is considered part of the instruction when an opcode includes immediate data. For example:

0xA5 42 LDA 0x42

This is the load register A instruction from the 6502 with an immediate value of hexadecimal 42. The opcode is 0xA5 and the data is 0x42, but the instruction is 0xA542. This may seem a pedantic distinction, but that is only because in this case, the opcode and data are both on byte boundaries. On a processor design I was involved in, the instruction word was 36 bits. In the Add immediate (ADDI) instruction, the immediate data was 10 bits, 13:4, in the instruction word.

This lack of distinction between data and instructions, even when the data is actually just data, is something of a requirement for a von Neuman architecture as there is only one address and data bus, so at the system level there is no difference between data and instructions. If the processors instruction pointer is pointing at it, then it will be treated as an instruction and executed. This is actually how stack overflow vulnerabilities occur. The stack is a special chunk of data used by the program for branching and subroutines. One of the things stored there is the value of the instruction register. If you can force that value to be "corrupted", then the program will start executing code from somewhere it shouldn't, including memory that supposedly only contains data.

The NX flag doesn't really denote instructions versus data. It simply marks a chunk of memory as invalid to load instructions from. As morphine noted, it takes a program (read OS) to set it.

Aside from the above mentioned immediate operands, code and data are typically stored in separate areas of memory. But other than the NX bit (which blocks the loading of instructions from data pages), and the "read only" bit (which is typically used to block writes to code pages once they've been loaded into memory), there's no real distinction between code and data areas (they're still in the same address space, though in different ranges of it).

As an aside, in a Harvard architecture CPU (e.g. Microchip's 8- and 16-bit microcontrollers) code and data are kept in completely separate address spaces, and are fully segregated. Harvard architecture makes sense when you're dealing with an embedded device where the code being executed typically resides in a small on-chip flash memory instead of in system RAM.

In standard processors (skipping oddities like JBI's Harvard Architecture) the instructions are where the instruction pointer is pointing to. It's up to you to ensure that it really is instructions and not data. The NX bit will help in protecting certain memory pages, but there is data mixed in with the actual instructions as SecretSquirrel shows. If you get your program wrong then the Instruction Pointer points at the wrong stuff and the processor will try executing it, most of which will not make sense and it will finally crash or hang. Other tricks modern processors do to help avoid getting hacked is to typically mark code pages as read-only.

People talk about self-modifying code where you treat the instructions as data, overwrite them and then execute the new data/instructions but it is very rare. More common are compressed executables, a small block of code at the start and a large block of compressed data. The code at the start uncompresses the data in to the original program and then jumps to it.

They're only oddities in the context of general desktop/server computing. PIC microcontrollers are ubiquitous in embedded control applications; there's a good chance you own dozens of them without realizing it, in your home appliances and car. Wouldn't surprise me if they outnumber "normal" CPUs 10 to 1.

notfred wrote:People talk about self-modifying code where you treat the instructions as data, overwrite them and then execute the new data/instructions but it is very rare. More common are compressed executables, a small block of code at the start and a large block of compressed data. The code at the start uncompresses the data in to the original program and then jumps to it.

One particularly clever use of self-modifying code was back in the MS-DOS days, when some x86 processors had a FPU (x87) and some did not. The Microsoft C compiler would generate FPU instructions for floating point math, and pad them with enough trailing NOPs such that there was room to patch a subroutine call in their place if needed. At runtime, executing an x87 instruction on a CPU without a FPU would trigger an exception (software interrupt); the exception handler (also compiled into the binary) would patch the code on the fly to replace the offending x87 instruction with a call to a software emulation routine, then resume execution at the point where the exception occurred.

This way you could have a single binary that could run on x86 processors lacking a FPU, yet still take advantage of the FPU if it was present. The in-place patching meant you only took the overhead hit of running the exception handler once for each FPU instruction on each run of the program; after the first time through, the x87 instruction was emulated with a simple subroutine call. On CPUs with a FPU, the only additional overhead was a couple of extraneous NOPs after each FPU instruction, and the extra memory occupied by the (unused) emulation code. (If you knew you were only going to run on CPUs with a FPU, you could set a compiler option to turn all this stuff off...)

I remember thinking that this was really cool (in a sick sort of way) when I figured out what was going on.