Over the last few years support for RAM encryption has been added to processors used in consumer electronics devices such as game consoles and set top boxes. A good example of this is the XBOX 360, in which the RAM encryption was a significant challenge for hackers as described in this classic presentation by Felix Domke and Michael Steil (video).

To minimize the impact on the processor performance, the RAM decryption algorithm will probably not be a strong block cipher like AES but a synchronous stream cipher.

Such memory encryption technology is intended to defend against an attacker who does not control the CPU but can access the physical memory directly, e.g by probing the bus that connects the CPU to the physical memory.

Yes, there is. In fact, it's highly likely that you posted from one. But encryption in the processor is not a silver bullet.

An x86 CPU (PC processor) or high-end ARM CPU (smartphone processor) contains a small amount of cache. There is enough room to fit code to encrypt and decrypt instruction and data memory on the fly. This way, the external RAM effectively becomes encrypted storage and swap space.¹

For most attacks, there is no difference between the processor itself and the innermost cache level (the L1 cache). The L1 cache is inside the same silicon wafer as the processor, and a physical attack on the cache would be about as difficult as a physical attack on the processor itself. So for security purposes, the processor and the L1 cache are the same. In many processors, even the L2 cache is inside the same package and about as difficult to attack, which gives more space to work in. If you're worried about attackers who can go inside the package, you really need to put your processor inside a tamper-resistant box, i.e. an HSM, in which case you might as well put enough RAM inside the box.

The benefit of encrypting data outside the CPU is to avoid attacks on the external RAM. There are two main kinds of attacks: snooping on a live RAM bus or modifying data on the bus, and dumping the RAM content.

There are several problems with encrypting data outside the processor cache.

It's slow. Not only are you spending a lot of time on encryption, but you're wasting a lot of cache space.

This requires support deep in the bowels of the operating system. It's not a trivial undertaking; with limited coding resources, it is often worth spending more time fighting more common attacks, for example looking for buffer overflows and string injection vulnerabilities.

There is a bootstrap problem: how does the processor know it's getting encryption code that doesn't have a backdoor, and where does it get the key? If the keys are stored on the hard disk, they might as well be kept in RAM.

One possibility is that the user types a high-entropy passphrase, one that the attacker can't realistically brute-force. Given that the attacks on the key would be offline, the assumption that the user would have a passphrase with sufficient entropy is only realistic among a small proportion of users.

Another possibility is to store the key inside the package. As far as I know, current Intel processors do not have any persistent memory. Some high-end ARM processors do (they have a small amount of write-once bits or a few kilobytes of on-chip flash memory).

There are also processors that have direct hardware support for encryption in the CPU, and where all communication buses going out of the CPU are encrypted. (I know they exist, but I can't name a part number.) These processors tend to be used in specialized applications.

This is an active research area (e.g. see this). It is in fact a rather hard problem, because the security model assumes that the attacker is the host system, which can observe every single memory access. It is hard not to leak information under these conditions...

For instance, imagine a CPU which computes a RSA signature. During the course of the algorithm, the CPU will evaluate some instructions, some of which being conditional jumps, and the CPU works with a pipeline with some branch prediction: to make things a tad faster, the CPU tries to guess whether a given conditional branch will be taken or passed through, based on what happened the last times this jump was encountered. When the CPU guesses wrong, the execution stalls for a few cycles: the CPU must forget his current pipelining, and refill his pipeline with the actual instructions.

Now, let's imagine a fairly limited attacker who can only observe whether jumps are predicted correctly or not. That's not much power; in the case of your encrypted CPU, the attacker can learn much more than that. But, at least, since both opcode and data come from memory, observing the memory fetches and writes (in particular target addresses and time of access with cycle accuracy) will tell the attacker what happens with branch prediction.

It turns out that such information is sufficient to recover the RSA key. That's a rather strong and scary result, because branch prediction analysis can be done on modern CPU with hyper-threading from an unprivileged process running on the same processor (so this means that in the Cloud, your neighbours, i.e. other customers of the same Cloud, are dangerous...). The model you envision, where the host architecture itself is hostile, makes it only easier (much easier) for the adversary.

Hard facts have never prevented business from being done. Thus, there are microcontrollers which at least pretend to do the kind of automatic encryption such that what you want. See, for instance, the DS5002FP from Maxim Integrated. Note that I do not claim that this product is broken: this is a very small 8-bit CPU which is unlikely to do branch prediction or other similar sources of leaks, and it embeds its whole RAM in the chip itself. This makes the security claim at least possible, even plausible. It also means that computing power will remain low. Such a microcontroller will offer about as much power as my first computer (from 1984).

A virtual CPU, then ? It is theoretically possible (or, more accurately, it has not been proven impossible) to offload arbitrary computations to an untrusted computer through the use of fully homomorphic encryption. There is even an opensource implementation but it is even slower than the 8051-compatible CPU from the previous paragraph. There again, this is an active research area.

Not in a single instruction, no. But maybe you can achieve something very close to it with a CPU supporting AES-NI instructions (modern Intel and AMD CPUs) with a program written in x86_64 ASM (assembly language). I cannot give you a definitive answer, because I am not an ASM programer. I presume you want to avoid writing unencrypted information into system RAM and keep it entirely contained within the CPU. This is pretty low level stuff. On a related note, there is a patch for Linux kernel, that allows for full disk encryption key to be stored only within CPU registers and not in RAM. It provides a mitigation factor against cold-boot attacks that rely on encryption key being stored in RAM.