So, UNIX’s predecessor, MULTICS, had a stack where the data flows away from the stack pointer where an overflow drops incoming data into new memory if available. Then, CPU-enforced segments to isolate memory regions. I’m not sure on the heap but imagine it was isolated/segmented as well. Secure kernels such as GEMSOS did something similar on custom and Intel CPU’s since it worked before. Imagine my shock when I first read a paper saying the stack and heap flowed toward each other… in systems using unsafe languages, ignoring security features on CPU’s such as segments, and possibly doing this in privileged code. I predicted problems since they’re always there by default when different things smack into each other not designed strongly for security.

And here they are in a nicely-written report. Apparently, they’re still doing weak mitigations against problems whose root causes were eliminated by 1960’s designs. Be interesting to see if cleverness + keeping root causes works this time or penetrate-patch-and-pray game continues. :)

Apparently, they’re still doing weak mitigations against problems whose root causes were eliminated by 1960’s designs. Be interesting to see if cleverness + keeping root causes works this time or penetrate-patch-and-pray game continues. :)

Unfortunately, “penetrate, patch, and pray” has some kind of evolutionary durability worked into it, given the constraints of late capitalism.

That plus Richard Gabriel’s Worse is Better seems to describe the situation quite well. The other thing is the bandwagon effect where people are usually jumping on one to solve their problems. If it worked well, they’ll defend it. If some properties were incidental, they might also believe they were designed that way on purpose for some advantage. Prior advantages might also be a detriment later one with main example being some aspects of C and UNIX came straight from PDP-11’s limitations. The low-level nature of the language + huge piles of code needing to be rewritten meant they kept the old approaches when new hardware came out.

I wonder what is going to happen to operating systems like Secure64’s SourceT micro OS, more info available https://secure64.com/secure-operating-system/ here, now that the Itanium2 has been EOL’d. As mentioned, the OS takes advantage of Itanium-specific features such as independent read/write/execute privileges per page, protected stack architecture, and 4 ring levels.

It seems that many of the design decisions relegated many of the “enterprise” features to Itanium chips only and they were left out of x86_64 specifically to keep the higher-end systems locked into the more expensive CPU. For example, only 2 ring levels and differing implementations of memory protection keying.

It seems that many of the design decisions relegated many of the “enterprise” features to Itanium chips only and they were left out of x86_64 specifically to keep the higher-end systems locked into the more expensive CPU. For example, only 2 ring levels and differing implementations of memory protection keying.

That’s not really consistent with history. X86 came out long before itanium and does have four rings. X86-64 was developed by AMD as a direct competitor to itanium. They weren’t leaving features out in a deliberate attempt to gift Intel more market share.

I’ve also heard, but not investigated - so consider these unsubstantiated - claims that many of the Itanium specifIc features that are unlike x86 were made to keep parity with and this ease transition from PA-RISC chips. This seems likely as PA was very popular with large enterprise, and supported HP-UX, etc.

The other take on that history I’ve heard is that Itanium was used as a testing ground for weird features, prior to potentially bringing them into x86, without risking getting the huge customer base hooked by backwards compatibility on something that might accidentally turn out later to not be useful enough to justify the expense in transistors, mm², design effort, power or whatever. I have no clue about whether that’s true.

The marketing said it had RAS features x86 didn’t plus speed boost from piles of registers and their VLIW or whatever stuff that didnt pan out. On top of security improvements. It was advertised as better than x86.

Then, backward compatibility prevailed and high-end x86 achieved parity on Itanium stuff. Now it’s a liability to Intel. OpenVMS is also getting hit since they got on Itanium bandwagon. SGI did too in past. Those two had other problems though haha.

I’ve wondered that exact thing. They chose it due to advanced security and fact that CTO (or some high up) had helped design those chips. Knew them well. Now, it’s EOL’d. I didn’t see any news release on it. Their contact page has no email, a toll-free number that hung up, and a contact form. I sent them a request for comment about current plans to deal with Itanium EOL plus a few options. I’ll message you if they reply. The “thank you” screen was blank, too, but might have been NoScript’s fault. (shrugs)

I wonder if making alloca() automatically segfault the process if asked for more than 2kB defeat most of the “jumping over the guard page” tricks? Most of them seem to be taking advantage of code that will alloca() buffers sized to contain attacker-controlled data.

Maybe better would be if you could move big stack allocations into malloc()s (automatically inserting matching free() calls). Ignoring from the performance difference, I think the main compatibility problem you’d have is that alloca()ted memory is expected to be freed by longjmp() calls that go past it.

I have a vague recollection that I might have once heard about something like atexit() but for individual layers of stack unwinding, so that resources could be freed when the stack is being unwound by exceptions, longjmp() or function returns.

I think focusing on alloca is missing the big picture. And 2kB is pretty small for userland code that likes to use stack. Think of all the PATH_MAX sized arrays on Linux.

Something like -fstack-check sounds like a much more comprehensive yet less intrusive solution, especially in combination with a reasonably large stack guard. The compiler can statically determine that most functions are not going to use that much stack, and so in practice only VLAs and allocas, along with the odd function with ridiculuosly large constant size arrays end up being checked.

Yes, using malloc for potentially large allocations is a very good idea.

They didn’t mention it, but windows requires the equivalent of stack check and always has. If you try to touch a page two below the stack, you crash. Every page must be touched in sequence, which enforces that the program does such checks, which ensures you’ll hit the guard page without skipping over it.