@DutchUncle: personally, I design CPUs just for fun, but I do see a lot of new processors out there, with different ISAs and different architectures.

Let me be honest: I don't think that multi-core/SMP is the way to go. These two techniques explore paralellism, and have a lot of drawbacks (synchronization, lack of predicability).

There's another technique that does just about the same - superscalar.

The hardware always is inherently parallel - we have combinatory circuits feeding synchronous elements, all at the same time. What we fail to do so far is to figure out how to explore this massive parallel infrastructure into our benefit, without turning the General Purpose Computer in a specific one - meaning we can indeed explore this parallelism, but only for certain tasks.

Perhaps we are doing it all wrong. Perhaps we don't need "general purpose" registers. Perhaps we don't need stacks. We still use programming/CPU architecture (general purpose) as it was done in the early days (in late fifties).

Restricting the set of instructions has its benefits: it simplifies the design, reduces size and power consumption, while having an impact on the performance - this might pay off for more simple execution streams.

I do think we are specializing CPU too much: I don't think it makes much sense to have a CPU doing "complex" tasks like vector arithmetic, SIMD, and others. There might be a solution for this: not multi-core, but multiple specialized execution units, that can somewhat be independent of each other. This would require a new programming model though.

@DutchUncle
I think Professor May does want to build the processor on the lab bench at the register level.
He makes the point that no schoolchildren today have ever seen any form of mechanical calculator; neither slide-rule nor hand-cranked calculator nor even the three-position dialing machine for calculating the remainder to score in a darts game.
As a result first-year students have no idea about what sits between a high-level language and the 1s and 0s toggling on a digital chip's pins....to them it is simply "magic."
Modern processor architectures are so complex and full of exceptions and special cases that it would be highly wasteful of time to teach processor architecture in that context.

Why bother creating a new processor, unless the point is to build it out of 74xx NAND chips just to show that you can? Restricting oneself to a subset of the available instruction set, or being forbidden to use a particularly helpful built-in operation, was one of the tricks used by my professor in CS205 class back in the 1970s.

But in the interests of energy efficiency it is generally better to complete task in reasonable time and then put the system to sleep.
So not pointless with regard to energy consumption...or you could be doing other things.

If anyone wants a simple computer, think about: structured programs evaluate a relational expression(condition) and either continue with the next sequential or jump/branch to a target.
next is either an assignment or another relational expression.
The location of the expression and the first 2 operands and operator are all that are requied.
Dual port embedded RAM can deliver the two operands simultaneously while a second delivers the operator and next address.

I would not disagree with that.
But keeping multiprocessors coherent probably only works for a relatively small number of processors or applications where maintaining coherence is not overly burdensome.
A more scalable multiprocessing may require the adoption of incoherence..like the Internet.