Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

lkcl writes about his effort to go further than others have, and actually have a processor designed for Free Software manufactured: "A new
processor is being put together — one that is
FSF Endorseable,
contains no proprietary hardware engines, yet an 800MHz 8-core version would,
at 38 GFLOPS, be powerful enough on raw
GFLOPS
performance figures to take on the 3ghz AMD Phenom II x4 940, the
3GHz Intel i7
920 and other respectable mid-range 100 Watt CPUs. The difference is: power
consumption in 40nm for an 8-core version would be under 3 watts. The core
design has been proven in 65nm, and is based on a hybrid approach, with its
general-purpose instruction set being designed from the ground up to help
accelerate 3D Graphics and Video Encode and Decode, an 8-core 800mhz
version would be capable of 1080p30 H.264 decode, and have peak 3D rates
of 320 million triangles/sec and a peak fill rate of 1600 million pixels/sec.
The unusual step in the processor world is being taken to solicit input
from the Free Software Community at large before going ahead with putting
the chip together. So have at it: if given carte blanche, what
interfaces and what
features would you like an FSF-Endorseable mass-volume processor to have?
(Please don't say 'DRM' or 'built-in spyware')."
There's some discussion on arm-netbook. This is the guy behind the first EOMA-68 card (currently nearing production). As a heads ups, we'll be interviewing him in a live style similarly to Woz (although intentionally this time) next Tuesday.

DRM, in some aspects - trusted computing - can be a positive thing.
My ideal system would have a root key I can set, that without software signed by it, it is a rock.

No, trusted computing is pointless. Let me explain: Exploits are caused by bugs in your software, even if signed and encrypted if the bugs exist that allow stack smashing or heap pointer overwrites (buffer overruns) then your signed and encrypted "trusted computing" can end up actually being a remote code execution vulnerability. See also: Return oriented programming.

Now perhaps you could brick your machine if the boot sector's been tampered with, but why would an exploit writer bother when they can jus

The reality is the signed executables are going to interact with unsigned data during bootup or normal operation and the exploit to run unsigned code can be triggered at this point

For example the original xbox could be convinced to run unsigned code through exploited game saves and then system files(fonts/audio db) could be replaced with corrupted versions meant to trigger an exploit on bootup. This is how soft modding was performed for the xbox.

IMHO, they really need to push this for scientific computing initially, as they tend to buy in bulk and are not very binary dependant. They are claiming it is so low power (2.7 W) that it would be easy to put an array, say, eight of them on a 1U motherboard for 64 cores.

So I'm not seeing a power advantage here. More questions: does the chip do double precision, and what's the rate? What's the memory bandwidth? Is there support for ECC/scrubbing, which is essential for Big Deal calculations? (The 7970 doesn't support ECC. The Tesla does, and it had better given the amo

For the record, the Tesla K20x TDP numbers include the memory (it's for the entire card).

A comment below says that it uses DDR3 1333. Total bandwidth of that, being extremely generous and giving them 6 memory channels (unlikely) puts you in the neighborhood of about 1/10th the memory bandwidth of the K20.

Combine that with the "how do you connect this to other things" problem, and this chip has no chance in scientific computing.

As a follow-up - it's one DDR3 channel - maybe 2. That puts it at about 1/30th of a K20.

People have tried to creep into Scientific Computing with processors like this (tile-based perf-per-watt SoCs). They haven't succeeded (see: Adapteva, Tilera, etc.). And they have much bigger budgets.:)

I always wondered why it is always assumed that separate CPU and GPU are somehow the most efficient use of silicon. It just seemed counter intuitive to me. If the proposed processor is as efficient as claimed, it looks like I was right to wonder. This absolutely annihilates Intel and AMD on a performance per watt basis.

Hopefully FSF also patents it, so no troll can extort license fees from using the technology. In fact FSF should patent it all, make the blue prints available RFC-style and don't bother with anything else.

Those performance numbers are pure fantasy. First off, the 38 GFlops is undoubtedly referring to single precision operations while the x86 processors mentioned in TFS are doing that much in *double* precision mode. Second off, the 38 GFlop number is a simple arithmetic estimate of what the magic chip could do IFF every functional unit on the chip operated at 100% perfect efficiency. Guess what: a real memory controller that could keep the chip fed with data at that rate will use > 3 watts all by itself. This chip won't have a real memory controller though, so you can bet the 38 GFlop performance will remain a nice fairytale instead of a real product.

Indeed, high gigaflops is easy, useful high gigaflops is hard. You can easily build a processor that only support float-addition and nothing else with a 1024 bit SIMD register clocked at 4 Ghz. And voila, you get 128Gflop/s per core. Problem is: it is useless.

The question is not how many adds or muls you can do per second in an ideal application for your architecture. The question is how many adds or muls (or whatever you need to measure) you can do per second on a real application.

For instance, the top-500 uses linpack, that measures how fast one can multiply dense matrices. That problem is only of interest to a small amount of people.

Compare it to a more modern processor. You want floating point performance? Take a look at a Sandy/Ivy Bridge. My 2600k, which I have set to run at 4GHz, gets about 90GFlops in Linpack. The reason is Intel's new AVX extension, which really is something to write home about for DP FP. Ivy Bridge is supposedly a bit more efficient per clock (I don't have one handy to test).

If you are bringing out a processor at some point in the future, you need to compare to the latest products your competitors have, since that is realistically what you face. You can't look at something two generations old, as the 920 is, and say "Well we compete well with that!" because if I'm looking at buying your new product, it is competing against other new products.

The summary is building expectations so much that I can't help feeling this is a massive flop (yup, I did that) waiting to happen.

I'd be really impressed if they did match the performance of the 920, even if it'll probably be somewhere between 5-10 years old by the time this Free CPU sees production and gets into consumer hands. That's quite a complex, performant CPU right there to match. But the summary has so many holes, I really have a hard time believing they'll get anywhere near the 920 for general-pur

The one anecdotal piece I have to complement the above is that I was recently doing some work on an application in C to improve the performance of some legacy Fast Fourier Transform code compiled with GCC. The original code was doing a bunch of heavy lifting with double precision floats. I optimized the algorithm as far as I could without changing any data types and, as a last step I changed the doubles to pure 32bit integer arithmetic expecting at least twice the execution speed compared to the doubles on

* The proposal is dated December 2, 2012 for an advanced kitchen sink SoC with silicon in July 2013? Really?

* Their never released to market CPU design that beats an ARM on one video decoding benchmark is ready to go, except they need to move it to a new process, double the number of cores, and speed it up by 30%. Trivial, I'm sure.

* This bit here:

What's the next step?

Find investors! We need to move quickly: there's an opportunity to hitChristmas sales if the processor is ready by July 2013. This should bepossible to achieve if the engineers start NOW (because the design'salready done and proven: it's a matter of bolting on the modern interfaces,compiling for FPGA to make sure it works, then running verification etc.No actual "design" work is needed).

The design is done! They just have to, you know, grab their perfectly-working peripheral IPs from unstated sources, "bolt them on" to their heavily-modified CPU, and then compile for FPGA. And maybe some timing simulations for their new 40nm process, but I'm sure that won't turn up any problems. And "verification, etc." (aka the part where you actually make it work). And fixing any problems found in silicon. But no *actual* design work is needed.

I have spent the last three months in my day job on a team of a dozen people writing design verification test cases for a new SoC. Fuck you for talking like that's nothing.

* They're going to hit "Christmas sales"? So despite being a real honest for-profit multi-million-selling product, we swear, they're still targeting a consumer shopping season. Hint: you want your chip to go into other products. Products sold at Christmas time are designed long before Christmas. Probably more than six months before, i.e. July 2013. Oops.

* No mention of post-silicon testing, reliability studies, or even whether they've got a test facility lined up, or what kind of resources they need for long-term support. I said it when OpenCores pulled this crap [slashdot.org], and I'll say it again. Hardware is not software. You have to think about this stuff. Yield and reliability are what determine whether other companies buy your stuff and whether you make money from it.

Let me offer some advice to anyone who wants to change the semiconductor world overnight with the magic of open source: start small. Really small. Even Linus Torvalds didn't start out planning to conquer the world. Maybe you could start by trying to get open source IP blocks into commercial products. Once there's a bench of solid, field-tested designs, *then* we can talk about funding an attempt to put it all together. But coming out of nowhere and asking for $10 million is not the way to start. Just ask OpenCores -- their big donation drive got them a grand total of $20 thousand [opencores.org].

Thanks for that post.. extremely informative and it's good to know that people who really have to deal with these issues on a daily basis are paying attention.

As I said above: I have no problem with a project to build an "open" chip for education & hobbyists, but scam artists that know how to fool their marks with the correct buzzwords and hype are not doing anyone any favors.

pay attention 007: we're aiming for mid-2013, not yesterday:) literally yesterday: today's the 4th, right? also, we're open to all kinds of investment opportunities. this article is a heads-up.

also, bear in mind: the core design's already proven. mid-2013, whilst pretty aggressive, is doable *SO LONG AS* we *DO NOT* do any "design" work. just building-blocks, stack them together, run the verification tools, run it in FPGAs to check it works, run the verification tools again... etc. etc.

* The proposal is dated December 2, 2012 for an advanced kitchen sink SoC with silicon in July 2013? Really?

Perhaps my phrasing was unclear. I am skeptical of a six-month development process.

also, bear in mind: the core design's already proven.

By who? To what specs (temperature, voltage, operating life)? Using what methodology?

mid-2013, whilst pretty aggressive, is doable *SO LONG AS* we *DO NOT* do any "design" work. just building-blocks, stack them together, run the verification tools, run it in FPGAs to check it works, run the verification tools again... etc. etc.

You know you can't go straight from RTL to silicon, right? You need timing sims and physical layout. Those are not trivial and they cannot be totally automated.

the teams we're working with know what they're doing. me? i have no clue, and am quite happy not knowing: this is waaay beyond my expertise level and time to learn.

Okay, here's the part that confuses me. You came up with an idea, talked to other people with expertise about doing it, and it sounds like you know who's working on it. All of that is fine. What I don't understand is why you are acting as the leader/spokesman for a project you know almost nothing about. Who are these other groups? The link at the bottom of your proposal is to a no-name Chinese semiconductor company that formed last year and has no products listed. Are they doing the RTL, layout, and verification? Who's doing the silicon testing? What foundry will you use?

The reason I'm being so harsh here is because you're asking for a lot of money with very little credibility. There is nothing in your proposal, your CV, or your comments to suggest that you are competent to work on a project like this. So who's doing the work? Why aren't their names on the proposal? Who has the experience and leadership to make sure the project actually gets done? Why are you "quite happy not knowing" what they're doing when you're the one trying to secure funding?

If you come back here in 2013 with a working chip I'll be the first to apologize, but right now I see very little reason to take this seriously.

Thanks for the info! I had a feeling EOMA-68 was nonsense too, but I stopped reading after discovering that A) his first big hardware project was developing an "industry standard", and B) they had to change the name from EOMA/PCMCIA because it wasn't actually compatible with PCMCIA.

The only thing I might be inclined to worry about is the possibility that he might sucker gullible people into donating to his obviously doomed project. (I'm not quite cynical enough to believe he's a scammer, but intent doesn't matter when the money's been flushed and donors can't ever get it back.)

Yeah, that was why I commented in the first place. There are too many overly optimistic software people here to let this sort of thing slide.

p.s. I also work for a fabless semi company. HATE YOU if you work for a direct competitor. (okay, not really;)

Fabless, heck, I work for TI! We have plenty of fabs. Although we like foundries too. Ev

unless you consider 1333mhz 32-bit DDR3 not to be a real memory controller?

Thanks for filling in that detail since I didn't know the precise specs (and for proving me right). To reiterate: No, this thing does not have a real memory controller compared to the 128 bit (2 channel 64-bit) or 192 bit (3 channel 64-bit) memory controllers in the AMD and Intel chips, respectively, that are mentioned in TFS.

You can go on and on about some busy-loop that you were able to code that gets all those gigaflops. I can get a 386 to tell me the result of 100 quadrillion quad-precision add-muls where the only operands are zero in less than a second too.. but it isn't useful work.

Trust me, if a chip even remotely like the one you are describing could do all that useful computational work in less than 3 watts using a previous generation process, then it would already have been deployed in supercomputers years ago and this wouldn't be some pie in the sky FSF project.

I have no problem with a hobby project to build a CPU with an open architecture, but frankly hyperbole and outright dishonesty about performance expectations are not doing you or anyone else in the project any favors. Being "open" should include being honest & realistic first and foremost.

First of all: Lots of non-x86 high-performance computers have similar memory controller layouts. Look at high-end SPARC or Power architecture systems.

Second of all: Thanks for proving me right with your screed about how ARM chips don't have good memory controllers. Guess what: you're right! They don't! And guess what: The Cortex-A15 is the first ARM chip capable of beating a 4 year old Atom when clocked north of 1.5 Ghz! So that's the type of performance that even the supposedly miraculous ARM gets with its architecture and a similar memory controller! You are now claiming to be insanely smarter than everyone at ARM and Intel simultaneously.. if chips could be designed and built based solely on arrogance & ego, you'd put ARM & Intel out of business by next Tuesday.

So basically you have been trolling this thread calling everybody who has pointed out flaws in the grandiose promises that you have put forth "007" in a smarmy and condescending manner while presenting zero facts to backup your arguments and contradicting yourself at every turn.

From your annoying and repetitive use of "007", do you perchance speak with a British accent? Do you appear in informercials at 2AM pushing whatever fake product of the day some insomniac can buy for $19.95? Because that's exactly how you come across in these discussions, and if you actually are associated with this project and aren't just troll then I'd highly recommend that the FSF immediately disavow this project before they end up getting sued when you make off with somebody's money.

And there arent any processors AFAIK outside of the x86 / x64 world that can match Intel and AMD designs in raw performance-per-watt. Trying to claim otherwise is dishonest, and as parent mentioned if it were true the top supercomputers wouldnt be wasting their time on Intel and AMD parts.

well, tell you what, rather than accusing, why don't you ask me to ask them

Its not a matter of asking. If someone could match even a 2-gen old i7 design on 3 watts, they would have done so by now, undercut Intel, and made zillions. They cant, because Intel processors are really good and their R&D budget dwarfs the budget of most US states, not to mention they own their own fabs and are 1-2 generations ahead of literally everyone else in process scale.

Thanks for proving my point: the x86 chips run at 1/2 the effective throughput for double precision operations because the operands are twice as large. I never said a single word about instruction latency, you just invented that to make yourself sound smart while actually being stupid.

Tell me, do you go around to kindergarten classes and call the kids stupid when they say that 1 + 1 = 2

H.264 can't (legally) be encoded without paying for a license... interesting choice for an example. Yes, decoding is free at the moment, but these patents will be in effect until around 2020 or later and are part of the highly patented MPEG 4 standard.

I doubt very much that the people who control the HDMI spec would allow an EFF-endorsed CPU to do this anyway -- the EFF has no interest in enforcing DRM, and HDCP pretty much requires you implement it end to end.

I doubt very much that the people who control the HDMI spec would allow an EFF-endorsed CPU to do this anyway -- the EFF has no interest in enforcing DRM, and HDCP pretty much requires you implement it end to end.

I'm not sure you could reconcile those two views.

funny you should mention this. i raised it with Dr Stallman because the same sort of thing occurred to me: why support DRM?? well... his answer was: the DRM in HDMI is so utterly broken that it's as if it didn't matter. therefore, he's okay with it.

which i find absolutely hilarious. DRM is okay, as long as the keys are available, one way or the other [thus making the DRM irrelevant, one way or the other]. this is primarily what the fuss over the GPLv3 is about, because of the endemic tivoisation that o

you could have a radio active sample that emits partials randomly, use that as the base for your random number generator. the features i would like to see in this chip though is virtualization acceleration similar to what the better x86 and x86_64 chips now have. Maybe throw in hardware decoding of open media formats like ogg to.

I write software that requires randomness to seed some key generation routines, for inverse DRM -- Where the user can validate mods other users make, or that my dev patches are valid (security, a value add, not the "prevent game from running" sense). When I do need randomness, I simply ask for it. I require the user to pound on the keyboard and randomly shake the mouse about, using the inputs to generate a bit of randomness to generate state and bit selection of the other random inputs for constructing t

>The deadline:> July 2013 for first mass-produced silicon>>The cost:> $USD 10 million

This poster has either no idea or is dreaming. In 6 months he will not have an SoC through potentially several tape-outs, having first done System Engineering, Design, Synthesis, Layout, Verification, Validation, Documentation,... and seemingly all without an existing organization. Or are SoC manufacturers lately doing short-term build-to-order processors. And the 10 million are not going to cover the necessary cost for all of the above. The masks alone might be that expensive depending on the number of tape-outs necessary (which - without an existing organization and working design flow - will be a lot).

>The deadline:> July 2013 for first mass-produced silicon>>The cost:> $USD 10 million

This poster has either no idea or is dreaming.

both. i have no clue - that's why i posted this article online, as a way to solicit input and to double-check things - and i'm dreaming of success.

In 6 months he will not have an SoC through potentially several tape-outs, having first done System Engineering, Design, Synthesis, Layout, Verification, Validation,

what i haven't mentioned is that one of my associates (my mentor) used to work for LSI Logic, and he later went on to be Samsung's global head of R&D. he knows the ropes - i don't. we've been in constant communication, and also in touch with some people that he knows - long story but we have access to some of the best people who *have* done this sort of thing.

Documentation,

ahh, my old enemy: Documentation. [kung fu panda quote. sorry...] - yes, this is probably going to lag. at least there will be source code which we know already works. not having complete documentation has worked out quite well for the Allwinner A10 SoC, wouldn't you agree?

also, because this is going to be a Rhombus Tech Project, the CPU will *not* be available for sale separately. it will *ONLY* be available as an EOMA-68 module. no arguments over the hardware design. no *need* to do complex hardware designs. the EVB Board will *be* the "Production Unit" - just in a case, instead.

so by deploying that strategy, Documentation is minimised. heck, most factories in China have absolutely no clue what they're making. it might as well be shoes or handbags, for all they know. heck, many of the factories we've seen actually *make* shoes and handbags, and their owners have gone "i know, let's diversify, let's make tablets". you think they care about Documentation?:)... ok, i know what you mean.

... and seemingly all without an existing organization.

yeah. it's amazing what you can do if you're prepared to say "i don't know what i'm doing" and ask other people for help rather than try to keep everything secret, controlled and "in-house". my associates are tearing their hair out, i can tell you:)

Or are SoC manufacturers lately doing short-term build-to-order processors. And the 10 million are not going to cover the necessary cost for all of the above. The masks alone might be that expensive depending on the number of tape-outs necessary (which - without an existing organization and working design flow - will be a lot).

well, because i know nothing, i've asked people who do know and have a lot of experience. the procedure we'll be following is to get an independent 3rd party - one that partners with the foundry - and get them to do the verification, even if the designers themselves have run the exact same tools. if it then goes wrong, we can tell them to fix it... *without* the extra cost of another set of masks. a kind of insurance, if you will.

but the other thing we are doing is: there will be *no* additional "design". it's a building-block exercise. the existing design is already proven in 65nm under the MVP Programme: USB-OTG works, DDR3/1333mhz works, RGB/TTL works, the core works, PWM works, I2S works, SD/MMC works and so on. all we're doing is asking them to dial up the macros to put down a few more cores, and surround it with additional well-proven hard macros (HDMI, USB3, SATA-II).

does that sound like a strategy which would, in your opinion, minimise the costs and increase the chances of first time success?

> Yes, this is probably going to lag. at least there will be source code which we know already works.> not having complete documentation has worked out quite well for the Allwinner A10 SoC, wouldn't you agree?

I don't know the A10 with the euphemistic name but I know that the typical SoC MCU I know has documentation in the thousands of pages. And most of it on internal blocks, not external connections which might see a reduced need by delivering it only on a board - although then you need to document t

"So have at it: if given carte blanche, what interfaces and what features would you like an FSF-Endorseable mass-volume processor to have?"Standard size chip socket, with adapter springs and guides for using off the shelf cooling implements (like zalman fans, and watercooling), for other CPUs.

need PCI and PCI express, prefrably at least 24 lanes, hopefully as many as 48 lanes.

Behind this, fast northside/southside busses to keepup with the following, I think AMD open sourced hypertransport, so front side bus

"So have at it: if given carte blanche, what interfaces and what features would you like an FSF-Endorseable mass-volume processor to have?"

thank you for taking me literally! really appreciated!

Standard size chip socket, with adapter springs and guides for using off the shelf cooling implements (like zalman fans, and watercooling), for other CPUs.

ah. this is going to be a 15mm x 15mm BGA with only around 320 pins. it's tiny. ok, that might have to be revisited now that i thought about doing an 8-core monster - 3 watts in a 15 x 15mm package is hellishly hot.i'm still debating whether it should have dual 32-bit DDR3 lanes. even so, that only adds an extra... 75 or so pins, bringing it up maybe to 19 x 19 mm.

need PCI and PCI express, prefrably at least 24 lanes, hopefully as many as 48 lanes.

ahhh... PCI express is a bug-bear. that many lanes would, on their own, turn this into a 12 to 30 watt part: right now we're aiming for a different market. i'm happy to be steered in a different direction if it can be shown that it's a genuinely good idea, with a high chance of return on investment.

Behind this, fast northside/southside busses to keepup with the following, I think AMD open sourced hypertransport, so front side bussing should not be an issue.

ah this is an embedded processor: they don't have northbridge/southbridge buses [at all]. those are reserved for CPUs at the 10+ watt market.

If your still mulling over instruction set, a built in crypto proccessing chip would ROCK. implement intels AES-NI or something similar, plus more for twofish, serpent, and other fairly mainstream modern, unbroken Free/Open encryption algorythms. Then add hash instructions for the entire SHA family of hashes, MD6, whirlpool, tiger, RIPMED, and GOST

ok - this is a general-purpose processor that *happens* to have been designed to be capable of doing a GPU and a VPU's job. hmmm... i wonder whether their instruction set can do crypto primitives.. hmmm.... yeah, that's a great question to ask. i'll get back to you on that.

GOOD USB 3 support, with legacy suppoequivsrt for 1 and 2. Not only do I want some ports on the back, I want at least 3-4 banks of header pins on a theorhetical motherboard for front panel devices and ports. They shtheorheticalould be USB 1,2,3. Solid high speed memory controller at a preimium.

definitely going to have 1x USB-OTG, probably 2x USB2-HOST, and at least one USB-3.

Universial SATA support for revisions 1,2 and 3 (1.5GB/s 3.0 GB/s and 6.0 GBs respectively), built in RAID controller. eSATA would help too.

i'm reluctant to push this IC towards 6gb/sec - it'd be by far and above the fastest bit of I/O on the chip. RAID i'd be concerned about pushing up the cost for the mass-volume uses [which wouldn't use it]. eSATA is _great_. i'd forgotten about that.

scalable audio chipset capable of up to 8.1 surround, Stereo input, SPID/F and all the other great audio features.

SPDIF - i'd not *entirely* forgotten about that - will remember to make a mental note. audio i would like to rely on the processor itself for that sort of thing (for basic audio - headphones and the like), otherwise handing off to a standard I2S/AC97 audio IC for cases where people really want more complex audio. there are 3 I2S interfaces i think.

so, yeah - i want audio to be done more like the TI McBSP. DMA-driven, but use the main processor for audio handling. keep it simple.

DDR3 RAM, or something comparable.

already done. 1333mhz. bit concerned personally about the power consumption of 1333mhz, i know that 800mhz is about 0.3 watts for example: 1333mhz is starting to get to 1 maybe 1.5 watts all on its own!

this chip is more like MIPS-with-3D-ASE, or Ingenic-with-XBurst. you *can't* separate the GPU from the CPU: they're one and the same. ok, you could... but you'd end up with two identical processors connected by some sort of fast bus... why bother? why not just double the number of cores?

0) A proper MMU and at least 1Meg of cache
1) 64bit - If not, there will be a need for yet another version at some point. Just do this.
2) Double precision floating point in hardware (for + - * / and preferably rsqrt)
3) GCC support.
4) LLVM support
5) LLVM-Pipe for OpenGL support
6) It would be nice if some instructions were optimized for running virtual machines.

I haven't looked into what makes sense for #6, but with all the VMs around it would be nice to have them run efficiently.

So will this 100% free processor follow a 100% free fabrication process? What is the use in being worried about dependencies on proprietary vendors' architectures in order to support 3rd, 4th generation processors when the ability to replace 3rd, 4th generation processors with an equivalent part requires production through a proprietary vendor manufacturing processes?

So will this 100% free processor follow a 100% free fabrication process?

interesting question! if it became an issue, i'd get quite pissed and would, if forced to, look for alternative processor designs. that's the whole point of the EOMA-68 and the Rhombus-Tech strategy: the products are *not* dependent on one particular CPU - processors are on *modules* that are completely interchangeable. but... i like the idea. i'll have to think how to handle this one - it's not actually our design.

So what is this to be attached to? A virtual motherboard with non-Nvidia / Intel / Marvel / Broadcom... virtual chipsets? This will be quite a long march to the desktop....

not really. the plan is to release it exclusively as an EOMA-68 module, which itself will be both the EVB *and* the mass-production PCB (just in a metal case). what we'll do is the same thing as done with the Allwinner A10 card: make the module power-able from the USB-OTG as a stand-alone computer, that also has an HDMI output. so it'd be a larger version of these USB-dongle-computers like the MK-802, except with more "oomph" and the option of being able to plug it directly into desktop chassis', tablet

First a boatload of cores. 8 is a good start but I want 128 or 1024. One idea would be to have variable cores. That is a handful of crazy powerful cores and then another 1000 lightweight cores for all kinds of lightweight stuff.

But the difficult question is how compatible with existing things to make it. If you venture too far into the land of cool you might only end up with a tiny bunch of hardcore followers like Lisp and Erlang presently have. I am not saying that lisp or Erlang are good or bad but wh

Can we please move away from x86? That architecture is horribly outdated, loaded down with things that sort-of made sense in the 1970s. Today's x86 CPUs are just dressed up RISC machines; let's free up some of that chip space and just use RISC.

Today's ARM architecture is just a dressed up CISC architecture, let's move away from ARM's lame attempts at copying AVX with neon and just use the real thing!

(You see how the door swings both ways there? Trust me, if any architecture designer from the early 1990's were frozen in a block of ice, thawed out today and then shown the ARMv8 ISA, he would never in a million years call it "RISC")

We pretty much _have_ moved away from x86. It really only lives on in server, desktop and laptop form. Tablets, phones, and appliances are close to 100% non-x86 and vastly outweigh the x86 market in terms of units in service and probably total market value.

Tablets, phones, and appliances are close to 100% non-x86 and vastly outweigh the x86 market in terms of units in service and probably total market value.

I'm a nerd, not an MBA. I want to tinker. I don't give two shits about market value or items in service and have no idea why you do. However, I also don't care what instruction set a chip is running, they're so fast these days emulation works. Nobody writes in assembly these days, as long as you have a compiler for the chip it's all good.

That architecture is horribly outdated, loaded down with things that sort-of made sense in the 1970s. Today's x86 CPUs are just dressed up RISC machines; let's free up some of that chip space and just use RISC.

this team have come from the perspective of what makes a good GPU, then turned it into a CPU. it's about as far as you can get from x86 as you can possibly get. luckily they've done the hard part of porting at least one OS (android) so have proven the tools, the compiler, the kernel, everythine.

with linux now being the main OS it's hard for me to even remember that windows and x86 was relevant at one point. not that i'm ruling out the possibility of MS porting windows to this chip: if they want to, that's great: they'll just have to bear in mind that there will be no DRM so they won't be able to lock everyone out.

If you want to run x86 binaries, use a dynamic translation tool.

who was it.... i think it was ICT who put 200 special instructions into the Loongson 2H, which allow it to accelerate-emulate the most common x86 instructions, they got 70% of the main processor speed.

It's time to put this one to rest.It's been a few decades and we've seen the argument from theory, practice, and to conclusion today.

x86(and it's er.. extension/evolutions) IS the better general purpose arch. But not for the reasons anyone conceived of. I think it's best put this way.

1. RISC(for example) very good at running good code.2. Most code is bad. (No really, it's awful. Ask any programmer)3. x86 processors, it turns out, are very good at running bad code.

Many other arches were created under the premise that good code could be created for them automatically. Turns out that compilers that can do this are like unicorns. They don't exist. It's an np-hard problem.

It's what killed itanium. The magic compilers never turned up. The amount of developer effort required to write good software isn't worth it.

*Why is most code bad you ask? Easy. Programming, put crudely, is a bullshit art.Just ask Dijkstra (Well not anymore. He's dead now) Programs are math. Few programs, however, are proven to be "correct" mathmatically. - It's impractical for most applications. Sure, you have rules you call "Practices" that tend to generate better code.. But everyone knows how code is really developed nowadays. Lay it down, slap it around until the show stoppers are reduced to a bearable frequency, and patch up anything you missed after it ships.

I'm not saying this approach is necessarily bad. It has advantages. It's very fast! It's fast, and you can get a lot of useful work out of it. If your idea or application is good or novel or productive enough you can put up with some bugs and at the end of the day you'll end up ahead. - If you set out to write a program that's mathematically prove-able from start to finish.. Your competitors will have buried you years before your first release.

How about implementing just a few of the most common C-library functions in dedicated hardware. For example, atoi(), strlen(), or printf(). Although the software routines are highly optimised, they still take hundreds to thousands of cycles. Dedicated libc functions would require a significant amount of chip die space, BUT, they would be really power-efficient - powered off most of the time, and simply used when needed. Imagine being able to use these functions as single-cycle commands... even if the core ran at 100MHz, the performance would be amazing. Essentially it lets us trade a few hundred thousand transistors (now very cheap) for a few mW (still rather valuable).

How about implementing just a few of the most common C-library functions in dedicated hardware. For example, atoi(), strlen(), or printf(). Although the software routines are highly optimised, they still take hundreds to thousands of cycles. Dedicated libc functions would require a significant amount of chip die space, BUT, they would be really power-efficient - powered off most of the time, and simply used when needed. Imagine being able to use these functions as single-cycle commands... even if the core ran at 100MHz, the performance would be amazing. Essentially it lets us trade a few hundred thousand transistors (now very cheap) for a few mW (still rather valuable).

Yeah, but how do we decide which functions those are? And why C functions? And once we hard-code those functions into silicon, we have to jump through extra hoops to change their behavior.

All three of your examples make a weak argument for this. atoi() is out of favor, since it doesn't detect errors like the strtol() function does. strlen() has no safety or bounds checking, and printf() is horribly complex.

BTW, some instructions in the x86 family are very specific for things exactly like this already.

Yes. Lots of them. Not only are these instructions slow, they're useless. No one needs the ASCII or decimal adjust instructions AAA, AAD, AAS, AAM, DAA, or DAS anymore, and they were never much use to start with. There have been a few cases in which these instructions were cleverly used for other than their intended purpose, but those are rare. Then there's the REP with CMPSB, CMPSW, SCASB, and SCASW instructions. They're useless for st

I couldn't care less if it is x86 compatible (I assume it is emphatically not). I'm sure the FSF does not care, either. I would use this in a heartbeat for my main desktop, and since I haven't had any significant dealings with Windows in at least 8 years, all I need is a free Posix OS (probably linux) and a C/C++ compiler.

Contrary to Linux zealots belief, Windows is not the only proprietary software on the planet.

I don't really use Linux, although that's going to change soon as I will be working with it daily.I haven't used Windows in almost a decade.

But what other proprietary code do you have in mind?How many proprietary operating systems are there?

Windows - already mentioned.RTOS - if you need this - stay with i386.OS X - license does not allow you to run on anything but Apple hardware - why would you care.If you are amongst the remaining 0.1% using a niche proprietary OS then I guess x86 compatibility might be ni

If you are using proprietary code on an open source OS (eg. Flash on Linux) then x86 code execution would be nice, but in the long run - the ability (and documentation) to create an open source flash version has been available for years now. If anything - having a x86 proprietary Linux flash player has hampered development of FOSS flash implementations as there is less of a perceived need for it.

Quartus, Maya, MATLAB... There's much more closed-source packages than just Flash. And all the upcoming Steam games for Linux will be binary x86.

It's beyond time to dump the x86. It was a bad processor from the beginning. I learned Motorola 6809 as my first assembly language processor and then when I went to learn Intel's x86, I was like WTF?! The inconsistent way the instructions and registers were used blew me away and made me appreciate the Motorola way a lot more. But the popularity of the PC kept the processor going and growing. I know things in x86 world have improved some, but it all still maintains that backward compatibility and the CI

If this processor is going to be designed and licensed under GPLv3 - I guess one won't be able to build any license-compatible proprietary software for it either. Curious - but count me out:)

ah interesting. no, it wouldn't be. i believe there are two separate misunderstandings here.

first: i did actually look some time ago at LEONv..... v2 i think it is, which is LGPL licensed i think by Gaisler Research but the amount of work needed to turn it into a modern GPU/VPU-competitive processor would be too costly. then there is the stuff on http://opencores.org/ [opencores.org] but it's not really ready for prime-time - i've been keeping an eye on the projects there for quite some time [none of them are SMP capable for example]

instead, i kept hunting, spoke to tensilica about their core (which is superb btw!), talked to synopsis about their core (ARC), and even came up with a way to do software-interrupt-driven SMP (yes i ran it by alan cox on LKML!). when this current design popped up, and i saw both its capabilities and that they are willing to respect the GPL regarding the toolchain, i jumped at the chance.

second misunderstanding is over design of *hardware* impacting what *software* it can run. it would be necessary to have a modified version of the GPL, stating "all and any software programs running on this hardware *must* be GPL licensed". the impact that this would have would be extremely problematic, as well as being rather fascist and not in the spirit of free software at all.... and, also, as it would be a modified version of the GPL, it wouldn't *be* the GPL, so could not be FSF-Endorsed.

with that as background, to answer the question directly: this is a proprietary design just like all other proprietary designs, using off-the-shelf completed and *tested* hard macros (including the core processor itself albeit only under the MVP Programme), where there is no restriction of any kind on the software that can be run on that processor, be it free software or proprietary software.

Hmm, one problem I have with the full GPL is that it *is* by design rather intent on spreading itself virally and to the exclusion of other legitimate models, and thus a restriction on what software the hardware would be allowed to run would be unfortunately in keeping with the GPL.

I agree that that would be excessive, but then I think that the full GPL is generally excessive.

You may guess that I prefer to license my stuff under BSD licences to allow fully commercial uses. B^>

Hmm, one problem I have with the full GPL is that it *is* by design rather intent on spreading itself virally and to the exclusion of other legitimate models, and thus a restriction on what software the hardware would be allowed to run would be unfortunately in keeping with the GPL.

you are absolutely, absolutely dead wrong. waaayyyy off base.

I agree that that would be excessive, but then I think that the full GPL is generally excessive.

You may guess that I prefer to license my stuff under BSD licences to allow fully commercial uses. B^>

Rgds

Damon

and how's that working out in the android community? you've seen the list of GPL violations as people mistake "android equals linux", yeah? it's a serious problem, and it's why i started the whole rhombus-tech initiative: to get free software developers involved right from the beginning in the mass-volume industry, right the way through to sales in hypermarket retail stores. the "dream" if you will is for free software people to be able to walk into a supermarket and go go "fuckin'A! i helped write the software for that! you wanna buy one of these, grandma, i can replace the OS in no time, with something that i can manage remotely for you".

you have to remember that the BSD license was designed and written at a time when everyone trusted (because they knew them personally) everyone else in the industry. *everyone* shared source code. then fuckers like apple came along and went "thank you very much. BYE". at one point, microsoft's NT Team took the TCP/IP BSD-licensed stack, and put it directly into MSRPC (because winsock was so shit). it's almost 20 years later that Wine have finally reverse-engineered MSRPC. i really don't understand people who don't understand why the GPL is so necessary, i really don't.

Why would that be? The GPL would only cover the processor's design. So, unless you are removing the die from the case and grafting on additional logic gates to add some new proprietary function to the CPU you are fine. Running proprietary code on the processor is just using the processor. Saying your code has to be GPL would be like saying every document you write in a GPL'd office suite has to be GPL. It doesn't make any sense!

This is a tricky thing, not much different from signed bootloaders. In theory, it can be great for users; in practice, it is likely to be exploited by people wishing to push DRM and other non-free systems on us. See: