Posted
by
Soulskill
on Saturday January 18, 2014 @05:55PM
from the and-how-quickly-could-EC2-win-the-crown dept.

Hugo Villeneuve writes "What piece of code, in a non-assembler format, has been run the most often, ever, on this planet? By 'most often,' I mean the highest number of executions, regardless of CPU type. For the code in question, let's set a lower limit of 3 consecutive lines. For example, is it:

A UNIX kernel context switch?

A SHA2 algorithm for Bitcoin mining on an ASIC?

A scientific calculation running on a supercomputer?

A 'for-loop' inside on an obscure microcontroller that runs on all GE appliance since the '60s?"

That actually breaks the C standard, but I suppose control systems aren't much worried about portability.

The ANSI C standard defines two types of implementations: "hosted" and "freestanding". An embedded system would most likely be considered a freestanding implementation, in which case, the entry point function can be whatever the implementation defines it to be. It might not even be named "main" (but if it is, it could return void if that's what the implementation says). That said, C99 allows main() to return void, even in a hosted implementation: 5.1.2.2.1 [coding-guidelines.com] gives "some other implementation-defined manner." as one of the options for main's definition. It notes in 5.1.2.2.3 [coding-guidelines.com] that "If the return type is not compatible with int, the termination status returned to the host environment is unspecified."

Lines of code I've written in HDL execute in hardware once every clock cycle, at, say 100 MHz on maybe 50,000 devices for at least a few years of active use each. That's like 3x10^20 executions alone, and I work for a specialty hardware company which has only sold ~50,000 devices I've designed over the past 13+ years. I'm quite certain other hardware developers have far, far more, and the original question doesn't necessarily seem to require code that executes in a processor versus inferring hardware.

Not sure, but OpenBSD has this at the very end of its main() [bxr.su]:

while (1)
tsleep(&proc0, PVM, "scheduler", 0);/* NOTREACHED */

I tried finding the FreeBSD equivalent, but their (Newbus?) code makes it entirely non-obvious where the loop is. Feel free to try your luck — only the comment to what the startup function is supposed to do matches, but the rest is quite unique, even a different function name — mi_startup() on FreeBSD [bxr.su].

I work with a ton of electrical/controls engineers. Yes it is still probably true, mostly because it is still even cheaper/easier to do this through ladder logic. I forget the context of what we were talking about one day, but one day while talking to one of our SENIOR (30+ years) controls engineers I was explaining some logic that if we had to implement it in C# would take probably 300 lines of code. His reply was simply, psh, I could do that in 3 rungs, don't bother.

I disagree. This may be the superlative of something, but I don't think "dumb" is it.

I actually think it's an interesting thought experiment. It immediately forces the reader to think about how pieces of code are used in the real world, both within and beyond their intended application. But it is also likely impossible to settle to anyone's satisfaction. I would trust a proposed answer to this question even less than I would an answer to "What was the size of the internet at the time of the Morris worm", or "How many lines of C code are there in existence".

Just because something's hard to measure doesn't make it dumb, though.

Personally I think that the biggest problem with Slashdot is the abundance of comments like this. Seriously, it might not meet your standards. I understand. Now get over it and stop wasting my time writing it for the thousandth time or actually submit an article that raises the bar. Whining is not really going to change anything.
Sorry, but I really had to.

What's even more hilarious, people like the GP who claim to hate a type of article then proceed to post in each and every one of them are in reality raising two counters in a database somewhere when they #1 click on the article and #2 post a comment.This indicates to slashdot that the article was both interesting to read as well as interesting enough to have participation, and the interpreted result is the readers want MORE articles of that nature!

Every Ask Slashdot gets a comment pointing out that it's the dumbest Ask Slashdot ever, I know.

This time, it's really, really the case.

On the contrary. Unless you have a definitive and provably correct answer to this particular Ask Slashdot, which I didn't notice you providing, I would assert that it's an interesting question and you're just being a jackass.

I would have to guess some code in BIOS that's pretty much the same on every platform. The POST components for memory checking, for instance. That might actually get disqualified as they may be written in assembler?

I would probably have to say whatever is the inner loop on the system idle process in windows.

Ding, we have a winner.
Not supercomputer code. Sure, supercomputers are... super and all, but the biggest one only has around 1 million processing cores. How many windoze machines are out there, idling away?

What platform has the most computation power (number of CPUs x speed)?Due to the increase in speed, we can disregard any CPUs built before 2000.

In number, mobile phones are the largest platform. So I would reckon, some GSM codec/cipher.

I think, for now, microcontrollers can be ignored, because they have much lower computational power.

Desktops and supercomputers have more power, but are they excessing the mobile phones? If they are a relevant portion, then across mobile phones and desktops, perhaps some code related to network access is the most-run.

I doubt it would be something kernel-related (like bootup, context-switching), because the kernel usually does not (or should not) take up a lot of the computing time. If we go by number of entries only, then perhaps some networking code.

If so, I'm not sure which layer to look into though. The lower ones are called more often, but media is not the same across use cases.

It almost has to be a video / image codec if we are talking about internet era code. I mean the internet is what, 90% porn?

But in all seriousness, I would still say video codec code. All the devices out there consuming video at (usually) 24+FPS have to decode each frame. The line kind of blurs with DXVA / VDPAU and hardware decoding though.

Come to think of it, it could also easily be an audio codec, either in portable music players or cell phones.

That is usually done by an ASIC though, so not code per-se. Parts of the radio in mobile phones are programmable, although they tend to be FPGAs rather than CPUs at the core because that reduces power when doing DSP type stuff. I'm not sure if FPGA code counts because it's not really executed like CPU code is.

There are a lot of similar candidates that fall into this trap. Hashing code, encryption code, checksumming code... Whenever it needs to be high performance it's usually better to create a hardware imp

You are right about the hardware/bios aspect, but arent on the right device.

(nearly) Every computer has a video device which has a loop running over the frame buffer, outputting pixels to the display output port. Even in the days of regular CGA 320x200 graphics on 60hz monitors that amounted to 3,840,000 iterations per second. We are talking over 3 decades of this going on, on nearly every desktop and laptop computer build during that time (vector displays worked differently) and even in those early days of CGA most of the time those machines were in a text mode with a pixel resolution of 720x240 and still putting out a 60hz of video signal (10,368,000 pixels per second.)

A single CGA desktop machine in text mode left on since January 1984 would have output 9,816,000,000,000,000 pixels to its display port so far. Thats nearly 10 quadrillion pixels. Even if the average number of running desktop computers over the period were only 1 million (a severe lowball) and used that shitty low resolution at only 60 hz, thats still over a sextillion iterations of that simple pixel outputting loop.

I would say the average number of running desktops over the period since 1984 is more like 50 million and the average resolution over the period was 1024x768, and the average monitor refresh is 70 hz. My guestimate is about 2.606E+24 iterations of the framebuffer loop, over 2 septillion iterations.

(nearly) Every computer has a video device which has a loop running over the frame buffer, outputting pixels to the display output port.

Unless you're using a Sinclair ZX80/81 or some other peculiar device that's too cheap to include a graphics chip, that's hardware, not 'code'. If you expand the definition of 'code' to VHDL and other hardware design languages, there must be 'code' doing far more than a graphics chip would.

Emphasis mine. There generally was only one *C*PU, but there may have been other ALU's or peripherals controllers (which includes graphics chips). Processors, yes, but not CPU's in the context of those systems. My mobile phone has at least 5 fully-functional ARM processors on it (not cores, processors), for example, but only one of those is the CPU.

It’s not really executable as I understand it, but I am not a biologist. The translation from DNA to RNA is hard to construe as ‘execution’. Then in the next step the RNA goes to ribosomes to construct proteins. So maybe DNA is ‘compiled’?

The field of computational biology would probably have a good metaphor to map the ideas from biology to computer science.

It's not compiled, it's interpreted. If you had a single gigantic mRNA consisting of all your genes, that would be compilation.

You can think of DNA as source (in an extremely low-level language), mRNA as machine code, and ribosomes as microcontrollers. DNA transcriptase interprets DNA into RNA. In eukaryotes, SNRPs are optimizers (written by a lunatic, but no analogy is perfect) that rearrange the RNA; ribosomes interpret the RNA.You've got lots of ribosomes in each cell, so think of each cell as a massi

How could this ever be more than a guess? How could it ever be determined, documented, or verified?

And for that matter, what is the definition of whether something is "the same" piece of code? For example, if the same source code compiles to different instructions on two platforms, are they running the same code?

How about if one of them actually compiles code that gets executed, and the other optimizes it out?

Any time a mainframe does anything with a dataset in a batch job (i.e. allocate, delete, whatever) it runs IEFBR14, a null program, as a target program to satisfy a requirement in how jobs are created.

This means that banks, retailers, governments, you name it--when they process the back-end records that make modern life functional, IEFBR14 usually gets invoked somewhere.

As an aside, this program, (which did absolutely nothing and, in binary format, was originally only 2 bytes long) had the dubious reputation of being the shortest program with a bug. It failed to clear the register that returned the error code. Oops.

Error? More like bad coding - it relied on the processor clearing the return/exit status register originally, and once the 'error' was 'corrected' it doubled the size of the program (from one BR14 instruction to a load instruction and then the branch instruction).

As most of us know, and the rest of us ought to, x86 and many other CISC architectures have their instruction set decoupled from the internal microarchitecture by using microcode [wikipedia.org].

Since multiple microcode instructions can run for one machine instruction, there's likely a sequence of three or more instructions used by many common instructions (I'm guessing something pertaining to checking for cache misses?) that thus gets executed more often than any single opcode on that machine.

Most OSes have some code that runs when other processes aren't running to measure the idle time. Certainly in Windows, this is a process in it's own right.If the CPU is only 1% utilised, then the idle time process is consuming most of the remaining 99% (with the kernel using a bit of that).

There is a bunch of problems with the question, esp. how you define your minimum code junk. If we really define as "any piece of code" then I'd go with some system functionality.

General search criteria:- runs on many machines- runs all the time- execution time is extremely low- runs already for a long time

My personal guess would be a version of memcpy, because- it is used for virtually everything and everywhere- the functionality is there since forever (so one can assume a stable code base with little chang

For a GIF of less than two million pixels, say 1600x1200, and each pixel's colour selected from a palette of 256 colours and dependent on neighbouring pixels' colour values derived from the same iterative algorithm run 1 million times to maximise value stability per pixel, you're looking at running the same line of code 4.9152*10^14 times.

Assuming 100% (as in perfect) saturation on a 2GHz processor core, that'll take just over 68 hours.Or, to use the old industry yardstick, 63.2 P90-days.

Based on the number of graphics cards out there, the high repetitive nature of their application and the fact that that's all they do, it's probably something related to them. I thought of supercomputers running very small recursive routines, but they usually have a limited lifetime and older computers aren't fast enough and haven't continued to run in any event.

Graphics though? I'd guess something in a very common graphics card would probably be in the scale to achieve the title of most-run code.

Most bitcoins are mined with ASICs or FPGAs, so that doesn't count as "code" since the algorithm runs directly in hardware. If you believe that algorithms implemented in Verilog or VHDL should count, then the "winner" still wouldn't be bitcoin, but something like the VHDL that implements the system clock on a billion x86s.