Samsung releases code of WebCL implementation for WebKit

Samsung has opened the source code of its WebCL implementation for WebKit. The …

The WebCL standard is still a work in progress, but the first experimental implementations have already arrived. Samsung has opened the source code of its WebCL prototype for WebKit, which is designed to run on Mac OS X. The company has also published some videos that demonstrate the efficacy of WebCL in action.

WebCL—not to be confused with WebGL—is a new Web standard that is being devised by the Khronos group. It will provide JavaScript bindings for OpenCL, a framework that allows software to offload general-purpose computing operations to a GPU. The goal behind WebCL is to bring OpenCL to the Web—making it possible for sophisticated Web applications to significantly accelerate computationally intensive workloads.

OpenCL was largely developed by Apple and NVIDIA, but it's an open standard with relatively broad industry support. The Khronos group, which maintains the OpenCL specification, is hosting the effort to guide WebCL through the standards process. There is no official WebCL specification yet because it is still being drafted, but preliminary JavaScript APIs for the standard have been proposed.

Samsung and Nokia are both building prototypes that demonstrate the viability of integrating WebCL in mainstream browsers. Samsung opened the source code of its WebKit-based WebCL implementation last week and announced its availability in a message posted to the WebKit developer mailing list. Nokia has its own separate WebCL implementation, which was built for Mozilla's Firefox Web browser.

To illustrate the potential performance gains that developers can get out of WebCL, Samsung showed how the technology can be used to increase the frame rate of an animated N-body simulation on the Web. A version of the simulation that used conventional JavaScript for performing the necessary computations rendered at only 5-6 frames per second, but the WebCL version ranged between 78-114 frames per second.

Samsung's demo

The code of Samsung's WebCL implementation is distributed under the BSD license and is available for download from the project's version control repository on Google Code. The repository also includes some examples that show how WebCL can be used in JavaScript.

While WebCL sounds interesting, I have to ask what sort of stuff would be possible in a web browser as a result of this. N-body simulations are good tech demos of parallel processing power, but they're not exactly useful.

What would the general use case of WebCL in a browser be? I am not saying there isn't one, simply I can't imagine one at the moment.

Um, what? You can't imagine one? Really!? The "general use case," as stated directly in the article, is applications. Duh. What's the use for OpenCL anywhere else? Answer: accelerating applications. The goal of WebCL is simply to continue the march of capability of web apps, simple as that, and thus enable web apps solve ever more problems that would previously have required a native app.

Image processing is better done with WebGL,OpenCL isn't well suited for audio processing as most audio processing algorithms are iterative and dependent on the result of the computation on the last sample. Also why would you want to do any of these... especially video processing in your browser?

I shudder at the thought. Brain-dead web developers are already able to eat up gobs of memory and CPU cycles with the tools they already have (in order to try and do something as straight-forward as a showing an advertisement or fetching some data via AJAX). Why would I give them the capability to communicate directly with my hardware?

Wouldn't it be better to implement full hardware acceleration, instead of adding more and more Javascript APIs? I'm satisfied with the speed of web applications based on HTML5 canvas running in IE9+. Or is this something unrelated to hardware acceleration?

For the people questioning the why, in your case it doesn't matter. This sort of capability will be exploited by those that innovate.

Yep, I can't wait for advertising that not only uses tons of CPU cycles but also abuses the GPU. No seriously, I can understand the reasoning behind OpenGL (games), but what computation intense application do you have in mind that would work better in a browser than as a desktop application?

JM_ wrote:

Wouldn't it be better to implement full hardware acceleration, instead of adding more and more Javascript APIs?

Does not compute. The whole idea behind this is to implement hardware acceleration - and what does the API that is used to call it to do with that? You need some way to call the stuff after all and Javascript is the obvious candidate for websites. The javascript will just call some c++ that talks with the graphics driver and does some sanity checks I assume.

What would the general use case of WebCL in a browser be? I am not saying there isn't one, simply I can't imagine one at the moment.

Um, what? You can't imagine one? Really!? The "general use case," as stated directly in the article, is applications. Duh. What's the use for OpenCL anywhere else? Answer: accelerating applications. The goal of WebCL is simply to continue the march of capability of web apps, simple as that, and thus enable web apps solve ever more problems that would previously have required a native app.

Applications certainly, but given security concerns what sort of unsigned code on the public web do you want using your OpenCL capability directly? I am likely to want to limit this sort of code to executing in the local domain, or to trusted domains. I can imagine scientific applications using this, but without the ability to save the calculations locally, what are we achieving?

It seems to me that ~10x faster than JScript for this kind of operation isn't all that impressive. I imagine a good parallel native code CPU implementation using SSE3 would do at least as well, let alone a real GPU accelerated implementation.

Also, implementing video decoders on the GPU is harder than it sounds. Codecs like H.264 often use CPU-intensive linear-only stages for entropy coding (CABAC in H.264's case). There've been hybrid CPU/programmable GPU decoders (like the Xbox's H.264 decoder for HD DVD), but those rely on having a lot of CPU horsepower available.

Just like with WebGL we're reaching dangerous complexities here; web development is (ideally) rapid and agile - the result is pretty much in a sandbox, is (theoretically) lightweight, easy to use and accessible. I'm not saying that I can't imagine a use for this feature but I find the fact disturbing that we're trying to make the web browser act like a full application launcher environment and eventually trying to achieve ultimate portability (which has a high price: just think about java applets for a moment).

Samsung actually making contributions! Unheard of!Snarkiness aside (and they do make good hardware and has some of the best display and flash memory techs) this will be great for further usage of HTML5 5-6 frames jumping to 70-110 is a huge improvement and I think we can only expect better things to happen.I think this also stands to prove the usefulness GPU is taking over CPU now, from being used in various simulation and cracking algorithms/passwords, it's more and more moving into a practical usage in computing.

Image processing is better done with WebGL,OpenCL isn't well suited for audio processing as most audio processing algorithms are iterative and dependent on the result of the computation on the last sample. Also why would you want to do any of these... especially video processing in your browser?

I can imagine lots of interesting uses for this. Purely website-based folding, bitcoin-mining etc. comes to mind. In terms of security I'm sure there's plenty of issues, but I wouldn't accept such a plugin from any site I didn't trust. I don't see any security issues from that angle; I would have trusted a stand-alone application from the same site anyway.

Image processing is better done with WebGL,OpenCL isn't well suited for audio processing as most audio processing algorithms are iterative and dependent on the result of the computation on the last sample. Also why would you want to do any of these... especially video processing in your browser?

What would the general use case of WebCL in a browser be? I am not saying there isn't one, simply I can't imagine one at the moment.

Um, what? You can't imagine one? Really!? The "general use case," as stated directly in the article, is applications. Duh. What's the use for OpenCL anywhere else? Answer: accelerating applications. The goal of WebCL is simply to continue the march of capability of web apps, simple as that, and thus enable web apps solve ever more problems that would previously have required a native app.

EXACTLY !

put simply, new languages like C++ AMP extensions and OpenCL/WebCL are gateways to use your hardware more efficiently for existing and new applications. who would rather have their GPU sitting idle most of the time (seriously, have a look at it's usage over 24hrs) compared to having it work for them just like an additional CPU ?

it's increasing the efficiency of the bloody hardware that currently has so many dedicated forms of silicone and protocols that it's very inefficient and rarely realizes it's potential.

standardizing languages that can access ALL your hardware, and later perhaps the cause-and-effect of making hardware suited to these languages means that we can have more performance, less power requirements and benefit from smaller, quicker, easier-to-make hardware and have a life where we don't need large devices to do our work because we will have evolved to a point where we don't stupidly make 20 different chips in a PC and have a mess of protocol and languages, drivers, etc to make them all work together.

you only have to look ti Intel's efforts to standardise communications buses starting with USB and evolving through USB3 and later Lightpeak to understand that consolidating features and amplifying performance is the way to irreducibly decrease the complexity and increase the volume of that pattern.

today we currently have the efficiently that's similar to the internal combustion engine: we have a lot of wasted power, heat, noise, vibration, etc considering the amount of fuel we give the hardware.

having more efficient use of the hardware will mean we won't need to have this current pattern of "over provisioning" using dedicated silicone to make sure we have the resources when we need it by having it sit idle most of the time.

heterogeneous computing evolving into homogenous computing is the future, it's just taken this long from wasting so much energy and materials and time making dedicated silicone before we realized a better way to use our resources.

right now most people's hardware is idle most of the time.....initiatives like OpenCL and WebCL, etc are beginning to change that.

I shudder at the thought. Brain-dead web developers are already able to eat up gobs of memory and CPU cycles with the tools they already have (in order to try and do something as straight-forward as a showing an advertisement or fetching some data via AJAX). Why would I give them the capability to communicate directly with my hardware?

Pretty much this, web development is generally approached in a way where no one does the due diligence when it comes to engineering. If it looks right, ship it! These sort of APIs aren't exactly "web developer" friendly.

Oh, c'mon. They're making ARM chips that support OpenCL. It's only a matter of time before OpenCL-capable chips make it into Chromebooks.

If you're Google, and you want to replace all native apps with a web browser, you want that web browser to tap as much hardware power as it can. OpenCL helps you do that. Google would be fools if they didn't pressure their hardware partners into adopting chips that support OpenCL.

Oh, c'mon. They're making ARM chips that support OpenCL. It's only a matter of time before OpenCL-capable chips make it into Chromebooks.

If you're Google, and you want to replace all native apps with a web browser, you want that web browser to tap as much hardware power as it can. OpenCL helps you do that. Google would be fools if they didn't pressure their hardware partners into adopting chips that support OpenCL.

agreed.

glad someone realises the potential of languages that can access more hardware so the developers can just "get on with it" instead of worrying about which silicone they need to address.

Oh, c'mon. They're making ARM chips that support OpenCL. It's only a matter of time before OpenCL-capable chips make it into Chromebooks.

If you're Google, and you want to replace all native apps with a web browser, you want that web browser to tap as much hardware power as it can. OpenCL helps you do that. Google would be fools if they didn't pressure their hardware partners into adopting chips that support OpenCL.

agreed.

glad someone realises the potential of languages that can access more hardware so the developers can just "get on with it" instead of worrying about which silicone they need to address.

except that completaly ignores how dev friendly it is or isn't to "just get it on"...

Oh, c'mon. They're making ARM chips that support OpenCL. It's only a matter of time before OpenCL-capable chips make it into Chromebooks.

If you're Google, and you want to replace all native apps with a web browser, you want that web browser to tap as much hardware power as it can. OpenCL helps you do that. Google would be fools if they didn't pressure their hardware partners into adopting chips that support OpenCL.

agreed.

glad someone realises the potential of languages that can access more hardware so the developers can just "get on with it" instead of worrying about which silicone they need to address.

except that completaly ignores how dev friendly it is or isn't to "just get it on"...

sure, compare an embryonic language that's less than 3 years old with languages that have had...how many years/decades of development ?

you're absolutely right, we should never accept anything that doesn't immediately show an improvement on today's technology, everything should be started from scratch and perfected immediately with no development time what-so-ever.

please show me this magic that allows ideas to be perfectly realised on first-attempt so that we can have jet-packs and time-machines right now. why bother wasting our time with evolution and testing when we can just skip all that and short-cut to the point where we go voila! and we have instant perfection.

i guess we should stop bothering with improving our current cars and other vehicles because although what we use today has had over a century of development to get to this point, these alternative-fuel vehicles are clearly not better than our fossil-fuel infrastructure, so we should obviously shit them down.

WebCL have many uses. Every time native app can tap into GPU, so will Web apps be able to.

Imaging webapps are prime example, and there are lots of them ALREADY. Nokia have nice showcase based on just them.

But more generally, WebCL is __just__ allowing for webapp to use GPU horsepower. And GPU are better at tasks that can be parallelized, than CPU (no amount of SSE3 can beat that).

Also WebCL is __just__ computing. Not acces to your personal data, filesystem, (put here all those Flash, ActiveX features that made them "dangerous"). So all you risk is Denial of Service when webapp consume all GPU.

But solving that IS THE JOB of OS and GPU manufacturers. Ati did nice job of anti-DoS hardening their drivers even for XP. On the other hand MS waited till WinV to propose (and enforce) anti-DoS mechanisms.

It is not role of WebGL nor WebCL nor Milehil(Flash) nor SL5XNA3D jobs to protect against DoS attacks. It is OS, and GPU drivers work. Those big holes that allow DoS are there from beginning of gpu computing. And wont disappear till OS/GPU devs will fix them.

And that is IT. No more security risks. And "but it will run directly on my hardware" is just stupid (ignorant) of what actually get EXECUTED on your GPU.

WebCL have many uses. Every time native app can tap into GPU, so will Web apps be able to.

Imaging webapps are prime example, and there are lots of them ALREADY. Nokia have nice showcase based on just them.

But more generally, WebCL is __just__ allowing for webapp to use GPU horsepower. And GPU are better at tasks that can be parallelized, than CPU (no amount of SSE3 can beat that).

Also WebCL is __just__ computing. Not acces to your personal data, filesystem, (put here all those Flash, ActiveX features that made them "dangerous"). So all you risk is Denial of Service when webapp consume all GPU.

But solving that IS THE JOB of OS and GPU manufacturers. Ati did nice job of anti-DoS hardening their drivers even for XP. On the other hand MS waited till WinV to propose (and enforce) anti-DoS mechanisms.

It is not role of WebGL nor WebCL nor Milehil(Flash) nor SL5XNA3D jobs to protect against DoS attacks. It is OS, and GPU drivers work. Those big holes that allow DoS are there from beginning of gpu computing. And wont disappear till OS/GPU devs will fix them.

And that is IT. No more security risks. And "but it will run directly on my hardware" is just stupid (ignorant) of what actually get EXECUTED on your GPU.

Alright, how do you propose that you will protect against DoS? A good portion of a regular CPU's operations handle things such as context switches, hardware exceptions, timing, access restriction, scheduling, and handling the VM. These operations make multi-application environments stable and allow for the operating system to prevent a single process from using all the resources of the computer. Most of these operations require at least kernel level privileges (higher than root) to execute. GPU's do not have any of these operations. A GPU only knows of its memory and its command buffer. I am not even aware of a GPU that natively supports context switches. They are usually emulated in the client side on the host's CPU. So if a WebCL Kernel or WebGL shader equates to something like while(1){}. The GPU will happily just continue to execute that command ignoring the other commands in the command buffer. Even if the next command is something important like rendering the GUI. The driver can't signal the GPU to stop executing a certain command buffer and start executing another. They don't support hardware exceptions. It cant say execute X commands for 100 cycles then execute Y commands for 10, then go back to X... GPU's don't support scheduling. The only way a driver can prevent a shader from entering an endless loop, is to insert a hidden branch inside every loop detected during shader compilation, that checks a memory location, and breaks; if a certain value is set there (by the driver). But this has one problem, GPU's don't have Out of Order Execution, or Branch Prediction. So each branch requires a pipeline stall on the GPU. So if implemented, it would significantly decrease performance.

. So each branch requires a pipeline stall on the GPU. So if implemented, it would significantly decrease performance.

if(uniform_counter> 100000) ContextSwitchOrExit(); if patched-in by a shader compiler does not compile to: (using x86 ISA for ease)cmp...jle NotGreater.. ; lots of code here, or a call/ jump-with-link-registerNotGreater:

instead, it's a branch-not-taken, which is like a nop:jg _UhOh

And besides, pipeline stalls only happen the first time 2x2 ...32 simultaneous threads (that share instruction decoding) need to be split. A counter like this is shared or has the same value between all these threads, thus it won't introduce splitting. And besides, what stalls when gpus do thread-switching on EVERY CYCLE, and can delay switching back to a specific thread-group in 4 to.million cycles, while continuing to execute other groups?There are GPFs in gpus, too. (y'know, gpus have their own MMU, too..).

last-gen gpus detect if a thread has been executing for too long, and soft-reset the gpu transparently. Weren't latest WDDM requirements that the gpu be able to do fine-grained context switching? At least Fermi did support that, iirc. Each context has its own dozen of command/data buffers, too.

Drivers simply need to protect their firmware from shaders and buggy serverside memset (had a gpu die of this...), and implement limits on loops and such. And be damn well tested .

New technologies are always great, that I totally agree on. However, like some other, I wonder who would use Javascript for that sort of high-performance application instead of good-ol C. Yes, Javascript is faster than ever, but dynamic garbage collected language tend to have a notable performance hit. Any time the high-performance application will have to do a part of general purpose code, it will have to go trough that slow language. Web application are totally awesome for a lot of uses case, that I agree on. But I still think some stuff is better done as a tradionnal application.

Also, openCL isn't exactly the easiest language to program with. This is far from the lightweight event-handling code we find today. Javascript today remind me a bit of Visual Basic : add some click handler on this or that, react to what happen on a form, do some treatement on the data then send it to the database/web server. And both tend to have the same kind of developper that know more about the business problem they try to solve than about technology. Those aren't in my mind the prime candidate for a complex GPU programming language. I am sure someone somewhere will find an awesome application for that, but I seriously doubt that it will be used much outside of a minority of programmer prodigies. Most web application do not require the solving of n-body problem.

If they an make it secure and someone somewhere use it, then good! But I can hardly label that technology a game changer. I believe its complexity will keep most dev away for a long time. And they better not introduce security flaw with it. I prefer a slow but secure browser to the opposite