If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

This is very nice. My first thought was: generic 64 bit binaries will not benefit from the avx2 extension, will they? But then I realized, this is actually an implementation of an algorithm. Did anyone look at the patch? It's beyond my head, but most of it is assembler. How does this work? The platform is detected at runtime, and the optimizations get picked up on the the haswell / broadwell processors?

Nothing NSS does cannot be massively improved by using OpenCL. Presto! You don't have to buy this new CPU!

Easy to say, but where's the code for that? Intel's contribution might be self-serving, but it's still a contribution - real code that can be used today. And that's more useful than a wishlist that says OpenCL would be better...

How does this work? The platform is detected at runtime, and the optimizations get picked up on the the haswell / broadwell processors?

Looks like it, yeah. I can't read the assembler, but the C parts of the patch (assuming they've not been disabled at compiletime) seem to be doing runtime checking of whether the current CPU supports the AVX2 extensions.

Nothing NSS does cannot be massively improved by using OpenCL. Presto! You don't have to buy this new CPU!

I'm not sure that's true. If you have a discrete card, it means you have to ship the data all the way from the CPU over to the GPU. That's widely known to be extremely slow, and it's why you don't invoke OpenCL kernels unless you've got a decent amount of data to crunch through at once, to overcome the latency slowdown.

I have no idea if the work Firefox is doing would qualify for that or not, but i suspect in many cases it probably wouldn't.

Now, when AMD releases their HSA CPUs, things might be different. There the GPU is able to address the same memory, from aboard the same cpu chip, and it should allow OpenCL to be a lot more viable for these types of applications. It also might mean that the AVX2 code is just running on the same hardware that the OpenCL code is, though, making an extra OpenCL version pointless.

Although I agree it is a good thing that Intel is participating in this project, I'm also a little skeptical. If you have a modern Haswell CPU, is NSS processing really going to be a noticable bottleneck for your average browsing session? I don't think so. Perhaps we are indeed looking at a technology demo with little impact.

Although I agree it is a good thing that Intel is participating in this project, I'm also a little skeptical. If you have a modern Haswell CPU, is NSS processing really going to be a noticable bottleneck for your average browsing session? I don't think so. Perhaps we are indeed looking at a technology demo with little impact.

I doubt it, except under special circumstances. NSS is fast enough to run smoothly on low power ARM chips (as part of Firefox mobile, for example), and a Haswell Atom will surely be at least on par with the performance of the fastest ARM chip.