LibreOffice aims to speed spreadsheets with AMD GPU optimization

GPU horsepower will be harnessed in rewrite of LibreOffice code.

The makers of LibreOffice are teaming up with AMD so that the open source office suite can take greater advantage of graphics processing units (GPUs). The partnership is geared toward optimization for AMD's upcoming Heterogeneous System Architecture (HSA), but LibreOffice developers say their work will make spreadsheets go faster for users of just about any type of computer.

The news, announced today, was spurred by AMD joining the Document Foundation board—the group behind LibreOffice—and providing financial assistance. "We traditionally had a big performance problem in Calc [the LibreOffice spreadsheet application] for large data sets," Michael Meeks, LibreOffice developer and distinguished engineer for Attachmate's SUSE business unit, told Ars. "My hope is we not only eliminate that problem but that we do significantly better."

LibreOffice has not generally been able to take advantage of the horsepower in GPUs, Meeks said; AMD's HSA helps address this problem. "HSA is an innovative computing architecture that enables CPU, GPU, and other processors to work together in harmony on a single piece of silicon by seamlessly moving the right tasks to the best suited processing element," the Document Foundation said in an announcement. "This makes it possible for larger, more complex applications to take advantage of the power that has traditionally been reserved for more focused tasks."

LibreOffice's development team is refactoring the core of Calc to take advantage of HSA in AMD's GPUs and APUs (accelerated processing units). LibreOffice's spreadsheet calculations have generally been "done in a very unfortunate way, with huge amounts of redundant and repetitive work done right inside the most time critical piece," Meeks said. In the new setup, LibreOffice will change the order of calculations to make them more efficient and will convert tasks into OpenCL so they can be run on GPUs.

When asked if systems with Intel or Nvidia graphics would benefit, Meeks said that "if a vendor has a good OpenCL implementation we'll use that, and that should be fine." Because Calc's code will ultimately be more efficient, even people who own computers without GPUs—"if you can find such a person," Meeks said—should see performance improvements.

The first of these code changes will show up in LibreOffice 4.1 in about a month, but it will take about six months to get substantial benefits to users. Although Meeks is expecting "very significant performance improvements for spreadsheet users," he said it's too early to provide an estimate of just how much speedier spreadsheets will become.

A faster LibreOffice, he joked, will help users "find a business reason to buy the very fastest 3D graphics card you can for your computer. I think that's an important value add to many business users of technology. When your boss comes and says, 'Why have you got this graphics card that's the size of a fridge?' you can say, 'My spreadsheet has got to get faster.'"

Ouch, I'm always complaining about our 2,500 code module Access "Application" (I know the specs say you can't have more than 1,000 but you can apparently). I suppost it's all about the same thing though, rightsizing. 10MB of data? Access is a good idea. 1Gb of data... then you need to look for something bigger.

The partnership is geared toward optimization for AMD's upcoming Heterogeneous System Architecture (HSA), but LibreOffice developers say their work will make spreadsheets go faster for users of just about any type of computer.

On the box for mine they call it "APP acceleration". So far I haven't found anything beyond Folding @home that uses it.

The partnership is geared toward optimization for AMD's upcoming Heterogeneous System Architecture (HSA), but LibreOffice developers say their work will make spreadsheets go faster for users of just about any type of computer.

On the box for mine they call it "APP acceleration". So far I haven't found anything beyond Folding @home that uses it.

A short list: "Photoshop CS6, GIMP, Media Converter/Media Espresso and WinZip 16.5." Handbrake is supposedly accelerated these days as well... and I'm sure there's plenty of other APP Accelerated software I don't know about.

More on topic though, this is very good news! In my opinion, "calc" has been the weak link in LibreOffice. Once they get calc performing well, it might be even more viable for real world businesses! To the point where they can't ignore it as an option. This is good.

If you need GPU processing power for your spreadsheets, you are using the wrong tools for your work.

Rather, they're taking advantage of a tool many products ignore. The GPU does very little work in an office app. Why not put it to use crunching numbers, and take some of the load off the CPU?

This. Just because its specifically designed to do calculations related to graphics doesn't mean it can't do other calculations as well. It's like shipping with two half empty trucks to the same destination because one is the "banana" truck, and the other the "apple" truck, and never the twain shall meet.

LibreOffice's spreadsheet calculations have generally been "done in a very unfortunate way, with huge amounts of redundant and repetitive work done right inside the most time critical piece," Meeks said.

That doesn't sound like moving the calculations to the GPU is the real fix. Not having crappy code is the fix.

Of all the applications that need GPU-acceleration, I'd have thought office software would be the last.

Clearly you have never worked in an office.

If your spreadsheet isn't 10MB with an extra 5MB in embedded code, you are doing it wrong.

Not sure what tone of voice you said that with but, yeah, IT folk love to sneer at business folk. The trouble is that IT departments withold proper tools from users and then start mocking when business users make the best of the lousy tools they've been given.

If you're not allowed to use a database or a programming language, Excel or Calc might actually be the very best available tool for the job.

Of all the applications that need GPU-acceleration, I'd have thought office software would be the last.

Clearly you have never worked in an office.

If your spreadsheet isn't 10MB with an extra 5MB in embedded code, you are doing it wrong.

Not sure what tone of voice you said that with but, yeah, IT folk love to sneer at business folk. The trouble is that IT departments withold proper tools from users and then start mocking when business users make the best of the lousy tools they've been given.

If you're not allowed to use a database or a programming language, Excel or Calc might actually be the very best available tool for the job.

Haha, no. Read other people's posts. Some people see a screw and think "I know how to use hammer, I bet a hammer can sort this out."

It's all nice and all, but as Libre Office user on a daily basis, I think they have more important issues (a.k.a., bugs) to look into than using GPUs to improve performance.

There's always something "more important", but putting every developer on the "most important" issue doesn't get it solved any faster than putting one or two on it and giving the rest other tasks.

Further, this is a problem that purportedly renders their spreadsheet software unusable on any but the simplest data sets. That seems like a pretty big problem to me.

I never said they need to put all their resources on fixing the most important issue, but that there are other issues that are more urgent than GPU acceleration. Actually, the problem is worse than this, because if the spreadsheets are unusable on any but simple data sets, they should be looking into the way it is coded and review/refactor it. After all, their competition did not use GPU acceleration (maybe they do now, on the latest versions) on older versions and you could definitely work with complex data sets.

Not sure what tone of voice you said that with but, yeah, IT folk love to sneer at business folk. The trouble is that IT departments withold proper tools from users and then start mocking when business users make the best of the lousy tools they've been given.

If you're not allowed to use a database or a programming language, Excel or Calc might actually be the very best available tool for the job.

Yep, we sure do love generating extra work for ourselves (because we're so very bored, you see) by preventing our customers from being able to do their jobs well because we get off on a self-generated sense of superiority.

On the flipside, business folk LOVE to claim they need every thousands-of-dollars-per-license software under the sun to do their job, because they want to find a way to turn work into playtime, or find that one app that will do their job for them.

See how stereotyping and projecting motive can be hurtful? In reality, most of the time when someone says they "need" something for their job, they can't provide any specific justification that would make the purchase of said software/equipment a net gain for the company. We don't buy all the cool stuff and hoarde it. Can't speak for everyone's IT, but most of the time, we'd love for people to use the right tools for their job because it means we have to do less of the mind-numbing, hair-tearing, "why in the name of all that is holy are you trying to do THAT!?" sort of support work.

I never said they need to put all their resources on fixing the most important issue, but that there are other issues that are more urgent than GPU acceleration. Actually, the problem is worse than this, because if the spreadsheets are unusable on any but simple data sets, they should be looking into the way it is coded and review/refactor it. After all, their competition did not use GPU acceleration (maybe they do now, on the latest versions) on older versions and you could definitely work with complex data sets.

If you need to rework your code anyway, why not design it to take advantage of a piece of hardware that is incredibly good at number-crunching? Rather than prioritizing one thing at the expense of the other, why not combine two efforts into one solution?

The biggest problem at the moment is that code isn't even multi-threaded. I regularly deal with a 200+k row csv and do exploration using pivot tables. (The csv is regenerated each day.) LibreOffice loads the file using only a single cpu which is pegged at 100% during the minute or so of loading. Changing anything in the pivot table (which you do a lot when exploring data) results in long delays, again pegging a single cpu.

That doesn't sound like moving the calculations to the GPU is the real fix. Not having crappy code is the fix.

I second this. LibreOffice's Calc is far far slower than, say, Excel, when looking at the same spreadsheet. Using more hardware inefficiently is not, in my mind, the right way to go.

I find it amazing that a simple spreadsheet program runs no faster on today's incredible computers than other, similar software did on hardware from over a decade ago. Then again, I'm trying to use LibreOffice/NeoOffice on a Mac, so I guess I get what I deserve.

When asked if systems with Intel or Nvidia graphics would benefit, Meeks said that "if a vendor has a good OpenCL implementation we'll use that, and that should be fine." Because Calc's code will ultimately be more efficient, even people who own computers without GPUs—"if you can find such a person," Meeks said—should see performance improvements.

From memory both Intel and Nvidia have OpenCL implementations, at least for hardware of the last couple of years. I guess could be issues with the levels of the implementations, did the developers say what OpenCL level they expect? Nvidia was trying to promote Cuda in preference but apart from Adobe can't offhand think of many that took advantage of it.

"Even for people who own computers without GPUs" .....how does the spreadsheet render on screen without a GPU, doesn't windows pretty much require one now?

It's all nice and all, but as Libre Office user on a daily basis, I think they have more important issues (a.k.a., bugs) to look into than using GPUs to improve performance.

There's always something "more important", but putting every developer on the "most important" issue doesn't get it solved any faster than putting one or two on it and giving the rest other tasks.

Further, this is a problem that purportedly renders their spreadsheet software unusable on any but the simplest data sets. That seems like a pretty big problem to me.

I never said they need to put all their resources on fixing the most important issue, but that there are other issues that are more urgent than GPU acceleration. Actually, the problem is worse than this, because if the spreadsheets are unusable on any but simple data sets, they should be looking into the way it is coded and review/refactor it. After all, their competition did not use GPU acceleration (maybe they do now, on the latest versions) on older versions and you could definitely work with complex data sets.

...They are changing the way it's coded. They're writing more efficient algorithms and also adding OpenCL support.

It's all nice and all, but as Libre Office user on a daily basis, I think they have more important issues (a.k.a., bugs) to look into than using GPUs to improve performance.

I totally agree. This is cool, but a good programmer picks the low-hanging fruit first. You don't put a nitrous injector in your car if you still have stock high-mileage tires and bad spark plugs. You fix those first.

The point is that the reason those spreadsheets have been slow is that the core calc engine hasn't been properly optimized to the the CPU fully. That's what you fix first, before you get all nitrous-y and go GPGPU. Primarily because uh, dur, lots of people, especially office workers, have machines that won't support this.

It's all nice and all, but as Libre Office user on a daily basis, I think they have more important issues (a.k.a., bugs) to look into than using GPUs to improve performance.

I totally agree. This is cool, but a good programmer picks the low-hanging fruit first. You don't put a nitrous injector in your car if you still have stock high-mileage tires and bad spark plugs. You fix those first.

The point is that the reason those spreadsheets have been slow is that the core calc engine hasn't been properly optimized to the the CPU fully. That's what you fix first, before you get all nitrous-y and go GPGPU. Primarily because uh, dur, lots of people, especially office workers, have machines that won't support this.

Did you, perchance, read the article? Or any of the comments beyond the one you quoted?

It's all nice and all, but as Libre Office user on a daily basis, I think they have more important issues (a.k.a., bugs) to look into than using GPUs to improve performance.

I totally agree. This is cool, but a good programmer picks the low-hanging fruit first. You don't put a nitrous injector in your car if you still have stock high-mileage tires and bad spark plugs. You fix those first.

The point is that the reason those spreadsheets have been slow is that the core calc engine hasn't been properly optimized to the the CPU fully. That's what you fix first, before you get all nitrous-y and go GPGPU. Primarily because uh, dur, lots of people, especially office workers, have machines that won't support this.

Did you, perchance, read the article? Or any of the comments beyond the one you quoted?

Perchance, I did, quite thoroughly. Did you have some substantive criticism, or are you just sniping?

As a programmer, I believe my point stands. Very few spreadsheets, even enormous ones, actually are doing enough core "computation" to require a GPU. The spreadsheet isn't spending all that time "adding" and "multiplying". It is so inefficient, most of the CPU time is wasted on the infrastructure code that manages recalculation, not on the calculating itself. The right and most general way to optimize is first optimize away that infrastructure code, which probably includes doing some kind of simple Just-In-Time compiler to generate what is effectively an optimized machine code routine that does the recalculation.

Only after that optimization infrastructure is in place is it useful to do what they are doing. That alone would probably bring enough performance improvement that they would not need to then do GPGPU, and if they haven't done it first, then the speed improvements will be lost in overhead.

So, the only way they can do this properly and use the GPU is to do BOTH of those tasks at once. But that is just stupid, like adding both a turbo and nitrous to your car at the same time before testing it with the turbo alone to see if that was all you needed. If you have time and resources to waste, then go for it. But these developers do not, and this is sounds like a wasteful development approach. You do optimizations ONE AT A TIME, and you test after each one before wasting time on further optimizations, and you STOP when you have reached your performance goal. Why you stop? Because you have other things to do.