Posted
by
timothyon Monday February 23, 2004 @08:47AM
from the me-too dept.

Jody Goldberg writes "A recent study of analytic quality, and responsiveness to problems strongly preferred Gnumeric in place of MS Excel. With new problems popping up in Office XP the case for spreadsheet users to migrate is only getting stronger.
In some related Gnumeric quickies, a new stable version 1.2.6 was released, and Open has done an interview with the Maintainer."

I love that one of the "failings" of Gnumeric was that the random number generator function RND was *too* random - Gnumeric uses the/dev/urandom device that generates random numbers from noise sources in the system (noise diodes, interrupt events, user input, etc.) rather than using a psuedo-random number generator with a predictable sequence.

True, there are times it is nice to have a "random" number generator that you can re-run for testing, but having a really random number generator is better for a host of problems.

I use excel at work all day and I have to say that no Open source solution comes close to providing what I expect a spreadsheet to do.

The idea that one should switch from excel to an open source solution because of a small set of statistics problems cannot be properly solved by excel seems a bit like throwing the baby out with the bath water. (unless you do nothing but statistical modelling all day)

It looks like Gnumeric improved or stayed the same on every data point except Pidigits, Numacc2, and Origin1 (whatever those are). Note that the LRE is the negative of the log of a value less than one, so a larger LRE means a smaller relative error. It's just the number of digits that agree with the correct answer. Really bad values would even have a negative LRE.

I've e-mailed a well-informed and helpful Microsoft developer, whom I first encountered on this very forum, on several occasions. I'm told a number of bug reports have been filed against the application in question as a result of my e-mails, and some of the things I've mentioned to him have certainly been fixed in a later version of the product.

Some people at Microsoft do listen, you just have to make a bit of an effort to find them. Curiously, a comment from the developer in question was that the dev teams love direct contact with customers prepared to give them helpful information about bugs or feature requests, they just wish the PR people would stop getting in the way.:-)

Advocates of new software, particularly OSS, often seem to forget that market share counts for a huge amount. Some studies we looked at back when I was in academia suggested that you need the "10x factor" to force a switch from an established product: your alternative must provide 10x the perceived benefits, or be 1/10 the price. That's a very big barrier to entry, and having a product that's only just become a challenger on technical merit and reliability is nowhere near it. (It's a good start, though!)

I've been using spreadsheets for over 20 years, since
Lotus-1-2-3 ver1A on a 128 KB (sic) 8088. I think MS-Excel
is unsuitable for any serious use. Aside from ease-of-use
issues (regression and other stats not easily accessible)
there seems to be serious defects in the core calculation
engine.

I've seen spreadsheets where MS-Excel would
miscalculate results by 20%. MS-Excel also has enormous
problems handling circular spreadsheets. Both are
probably related to defects in the order-of-calculation
algorithm.

That sounds great, but I live in a world of a locked down PC controlled by my IT division. I'm a power user in Excel, using pivot tables, mild VBA, mostly for automation between linked files, and in general using the 80% of the features most people don't use at all.

True - VBA shouldn't be used for extremely complex items, but for my use, and other power uses - it's tremendous in it's automation abilities.

If I had to use MS-Excel to manipulate serious figures, for instance huge budgets, I wonder how well I would sleep. And if I had people under my responsability who manipulate serious numbers, I would ask them to prefere accuracy to spectacular pie-charts. Am I that weird ?

By the way, if your business goes into troubles because of MS-Excel bugs which have been well known for years, can you sue MS ? Of course, the EULA tells you you can't, but in the real world?

I didn't read all of the linked article -- so whatever...however, I will say this: Anything that makes Microsoft Office look bad and (insert cheaper solution here) look better, I like.

For a $1000 computer, I pay ~$400 per license for MS Office Professional -- that's 40% of the cost of the computer. If I could convince management and our user base, I'd change to anything else because anything else would be cheaper (Star Office, Lotus Smart Suite, OpenOffice, whatever). I checked out Open Office with one of our accounting guys, and it worked just fine with all of his macros. Peace of mind against FUD just isn't worth that much. MS Office is a fine product, just not worth the price. If there was anything with a remotely competitive amount of market share, I'm sure that MS would drop their prices to stay competitive.

Okay, "frequently" may be overkill.:-) I was trying to drive home the point that it's not unreasonable.

Say you want to do up a spreadsheet containing some tables that contain data based on random data. You *could* either include fifteen megs or whatever of random numbers, or you could just include a random number function with a seed if your spreadsheet uses a standard RNG. As long as the variability with different seeds isn't significant for your work, you shouldn't have a problem. However, you *may* have insignificant digits shifting around in your table, which doesn't look all that professional (a bunch of different people with slightly different tables).

The other major use for pseudo-random numbers with a known seed that I can think of off-the-cuff is in finding problems between two environments. Say Bob has a copy of Excel 98 and you have a copy of Excel 2003, and for some reason Bob is getting different numbers than you are. Possibly you're relying on something that you shouldn't (i.e. Bob has an old version of some other file that this one uses), or perhaps Microsoft broke something between versions. If you can fix the seeds, you can find where the divergence is creeping in. If you can't fix the seed, then you have no way of finding and fixing what's wrong.

'Wah, Excel doesn't have some bloody obsure stats function' - Go get MathLab or similar then. What, short of dosh? Funny that.

'Wah, Excel doesn't have a real language' - It's got VBA, aka VB6. It's enough for most office jobs. You're not ray-tracing or writing an OS, your importing a text file or updating a Word template. Hell, I wrote a 3270 screenscraper in VBA. Fuck off and write your app in TASM/Smalltalk/Lisp/Perl/C/Java/Fortran/what fucking ever. I'll be finished and gone home. don't forget the lights.

Excel fills a niche, and does it reasonably well. Don't like it then don't use it. Yeah, bits of it suck. VLookups suck. Smart tags suck. Handling named ranges suck. Hey parent, you know what any of this means? No? Funny that.

Excel also has (had? this might be fixed in 2003) some problems with things like standard deviations, etc. Run a standard deviation of 666666666123, 666666666246, and 666666666369, and you won't get the expected value of 123. Rather, you'll get 0. The high absolute values of the three numbers causes rounding problems by pushing significant bits out of the mantissa of IEEE numbers.

Despite the fact that this has an easy fix (mean center the data before computing the deviation), Excel has had this problem for years.

"If you are using VBA in your spread sheet you need to move to a better solution - a dbms and a decent programming language. You are doing the equivilant of using a table knife for a screwdriver."

However, you're suggesting we use a bulldozer when a shovel will work just fine.

I work for a Pharmaceutical company as a software developer. Our scientists use Excel spreadsheets as reports; they enter in some raw data (or it's streamed in from an external program) and a combination of VBA and Excel formulas do the rest. These spreadhseets summarize data, predict flows, highlite trouble data, etc.

THEN, in some cases (at least those that are needed), we have the ability to export the data stored in the Excel spreadsheets into Oracle tables.

The spreadsheet acts as an intermediary for the scientists. It gives them something visual. They can modify things themselves, look at graphs for select data, etc. In some cases, they've even written their own VBA code to perform certain tasks. It's a horrible language, but simple enough for someone to pick up.

Try writing software to allow them to do all of this, and to work with about 150 different macros that were written in the past. A biologist is not going to try to learn C++ or Java, because it's too time consuming and overkill for what they need. And any application, as simple as you make it, will not be as customizable and visual as Excel. You'd be robbing them of that important aspect.

Sure, VBA is a pain in the ass; I wish it would go away forever. But it's made its niche; it allows the non-computer-savvy to do complex things. Anything better would be overkill and would reduce functionality.

Some people at Microsoft do listen, you just have to make a bit of an effort to find them. Curiously, a comment from the developer in question was that the dev teams love direct contact with customers prepared to give them helpful information about bugs or feature requests, they just wish the PR people would stop getting in the way.:-)

...a perfect example between the difference in OSS and closed source.

I've worked both sides of the fence, and realize the differences. There are base motivations that drive each to do things differently. Still, I was stunned when I asked Theodore Tso a question in email a few years back, and he not only responded quickly but even sent a patch for me to try out!

The only thing I use Excel for is Solver. Solver turns Excel into the worlds easiest to use linear/non-linear optimizer for ANY function you can put in a spreadsheet. I use Gnumeric a lot, but I always have to go back to Excel for Solver...