Posted
by
kdawson
on Tuesday May 20, 2008 @05:59PM
from the no-relation dept.

Leemeng writes "I'm looking for a simple, free, and F/OSS flat-file database program. I'm storing info about Wi-Fi access points that I come across, maybe 8-9 fields per entry. I've outgrown Notepad. This info is for my own reference only; it is not going on a Web server. Googling was unhelpful, with results skewed towards SQL, Access (MS), and Oracle, all of which would be overkill for my purposes. My criteria are: it must be simple, F/OSS, must work in Windows Vista, preferably use a portable format, must not be an online app, and must not require Java. Does such a beast exist?"

The only thing that comes to mind is Mac only. Bento [filemaker.com] (which from the write up is exactly what you want). I really think the only thing on Windows that really comes close is Microsoft Access and even that isn't what you're looking for.

If you are just doing tabulated data in a piece meal form, Excel with an Access back-end will do the trick, I'd be willing to be you're going to be able to find templates that help you start. You could also use File-maker (which is overkill, but easy to use) and go to the user community for starting templates.

YAML meets the flat file requirement and can be made to be the persistent DB behind Python, Perl, Ruby.

If it were not for the Flat File requirement then a simple Python shelve or Perl Tie would be the most logical solution since they are both part of the standard library so don't require installing libs on random computers you might use or port to.

of those two Perl Tie is probably the most suited because it's backed by a real DB operating off the disk not fully in memory.

But why not do both: use YAML as the DB backing the Perl Tie.
then you get a nice human readable flat file.

You may think you're keeping it simple by using a flat file, but you're really not. It may be somewhat easier to manually edit, but it's also easier to screw up, and I've never heard of one with the ability to undo changes.

Class dbtxt includes the ability to undo changes, and features human-readbility for the flat database files. It's trivial and not error-prone at all to mod a database file to delete, or undelete a record, and of course you can do it through the database engine as well.

Or being purposely obtuse which you (probably rightfully) feel you can get away with because this is a very GPL friendly forum. BSD, MIT, etc licenses allow you to do whatever you want with the code with none of the strings that the GPL has. Your false dichotomy between GPL and closed source doesn't really help promote a useful dialog about freedom of source.

Couldn't you have at least done an example that doesn't promote bad practices with quoting? My experience with sqlite (admittedly using the Tcl binding) is that it gets this stuff far more correct and easy to use in practice than any other database API that I've seen. I don't know if the python bindings are of that same quality; if they are, they're top notch. The code I would have written? Like this:

set first "Spooky"set last "Monster"set email "spook@spammity.spam"# In practice it's really easy to put values into variablesmydb eval {insert into contacts values (:first,:last,:email)}

The advantage? That code is now totally armour-plated against SQL injection attacks as well as being fast. Which is nice, really really nice.

Your problem is that you are looking for a "database". This is a very complex kind of program that is total overkill for your purpose. Will you really be storing hundreds of thousands of records and wanting to do some complex querries? Ifnot why not just use a speadsheet? If you want a free spreadsheet look at OpenOffice.

For most GPL projects, this is what it means. How is that unreasonable??

Did I say it was unreasonable? It's your code, you're free to put any conditions you want on it. I encourage you to do so. PD, GPL, BSD, commercial, etc., that's your call. With these choices come consequences to the end users of the application and/or the source code, though, and there's no squirming out of them.

What I said was that the GPL imposes "you're free to do as we tell you" conditions while holding up PD in contrast. Which is precisely accurate. PD doesn't impose such conditions; hence PD is a feature offered as compared to any GPL'd alternative. Since it is a feature that directly improves usability and lightens the conditions under which the user can sell, modify and redistribute the source code, I find it well worth mentioning.

Just because you don't necessarily want/need to make it public doesn't mean you want to distribute it as binary-only or not be able to use other GPL code in the development or expand it in unforeseen ways in the future.

I'm not a programmer, but in one of my past jobs I used to hire developers for projects. One of the caveats I applied to everything we did was that it had to be GPLed, partly for flexibility down the line if we needed to modify it or distribute it to partners (gov't and universities), but also because the nature of the work was such that 95% of the foundations already existed in other GPL projects, all we needed was customization to our particular workflow. It's no secret that a lot of free software is only usable by techs, and much of what I was doing was developing user-friendly interfaces for our employees, and then paying to have the backend made functional.

But we never had the software available for the public, just because they'd have little interest in it as it was so customized for our particular use and I'm sure some of the very situation-specific code was fairly hacky. I know the developers did push patches upstream when it made sense, and I encouraged them to do so and even paid for more hours here and there to make sure the code was clean enough to do that. I strongly suspect that most developers who make a living with GPL/BSD software (other than employees of Red Hat and Mozilla and the like) pay their rent on projects like this. It's a great symbiotic relationship between the GPL software community and private enterprise.

I'm late to the party, and there are certainly a ton of other good suggestions, but I just had to pipe in and cast another vote for YAML [yaml.org].

Even if you don't end up using it for this particular app, it definitely deserves a look. Although it'll never displace XML, it definitely answers at least some of the questions XML attempts to answer, reduces "tag bloat," and is easy on the eyes.

Having a db that is human readable can be a good thing--and a bad thing (anyone can change its contents manually?). Being small is a good thing. But even if you are doing embedded programing, is 800k such a burden that you would give up this feature set?

Sure, it can be. For instance, adding 800k of stuff to your application means that you are giving up control. You might have the source (which is going to be somewhat larger in this case)... but do you have the time to dig into it if there's a problem? Does reporting the problem necessarily mean it's even going to get fixed? For instance, I've reported many problems to various authors, some of which got fixed, some didn't. Likewise, adding a feature can be non-trivial with a large (source code) project. Compare that to a very lightweight thing that, let's say, almost does what you want but you just need feature X. Might be a whole lot easier to add that to a 20k "almost-there" set of source than megabytes of source, is that not possible?

Distribution of a class module with a python app is an entirely trivial thing to do. For that matter, you can take the class and put it in the same file with your application if that appeals, giving you the advantages of atomic distribution -- harder for an end-user to separate things and make whole app, as you handed it to them, fail to work.

And I suppose the elephant in the room is that the end user may have no need for the features. I sell a commercial program with features not available in other programs in the same general application space, functionality-wise; that doesn't mean it will sell to anyone on that basis unless they actually need those features, right? In this specific case, based on the article, the needs described seemed to be met by what I'd written, and hence my post.

PostgreSQL whips the living hoo-ha out of SQLite; it is larger, but offers more features (sound familiar?) Yet SQLite has a place, as your advocacy clearly indicates. I would venture that the difference between SQLite and class dbtext is similar, just occupies another place on the size/features scale. In the end, the user, based on their requirements, may go "what a POS" and step up the ladder. From either one.

OK, dbtxt looks interesting, but SQLite is very stable

dbtxt also seems to be very stable. It's never failed for me, nor have I ever received any indication that it wasn't doing what it was supposed to do when used as indicated. Also -- as a user myself -- I'm quite interested in fixing it, should it turn out to have problems.

Also, just as an aside, If someone can explain cross-platform file locking in python to me in terms I can understand, and if python's ability to lock can be made to implement a reasonable form of "wait for lock to release", I'd like to add that as a feature, too. I found python's docs on locking to be more than a little opaque. I'm sure it's just me.

He said he wanted a flat-file database. Why, on this Earth, would he ever need "standard SQL including joins." SQL is designed for relational databases and, in particular, joins are only EVER used in a multi-table, relational database. In fact, the guy even said "SQL is overkill".

First of all, my point was that in comparison to a python-only tool like dbtxt, sqlite is probably worth the additional memory overhead of a couple hundred kilobytes.

Second, I'm not sure that you know any better than I do what the poster really needs. Open office calc might be an option, but certainly not if 800k is too much space; open office is what, like 200 megs? And dbtxt might also be an option, if doing everything in Python is what the person wants; I love Python, but there are plenty of people who don't. I'm sure that all three proposals are equally valid.

Now, to defend SQLite: in most cases, dealing with a relational database is "overkill" not because of SQL per se, but because of the fact that software needs to be installed and configured*, and because complex sets of processes need to be managed and maintained. SQLite is useful, in part, because it provides the functionality of a relational database without the complexity of the engine. Now, if you don't want or need SQL, then you might not consider that particular feature to be of particular benefit. Nonetheless, there are many situations where SQL might not be required, but where it might be useful, where in the past SQL might not have been used, but where now, because of SQLite, it will be used. There are very few cases where real-world information is not most accurately represented as relational data (in comparison to flat data). SQLite allows you to avoid making compromises regarding the way that that data is stored. And SQLite is fast.

*(Although some interpreted languages might require that sqlite's libraries be installed on a system for it to be used by a program, that is not the case for compiled languages like C, C++, Obj-C, etc., where the sqlite libraries can be compiled directly into the code distributed to end users.)

With PD code, you *do* own the code. Everyone owns the code. So you can do anything you want. Now that is good manners.

There is a philosophical position that I think is important. The term "share and share alike" is something that most kids have learned in childhood for centuries for the greater good of society. The GPL simply puts that in the license.

I am a capitalist, seriously, I honestly think that real people deserve real compensation for their work. Actually, let me rephrase that, people have the right to try to get compensation for their work.

The BSD license makes no philosophical statement, and that is sad. I honestly believe that if you put your code out there for someone to use you should try to get something for it. Your life, the time you spend writing the code, is valuable. Make it worth something.

TextDB: If you don't have access to a database but you do have access to PHP, and you want your web data stored in a database, then this is what you're looking for! Full sql compatibility is planned as well as a perl port.
This is OS-independent and is licensed under GNU GPL, has a web-based interface and is developed using PHP. (check out: http://sourceforge.net/projects/textdb [sourceforge.net]).
Berkeley DB (BDB): This is a computer software library that provides a high-performance embedded database, with bindings in C, C++, Java, Perl, Python, Ruby, Tcl, Smalltalk, and many other programming languages. Berkeley DB is redistributed under the Sleepycat Public License, an OSI and FSF approved license and runs on a variety of operating systems including most Unix-like and Windows systems, and real-time operating systems. It comes in three different editions: Berkeley DB(originally written in C), Berkeley DB Java Edition, Berkeley XML DB (check out: http://en.wikipedia.org/wiki/Berkeley_DB [wikipedia.org])

Check out Notetab Pro.. should have no problem dealing with your text files.. and you can get fancy and make imput dialog boxes, and little clips (macros) to deal with repetitive inputs... Deals well with text, if I remember right the clip engine was modeled after Perl.

Accurate and very full SQL implementation, fully transactional and ACID compliant, stores everything on disk in a single file, and can be archived in a transportable format for restoring on different OS/Architectures.