Transparent persistence for JavaLanguage business objects. Orders of magnitude faster and simpler than a traditional database. Write plain java classes (albeit in accordance with a CommandPattern and a few constraints): no pre or post-processing required; no weird proprietary VirtualMachine required; no inheritance from base-class required. Clear documentation and demo included.

Apples and oranges. What MySql and OracleDatabases do (across a network connection, I might add) is different that what ThePrevayler does. Although some applications that use the former could instead use the latter, and might do their jobs faster as a result, saying ThePrevayler is "faster" than MySql is like saying a telephone is faster than a typewriter. They simply don't do the same thing.

Well, whether they overlap or not is perhaps debatable. But the primary reason it is fast is because it is language-specific. Most RDBMS are not. Data tends to outlive languages, by the way. There are a lot of other issues to consider. For example, for read-only operations MySql tends to be faster than Oracle. This is because Oracle is tuned for a mix of reads and writes. Thus, a benchmark that favors reads will give a potentially misleading score. Further, if you later need some of the features that RDBMS typical provide, you may have to end up implementing such yourself and then no longer win speed races unless you spend years using all the optimization tricks that database companies spend a lot of effort on. It is often said that human maintanence is more costly than machines.

KentBeck, RobMee, HumbertoSoares? and KlausWuestefeld paired up for a weekend in december 2002, on the island of Florianopolis, Brazil, and implemented Florypa, a minimal prevalence layer for Smalltalk (VisualWorks) based on Prevayler.

Is there a write-up anywhere on prevayler that doesn't use this mock question and answer format? Asking yourself leading questions and then answering them is a really annoying way of presenting anything.

No.

Aren't you simply dropping all code from DBMs and stuffing it all on the application?

No. I have seen prevalent systems that were thousands of lines of code. Most DBMs are hundreds of thousands of lines of code. --KlausWuestefeld

Citations please? Many open source relational databases are signficantly less than hundreds of thousands of lines of code.

First things that come to my mind are referential integrity, schema evolution, etc.

The JavaVirtualMachine takes care of referential integrity. Don't worry, it will not let you put a Pineapple object in a Person variable. As for schema evolution, does your DBM write your data migration scripts for you? --KlausWuestefeld

Type checking is not the same as referential integrity. The JVM will not enforce a NOT NULL constraint, or UNIQUE, or any of a number of things that might be expressed as a CHECK constraint, of course.

Reinstating this question, using GreencoddsTenthRuleOfProgramming: "won't a sufficiently complicated database application written using PREVAYLER contain an ad hoc, informally-specified bug-ridden slow implementation of half of a DBMs?"

No. Your DBMs probably didn't allow that but with Prevayler, you are finally free to use other people's OO code. --KlausWuestefeld

That doesn't answer the question. You mean "yes, but you can borrow someone else code to help build that 'ad hoc, informally-specified bug-ridden slow implementation'. What's porridge got to do with anything here?

Are the best practices one finds in today's database theory all best forgotten?

While there is no "inheritance from base class required" there is a rather important architectural restraint involved. Applications utilizing prevayler are required to adhere to a simple CommandPattern. No great burden, and in fact a nice guideline, but the statement above could easily be misinterpreted.

The client code has to issue transactions to the business objects through the simple CommandPattern. The business objects themselves, where the real gain is, have no such restriction. -- KlausWuestefeld

Seems interesting, but it doesn't seem to provide multi-user support (i.e. locking / concurrency). I assume this is ignored by requiring commands to be executed sequentially.

Commands (transactions) are executed sequentially so you don't have to worry about concurrency between them. You do have to worry about concurrency between commands and queries, though. Prevayler does not have to specifically provide for that. Remember: all your business classes are regular Java classes. You can use all the Java language synchronization constructs normally. -- KlausWuestefeld

"Command/transactions are executed sequentially" - doesn't that make the system really hard to scale?

No. Every command takes only a few microsecond to run. This is partly because all data is stored in RAM.

There's no guarantee that a command will take a few microseconds. A command could take years to run.

It could. But it doesn't have to.

Remember that read-only operations can still run (depending on your synchronization solution).

What synchronization solution? How do you perform a read-only operation while a command is executing without a chance of getting partially modified data?

In Java, the most common solution is to use the "synchronized" keyword. Just solve it the way you usually solve concurrency issues in your code.

SnapshotPrevayler? synchronizes on itself for each command. If another thread reads from the PrevalentSystem? during that command there's no guarantee of data integrity. Are you suggesting that code called by the command has to grab locks before modifying data that might be read? If so, how can read-only operations be processed while a command is executing?

I don't really understand what you mean here, but: Yes, depending on the semantics of your business objects, you may need to use "synchronized". Just like in any other Java application the other threads can still access the remaining, non-locked, objects in the system.

Read-only operations (a.k.a "queries") don't need to use Commands, but instead access the system directly. Maybe that's where the confusion is?

I think you will find http://pot.forgeahead.hu/ both faster and better
for prototyping as it lets you use any POJO data structure you can come up with. -- Bjorn Blomqvist

You might also want to look at JavaSpaces. The Sun contributed version is an "all in memory" implementation of an AssociativeMemory like Linda. Of course the problem that we found writing enterprise applications was the prevalence hypothesis. IOW, right now, today, there IS NOT enough ram. If you are dealing with Gigabytes of data, you will eventually fall over with today's machines. However, for those not dealing with large datasets, I would definitely recommend something like Prevalyer or JavaSpaces or any other in memory log-based PersistenceMechanism. My feeling is that the correct solution (as always) will end up being a hybrid solution. After all, some things you want to keep on disk. I can think of no good reason to keep years of transaction history for some application FOO in memory. The only time you'd want to bring it into memory is when you are going to run some rare report. Another example is an archiving or versioning system - why keep binary images of files in memory? Just because it's data? No. Again, a hybrid approach is probably the best. -- RobertDiFalco

You could easily wrap your binary data using some scheme where the only time you bring the large chunks of data into memory only when they're needed. This requires a few extra lines of code, and I don't see how one could arrive at the conclusion that because of this one needs a full-fleged RDB implementation.

-- I agree that a hybrid approach wins just as it always has. However the important thing to note is that an RDMS is no longer a part of that hybrid.

What about history? How many years of history should I keep in RAM? Do I have to use different classes if I want to load some state from ram and other state from a persistent archive?

If you want (?). There would probably be well-factored, not-very-instrusive ways of accomplishing this, with aspects, etc. But it sounds more like you just don't like this whole idea. I can think of many (most) of applications I've worked on, in fact most of the "serious" enterprise apps, that would have no trouble fitting in RAM. Of course there are an infinite number of possible applications that wouldn't fit in RAM without the modifications I suggest. This is all stated up front, no one is saying this thing is supposed to be all things to all people (unlike, say, EJB or RDB marketing).

[Summarize] :
Me: It doesn't do what rdB's do (platform independent, fast bulk queries, arbitrary associations, multi-access).
Anonymous: Do you need to do those things?
Me: Yes. Perhaps not all of the time, but most of the time. Exactly. You need a tank, which requires lots of maintenance, slows down progress, etc, but has features you need. Most people just need cars -- even for "enterprise" apps.

I think a simple form of object persistence is an excellent idea for saving session state. If it can be transactional too then that is a plus, but depends on the application working hard to externalize updates as transactions. A simpler scheme, such as checkpointing, would seem enough.--RichardHenderson.

I'm actually using this in a real world project and have had none of the problems RDB people are positing. The only thing I've missed about relational databases is cascade deletes, and I plan to move to something like RDF in the future to solve that problem. What they don't mention is all of the hoops you have to jump through to build and maintain and change a RDB, the extra overhead you incur running unit tests, and depending on your organization, the DBA bottleneck. The bottom line is if you need crash-proof persistence - which is all most applications really require - something like Prevayler is all you need. An in the future if you find you require a RDB, you can always refactor.

What if there are other resources, e.g., JMS channels, transactional email services, etc., that need to be involved in the transaction? Is ThePrevayler XA-compliant? --RandyStafford

Not today, so this would be a requirement that would lead you to use an XA-compliant PersistenceMechanism, possibly a RDB. Many of your questions might be answered faster by simply downloading the source and taking a look at what it does, there's not much of it to grok.

Actually it is very much not so. But make up your mind: you either contribute here to WardsWiki and allow your claims to be discussed and possibly debunked here, or stop making claims supported by external links only, as people here are unlikely to contribute to yet another wiki. By the way, have you studied TransactionalInformationSystems ? --CostinCozianu

Isn't ThePrevayler based on two phase commit? It writes each deterministic transaction to a log before it performs it. The problem is that it can only deal with deterministic transactions, and JMS, email, etc. aren't deterministic. -- EricHodges

Actually the TwoPhaseCommit is the algorithm for resolving commit commands for distributed transactions. What prevayler does is a basic recovery mechanism, and it sweeps concurrency control under the carpet by serializing all concurrent acesses to the global object model. Well, it works in situations, but you have to be very careful in analyzing whether it works for your particular problem. If you have a distributed transaction situation, the lack of support for TwoPhaseCommit is can be a very serious deficiency, but again, depending on the particular situation it may not be too big a problem. Because what TwoPhaseCommit does at the infrastructure layer, transparent to the developer, could be done at the application layer with application specific business logic. I guess that's what Klaus' article it's trying to say: that you can deal with the issues at the business logic level. However, depending on the situation that extra business logic you have to add can be extremeley simple or it can be extremely complex, and I'm affraid the examples he presented are not very representative of how and why one would be in a distributed transaction situation and how to handle it.

But TwoPhaseCommit is implemented just like ThePrevayler. Write a message to a log indicating what you're about to do, then do it.

Not at all, there are fundamental differences in implementations. Of course they share some similarities because everything that has to do with transactions has got to write some into some log. But other than that, the two algorithms are fundamentally different. And by the way the solve different problems. Prevayler solves the problem of persisting the effects of a transactions against a single system in case of a soft crash, and even that implementation has problems. The distributed transaction means that a transaction that spans multiple database managers have to be either committed by all or rollbacked by all the participants, and of course the effects persisted, in presence of both soft failures, hardware failures and communication failures. TwoPhaseCommit doesn't offer an absolute guarantee that it will meet those requirements , as it is provably impossible, but it comes very close, and in the unlikely event that it fails, manual intervention by the DBA with database specific tools can solve the remaining problems.

Like for example the ability to roll back, and the need to contact a central transaction coordinator and deal with communication errors. See TransactionalInformationSystems, for more details.

We're using different defintions of TwoPhaseCommit. The two phase commits I've implemented had no roll back and no transaction coordinator. They were used for disaster recovery. If the system crashed during a transaction it could resume the transaction when it started again. I see from google that my usage is in the minority. Nevermind.

"Actually, two-phase commit is unnecessary"

After having testing Prevayler (and Prevayler-like IMDB's) we came to this conclusion: It is useful for prototyping, but fails miserabely in terms of a lot of OODBMS/RBMS issues.

In fact, we concluded that when all the problems with Prevayler was solved, we had an object-oriented database.

As a side note, a lot of people don't use Prevayler because of the elistic (unrealistic?? elitist??) "it-solves-everything" tone presented by Klaus and other Prevayler junkies.

I agree with you there. I use Prevayler and like it a lot but the web site is very offputting.

Well the benchmarks are a little silly too. There are a ton of things that Prevayler is not faster at than RDBMS systems. Not only transaction processing but there are many complex sorted tree queries that I'm pretty certain would be slower on Prevayler than a well designed RDBMS when operating on my multi-terrabyte datastore.

-9983 times faster than ORACLE. -- You should be specific about the operations.

Prevayler, and much of the above, seems to miss the fundamental point that traditional DBMSes provide reliable, language-neutral, integrity-maintaining storage of vital enterprise data in a manner that supports a variety of applications, not just those written in Java. Enterprise databases tend to long outlive most applications -- or even the popularity of the languages the applications are written in -- because they represent genuine business assets; they are the valuable knowledge of the business captured in an accessible form. They are worth money. In some cases, they're worth a lot of money. In most businesses, the applications that access the enterprise databases -- or even that just require persistent storage -- with few exceptions are nothing but expensive overhead. They cost money. In many cases, they cost a lot of money. They are merely the necessary (and, from an executive or stakeholder point of view, unpleasant) expense of presenting and updating databases in a user-friendly manner. The databases -- i.e., the all-important data -- will still be there, delivering business value long after Prevayler, Java, and any associated applications and their language-de-jour development environments have come, gone, and been written, re-written, and replaced numerous times to suit technical fashion and petty (again, from a stakeholder point of view) tweaks to requirements.

Prevayler and its ilk do nothing but increase overhead -- due to having to port serialized Java objects (or whatever) to a new environment -- when it comes time to replace the Java applications with something better, newer, or cooler. Far smarter would be use reduce the cost of using Java (or whatever) with traditional SQL (or truly relational) DBMSes, so that the expense of using a DBMS is minimised while ensuring that the database needn't be ported to a new environment when the Java application is (inevitably!) replaced. -- DaveVoorhis

Language neutral?!? I've yet to meet two databases that speak the same language.

I think you mean you've yet to meet two DBMSes that speak exactly the same language, and indeed most don't. By "language-neutral", I mean that DBMSes are not dependent on any particular client-side language. You may switch from COBOL to C to C++ to Java to C# to Ruby to the NextBigThing, as many enterprises have done, or use all of them simultaneously, against the same DBMS. Prevayler is dependent on Java.

{Prevayler is dependent on JVM, not on Java. You could use Prevayler with Clojure or Scala if you wish to do so. I'm not so certain that this 'language neutrality' is a real benefit in any case, especially when what it really means is "Use MetaProgramming!". Excluding (not insignificant) concerns about existing documentation, how much more difficult would it be, relative to crafting SQL, to craft a small Java program each time I wish to query a Prevayler database? Heck, ThePrevayler already follows a CommandPattern; I don't need to go nearly so far as expressing a whole program - just crafting a command will do. Move Prevayler up to a 64-bit address space and have it handle the paging issues, possibly add SoftwareTransactionalMemory to allow concurrent transactions, and ThePrevayler could probably meet the scalability requirements for enterprise use.}

Despite the fact that Prevayler might be shoehorned onto some of the obscure languages that run on top of the Java Virtual Machine, the Prevayler home page states that "Prevayler is an open source object persistence library for Java." For. Java. I'll take Mr. Wuestefeld's word for it (literally), even if it isn't strictly accurate. As for your notion that it might be appropriate to "craft a small Java program" each time you wish to query a ... database, that may indeed be achievable in theoretical sense, to be practically reasonable it would mean making Java available in every context where SQL is used today, and the subsystems that host it would need to exhibit the same behaviour as a DBMS. Until that happens, Prevayler remains dependent on Java, the Java Virtual Machine, whatever.

{I think you can readily grok the difference between "Prevayler is for Java" (a statement of purpose) and "Prevalyer is dependent on Java" (a statement of dependency). As far as "making Java available in every context where SQL is used today", that isn't necessary. Right now, SQL is 'dead code' in most C++ programs. The crafted Java 'command objects' could equally be 'dead code' in a C++ program that accessed the equivalent of a Prevalyer DBMS. One strictly only requires "making Java available in every context where SQL is implemented today." That said, a cheap JVM suitable for Prevayler business objects (which are more constrained than arbitrary Java objects) wouldn't take unreasonable effort at all to build if one wishes to return encapsulated objects in queries.}

You're quibbling and you know it.

{It might have been mere quibbling if I didn't have a great deal of interest in Scala, EeLanguage, Clojure, and other languages that run atop the JVM. As is, you're throwin a nephew I'm somewhat fond of out with the bathwater...}

The notion of using Prevayler for anything but trivial object persistence for small applications is patently ludicrous, and that won't change until it can support 100,000 simultaneous users on terabyte databases, be integrated with the full set of enterprise application development environments, and be queried via TOAD and support reports generated using CrystalReports. Sure, it's theoretically possible and I suppose it could happen, but what are the real chances? Stop arguing and go write some code.

{The above statements were already qualified. See (page_anchor qualify). If the question is whether Prevayler can unseat an incumbent, the answer is "no fucking way, because when it comes to SystemsSoftware you need to be at least twice as good as the incumbent." If the question is whether Prevayler has a better data-model than the RDBMS, the answer is "I think not. Indeed, I've repeatedly expressed opposition to BusinessObjects (see ObjectVsModel, OoNotForDomainModeling?)." If the question, however, is whether Prevayler or a system like it can support 100,000 simultaneous users on terabyte databases, and can be integrated with enterprise application development environments, and can be supported by report generating software, the answer is 'fuck yeah', albeit requiring some upgrades to implementation. Object systems can scale very effectively, plus offer an inherent security model (ObjectCapabilityModel) suitable for scaling to multiple untrusted parties.}

Prevayler requires external concurrency control to support any more than 1 simultaneous users.

Otherwise, the fact that it can support very large databases is an unsurprising and uninteresting note, because Prevayler is essentially nowt but a crude storage engine layer for a DBMS. As for your other point, almost any unreasonable and pointless effort can be "justified" (I use the term loosely) with a handy "can be supported ... albeit requiring some upgrades to implementation." I can turn a scooter into a 747 with a few "upgrades to implementation" too. By the way, is your colourful language ("fuck yeah," etc.) really necessary? Are you drunk &/or stoned?

{I'm not drunk, but I'm a slightly pissed. Pissed off, that is. I'm not talking about the sort of 'upgrades' that it takes to turn a scooter into a 747. I'm talking about moving to 64-bit support, adding SoftwareTransactionalMemory, and making ThePrevayler responsible for paging and GarbageCollection. In terms of implementation, those things take some careful design effort, but can be done in a few thousand lines of code. ThePrevayler already handles replication-based distribution as well as most enterprise-quality RDBMS. I'm moderately pissed off because, despite being explicit about the technical effort required, that has been repeatedly ignored by people who point at the incumbency advantage as though it were a technical problem then wave their hands and compare upgrades to ThePrevayler with turning scooters into 747s.}

{Why? For scalability, especially to multiple applications and users, of course. Not that I care about ThePrevayler in particular, but I am implementing my very own PersistentLanguage, and my own goal for scaling is integrated systems at the Internet scale. I need to go far above even the proposed modifications to ThePrevayler in terms of supporting non-replicated distribution, locating objects on a network, etc. So, suffice to say, I've spent a lot of time on related problems. Even if the obvious question is 'why', an equally important question is 'why not'? Is there a good reason that writing up an 'RDBMS' shouldn't be as straightforward as modeling Relations and enforcing Consistency rules in a PersistentLanguage that has SoftwareTransactionalMemory?}

{The main benefit of RDBMS is not this dubious notion of 'language neutrality'. The main benefits of RDBMS is the RelationalModel, which offers simplicity, predictability, optimizability, and joins (which in turn support a rudimentary form of LogicProgramming). Support for inter-relvar 'consistency' rules is also nice.}

Fine for RDBMS. However, I referred to DBMSes in the original paragraph. The main benefit of a DBMS is its role of maintaining long term independence of the corporate application infrastructure. The physical independence of the DBMS means its asset value can be relatively easily and inexpensively maintained over the long term (decades, now, in many cases) without risk of being contaminated by the expensive fashion-driven application development churn.

{Prevayler won't be replacing the RDBMS. But I would not be surprised if the RDBMS is eventually 'endangered' by PersistentLanguages - especially as they grow into the CloudComputing era with composable security models, ACID transactions, and high levels of scalability. A language with an advanced LogicProgramming subset would be raring for a takeover. Performance and features are likely to both be advantaged under such a design due to fewer serialization steps and better type and object integration.}

{I honestly wonder how well Oracle's years of fine-tuning its B-tree layouts, page-locking mechanisms, and in-block record-shifting (which is remarkably expensive, but necessary to maintain ordering, and tends to fight making blocks very large) would hold up against new designs for GarbageCollection, OptimisticConcurrency, and shared-structure writes. I'm guessing Oracle would do better (at least initially) due to specialized block structure, good query optimizers, and sheer experience of thousands of DBAs hammering out and documenting performance tweaks for many years. But that'd be excluding serialization, and the advantages of type integration, potential code specialization, and JustInTimeCompilation...}

"Language neutrality" aside, it doesn't "fit" other languages nearly as closely as it fits Java, removing one of it's main advantages. Sure, it may "run" in some other languages, but merely running is not enough to dethrone RDBMS's.

{SQL RDBMS doesn't "fit" other languages nearly as closely as it fits SQL. Thus, by your own logic, it also lacks the "main advantage" that ThePrevayler loses. So, that simply puts ThePrevayler accessed from external languages on even ground with respect to SQL RDBMS accessed from external languages. It's also why we've got 'procedures' and 'triggers' and such being added to SQL standards - attempts to reclaim some of this advantage.}

Compare what would be involved to access and share data among many different languages and tools. How much effort to hook it up and access from COBOL, MS-Access, CrystalReports, Python, Perl, C, Dot.NET, etc.? Even if an ODBC-like infrastructure existed for Prevy, what would the query syntax look like for common tasks? For non-programming tools? -t

{Other languages would probably need to add their own equivalent to ODBC-for-ThePrevayler, though, and a bit of standardization effort might be necessary. So long as you're comparing apples-to-apples, i.e. accessing the same types of data stored in SQL RDBMS vs. in ThePrevayler, there'd be no reason to expect trouble. If you wished to deal with features more advanced than SQL itself ever offered, such as passing object code around, then ODBC-for-ThePrevayler is free to carry a cheap (e.g. small footprint, no JIT, single-thread) JVM. The cost of a cheap JVM is less than the cost of a typical state-machine serializer/reader for SQL results on an ODBC connection. Besides, due to the various constraints ThePrevayler places upon its persisted 'business' objects (we aren't talking about arbitrary Java objects) it isn't as though these objects can reference anything but one another.}

SQL reader? I'm not sure what you mean. Typically the client doesn't parse SQL. It merely passes it to the server as a string and gets a data-set (rows and columns) back. This is the beautiful KISS of the concept and why thousands of languages and tools can hook up to it. (True, Microsoft's ODBC drivers sometimes try to parse SQL, but that's merely MS's odd dance, perhaps a way to give their SQL dialect an advantage.) -t

{I mean the bit on the client that reads the data-set. You seem to have in your mind a belief that writing an evaluator for a simple language - such as bytecode - and a small GC is something of a challenge. That's simply untrue. A java bytecode interpreter and simple semi-space heap+GarbageCollection can be written in less than a page of code each in most app languages. Performance will suck, and thread-safety will be non-existent, but that's okay.}

It's still not clear to me what issue you are trying to address.

{I'm addressing your previous comment: (1) I said the cost of a cheap JVM is less than the cost for a serializer/reader for SQL. (2) You asked for clarification about "SQL reader". (3) I answered, plus made it clearer just how 'simple' it is to add a cheap JVM to the ODBC-library-equivalent for ThePrevayler. Where are you confused?}

I don't know what a "typical state-machine serializer/reader for SQL" is. What is "serialized" SQL? Why is it needed? At what stage is it needed? You talk about reading SQL, and then reading a data-set. They are very different things. Thus, I don't know whether you are talking about apples, oranges, or fruit salad. -t

{What stage is it needed? Behind the scenes, transport layer - ODBC drivers for example. I'd have thought that obvious, since I have been talking about a sort of equivalent to 'ODBC-for-ThePrevayler'. I've clarified it a bit above.}

{4 and a half - loading input stream to produce a language-dependent record set. (And that's ignoring cursors and whatnot.)}

Usually that's done by the driver, not the app developer. It seems a relatively minor thing anyhow. I dissagree that Java does it inherently any better, but it's too small an issue to squabble about. A "cheap" driver could simply generate a CSV-style file/pipe and then let a lang-specific parser turn it into arrays/maps/cursors. It's not rocket-science. I'm more interested in how a Prevy version would do stuff. For example, does it also accept a text query? -t

{I was comparing it to having a cheap JVM implemented in the driver, not by the app-developer. So it seems apt to compare that to something done by the ODBC driver, not the app-developer, does it not? And to be clear, I have never claimed Java is inherently "better". Stop implying I have claimed Java "better" by 'disagreeing' with it. The question isn't whether Prevayler does "better", but only whether it does "worse" in some significant and critical way that makes it unsuitable for queries and sharing data. A "cheap" Prevy-version of the driver would simply serialize the return objects, including strings of bytecode for 'methods', then load it into a new instance of a micro-JVM locally. One could then visit or interact with the returned objects. Queries could return lists or arrays for simple visiting. And yes, one could make a text query and expect the remote Prevayler object to compile it in the same manner that one sends SQL text queries and expect the RDBMS process to compile them.}

If so, what does this textual language look like and what generates it? Lang-specific API's? -t

{That's a rather ephemeral question, a bit like asking what an RDBMS 'text language' would look like before SQL was chosen, which itself was (by historical coincidence) before PluginArchitecture was used for everything. But if it helps you imagine, assume that you're writing a class in Java, complete with 'import' statements at the top (except for the common ones, which might be automatically included), and that the class creates an object on the 'Prevayler' DBMS that can interact with the database, talk with your process by passing objects along (for long-term interactions and transactions), and eventually return a value (which might be the root of an object graph). One could advance this via plugins to other JVM languages, too, such as Scala and Clojure.}

In my opinion, the Unix-style files/text/pipes is preferable to PluginArchitecture, but that's probably another topic. And, non-OOP fans, such as P&R and FP fans, will not be so happy about being forced to use the OOP paradigm. And even within OOP, loose and tight typing models tend to not get along so well. An API that works smooth under both is difficult. -t

Top, you remind me of the guy who learned just enough automotive engineering to think it's a good idea to build an internal combustion engine out of wood.

You remind me of the guy who built wood out of an engine. Complificatify stuff for no reason out of an idealism obsession. API's usually take longer to use and learn than table interface because they are navigational in nature. More specifically, to use a table-based interface you only have to know the schema and a standard query language. With API's you have to learn the API's, which take longer in part because due to encapsulation, they reinvent DatabaseVerbs from scratch and in different ways for each API. This lack of collection standards means more relearning. Reinvention of of collection idioms is not logical. If that's "wood", then paint me guilty. (Yes yes, I know, your GodLanguage will fix all that.) -t

You are speculating, with zero evidence, on what takes longer to learn. You should probably stick to things you understand. You'll be happier there.

I am not speculating with regard to collection idioms: I'm stating a fact: OOP API's either reinvent DatabaseVerbs in different and inconsistent ways, or lack them all-together, making one have to roll their own. Consistency is better than inconsistency all else being equal. And the ability to readily apply DatabaseVerbs is better than lack of that ability all else being equal. Encapsulation inharently limits OO in that regards. -t

{In my experience, OOP collections and operations over them are invented once per language. GenericProgramming and polymorphism handles the rest. This results in a acceptable degree of consistency of API within the language, even though the collections might be less convenient than relations or [extended sets in]ExtendedSetTheory. Encapsulation doesn't "inherently limit OO in this regard" any more than encapsulation of table-implementations limits database verbs on tables. Since you claim to be stating facts, TopMind, I would like to see your sources. (Complex idioms like TreeInSql also make me doubt the alleged 'consistency' achieved by DatabaseVerbs.) My problem with collections in OOP has not been consistency of API; a much greater issue has been supporting concurrency.}

But we are talking about sharing among different languages/tools. I've already agreed that if you are certain to only use Java for all your collection-handling, then Prevy makes sense. (Some dialects of SQL support built-in tree operations. But being proprietary or narrow puts it on par with specific API's or languages.)

{Widespread market support is NOT the same as technical consistency. In terms of 'consistency of API', SQL ain't especially consistent with anything but SQL. If there are a few languages it is moderately consistent with, they'd probably be DataLog and MercuryLanguage and the like, only because the data-model fits. From my perspective, SQL is a specific language, and using SQL queries from a non-SQL language is no less 'narrow' than is using Prevayler/Java queries from a non-Java language. Would the Prevayler/Java queries 'invent' a few DatabaseVerbs? Probably. But that is not worse than SQL having 'invented' DatabaseVerbs. It's once per language. Once per language. Once per language. Pay attention! SQL did not do better here in terms of "per language" re-invention of Database verbs.}

I was talking about PluginArchitecture in general, not so much about Prevy. To settle such, we'd have to have a "share-off" where a typical set of data-sharing scenarios are tested under both SQL and Prevy. And I ask again, would Prevy be different API's for each language and tool (yik), or a textual query language? Which are you entering into the share-off? (By some accounts, relational won attention over "navi-DBs" by winning query-offs.) -t

{If I were entering a 'share-off', I'd probably follow the same pattern as SQL just to keep as many variables as constant as possible. Wouldn't you? That's independent of what would be better or worse in practice (not everyone favors current SQL APIs, either). And the question overall is ridiculously stupid, like asking whether your SQL sharing examples will include per-language ODBC or not. Also, can you stop using that hideous baby-talk 'Prevy'?}

I don't find any of the above clear. What is the "same pattern"?

{If "to keep as many variables as constant as possible" failed to clarify that for you, I don't know what will. I don't really feel like explaining scientific method and the reason for minimizing differences between experiments, or explaining how you couldn't attribute to SQL vs. another language differences that come from other API elements...}

I agree that ODBC adapters are different per language/tool, but because most of the query complexity and power is in the query language and not the API, the difference is not that great, not significantly different than say FTP API's: they don't build a copy of the file structure at the client, for example. It's mostly just send-command-wait-get-result.

{The complexity is in building or describing the query, in managing any transactional interface, and then in processing results. Same would be true for a Prevayler DB.}

And the fact that you waste time complaining about "Prevy" is silly. There's lots of little things that bother me about your style, but if I stopped and pointed out every little one, we'd waste all our text on that alone. Stop being anal. Focus on the bigger things.

{You'll note only one sentence was spent on my request, and more than 80% of my last post was 'focusing on bigger things'. Can you say the same?}

So your speculation is better than my speculation? At least I can relate with and help fellow practitioners like myself because I know how they think and what they want. You design shit only a mumbly academic's mother could love. Thus, it will be ignored like 99.9% of the other dusty "computer science" papers one finds at a university basement. Maybe it takes a dumb blond to sell to a dumb blond. You sell to the 5 people in the world who think like you. In other words, nobody wants your goofy shit.

I've had 4500+ downloads of the RelProject, so apparently a few more than 5 people want my goofy shit. And, I am a practitioner. I went into academia after over 15 successful years developing custom, database-driven business applications.

Okay, only some of what you design is shit. My apologees. -t

A step toward enlightenment is recognising that what you perceive to be shit is not necessarily shit. Perception is fallible.

Nothing, but you wrote that I "design shit only a mumbly academic's mother could love" and I "sell to the 5 people in the world who think like" me. I'm pointing out that you're wrong, and I'm providing evidence that you're wrong.

{Down at the ABI/VirtualMachine level (as opposed to the language-level) the two type-systems get along quite easily. That's why ForeignFunctionInterface can work. Clojure is an example of a language with DynamicTyping that runs on the JVM. It's entirely feasible for, say, the local ODBC-driver JVM to handle basic type-conversions quite readily to allow useful interaction with the application language.}

Why would it be better than SQL? You have to look at the total IT picture, not just what makes Java devs happier. -t

{With regards to pointing out that SQL RDBMS is already starting without this "main advantage", thus putting ThePrevayler on even ground, why would ThePrevaylerneed to be "better" than SQL? And you shouldn't argue about "the total IT picture" when the point being argued is just a small fragment of it. In "the total IT picture", the RelationalModel still strikes me as superior to ThePrevayler's persisted-business-objects model.}

I believe it rational to keep existing standard unless a contender is clearly better. True, some choice and competition is good, but you have yet to show that it can even approach the utility of the current champ. In other words, demonstrate that it's at least close to even if infrastructure history is ignored. It's still competing with Xpath (XpathLanguage) and other query-like challengers to SQL. You have to win the the runner-up race before challenging the champ. -t

I'm not sure what you mean by "small fragment". Non-shared data? If you can be fairly certain that it won't be shared, then I have no major complaint as long as you can make a good no-share case. -t

{I've not argued that ThePrevayler is a 'good' query or data-manipulation language, or that it in particular has a real chance of replacing RDBMS. In this sub-argument, I've only pointed out that ThePrevayler has the same sort of dubious 'language neutrality' enjoyed by SQL, and that 'sharing' isn't nearly difficult as you (with your gloom and doom pessimism) seem to be predicting, and that losing its "main advantage" simply puts ThePrevayler on the same ground as SQL which never had that main advantage in the first place. By "on the same ground", what I mean is you'll be comparing quality of data models, scalability, distribution, performance, etc. Only when we start talking about PersistentLanguages in a more general sense - specifically those with advanced LogicProgramming facilities as we move into a CloudComputing era - have I suggested that RDBMS may become endangered.}

The incumbent tends to have the burden of evidence. Let's study some scenarios. Let's start with one. Say I want to hook up a tool like CrystalReports to your Prevy data to make some reports for managers. What is needed? What does the user of CR have to do to prepare for it? If you have reasons to believe that RDBMS can't scale, then perhaps a different/new topic would be a better fit.

{What are you attempting to prove by such a question? That existing applications like CrystalReports do not currently support Prevayler-as-a-database? Or that 'persistent-business-objects' data model leaves much to be desired? (Both of those things I'd happily agree with.) I assume by "Prevy data" you refer to ThePrevayler specifically. Ultimately, though, you craft a query, send it across a process link, get a reply, and decide how to display that reply. CrystalReports would need some updates to handle the new database and query types. ThePrevayler could report strings, numbers, dates just as easily as SQL so apples-to-apples on that feature. If you wished to take advantage of complex object-graphs and method-calls, you might also be able to produce 'interactive' reports - e.g. an object-graph that displays an interactive scene within the report. (Not something you could print, though.) But that would hardly be necessary... one could simply say that CrystalReports is restricted to displaying the same sorts of data it always has displayed before, and one simply wouldn't craft queries that return complex object graphs.}

Oh great, your "query language" is morphing into your pet GodLanguage/typebase again. Surprise Surprise. This is where I get off the bus, for I've been to that theme park already, and don't like Goofy's teeth. (See CrossToolTypeAndObjectSharing.) But let's back up a bit. As a textual query language to compete with SQL, do you feel that it can hold it's own there? Or is its advantage is that it can eventually take everyone to your Xanadu? -t

{Oh great, you're doing that HandWaving assume-everyone's-talking-about-TypeSystems thing again. Surprise, surprise. Let me say this much: if you use a Prevayler database to do EXACTLY what you're already doing with SQL, such as relational queries, it won't be very competitive because Java doesn't support relations, joins, etc. all that well. On the other hand, if you take advantage of the features that are unique to an OODB, such as support for executing FunctorObjects as part of a query (approximately equivalent to executing scripts held in a ControlTable as part of an SQL query), it isn't any longer a direct competition with SQL. Either way, useful reports could be generated for CrystalReports so long as it has the minimal support for connecting to and querying Prevayler databases.}

Waitaminit... Isn't "as long as it has the minimal support for connecting to and querying Prevayler databases" essentially the same as "as long as CrystalReports is rewritten from scratch"?

{I prefer to think of it as: "as long as ThePrevayler is a major player when CrystalReports is written". I don't really know about the architecture of CrystalReports, but I do know it supports SQL - not because SQL is great, but because SQL is and was (ahem) prevalent.}

I don't think Oracle is the best comparison. There are many medium- and smaller-scale RDBMS that may be more comparable.

{The paragraph to which you responds is about PersistentLanguages and competing with RDBMS on an enterprise scale. (My exact words were: "I would not be surprised if the RDBMS is eventually 'endangered' by PersistentLanguages.") For that, I think Oracle is (if not the best) one of the best comparisons.}

{How could it relate? It isn't as though anyone here is arguing that PersistentLanguage = just Persistence, or that RDBMS = just Storage...}

Well, then call it a "collection language" or "database language" or whatnot. The emphasis on "persistence" suggests that somebody is focusing too narrowly.

{I disagree. That "persistence" is important, and it still specifies "language" (as in "programming language"). And if you wanted further clarification about other language features, well, the complete statement remains available earlier in the page.}

The real issue is cross-app info sharing. Data tends to outlive languages even if you are currently just a Java shop. In my experience, data of any lasting value often needs to be shared by other apps or tools, such as CrystalReports. One needs to compare Prevayler to other RDBMS with that in mind, not just Java+Prevy versus RDBMS.

{That's nothing special. It isn't as though the normal storage format for an RDBMS is any less proprietary binary. Data is made available through a protocol (e.g. ODBC) by a shared service that outlives individual applications. Query and DML facilities implicitly provide the necessary features to back up and restore data, thus allowing the data to live longer than any individual language. Data can be transformed without loss. There is no reason to believe that a dedicated process or object in a PersistentLanguage would perform poorly with regards to sharing or maintaining long-lived data.}

I wasn't really addressing machine performance. It's more about IT effort to work with the technology, to hook things into it, to write drivers for it, etc.

{I wasn't talking about speed, either. What does "with regards to sharing or maintaining long-lived data" mean to you? At this point, it'd be a serious challenge to unseat an incumbent even with insidious beachhead strategies. But, if you're talking about raw effort to support applications, drivers, backups, data outliving languages, sharing data with an application similar to CrystalReports and so on, then I can't think of any valid reason to be pessimistic. (It's a lot of effort on an absolute scale, true, but to be fair you must compare the effort that was required to integrate SQL with all this technology.)}

Think of it this way: if Java was replaced by some GreatNewLanguage? that was sweeping the industry, would Prevayler still be advantageous over an RDBMS? SQL is the de-facto standard for sharing info between multiple apps, languages, and tools inside a given org. It's familiar to IT workers and works pretty well despite some warts. You need to show Prevy competing well in that aspect. It's my opinion that Prevayler is merely Java developers thinking too narrow and being selfish (perhaps unconsiencely). -top

{If ThePrevayler were the incumbent, you could ask the same question in reverse: "Java is the de-facto standard for sharing info between multiple apps, languages, and tools inside a given org. Do you really want to switch to SQL? Sure, it has those nice LogicProgramming features, but how is it going to scale to multi-organization data sharing? What if it is replaced by some GreatNewLanguage?? Those DataLog and MercuryLanguage guys look like mighty powerful competitors to this upstart SQL, it might be gone before you know it! Besides, all the IT workers are familiar with it, and it works pretty well despite some warts." You have been claiming that a design is technically flawed on the basis of historical coincidence. I can't agree with such logic. I do not need to show Prevayler competing well against historical coincidence unless I'm making make a non-technical argument (e.g. about market success, adoption, getting support integrated into CrystalReports).}

{I do agree that ThePrevayler has no chance to unseat the incumbent on the basis of historical coincidence. Technical merit, after all, rarely determines market success, and I don't favor ThePrevayler even in terms of technical merit. But I'm confident in my hypothesis that someone over the next three decades will invent a PersistentLanguage, designed for CloudComputing and DistributedSystems, with a LogicProgramming subset with sufficient technical merit and the marketing edge (in terms of scalability, performance, and security sufficient for multi-organization data sharing) to put the 'dedicated' RDBMS down like an old dog. My own efforts at a GreatNewLanguage? are promising on this front.}

Good for you. I'm working toward similar goals -- a distributed, persistent language for general-purpose programming (including, perhaps, what is currently called CloudComputing), but fully taking into account corporate political and cultural concerns. I hope to do this as PhD research, but if that is not possible (for purely bureaucratic reasons; I'll find out in two weeks) I'll do it anyway. Within a decade, it will eat the world and you and Top will be begging me for jobs. :-)

Hopefully I'll be retired by then so that I can watch the Titanic from a distance instead of having to be a passenger. -t

{I'm interested in learning more about the bases for your language design, and how you aim to account for corporate, political, and cultural concerns. I have my own fairly extensive accounting for what I consider to be relevant facets of those social concerns (and military concerns, too), but it may be neat to see how you're thinking different (e.g. what you consider to be problems needing partial solution, and what you consider to be the partial solutions). Do you have any blog or reference for your ideas?}

No blog or reference yet. There is my three page research proposal, but it's only been distributed internally at work and it's more sales pitch than information. In two weeks, I'll find out what restrictions, if any, there will be on informally publishing ideas and work-in-progress. However, I forsee a significant basis of my work being ExtendedSetTheory, but I may find reasons (I'm doing research, not advocacy) for that to change.

{Then I guess I'll see in a couple weeks. Good luck!}

It should be pointed out that ODBC does not require the use of SQL. The client can send ANY query language as long as it's text. Thus, a new query language could use ODBC as long as it's represent-able as text and can live with the "table shape" of the result. -t

That's funny, everything I can find indicates that ODBC requires the use of SQL as part of its specification. One could certainly create something ODBC-like that didn't use SQL, but then one wouldn't need to represent anything as text, nor would they need to live with the "table shape" of the result. (Although both would be likely as the first is a language in and of itself, and the results are naturally table shaped).

[ODBC was designed for connecting to SQL DBMSes, but there's essentially nothing that fundamentally precludes using the ODBC API with other table-oriented or relational query languages.]

"ODBC defines a standard SQL grammar. In addition to a standard call-level interface, ODBC defines a standard SQL grammar. This grammar is based on the Open Group SQL CAE specification. Differences between the two grammars are minor and primarily due to the differences between the SQL grammar required by embedded SQL (Open Group) and a CLI (ODBC). There are also some extensions to the grammar to expose commonly available language features not covered by the Open Group grammar.

Applications can submit statements using ODBC or DBMS-specific grammar. If a statement uses ODBC grammar that is different from DBMS-specific grammar, the driver converts it before sending it to the data source. However, such conversions are rare because most DBMSs already use standard SQL grammar."

So, it appears that SQL is a required part, although it's allowed to use a DB specific language in addition to the SQL.

It's still not clear if non-SQL can be sent. As a last resort "trick", I suppose it could send a string constant as part of a dummy expression that contains query-language-specific content. As long as the server knows how to extract that string, it can use it.

[Non-SQL can be sent. I've seen ODBC APIs & drivers used with non-SQL applications, though it's not conforming to the standard.]

Yes, the previous comment is true: You can send any language you'd like using ODBC and JDBC. Relational databases use this to support vendor-specific SQL syntax. But if your driver does not support SQL then most 3rd party tools will fail, making your "support" of ODBC/JDBC questionable. -- JeffGrigg]

It's not ODBC limiting them, it's that they use SQL to communicate. For example, they may have a QueryByExample engine that generates SQL under the hood and send that to a remote DB of your choice via ODBC. After all, SQL is the Lingua Franca of query languages (for good or bad), and thus lots of tools are built around that.

[An ODBC driver is required to accept a particular variant of SQL. If it doesn't, it isn't really an ODBC driver, regardless of how closely it adhears to the rest of the standard. This doesn't prevent an ODBC driver from accepting other languages.]

[To be strictly accurate, the ODBC standard requires that an ODBC driver accept ODBC SQL, but there is no technical requirement to do so. As I noted above, I have seen ODBC infrastructure used to connect to external services via non-SQL languages. These were not intended for use with third-party ODBC tools. It was done mainly to provide a common API for both the SQL-based data sources and the non-SQL-based data sources, and probably because ODBC driver source code already existed that could be relatively easily converted for similar purposes.]