If there are performance problems because of WINE, it is usually because some of the implemented routines are just not fast enough. The only reason WINE can be considered 'slow' is because there is another implementation -- Microsoft Win32, on Windows -- to compare it to.

The main performance gaps that fall into WINE territory that they could not close unilaterally by improving its user-space implementation are those where Linux requires multiple system calls to accomplish something similar to one system call in Windows. Otherwise, it is "just a library", albiet one that includes its own linking and loading (like binutils) and graphical widgets (like gtk) and sound/video abstractions (like sdl). There are also more mundane performance issues in other components in general: for example, video driver routines, independent of what user space program uses them, Wine or, say, SDL.

However, because the inner loop in most games don't push system calls (because that'd be slow and insane) and instead just flat-out userspace instructions, this is almost not a problem for the game use case. Usually it's missing routines or slow routines for DirectX/DirectSound.

In spite of this general argument that there is no technical reason as to why programs under Wine are slower, most programs running under Wine are slower than on Windows. Why? The cause is probably more social than technical: the developers of a game for Windows are always developing with and against the Microsoft implementations of the same functions: they avoid slow ones, and optimize for the specific behavior of fast ones. Almost nobody would block a release because the Wine implementation of the same functions are having performance problem. Also, Microsoft has a lot more resources to invest in their Win32/64 API implementation. The result is that the program runs faster in Windows, and anything that runs fast in Wine is a combination of luck and the good graces of the Wine developers to schedule their time to profile and optimize slow implementations given everything else everyone asks them to do (implement more APIs, Win64, fix bugs).

You are correct: you to remember these rules, but it is precisely because the type system is primitive that you must do this. You might simply say that the type system is not expressive, and doesn't tell you what typed expressions cause mutations, might cause dereferences, or whatnot. So, not unlike C, one must memorize what the semantics are of each type and operator. You, the programmer, try to best to know and reconcile these facts that are opaque to the compiler, and apply them correctly in your head, rather than formalizing them in a way that satisfies a more advanced type system or checker.

Because of the simplicity of the type checker, it pretty much has to assume that any kind of dereferencing or mutation can take place, and more powerful transforms to handle optimizing boxing, etc are hard or impossible to write (many of these optimization passes apply, in the small, an optional variant of more advanced type systems, since they have to prove a property of the code under optimization).

The upside of this is that there's minimal baggage, and the caveats can be predicted somewhat easily. For example, proving a non-toy program will not suffer a divide by zero is generally beyond most pedestrian type checkers, and one might imagine that for some programs, not proving this error condition cannot occur renders all other exercises pointless. Another problem is that type checkers of any real use cannot type check every correct program, and sometimes require some massaging that may reduce lucidity of the code.

So, the Go approach was basically to do nothing, and eat the predictable caveats, which is not inconsistent with their approach on, say, generics.

It's not a primary source, but is written by a credible author and has an extensive bibliography:

"The Bottom Billion", by Paul Collier

The effects of too much or the wrong kind of aid destroying local economies is a phenomena he seems to support. For more secondary sources, an article on Wikipedia (sub-section "Effectiveness") outlines the common viewpoints.

The Go type system is really simple: arguably simplistic. Which I think is good, because it leaves more room for advancements -- better than caving to a complication too soon.

For example, [512]byte and [1024]byte are different types. They really do not subsume one another in any way; typically one talks to them though slices that do have the same type, and the deceptively similar notation []byte.

The one area where people get confused is interfaces, which at first blush look like static type information, but are not -- they are things, or values, just like a pointer is. In this case, interfaces are a reference that also knows about the method set of its referenced value. It so happens that the pervasive auto-coercion of references into interfaces and the overloading of the . operator to handle dispatch on pointers, structs, and interfaces makes the difference less apparent.

Here's an example people find counter-intuitive at first: "an array of pointers to concrete type X that implements the interface Y is not also an array of things that implements the interface Y", which is not what one would expect in say, Java. To get a similar effect, one must allocate an array and fill it with interface values (and this is wordy, one nominally a loop for each concrete type to fill such a slice, just like C), as interface values are run-time entities.

I'm talking about the kind of scale where you stop thinking about vertical scaling and are fully transitioned into horizontal scaling.

Facebook/Google scaling, etc.

I think there are a complex mix of reasons for this, but the biggest one is MySQL AB shipped pretty dodgy good-enough features earlier and had more early adoption, and as a result familiarity and inertia took care of the rest. They don't really have any advanced multi-MySQL-management machinery (special exemption: MySQL NDB, which is quite interesting). It was masterful market placement, and Postgres is a project, not a product, and projects can have a harder time acting strategically, sometimes for better and sometimes for worse for their users.

For example, Postgres 8.3 was the first that did not blow through transaction ids in read-only transactions, which contribute mightily to constant VACUUMING. This is 2008, when MySQL was also just implementing Row Based Replication, which is the first Logical Replication system they had that had a shot at being sound (yet, a google search for mysql rbr is a very sad place after the first five or so results). At this time, they had many years under their belt with an unsound logical-replication solution based on statement replication, which was enough to lure many operations into an operational purgatory: something as simple as UPDATE without LIMIT without a sort causes standby drift from the primary in a subtle way, but hey, it worked well enough until it didn't. However, it was easy enough for simple cases and gave you time to figure out how to avoid corner cases, something Slony did not do -- Slony is sound (it runs the .org registry, among others), but is very complicated.

Since then, MySQL releases have been maligned with increasing infrequency, and no obvious movement on increased quality -- although I think Oracle Corporation has actually done a pretty good job getting this back on track -- as the company changed hands to Sun and subsequently Oracle.

Meanwhile, at the end of 2010 and two releases after 8.4, Postgres 9.0's streaming replication is released, which is a very exact "physical" replication approach (with trade-offs, of course). Now copying data is a lot easier, but this is only two years ago: this investment is still bearing fruit.

Although it is yet to be seen, there is now a lot of activity around logical replication that would allow for fine-grained filtering that is also possible with MySQL's RBR implementation for the 9.3 release. I'm not sure what the odds of its acceptance are -- the project is awfully picky -- but it does seem like now, many years later, this tenacious pickiness is paying off. Release momentum is better than it ever was.

So, all in all, there are specific features that are helpful for horizontal scalability, but they're all pretty half-busted in MySQL, and unfortunately an unbusted version takes a lot longer to flesh out. Postgres is not a company capable of strategically releasing busted stuff and iterating --culturally, the project rather not have a busted feature at all. So they didn't.

MySQL was able to do this, and even with a loss in soundness let your business grow so you'd have more resources to pour into overcoming its shortcomings later. Unfortunately, the amount of MySQL technical debt has mounted and the social infrastructure has become somewhat awkward because of MySQL fragmentation and the controversy over Oracle's ownership (note: they already owned one of the more sound optional components of MySQL for many years, InnoDB, now they own the pretty hairy optimizer and executor, too, which need a lot of un-damaging).

Like you: modulus its abuse as a partitioning system I also think table inheritance should be retired.

However, I'd like to rant a bit: there is no probable clean mirror of OO inheritance, because that idea has failed in my eyes. Multiple inheritance is complex and brittle and single inheritance restrictive and brittle. Some Java programmers began to speak out against this blight even nearly ten years ago, and this thinking has only become more popular today when one looks at the modern Java frameworks and design idioms. That's because bundles of operators are not naturally reused in taxonomies, but as general relations: Traits (commonly seen in Scala, a language I find too complicated, but I really like the shape of the Traits idea in it) or Golang Interfaces just seem like much better ideas.

I think postgresql.org promises relatively little in the documentation (and has a list of caveats there), but a lot of people get quite excited about the feature, maybe because it seems unorthodox. The weaknesses are mostly in the cleverness of the planner, but I think there may be some semantic weaknesses too and it's not clear what the best idea for fixing them is.

I think another system like this is the CREATE RULES rewrite system for non-trivial cases like UPDATE/DELETE. In practice they do work exactly as they say on the tin, but it's generally advised to avoid their proliferation because it's easy to get them quite wrong when considering change over time.

Getting back to table inheritance: in practice, it seems like most seasoned users only recommend it as a workaround for partitioning, which is slated to be one of the areas where the system should get a real treatment of (I think the plans are much better than the 'oracle style' partitioning, IMO).

I think if someone found a good non-partitioning set of very convincing use cases for table inheritance as-is that support would improve. Or, if a set of convincing use cases could justify an adjustment to the feature set for table inheritance to make it a more advisable feature would be very interesting. But so far I have not seen an attempt to do this, and there may be a better take at the problem than hierarchical inheritance (I think Golang's interfaces, for example, are an intriguing idea that may serve a similar purpose).

Commonality is not the same as not having a bug (the bug in question was not on obscure hardware), however, I'm not saying this is the cause of your problem: only that the existence of such a class of problem makes it hard to justify chasing down opaque symptoms.

However, it sounds like you have experience with a specific recurring symptom in mind. Perhaps report it to pgsql-bugs, if you haven't already, with any observations you have over multiple occurrences of the problem. Also, it sounds like you are running pg_resetxlog, which is not to be taken lightly. The part about the locking sounds very suspicious in particular.

Also, consider pg_check, which is a very handy extension to cross-check indexes and the heap as well as the well-formedness of tuples in both. It's a bit rough, but I've found it useful. Be aware that it takes some heavy locks, so recommend splitting off a physical replica (aka a "point in time recovery") from a live system when poking at it.

Bottom line, PostgreSQL is great when it is working, but you are pretty much on your own when things go wrong. If you are considering using PostgreSQL, you better plan on having someone in your organization becoming an expert in it and you better have backups because fixing even slight corruption in PostgreSQL is frequently impossible.

There is a reason for this: the bugs that cause subtle and infrequent corruption tend to be really, really hard, and are in the realm of likelihood that other obscure bugs in, say, your RAID controller firmware (not a made up case!) are the cause. The latter kind of bug can take a very qualified team months to figure out at a database company (to eliminate the database and kernel first, of course). And such is part of the reason why proprietary database licenses cost so much -- besides buying Yachts -- and why proprietary database companies certify on certain hardware.

However, pretty hard bugs are reported and fixed with some regularity, sometimes with coordination from multiple reporters.

So the problem is that unless one wants to put in that kind of dough (it is a community project after all, although it is one with exacting quality standards), it is simply cheaper to rely on continuous archives or standbys. This is the economics of the situation for most people, and is probably why you retrieve this advice.

On more reflection, even if that kind of dough materialized, it's not clear to me that any organization is structured in a way to absorb it efficiently to take on such obscure problems. Maybe EnterpriseDB. The community seems to be growing, though, so I think quality, features and vendors will probably going in a positive place.

For an example of a hard bug that required quite some time to figure out, see #6200 in its connection to #6425. For one that eluded the project for nine years, see http://archives.postgresql.org/pgsql-hackers/2011-11/msg00718.php. This bug would reproduce a handful of times (I'd fix the catalogs carefully by hand, and it was not a major use of my time) on my employer's entire deployment, which is probably one of the world's largest.

While I'm quite happy to see anyone spending their time on user-experience oriented software for my data base management weapon of choice, do you find psql lacks any of those features except colors and perhaps better defaults? I have to say...psql is way, way better than the MySQL CLI from what I can see. It may very well be the best CLI database tool I have ever used (it helps that proprietary database offerings can't seem to give a care about the user experience of tools that will not sell to management, so they tend to be...terrible. But the GUI ones are much better...)

The one feature that I'd like is some use of color.

psql already has support for:

Control-C to interrupt the query

linestyle can set unicode box styles

output-sensitive pagination

in 9.2, output sensitive expanded mode (\x)

In fact, expanded mode itself is really handy

Online help. Saves tons of time \h create table

I hope AltSQL will try to do something more wild and crazy, because the existing feature set, portability and speed of psql will probably make it hard to go up against unless something is drastically better or serves a different use case. It will also be hard to be able to compete with some of the special commands like \h in a flat-out usability slugfest unless one sinks quite a bit of time into Postgres support.

The good news is that really crazy stuff is totally possible without linking to Postgres' source directly because the Postgres catalogs reflect (actually, more than reflect: they are the implementation) a ton of things about the system...apparently, much more than most database catalogs. One can troll the psql source and just steal the SQL queries wholesale. All the client-side backslash commands are implemented this way.

I think using libwine (wine, but as a library) is totally fine. What is less fine is just slapping libwine into a game and not targeting the game against libwine to meet it halfway on bugs, features, or performance issues, or being inflexible as to introducing little bits of platform specific code into the game when necessary. For example, having to see "C:\" in a linux game is a pretty lousy user experience.

There are many libraries used by just about every game, and to a large extent nobody cares what they are, as long as they are tested and stable. The problem is buggy games (handing blame to libwine is not really acceptable in this scenario, even if it has a bug), not libwine intrinsically.

I think use of libwine is a fine way to get portability to multiple platforms.

A comment, not a question: Thank you so much for supporting Linux. I have a Mac laptop and Linux workstations and it's great that I can use both. The only thing that could be better is a synchronization service so my save games would be nicely and transparently copied around.

Between the ages of 12 to roughly 23 I played a lot of games. I am now 28. From 23 onwards I just have had much less time to play or to even anticipate good games. These days, I'm not really attracted to games that familiar but are complicated (how many samey dungeon slashers or semi-realistic FPS does one really wat to play?), and I like games that can run on non-exotic hardware and that are well put-together, but are shorter. My time and monetary constraints have changed: purchasing video games represent a not-even-visible part of the overall monetary budget of simple survival, and time spent both playing and researching them is much more limited.

If not for the Humble Indie Bundle, I would not even come close to buying these games: when I see a Humble Bundle, there's a 50% chance I have not even heard of even one game among the lineup, much less even being able to name the genre, and I work as and with a bunch of software engineers in San Francisco, where we play games before they're cool. For me, the Humble Bundle not only represents both a price point I find agreeable but a curation that would otherwise make finding the games to buy totally infeasible. In any Humble Bundle I buy -- which has been nearly all of them -- there will be at least one game I like and get around to playing (and I don't always finish, even then). My uncertain return from purchasing a game makes researching and buying single games for even $10 a piece an unappealing proposition.

I think as people who played a lot more games when they were younger get older and have more obligations, they might tend to be more like me. I don't want to generalize, but I think your worries are from the viewpoint of someone who follows games very closely and is pretty sure what they want, and sees the Humble Indie Bundle as selling too low. In my counter-case, I am an example of a customer who would just never exist in the first place, and I think as once-serious gamers get older there might be an increasing number of people like me.

I hate to be "that guy" (as I have to say it whenever Shen is mentioned), but the Shen license is no normal beast:

The license applies to all the software and all derived software and must appear on such.

Alright.

It is permitted for the user to change the software, for the purpose of improving performance, correcting an error, or porting to a new platform, and distribute the modified version of Shen (hereafter the modified version) provided the resulting program conforms in all respects to the Shen standard and is issued under that title. The user must make it clear with his distribution that he/she is the author of the changes and what these changes are and why.

Derived versions of this software in whatever form are subject to the same restrictions. In particular it is not permitted to make derived copies of this software which do not conform to the Shen standard or appear under a different title.

Want to derive a new feature? Well, too bad, you can't distribute it, not even under a different name, even with acknowledgement to the original author and project! (It's a derived work). If the original spec is any good, fragmentation will not be an issue.

It is permitted to distribute versions of Shen which incorporate libraries, graphics or other facilities which are not part of the Shen standard.

...except fragmentation will still be an issue, because in practice, those things matter, and programs using those useful features will be non-portable, except by the ad-hoc agreement by the community (and that same standard should be applied to the language as well). It's the most hubristic and, in some ways, legally toxic licensing I could imagine, and only resemble rational if the issuer of the copyright was employing their own legion of compiler hackers and library writers to grow the thing, make it useful, and make it perform well and you trusted said organization, such as some people's acceptance of Oracle's relationship with Java, or more so Microsoft with .NET.

Rant off. I'm mostly annoyed because I was interested in the predecessor language, Qi, and this one too, but I'm not basing anything serious off of an implementation that is controlled by an inescapable iron-fisted committee.

I guess one could re-implement the thing, clean-room, and escape that confounded license.

That's a separate problem, and one that's easy to solve...you can bubble up as much information as you want in the exception.

You can also bubble it up using return codes, and, in theory, catch every exception and turn it into a return code, but that would not be idiomatic in most languages that have this major category of error handling.

That's a separate problem, and one that's easy to solve.

No, it isn't. It's possible, but it's ugly enough that it doesn't get done, because some error handling cases can involve continuing with the rest of the calculation with a substituted value, for example. Both some issues with return-code based systems and exception based systems are ugly, so I choose the one with less mechanism: returning values are harder to justify removing than exceptions.

The "Rust" language by Mozilla also doesn't support try/catch exceptions, and likely will continue not to. I asked this precise question to them, and this was the response, which I felt was very well-informed:

In practice, I do see a clear use of both "major" faults (where just about everything can abort) and "minor" faults (where pushed-down recovery selections give the best result), which is a somewhat unsatisfying dichotomy (where's in-between?), yet it maps well to code I've written most of the time.

I have found that dealing with errors locally is generally the only way to get a satisfying solution, because when the stack is unwound I've lost useful information to do the recovery. That's why I carved out a little exception (no pun intended) for common lisp conditions, which allow you to define recovery at low levels, but you can choose to opt to use them at a higher level function.

Many times when dealing with standard try/catch/finally-class languages I'm left frustrated because I have to push down error recovery code in some way anyway, so why even bother with a special flow control construct that doesn't satisfyingly get the job done? In Go the problem is not solved, rather, it is left basically untreated and back to return codes, which do not have action at a distance. On balance, a win for me.

Okay, good to know. I just wanted to understand the style of caching it uses. I know these tradeoffs are useful in practice, but can be problematic if one doesn't take note of the hazards that are almost always necessary to get improved performance. (I do wish caching strategies would quickly enumerate their hazards vs. a direct database connection, so one can bin the solution quickly)

I didn't even think about the data modification issue, but you are correct, that'd be surprising under the same statement schedule. However, avoiding the cache in transactions still has a different corner that is probably more acceptable, but not strictly the same as a real backend connection:

Process one issues query 'a'

Process two issues runs BEGIN, and then query 'b'

Process two issues query 'a', it is not fetched from the cache

There are different surprises if 'a' updates the cache here, or not

Process one's request for 'a' finishes, updating the cache

Process two COMMITs or ROLLBACKs

Process two issues query 'a' again

The problem here is that, nominally, the following SQL-ish program should be able to, in all situations, see the modified data when issuing QUERYA the second time:

BEGIN;
SELECT QUERYB;
SELECT QUERYA;
COMMIT;
SELECT QUERYA;

or, should never see any modification should COMMIT be replaced by ROLLBACK. That makes caching the result of QUERYA in the transaction problematic, because at any point there can be an error later in the transaction or an explicit rollback, and if it's not cached then a stale cached copy would not act quite like the real thing. This is fixable if one implements database-like MVCC (but that occupies a lot of buffers...) or SystemR-style locking (low concurrency) solutions to put aside the new cached values and only make them visible upon successful COMMIT. I think another solution would be to attach snapshot information (which is generally pretty dense, just a few numbers) to query results, so numerical comparisons can provide a view of the data without backwards-timewarp caveats, but this may hurt the cache hit rate too much, since the snapshot is global. Sigh...if only the snapshot was confined to a partition.

Sort-of-related, if you doing sequential scans, which is not what a lot of OLTP systems do. I have seen, however, some people who are probably only getting by because they keep attaching queries to a probably perpetually-running synchronized scan ring. I wouldn't call the performance "preferred", but for data sets that are not large but not trivial relative to memory it was saving them from thrashing.

It also wouldn't help you for CPU heavy operations, so there would be an advantage to, say, complex PostGIS query result caching.

Hmm. Do you think you have a lot of duplicate queries to merge? It looks to me like it's somewhat smarter than pgbouncer, and somewhat dumber than pgpool (which is also smart enough to be just a bit scary in its caveats).

I'll give you a corner case for query caching right now. I think it's solvable:

Process one issues a query 'a'

Process two issues query 'b', it completes

Process two COMMITs, it completes

Process two issues query 'a', blocking because that query is cached

Query 'a' completes, notifying process one and two

Process two receives results for an old snapshot for 'a', because the snapshot used by Process one for query 'a', that started the query, is used.

This could be fixed by ensuring that the last query issued by a process had a sufficiently old snapshot, so that more simple inductions/temporal logic can be applied by the programmer. It could also vastly reduce the cache hit rate.

The source code is interesting because it implements a pretty small SQL parser, and has its own small overlay catalog. Thus, it can probably be adapted for many systems, although it lacks the type analysis and executor of a full-blown implementation. It also means that quirks or more powerful SQL constructs may need hacking into the grammar.

Depends on the not-SQL system. Some have ordering semantics, others do not. Ones that adopt partial ordering semantics can break the barrier. Even those that do not may find themselves bottlenecked on a system call on their platform of choice. Still others give you the option of some ordering if one writes to a quorum, but blocking on writing to a full quorum is not speedy at all...

Many NoSQL systems still are basically monolithic in nature, because write() and then read() returning results from the past is just not a very convenient behavior and requires a lot of error handling. Even without transactions, logic against data that does not run in the same direction as time is more complicated as data that is guaranteed to appear in the order of time's arrow with respect to some observer, such as the database.

I think in the long term that relational databases will be taught ways to apply their isolation model over a tiny subset of the data, thus allowing the free flow of partitions in a cluster.