Adobe Photoshop already has a very useful history feature that allows branching. It's quite a nice tool, but when you enable branching, it's much more difficult to use. I don't see how you could make branching simple enough to make users want to use it on a regular basis.

On an unrelated note, isn't Apple's new Time Machine the first step towards what a modern undo should be (assuming it will be as painless to use as advertised)?

Yes, absolutely. What we really need is automated version control running on the local system. Slap a decent GUI on it so people can browse previous versions of the file (if needed) and it should be good to go. Just hijack the save dialog to also commit a change to the versioning system, and give people the option of also putting the undo stack in the version control.

EditPad has had an intuitive, visual, unlimited undo function for years. It undoes and redoes right past saves (unlike MS Office, for crying out loud). Granted, it doesn't do branched undo/redo trees, which would certainly come in handy at times.

Also, I think Office has something like "Track Changes" which isn't an undo but turns a doc into something like a Wiki, giving something something like an infinite undo. This feature has gotten some people in trouble because the data that was supposed to be "buried" could actually be recovered and reveal some incriminating things.

So, any app with a persistent (especially infinite) undo would have to have some sort of impenetrable encoding (such that even people who have the app can't reverse engineer it or something) or store the undo history in a separate file (hardly a graceful solution, I hate when emacs, etc. does this) if an infinite undo is to avoid this problem. I like the idea, but there are certainly issues involved.

From a technical standpoint, anybody know how "undo" is implemented in these text editors (or in Wikipedia)? I can understand how, if you get the implementation right it wouldn't be too hard to capture, but I'm trying to think what the most graceful implementation has been?

Undo isn't too difficult if you are only trying to support rollback of changes made to text buffers. It gets a lot more hairy once you are dealing with arbitrarily complicated graphs of objects. Part of the problem is delimiting what changes instigated by the user should be recorded into the undo record and which ones shouldn't. This requires carefully partitioning the application state.

Once you have thought long and hard about state partitioning you still need to implement the actual transactional memory. In one big project I worked on we used the existing and very powerful serialization framework. You delimit transactions with begin()/end() calls, and before making any change to an object that should be recorded into the undo log you need to call dirty() on that object. The first time dirty() is called on a given object in a transaction it has the effect of serializing that object into the undo log. This lets us later deserialize the object to restore it to its original state. The system in question had a powerful garbage collector which we used for handling the kind of case where you have an existing object A that grows a pointer to a newly created object B whose creation is undoable. When undoing the creation of B the garbage collector was used to NULL out any pointers to B.

Even with this kind of general purpose transaction framework, undo is still a mean bitch.

90% of undo/redo is text-based in a text editor. The rest of my post was about undo/redo in a more general, more challenging context.

The whole point is that you don't want to take global snapshots. The application I was referring to is the suite of editting tools for Unreal Engine 3, an engine for making modern video games. There is an extremely complex and rich object model with dozens of tools that can be opened at any given time. Each must support undo/redo within its own context. This is feasible as long as the objects they touch don't overlap in a conflicting way. It's a pretty hard problem in general but a transactional memory can do a lot of the legwork. What Taladar talks about is a transactional memory.

1) (more important) version control. Allowing people to version their text without needing a global snapshot. From an analytic perspective, maximum storage efficience seems to involve storing the diffs, but it's tough to say if the space saved (HD space is cheap) would be more cost effective than the cycles involved in reconstructing documents.
2) some form of undo. I imagine I can do something similar to text editors and keep a buffer file server side.

Anyway, my thing is that I haven't worked with these types of applications and these sorts of features, so I have little conception of how these things are typically implemented.

From a "get this shit working as fast as possible" perspective I think it's important you don't get bogged down by premature optimization. It's easy to isolate the implementations details behind an interface. In your initial version don't worry about diffing or compressing the text files; just store every revision of a file as a complete snapshot. It might never become enough of an issue. If it does you have plenty of options available to you: diffing (which will probably require regular complete snapshots anyway to prevent having to compose all diffs starting at revision 0), gzipping, etc.

In a system like this, undoing/reverting is just a matter of copying the contents of a previous revision into the current revision.

In the simple case of a text editor you just create a data structure for any editing operation the user can do. So you might have

add_text(startpos, text)
delete(from, to)

You then keep a stack of all editing operations performed. When the user does an undo, you pop the top item from this stack, undo the changes it represents from the document and push it onto the redo stack. When the user does a redo, you pop the redo stack, execute the changes represented by that command and push it onto the other stack.

Of course if you want to have branching then you need a tree of editing operations instead of a stack, but the principle is the same.

Rather than "intelligent undo" I'd rather have a "checkpoint version" feature that enabled me to save specific versions of a file while working on it, then discard them when I did a "final save" or some such.

Checkpointing "accomplishments" corresponds more to how I create work as opposed to intelligent undo's emphasis on wholesale retention of intermediate steps, some of which I don't necessarily want to see again…

“checkpoints” are actually supported in the e text editor. You can “Commit” the document, which equates to saying “this batch of changes is a single change”. This creates a milestone of the documents current state to which you can add comments and labels. All milestones can be viewed on a nice timeline.

But checkpoints don’t help you in the time between committing milestones, that is why we also want an undo that ensures that users never lose any parts of their work.

It seems like the visual aspect of undo-ing is unneeded for the most part. The key feature is allowing users not to lose a huge part of their undo-history by undo-ing a bunch then doing something, then finding themselves unable to go "forward" in time anymore. The solution should be pretty clear though: If I do 1 2 3 4, then undo twice to get back to having just 1 2 and then add a 5 (resulting in 1 2 5) then undo again, it should first go back to 1 2 (as is done now by everyone), but if I undo again, it should go back to 1 2 3 4 instead of back to 1 (as is done by most people now). To get back to 1, I should have to go all the way through 2 3 4 again. The point being that I can still get any part of my history back later through enough undo-ery, since all new changes are being put on the end of a stack, but nothing is ever popped from the stack, just the position in the stack moves during a series of undo commands, but goes back to the top when something is done.