The Essence of Google Dart: Building Applications, Snapshots, Isolates

Wіth thousands of programming languages floating around, why is Google introducing Google Dart? What can it possibly add? The short answer: the Google Dart team wanted a language well suited to modern application development, both on the server and the (mobile) client.

Some of Dart's features address problems that languages like Java or Javascript have long had. Dart's Snapshots resemble Smalltalk images, allowing (nearly) instant application startup and without some of the problems images bring. Isolates keep code single threaded with shared-nothing, message passing concurrency like Javascript's WebWorkers or Erlang's processes. Some language features and decisions allow for scalable, modular development. Dart can be compiled to plain Javascript with the DartC compiler or executed on the Dart VM.

InfoQ takes a look at the most interesting aspects of Dart for application development, with a focus on the Dart VM and some of the notable language features.

Dart is an Application Language: Snapshots And Initialization

Is an application's startup time really relevant? How often a day do users restart their IDE or word processor? With the rise of memory constrained mobile devices, application startup happens a lot; the Out Of Memory (OOM) killer process is very trigger happy on mobile OSes and will kill suspended applications without hesitation. iOS's multitasking model and the prominent physical home button, have also shortened the average life span of mobile apps. Before iOS 4, pressing the Home button always killed the running application; with iOS 4 the situation has become a bit more complicated, but applications still have to be prepared to die at any time, be it at the hand of the user or the OOM process.
This behavior won't stay on mobile OSes. "Sudden Termination" and "Automatic Termination" are application properties introduced in recent OS X versions that declare an application can handle being killed at any point (eg. when the available memory is low) and then restarted, all transparent to the user.

Slow startup has plagued Java GUI applications since Java 1.0. Booting up a large Java application is a huge amount of work: thousands of classes need to be read, parsed, loaded and linked; before Java 1.6, that process included generating the stack map of methods for bytecode verification. And once classes are loaded, they still need to be initialized, which includes running static initializers.

That's a whole lot of work for a modern Java GUI applications - just to show an initial GUI. The introduction of a SplashScreen API in Java 6 shows that it's a problem that hasn't been solved, and that's affecting developers and their users.

Snapshots vs Smalltalk Images

Dart addresses this with the heap snapshot feature, which is similar to Smalltalk's image system. An application's heap is walked and all objects are written to a file. At the moment, the Dart distribution ships with a tool that fires up a Dart VM, loads an application's code, and just before calling main, it takes a snapshot of the heap. The Dart VM can use such a snapshot file to quickly load an application.

The snapshot facility is also used to serialize object graphs sent between Isolates in the Dart VM.

In the initial tech preview of Dart, there doesn't seem to be a Dart language API for initiating a snapshot, although there shouldn't be a fundamental reason for that.

Technical Details of Snapshots

The Dart team put a lot of effort into the snapshot format. First off, it can be moved between machines, whether they're 32 bit, 64 bit or else. The format's also made to be quick to read into memory with a focus on minimizing extra work like pointer fixups.
For details see runtime/vm/snapshot.cc, and runtime/vm/snapshot_test.cc for some uses of the Snapshot system, іe. writing out full snapshots, reading them back in, starting Isolates from snapshot, etc.

Snapshots vs Smalltalk Images

Smalltalk's images are not universally popular; Gilad Bracha, wrote about the problems of Smalltalk images in practice. Smalltalk development usually takes place in an image which is then stripped of unused code and frozen for deployment. Dart's snapshots are different because they're optional and need to be generated by loading up an application and then taking a snapshot. Dart's lack of dynamic code evaluation and other code loading features can allow the stripping process to be more thorough.

Dart's snapshots aren't currently supported in code compiled to Javascript with DartC.

Currently Snapshots are used in message passing between Isolates; objects sent across in messages are serialized using SnapshotWriter and read in on the other side.

In any case, the snapshot facility is in the Dart VM and tools, and as with many other features of Dart, it's up to the community to come up with uses for it.

Finally, a snapshot feature is already present in Google's V8, where it's used to improve startup speed by loading the Javascript standard library from a snapshot.

Initialization

Even without snapshots, Dart has been designed to avoid initialization at startup if possible. Classes are declarative, ie. no code is executed to create them. Libraries can define final top level elements, ie. functions and variables outside of a class, but they must be compile time constants (see section 10.1 in the language spec).

Compare this to static initializers in Java or languages that rely on various metaprogramming methods at startup to generate data structures, object systems or else. Dart is optimized for applications that start up quickly.
Dart doesn't come with a Reflection mechanism at the moment, although one based on Mirrors (PDF) is supposed to come to the language in the near future, possibly with the ability to construct code using an API and load it in a new Isolate, bringing metaprogramming to Dart.

The Units of Concurrency, Security and the Application: Isolates

Concurrency

The basic unit of concurrency in Dart is the Isolate. Every Isolate is single threaded. In order to do work in the background or use multiple cores or CPUs, it's necessary to launch a new Isolate.
Google V8 has also recently gained Isolates, although it's a feature mostly interesting for embedders of V8 and for implementing cheaper Web Workers by launching them in the same OS process; the feature is not exposed to Javascript code.

The model of having multiple independent Isolates for concurrency is similar to Javascript or Erlang. Node.js also needs to use processes to make use of more than one CPU or core; a host of solutions for managing Node.js processes has popped up.

Other single or green threaded languages have similar process herding solutions. Ruby's Phusion Passenger is an example which also tried to fix the overhead problem when loading the same code in multiple processes: Phusion Passenger loads up a Rails application and then uses the OS' fork call to quickly create multiple processes with the same program contents, thus avoiding parsing and initializing the same applications many times over. Dart's snapshot feature would another way to solve the problem.

Reliability

The first tech preview of Dart uses one thread per Isolate, although other modes are being considered, ie. multiplexing multiple Isolates onto one thread or having Isolates run in different OS processes, which would also allow them to run on different machines.
Splitting up an application into independent processes (or Isolates) helps with reliability: іf one Isolate crashes, other Isolates are unaffected and a clean restart of the Isolate is possible. Erlang's model of supervision trees is helpful with this model in that it allows to monitor the life and death of groups of processes and write custom policies to handle their death.
This interview with the creators of Akka and Erjang gives a good overview of the advantages of Erlang's model.

Security

Untrusted code can be run in its own Isolate. All communication with it must take place over message passing, which will be enhanced with a capability-style mechanism that permits to restrict who can talk to which ports in Isolates. An Isolate must be given a port to send messages to; without one, it can't do anything.

Compartmentalization of Memory

Another benefit of splitting an application into Isolates: each Isolate's heap is standalone; all objects in it clearly belong only to it, unlike the objects in a shared memory environment. The key benefit: if an Isolate was launched for a task and it's done - the whole Isolate can be deallocated in one go; no GC run necessary.

What's more: if an application is split into Isolates, that means the application's used memory is split into smaller heaps, ie. smaller than the total amount of memory the application uses. Each heap is governed by its own GC with the effect that a full GC run in one Isolate only stops the world in that Isolate, the other Isolates won't notice. Good news for GUI apps as well as server applications that are sensitive to GC pauses: time sensitive components are unaffected by one or a few messy, garbage spewing Isolates that will keep the GC busy. Hence, having one heap per isolate improves modularity: each Isolate controls its own GC pause behavior and is not affected by some other component.

While GCs in Java and .NET have been improving a lot, GC pauses are still an important issue for GUI applications and time sensitive server applications. Solutions like Azul's GC have managed to make pauses managable or even nearly disappear, but they need either special hardware or access to low level OS infrastructure, as in their x86-based Zing. Realtime GCs do exist, but they also slow down execution in exchange for predictable pauses.
Splitting up the memory into seperate heaps means that GC implementations can be simple yet still be fast enough. Of course, it all depends on the developer - to benefit from these characteristics, an application must be split into multiple Isolates.

No more Dependency Injection Ceremony: Interfaces and Factories in Dart

"Program To Interface" is common advice, in practice it gets a bit harder as someone has to call new with an actual class name. In the Java world this issue has led to the creation of Dependency Injection (DI) frameworks. Adopting a DI framework first means to inject a dependency on the specific DI framework into a project.
What problem does DI solve? Calling new on s apecific class hardcodes the class, creating problems for testing and simple flexibility of the code. After all, if all code is written to an interface, the specific implementation shouldn't matter and someone should choose the right implementation for a use case.

Dart now ships with one DI solution, making it unnecessary to chose from a host of different options. It does so in the language by linking an interface to code that can instantiate an object for it. All flexibility that's required can be hidden in that Factory, whether it's deciding which class to instantiate or whether to allocate a new object at all and just return a cached object.
The interface refers to a factory by name, which can be provided by a library; different implementations of a factory can live in their own libraries and it's up to the developer to include the best implementation.

Language

Google Dart is a new language, but it's designed to look familiar to many developers. The language resembles curly braced languages, and comes with OOP that focuses on interfaces. Dart's OOP system does come with classes, unlike some other recent languages like Clojure (which does OOP with a combo of protocols and types) or Google Go (Go has interfaces, but no classes). One advantage of having one OOP system built into the language is not getting a new OOP system and paradigm with every other used library, like in Javascript.
For details official Dart language specification or for a quick overview see 'Idiomatic Dart' on the Dart website.

Modularity

Namespacing in Dart is done with the library mechanism, which is different to Java where the class names are the only way to namespace things like methods or variables. One consequence: libraries in Dart can contain top level elements other than classes, ie. variables and functions outside of classes.

The print function is one example of a class-less top level function. The library system also provides a solution to avoid name clashes: library A can import another library B, and to avoid names from A and B clashing, all names іmported from B can be prefixed. Ie. #import("foo.dart", "foo") will import the library and make all its elements available with the prefix "foo.".

Optional Typing

The key word in "Optional Typing" is "Optional". The developer can add type annotations to the code, but these annotations have no impact at all on the behavior of the code. As a matter of fact, it's possible to specify nonsensical types - the code will still execute fine.

Having the types in the code allows for the various type checkers to do their work. The editor shipped with Dart has a type checker and can highlight type errors as warnings. Dart also comes with a checked mode in which the type annotations are used to check the code and violations will be reported as warnings or errors.

The optional type annotations allow to actually have type information in the code where it's useful for documentation purposes; no more hunting for documentation that explains that an argument must implement a certain list of methods in order to be considered an acceptable duck. The presence of interfaces, ie. a named set of methods with method signatures, and optional type annotations allows to document APIs.
Crucially, the language is always dynamic and arguments can be specified as dynamic, ie. of type Dynamic.

Runtime Extensibility and Mutability - or lack thereof

Let's get it out of the way: No Monkeypatching. No eval. No Reflection at the moment, although a Mirror-based system is in the works (for details see this paper introducing Mirrors). The plan seems to be to limit construction of new code to a new Isolate, not the currently running process.

noSuchMethod

Dart allows some dynamic magic with the noSuchMethod feature, similar to Ruby's method_missing, Smalltalk's DNU or other similar language features. Future versions of Javascript are also supposed to have a similar feature in the form of Dynamic Proxies, which are slowly making their way into current Javascript VMs, such as V8.

Closed Classes, no Eval

Languages like Ruby allow to change classes, even at runtime, which is referred to as open classes. Not having this feature helps with performance: all members are known at compile time, which allows to analyze code and remove functionality that's never referenced. Refer to the 'Criticisms' section below to see the current status and what current solutions exist in other languages.

Future Language Features

An async/await-style extension is being considered to facilitate writing I/O code. Many of the I/O APIs in Dart are async, and hence some support with making that easier is welcome. The reason to stay away from adding features like Coroutines, Fibers and their variants, is to avoid adding synchronization features. Once coroutines are in the system, it's possible to schedule and interleave their execution and in order to write correct code, it's necessary to synchronize shared resources. Hence the focus on single threading; concurrency is done with Isolates: explicit communication, no sharing, Isolates can be locked away etc.

Criticisms

Nothing riles up developers more than a new programming language. A quick look at some common criticisms.

DartC compiles Dart to huge Javascript files

A link's been going around showing a Dart "Hello World" application that's compiled into thousands of lines of Javascript code. The short answer is: adding optimizations such as tree shaking, ie. removing unused functions, are on the Google Dart and DartC team's ToDo list.

Certain characteristics of Dart make these optimizations possible, in particular the closed classes which means that all functions are known at compile time. The lack of eval means that all at compile time, the compiler knows which functions are used, and more importantly: which are not. The latter can be safely removed from the output.

Users of Google's Closure tools will know this as the Advanced Compilation feature. Closure brings a class system to Javascript and allows to annotate classes with information. In the advanced mode, the Closure compiler assumes the developer conforms with certain rules, and with that knowledge can assume that if a function isn't explicitly referenced in code, it can be dumped. Obviously, if the programmer breaks these rules, uses eval or other features, the code wіll break.

Google Dart doesn't need to rely on the programmer to stick to the rules, the language's restrictions provide the necessary guarantees to the compiler.

Another example of a language that makes use of Google Closure's Advanced Compilation is ClojureScript (that's Clojure with a 'j'). ClojureScript is also meant to be an application language and lacks eval or other dynamic code loading features. It's compiled to Javascript that complies with Google Closure's Advanced Compilation tools in order to allow the compiler to remove unused library functions.

Why not use static typing for runtime optimization?

Why is the typing optional and when it's present, why isn't it used to improve the generated code. Surely, knowing that something's an int must help optimize the generated code.
As it turns out, the team behind Dart knows about these ideas, they've done a VM or two in the past, Google V8 and Oracle's HotSpot are just two examples.

Using static type information in Dart doesn't help with the runtime code for several reasons. One is: the types the developer specifieѕ have no impact on the semantics at all and, as a matter of fact, they can be totally incorrect. If that's the case, the program will run fine, although you'll get warnings from the type checker. What's more, since the given types can be nonsensical, the VM can not use them for optimizations as is, because they're unreliable. Generating, say, int specific code just because the developer specified it is wrong if the actual objects at runtime are really Strings.
The static type system is an aid for tooling and documentation, but it has nothing to do with the executed code.

There is another reason why the static types aren't very helpful in generating optimized code: Dart is interface-based. Operators for, say, ints are actual methods calls - method calls on an interface. Dart isn't kidding, eg. int is actually an interface, not a class.
Calling interface methods means resolving them at runtime, based on the actual object and its class. Concepts like (polymorphic) inline caches at callsites can help remove the overhead of method lookup. StrongTalk and its direct descendent HotSpot use feedback based optimization to figure out what code is actually executed and generate optimized code. V8 has also gained these optimizations recently in the form of Crankshaft.

Where's my favorite language feature?

Google has released Dart at a rather early stage. It's easy to get fooled by the lanugage spec, IDE, VM, DartC etc.: the clear message from the Dart team is: now is the time to try Dart and provide feedback. A lot of features are already planned but haven't been finished or implemented yet; Reflection and Mixins, are but a few ideas that have been mentioned as potential future features.

If a feature isn't in the Dart repository or language spec, now is the time to provide feedback and suggest fixes or changes to the language and runtime environment.

Wrapping Up

A lot of work has been done on Dart: the language spec, the Dart VM, the DartC compiler to compile Dart to plain Javascript, the editor that's based on SWT and some Eclipse bundles, etc.

However, the initial release of Dart is a technology preview and the language, APIs and tools are very much a work in progress. Now is the time to give the Dart team feedback and actually have a chance to have an impact on the language. The language will change, some of the proposed and planned changes were mentioned in this article.

Some have already started experimenting with Dart, for instance a Java port has been started with the JDart project, which makes heavy use of Java 7's invokedynamic features.

In this article, we focused on the language and features of the Dart VM, but with DartC it's possible to compile Dart code to plain Javascript; some of the samples shown at the introduction, including the one running on the iPad, were actually Dart applications compiled to Javascript, running in standard browsers.

While the initial development of Google Dart was done in secret, the whole project, source, tools, ticket system, etc. is now out in the open. It remains to be seen if and where Dart will be adopted. As mentioned, the Dart VM comes with features that will make Dart appealing to both client developers as well as server developers.

Links:

Instructions to get the Google Dart source. Note: some people have complained about having to enter a Google password in order to fetch the necessary tools and sources. As the linked page says, this will only happen if you access the sources as a committer via https; the code can be downloaded anonymously.