Recorded at:

Bio Kresten Krab Thorup is CTO of Trifork, where he's responsible for technical strategy, researching future technologies, and the JAOO and QCon conferences. Kresten has worked on open source projects like GCC, GNU Objective-C, gcj, etc. and used to work at NeXT Software on the Objective-C tool chain, the debugger, and the runtime; his latest project is Erjang; he blogs @ http://www.javalimit.com/ .

The Erlang Factory is an event that focuses on Erlang - the computer language that was designed to support distributed, fault-tolerant, soft-realtime applications with requirements for high availability and high concurrency. The main part of the Factory is the conference - a two-day collection of focused subject tracks with an enormous opportunity to meet the best minds in Erlang and network with experts in all its uses and applications.

That is true. That is really my passion. I’ve been doing language design and language implementation in the last 20 years, starting when I picked up a copy of the gnu compilers and there was this kind of non functional meaning not working implementation of Objective C and I started hacking Objective C, so I did that professionally.

Yes, I was here in the valley for two and a half years doing Objective C and then I have been working on Java, I’ve been part of some of the Extra groups, the one on parameterize classes and I did my PhD in Aarhus mostly with a Beta program language group which is one of these odd object oriented languages that kind of groove out the similar Kristen Nygaard lineage of programming languages. So I’ve been influenced by programming language implementers and deciders a lot and that continues to be my passion.

I don’t really think Java failed. I mean obviously it’s a big success, but it’s stalling, it’s not going anywhere further because that probably happened to hugely successful languages, that there is a fine design and there is a lots of programmers adopting it and then very soon it goes so difficult to make improvement because there is a huge install base and there are lots of developers that are using it and it just has an inertia that kind of drives it to almost on halt.

So obviously if Java is a successful as a language as successful as COBOL or C it’s never going to die. But I think it was really designed in a real time there were so many new problems that we are starting to face that it doesn’t capture.

I think there is a whole concurrency issue obviously and that is something people talk about a lot and that is obviously very difficult to write concurrent programs in Java. I mean there are these lots of books on concurrency in Java, very nice books, but they are survival guides, they are not design guides for how to make beautiful concurrent programs.

And there is a whole mindset. I see many places in discussions on concurrency. If concurrency is the problem, that we need to engineer our way out, we need to engineer a way to solve these things, to make correct programs, to make fast programs, kind of engineering point of view. And I think there is a huge opportunity in making abstractions that kind of relate to concurrency, but that will make programs a lot easier to understand, just like object oriented programming, and Java took that mainstream obviously, gave us a lot of ability to manage complexity and structure.

Everybody has to care about concurrency one way or another. Either you have to push the concurrency issues down in your stack, so you are using your framework or high level language or something where concurrency disappears, so you can use a functional language, you can use frameworks to do map reduce use all kind of data parallelism that kind of abstract the concurrency away, you push that into the platform. But there is a huge set of problems where you cannot abstract the concurrency away, and that’s all the coordination stuff.

Lots of systems are interactive by nature, they interact with outside state full things, they have to coordinate many sources of events and we are starting to see them more and more. I like to say Java was designed in client server age and then picture stone axe or something like that, but really back when Java was designed and most of the usages of Java is really for client server systems where you can let this synchronous coordination paradigm prevail of sending a request and waiting for a response. And there is essentially only two parties in most distribute systems.

As soon as you get more than two parties in the distributed system you need to coordinate and that coordination stuff is just really tricky to get right. And that is obviously happening more and more because we are looking into integrating web services and practically every system that we build these days are systems that integrate many web services and they are out of your control and typically many of the systems that we built at Trifork are systems that have so many interface points that there is always one of them that doesn’t work and you have to have your system continue running.

So that’s one area where there has been a big development over the last 15 years since Java’s inception. I think we need much more coordination. And the other area is obviously kind of at the low level at the multicourse stuff, because it’s the same kind of thing just at a micro level. If you have things actually happening in parallel you still need to coordinate. So you can run stuff in parallel that’s pretty easy, but it’s always a difficulty of coordinating things at the end.

I think coordination is starting to feel more and more in our program, just like you talked about releasing your algorithm, the talk you are going to do tomorrow. So the context for this is really I’ve been looking a lot of old C code recently and translating it into Java because I am working on Erjang VM and the interesting part is it’s amazing how much error handling code there is. I mean it just fills up so much that you can’t see what this piece of code is supposed to do.

I think the really interesting opportunity here is to make a leap just like we did with objects and object thinking; it really released a lot of this sense of it’s very easy to handle some kind of complexity. I think with concurrency we can do the same thing, because there is so many things, so many problems that are by nature concurrent, so if we start describing our domain models as actors essentially, as concurrent entities that have isolation properties, many of the things typically you’ll find like Erlang. Erlang has many core concepts.

I’ve been thinking how to view actor programming as a progression of object oriented programming and several people have talked about both in the past and specifically in the last couple of years how Erlang is really the right way to do object oriented programming. So an Erlang actor, let me use the term actor because I really want to promote the term actor. The actor you think at it as an object and you can send it messages; it has its own internal state that you can’t mutate from the outside and inside the actor there is only one thread, to use that terminology.

An actor is a thing that has one thread inside of it, has some state and that is conceptually very close to the concept of an object and actually it’s a nicer object because it has real data encapsulation. We’ve been talking about all these concepts of objects oriented programming like polymorphism and inheritance and data encapsulation and etc, except that the data encapsulation was never really there.

You can always easily end up passing a pointer out to some data which is then mutated later on. With the actor model you truly isolate your actors, your objects that become little protection areas and that in itself in regardless of the concurrency and all that makes it so much easier to reason about a large system. And when you add concurrency and say each actor manages its own state concurrently with other actors, then there is so much complexity in reasoning about concurrent systems that goes away because you don’t have to worry about having shared mutable state at all.

The issue obviously is that this is quite new for me and the people to think this way and when you do concurrent programming in a language like Erlang, the other thing is how do these actors communicate. In Erlang there is an infinite size queue of messages and that brings in another kind of factor of complexity and understanding what goes on, I think, because there is the delay that might be a long delay between sending message and receiving it.

And getting your head wrapped around understanding what happened when you have all these queues and they might overrun, they might run faults somehow because obviously there has to be some limits to them so that one single actor’s queue can start the entire program. Getting your head wrapped around that is really hard. So I’ve been doing a lot of looking back at what people have done and I think there is some really interesting work that kind of descends from the realtime operating system stuff.

Many people might have worked with QNX. QNX is a micro kernel operating system for doing control systems which really implements an actor model. It’s an operating system with very lightweight processes. So you can easily create many processes and it’s almost like a framework for writing actor programs, except it’s not a language, it’s an operating system. But they have this very nice way of doing synchronous sense and asynchronous replies that make something to explore, to figure out ways of interacting between actors.

So I think the important aspect is really the isolation. Getting back to that isolation and dealing with the fact that you can’t really mutate into each other’s objects.

Yes, you have to model it in a slightly different way, but I think once people see it they will think: "It is a lot easier." And just like when objects came out and started to become mainstream it wasn’t fast and it was new concepts you had to learn, but standing here looking back you say: "There is at least many applications where it makes a lot of sense to these objects, because it just makes your program simpler in certain ways."

And I think the same will apply for modeling with concurrencies. So concurrency modeling or actor modeling is just as a very simple progression of object modeling that includes real isolation. And I think we’ll be standing there in 5 or 10 years and say: "It’s really amazing so simple it is to model things here." Joe Armstrong, one of the designers of Erlang, he usually likes to say it’s so simple to model the processes; you just look around and see the processes everywhere.

And it’s kind of the same sense I have with object and classes is I just look around and I can easily see how I model something and I think lots of people will get to that point where Joe is in 5 to 10 years once we have the real languages for that.

Yes, it’s not an easy answer. I think there is very different problem domains that fit each of these languages and I am starting to see, at least there is the functional languages, they’re obviously very useful and there is lots of people that are doing really good things with that, making concurrency programming really fast and making it easy to write correct programs and stuff like that, but there is only certain classes of problems where it’s a natural fit.

All the big batch processing things is perfect fit. The map-reduce stuff. It’s very nice for doing huge functions that might take a day to compute, but as long as it’s essentially one function, the whole mind set comes with that fits very nicely. In the other end of the scale I think all the interactive systems like anything from GUI programs to control systems, to programming integration between different services, that whole area with interactive systems, we don’t have an application language that kind of hits "smack" in that.

There is mostly system programming languages like, Both Go and Erlang are at least being seen as mostly system programming languages are kind of integration platform languages. You don’t see typical applications of Erlang building, desktop apps or building applications with big domain models or these kinds of things that we typically see as application programs do. But yet application programs have to deal with all this stuff, all this concurrency.

I think GUI programming it the killer app for a good concurrent actor base language, because right now with GUI programming you have to manage all these queues of events, you have to make the event queues very specific and typically working with Spring or Objective C It’s a big hassel having to always run this on main threat and making sure you don’t mutate the GUI components from worker threads etc.

And all that stuff that you do there, that is the coordination stuff that kind of ends up filling a lot of your brain and filling a lot of your code, that I think an actor application programming language could really solve a lot of stuff there, but obviously it’s something huge to create a new language. And I don’t think that any of the languages that are there right now are "smack" in the middle of that space. There is number of different languages like Scala which are very likely to succeed big time because it leans itself to Java and we can do the actor programming framework in Scala.

I think Scala has an issue with so many different ways of doing things. It’s difficult to know when you’ve written a Scala program, is this right? And I think even though it has an actor framework it doesn’t do real actors in the sense that it doesn’t provide isolation of actors, it doesn’t provide safe messaging and it that means it’s very easy to cheat and pass pointer.

Sure, that is the price of it. And from an engineering point of view it’s a great tool box for doing all kinds of things and integrating, but I think from a slightly more scientific imperious point of view I think it’s going to be very difficult to be able to reason about the soundness of those resulting systems. I am a language geek myself and every individual feature of Scala is so nice and fell thought out, but there is just too many, I think.

I am not so sure because I am kind of leaning towards liking adding stuff as frameworks, but only to the extend that you can really do it. When you are on a Java platform or at least when you are on a Scala platform you fundamentally can’t provide isolation, so it’s just an engineering half-packed kind of solution to the problem and if you are very careful and write your actor programs correctly, obviously you can work and work very well, but it’s very tempting to break the invariance.

Actually that’s a thing I’d really like to see in a core platform is have some notion of isolation units. Like it’s been proposed to Java to have these isolates, Erlang obviously has is in its process concept. It’s been very successful for decades in micro kernel operating systems. The IPhone runs a mock operating system which is a micro kernel system that has its library processes and all the control systems are using the same paradigm and I think bringing that paradigm of memory isolation units into a core platform, I think that is very important.

What the language does on top of that I think it’s fine to have a language where you can do stuff in terms of frameworks and stuff and then you can argue whether the resulting framework and its usage is nice enough and you can provide really good tooling and you can understand the framework. Scala is really nice from a language point of view, but it’s also very complicated. There is so many things in it. So that is a tradeoff there.

But it doesn’t have to be so complicated. There is lots of simpler languages that succeed in providing things kind of frameworks like Ruby, Lisp, you can do lots of stuff in Haskell; there has been one the high points of the Beta programming language where I kind of grew up as being able to write new control structures and stuff like that. You see that Smalltalk if construct is not in the language, it’s just something you do with the core abstractions.

So I kind of like that way of being able to write new control structures, new abstracts into that language, but obviously you need tool support to do that and that often was lacking. That is one of the issues in Ruby, for instance, there is all this meta programming of adding new DSLs and stuff, but it very difficult to provide tooling for it.

I think the real problem is there is mete programming without a meta model. You end up in Ruby, for instance, doing meta programming and that is really nice and easy, but then there is no way that tools can reason about things that you see as new abstractions that come out of it. So there is obviously a big challenge now, we are starting to see more and more DSLs, but building languages and tools around languages is quite difficult.

If you just build the language without tooling support then it often can be very difficult to maintain and manage about the resulting code. That is really one of the dangerous of DSLs as long as the danger of having languages obviously where you can add your own control structures. You need to have a way to be able to reason about the things that you express. You need to have a meta model for the meta programming that you do.

I think having an optional type system. I think that is a very interesting area. I started using that when I started working on objective C almost 20 years ago. Objective C you has this property where you can write classes and functions, methods and what not, without type declarations and then you can add some type declarations and when you add some, you get type checking for those, you get compile time warnings and that provides you with a spectrum of type fullness.

So when you start out doing something, maybe you don’t want to use types, because you have everything in your brain and it’s important to just get some stuff done, but as the code matures and you want to expose it to more developers it makes a lot of sense to those types because it really improves the readability it provides the context for understanding what some code does when looking it from the outside.

Having a spectrum of types, meaning there is an optional type system somehow, is very interesting, so it was Objective C, that’s in a language called Strongtalk , which is Smalltalk with types the guys that did the hotspot VM actually did a high performance of Smalltalk before it was acquired by Sun. and one of the guys from there Gilad Bracha is continuing work on optional type systems.

He is working on this language Newspeak, which is really interesting. Erlang also has a type of optional type system, where you can add type specifications to functions, though it is not promoted as much and it’s not tooled as much. It’s really useful when you come up to a framework in your library and you go into IDE and it will let you navigate and know what you can do with it because it has type information. I think that is really where the biggest value of type systems are is allowing developers to navigate because that is really the costly part.

Reasoning about global correctness, I don’t think that is the most important. It’s always been the vision but there so much other kinds of complexity that kick in, so reasoning about global correctness on the bases of types I don’t think that is really interesting for most applications. We are not willing to pay the price of putting all that precise type information everywhere that is needed to be able to do that so I think optional types really has a big future and obviously it’s already quite successful with stuff like objective C.

People are very effective in that. You work exactly that way, so for your own classes that you are not intending to reuse you might not type them but if you are packaging it up as a nice library and you write fine types for everything that enables you to write really nice documentation and it is easy to navigate your code. I think that is a more important part of typing that is really not emphasized enough; people like to have it all.

Types for performance are really not that relevant anymore because typically VMs will infer types anyway. VM will know more about your program than you do, in very specific knowledge about, specific combinations of applications and stuff that it will reason much more specifically about your types. I don’t think types for performance is that important. The performance is something that we should offload to the VM.