Talking Garbage Collection with Gil Tene, Azul Systems

We chat to Azul Systems CTO Gil Tene about a Java topic all developer want to understand further – garbage collection

After selling out a London Java Community meetup (so to speak),
it seemed rude not to catch up with one of the key men behind the
radical rethinking to JVM garbage collection, Azul Systems’s Gil
Tene. Here’s Part 1 of our chat, delving into different approaches,
the origins and upcoming trends. Make sure to check back on
Wednesday for part two of our exclusive interview.

JAX: Obviously garbage collection has been around for
decades – can you detail the origins of GC?

Gil Tene: The wider way of looking at garbage collection or
automatic memory management is something that has been around for
quite some time, probably a good four decades with a lot of good
academic work on it – interestingly a lot of that was done during
the 70s and 80s and more groundbreaking stuff recently.

What is garbage collection good for? It seems sort of obvious but a
lot of people seem to think of it as ways to let lazy programmers
that are not very skilled still program. But I think there are very
important industry trends that come out of it and now that Java and
other platforms are dominant in parts in the industry, we can see
the effect of what that does to actual productivity.

JAX: Does it differ depending on the working
environment?

GT: The first thing is that if you work in a mixed environment, or
compare environments where people code in Java or C# to those
programming in C and C++, what they do is build applications that
support, maintain and enhance over time. They don’t look at the
time to initial delivery so much as a year and a half or two years
into a project. My experience of embedded systems, servers and
software you shift is that when the stuff is built in C and C++,
you get it stable, you get it shipping and then you have these rare
things that crash and you spend about half of all your engineering
efforts trying to figure out what the rare crashes are.

When you look at core Java, they spend their time debugging other
things. It’s not that they don’t have bugs, but it’s very rare that
they look for crashing bugs, they usually end up looking for
functional bugs, enhancements and slowness and other things. The
other part that is not so obvious. Automatic memory management is a
core requirement almost for an effective framework ecosystem and
leveragable software. That is not an obvious thing to most
people.

JAX: So do you think Java developers, in general, care
enough about GC? Do they put it at the back of their
minds?

GT: I think they predominantly don’t care. And I think that’s the
right way – they shouldn’t care. When they run into tuning in
production, they end up caring for practical reasons but I actually
think the term GC-friendly programming or heap-friendly programming
is a symptom of a really bad execution on a garbage collection
side, not a good thing for programmers to do. Saying heap-friendly
programming is kind of like saying 16bit-friendly programming. It
should be considered that way.

If a collector can’t give you the invisible experience, then the
collector’s broken, the JVM’s broken and it’s a toy that isn’t
working right and you should be programming better.

JAX: Shall we move on to more specific garbage collectors
and what typical things they do?

GT: I think that if you look at garbage collection implementations,
the modern mature runtimes such as Java and .NET, Ruby and others –
the evolved garbage collectors which are pretty good. Usually
you’ll see precise garbage collectors, one that knows where all the
pointers whenever it collects, so it can move objects around for
example, it can compact memory. You can’t do that if you can’t fix
the pointers.

As far as techniques go, there’s no commercial JVMs that don’t do
that, it’s a given. Generational collection, observation that young
generation focus on recently allocated objects produces efficiency
– again it’s a universal practice and every JVM on a server.
There’s ways to run without that, but usually you don’t get very
good throughput or scalability.

JAX: What particular trends appear to coming through,
particularly in a Java context?

Over the past 15 years, certainly since Java started, we keep
improving garbage collection but the way people tend to improve it
is by taking the really bad thing that’s happening and trying to
make it happen a little less often. So, pushing it into the future
and further into the future rather than taking it on and figuring
how to solve it.

JAX: Does that mean the mindset needs to change
dramatically?

GT: I think so. In effect, our design for GC started nine years
ago. It was started from the simple statement of saying that that
trend is not sustainable – we have to go in the exact opposite
direction. Rather than figuring out how to take a problem and
ignore for a little longer. Instead of a few seconds, it’s going to
be a few minutes, maybe hours. That’s the Microsoft Windows 95
approach to stability. And we stopped making fun of Windows laptops
and rebooting because Java needs to rebooted more often than a
Windows laptop.

The opposite direction is find that thing, solve it and maybe by
solving it, all this other crud you had to do avoid it, doesn’t
even have to be done. That’s the choice we made nine years ago. So
far, I don’t see it as something the industry has followed but I do
think that it is unavoidable that everybody will have to choose
that other path. The path of delaying the problem even longer is
running out of steam. Or arguably already has run out of steam five
or six years ago.

Within that, there’s various modes. You’ve got non-compacting
collectors, the fall-back compaction, incremental compaction,
default and full compaction. Then there’s mostly concurrent, mostly
incremental, mostly non-stop the world. The word mostly means,
sorry I can’t deliver except for some of the time. All those are
common exercises in delaying. Every time you add to them you buy a
little more time.

JAX: To an outsider, the jargon with garbage collection can
be a bit perplexing at first. Can you attempt to classify the key
terms?

I have this talk (see below) which defines terms, types and
classifies metrics and puts it all together to classify modern
collectors. I like to distinguish concurrent collection from
parallel collection. Concurrent collection happens without stopping
the application, parallel collection can use more than one thread
at a time.

JAX: Obviously Azul Systems has a spin on things – care
to explain how Zing differs?

The C4 collector has two main behaviors that are visible from a
classification perspective. First, it has got a single pass
concurrent marker. Guaranteed, regardless how fast you change or
mutate the heap, the marker will always complete and never have to
revisit what you’ve touched. Because of that it is independent of
application throughput.

It has a concurrent compactor that allows us to move objects
without stopping the application and more importantly allows us to
fix pointers without stopping. That’s one of the key difference
from the other commercial collectors. It has this for both the old
gen and the new gen – it’s actually the same collector. It has zero
stop the world fallback code. In practical terms, this results in a
compactor that is not sensitive to heap size, it’s not sensitive to
allocation rate or mutation rate. None of those things will affect
pause time or response time.

JAX: I’m guessing it thrives under high pressure
situations?

GT: It does very well but actually it was designed to work across
the whole spectrum.

JAX: What mistakes are common when tackling garbage
collection?

GT: From a developer perspective, obviously I’d like to see people
use those collectors and never look at them. But given that Zing in
its native Linux form is a new thing, relatively young less than a
year, the same techniques aren’t available on other vendors. I
think developers do what they have to do and use a lot of duct tape
and work with reality and make their applications work as well as
they could be expected to. So when I see people program in a
GC-conscious way, I don’t think they’re doing something wrong. I
think it’s sad that they have to do that. But good engineers have
to solve real world problems.

Part 2 of our interview with Gil will appear on Wednesday,
where we discuss his company’s role in the Java Community Process,
and Java 8. Photo by
brianfuller6385.

Gil Tene (Twitter: @giltene) is CTO and co-founder of Azul Systems. He has been involved with virtual machine technologies for the past 20 years and has been building Java technology-based products since 1995. Gil pioneered Azul’s Continuously Concurrent Compacting Collector (C4), Java Virtualization, Elastic Memory, and various managed runtime and systems stack technologies that combine to deliver the industry’s most scalable and robust Java platforms. In 2006 he was named one of the Top 50 Agenda Setters in the technology industry by Silicon.com. Prior to co-founding Azul, Gil held key technology positions at Nortel Networks, Shasta Networks and at Check Point Software Technologies, where he delivered several industry-leading traffic management solutions including the industry’s first Firewall-1 based security appliance. He architected operating systems for Stratus Computer, clustering solutions at Qualix/Legato, and served as an officer in the Israeli Navy Computer R and D unit. Gil holds a BSEE from The Technion Israel Institute of Technology, and has been awarded 28 patents in computer-related technologies.