Whoa… Geth 1.5

The Go Ethereum team is very proud to finally release Geth 1.5, which can almost be called a complete internal rewrite of the Go Ethereum (go-ethereum) codebase.

We’ve packed a huge number of changes into this release, and simply listing them wouldn’t do them justice. Instead, we’ve decided to write them up in a more informal way, explaining not only what’s new, but also why it’s needed, and why it’s awesome!

Go Ethereum website

The go-ethereum project never really had a website. There was something auto-generated a long time ago by GitHub, but it couldn’t really be called a decent website as it didn’t contain valuable information, didn’t look particularly good, and there was nobody to properly maintain it. But at the time it was ok as the hardcore developers were cared more about the source repository and wiki pages, than a web site.

However, as Ethereum gains popularity and traction, we are now making efforts to make Geth, its code, and associated resources more accessible and streamlined for everyone involved, not just a handful of core developers. As a first step in this direction we’ve begun to put together a new website for go-ethereum. You can see it at: https://geth.ethereum.org.

The web site still has a long way to go, but we’ve done our best to include information that is not available elsewhere else, yet we feel is essential for anyone starting out with go-ethereum: a detailed installation guide for all platforms, and a downloads section gathering all our binaries from every build service we maintain. You can expect a detailed developer guide in the next few weeks, and a detailed user guide afterwards.

Library access

Go Ethereum, one of three original clients along with C++ Ethereum and Py Ethereum, evolved alongside the Ethereum networking and consensus protocol specification. This process entailed fast prototyping, frequent rewrites and binned features. The net effect was a codebase that worked well, but was difficult to embed into other projects due to its messy internals.

In the Geth 1.4.x series we started untangling go-ethereum, but it took longer than anticipated to clean up most of the public API pathways. With Geth 1.5, we’ve finally arrived at the point where we can stand behind our programmatic APIs both as usable and as something we would like to support long term. The final pieces are still being polished, but we’re confident you’ll like the result a lot!

Our main areas of focus were: a) simplified client side account management, b) remote clients via HTTP, IPC and WebSockets; c) contract interactions and binding generation, and d) in-process embedded nodes. With these four main use-cases covered, we’re confident most server side or mobile applications can go a long way.

Mobile platforms

With Geth 1.5 focusing on library reusability, it is only natural to see how far we can push the envelope. There has been ample exploration of running (or at least interfacing with) Ethereum from browsers; our current release focused on doing so from desktop/server processes. The only missing piece of the puzzle was mobile devices… until now.

The 1.5 release of go-ethereum introduces our first experimental attempt at providing true Android and iOS library reusability of our codebase. This comes in the form of a native Java and ObjC wrapper around our code, bundled up officially as an Android archive and iOS XCode framework. The former is more mature, while the latter requires some API polishes due to the difficulty in automatically wrapping Go to ObjC/Swift code.

We’re also providing native dependencies for both platforms in the form of Maven Central packages (or Sonatype for develop snapshots) for Android, and CocoaPod packages for iOS. Since this is the very first time we’re making the pushes to these package managers, there are a few hurdles that may arise, so we’ll make a separate announcement when both are reliable to use. Until then, we recommend sticking to the downloadable library bundles.

Experimental protocols

The 1.5 release of Geth is an attempted foundation for the future direction and features we’d like to work on and stabilize in upcoming releases. In our opinion, the best way to push the desired new features forward is to ship them as experimental (solely opt-in) protocols so that anyone can play with them and provide feedback. In the light of this, we’ve merged in quite a few things we (and hopefully the community) had been looking forward to for quite some time.

Discovery v5

If you’ve played with joining the official testnet (Morden) or running a publicly reachable private testnet, you know it can sometimes take quite a long time to synchronize, as the node often seemingly just sits there doing nothing.

One of the root causes for testnet sync issues is that the peer discovery protocol cannot differentiate between machines running different blockchains, or even different network protocols altogether. The only way to find suitable peers is to connect to as many peers as possible and keep the ones that make sense. This approach works for the mainnet, but for smaller protocols (testnet, light clients, swarm, whisper) it’s like looking for a needle in a haystack of advertised peers.

Geth 1.5 contains a new version of the peer discovery protocol that extends the “shooting in the dark” approach with topic based peer-querying. In short, peers can actively search for other peers that have specifically advertised feature sets, protocols or configurations. This new discovery protocol should enable nodes to instantly find others of interest, even when there are only a handful among thousands of “boring” ones.

Please note: the v5 discovery protocol is experimental, hence it is currently only enabled for light clients and light servers. This will allow us to gather valuable information and analyze its behavior/anomalies without influencing the main Ethereum P2P network in the slightest.

Light client

Blockchains are large beasts, there’s no denying it. Irrelevant of optimizations, there will always be devices that are too resource-constrained to play an active role in blockchain networks (e.g. mobile phones, IoT devices). Although unexpected, we’ve seen this effect happen during the DoS attack, which caused HDDs to have troubles syncing.

The only meaningful solution for running a blockchain on tiny embedded devices is for them to become light clients, where they do not bare the full burden of sustaining the network, but rather only bear the burden of their own operation. Not only is this beneficial for the small devices, but it also benefits the network as a whole as it removes slow links and thus makes the core network smaller, tighter and more performant.

We’re proud to finally include an alpha version of a light client inside Geth 1.5. It can sync in minutes (or less) and consume only megabytes of disk space, but nonetheless fully interacts with the Ethereum blockchain and is even usable through the Mist browser (although there have been hiccups there).

You can run Geth as a light client via the --light flag. If you are maintaining a full node, feeling a bit generous, and aren’t running a sensitive production system, consider enabling the light server protocol to help out small devices in the network via --lightserv 25 --lightpeers 50 flags (first sets the percentage of system resources allowed to be used by light clients, and the second sets the number of light clients to allow connecting).

Swarm

Along with the consensus protocol, the Ethereum vision also consists of two other pillars: real time dark messaging (Whisper) and decentralized file storage (Swarm). All three are needed to create truly decentralized, high availability applications. Whisper is more or less available as an experimental protocol, but Swarm always looked like a far away dream.

With the arrival of 1.5, we’re very excited to include an initial proof-of-concept implementation of the Swarm protocol for developers to play with. It is included as a separate daemon process (and inherently executable binary), not embedded inside Geth. This allows users to run Swarm against any Ethereum client while also preventing any issues from interfering with the main node’s functionality.

RPC subscriptions

If you’ve written a more complex DApp against a Geth node (or any other Ethereum node for that matter), you may have noticed that polling the node for data on RPC can have adverse effects on performance. Not polling it, on the other hand, has adverse effects on user experience since the DApp is less sensitive to new events.

The issue is that polling for changes is a bad idea since most of the time there’s no change, only the possibility of one. A better solution, instead of querying the node for changes every now and then, is to subscribe to certain events and let the node provide notification when there’s a change. Geth 1.5 enables this via a new RPC subscription mechanism. Any DApp (or external process) can subscribe to a variety of events and leave it to the node to notify when needed. Since this mechanism is not possible over plain HTTP (like it is over IPC), the 1.5 release also includes support for running the RPC API via WebSockets.

JavaScript tracing

During the DoS attacks in recent months, we spent an inordinate amount of time analyzing different transactions to better understand how they work. These efforts entailed trying to create various traces, looking at exactly what the EVM executes, and how that influences the underlying implementation.

Although Geth featured an EVM tracing API endpoint for quite some time now, it didn’t provide much granularity in regards to configurability. It ran the EVM bytecode, returned the executed opcodes, any occurred errors and optionally a diff of stack, and memory and storage modifications made by the transaction. This is useful, but expensive resource-wise to both create and to pass through the RPC layer.

With the 1.5 release, we’re introducing a new mechanism for tracing transactions, a JavaScript map-reduce construct. Instead of the usual trace options available until now, you will be able to specify two JavaScript methods: a mapper invoked for every opcode with access to all trace data, and a reducer invoked at the end of the trace to specify the final data to return to the caller.

The advantage of the JavaScript trace approach it that it’s executed inside the Go Ethereum node itself, so the tracer can access all information available for free without performance impact, and can collect only what it needs while discarding everything else. It is also a lot simpler to write custom trace code instead of having to parse some predefined output format.

Vendored dependencies

Until the 1.4.x release cycles of Geth, the go-ethereum codebase used the godep tool as its dependency manager because Go itself did not provide a viable alternative other than manually copying dependencies or relying on upstream repositories to not break over time.

This situation was unfortunate due to a number of drawbacks: a) building go-ethereum required both a custom tool as well as knowing the quirks of said tool, b) dependency updates via godep were very painful due to them dirtying the local workspaces and not being able to work in temporary folders, and c) using go-ethereum as a library was extremely hard as dependencies weren’t an integral part of the Go workflow.

With the Geth 1.5 release, we’ve switched over to the officially recommended way of vendoring dependencies (fully supported starting with Go 1.6), namely by placing all external dependencies into locations native to the Go compiler and toolchain (vendor), and switching to a different dependency management tool to more cleanly handle our requirements (called trash).

From an outside perspective, the main benefit is no longer having to muck around with some random dependency management tool that we happen to use when building go-ethereum, or to using it as a library in other projects. Now you can stick to the plain old Go tools and everything will work out of the box!

Build infrastructure

From the beginning of the Ethereum project, all official clients depended on a build infrastructure that was built and maintained by @caktux based on Amazon EC2 instances, Ansible and a sizeable suite of Python scripts (called the Ethereum Buildbot).

Initially, this infrastructure worked well when the original implementations all shipped a handful of major platform, architecture and deliverable bundles. However as time passed and projects started to focus on smaller unique builds, the maintenance burden started to ramp up as the buildbot began to crumble down. When the maintainer left the Ethereum project, it became clear that we needed to transition to new build flows, but creating them was a non-trivial effort.

One of the major milestones of the Geth 1.5 release is the complete transition from the old build infrastructure to one that is fully self-contained within our repositories. We moved all builds on top of the various continuous integration services we rely on (Travis, AppVeyor, CircleCI), and implemented all the build code ourselves as an organic part of the go-ethereum sources.

The end result is that we can now build everything the go-ethereum project needs without depending on particular service providers or particular code outside of the team’s control. This will ensure that go-ethereum won’t have strange missing packages or out-of-date package managers.

Build artifacts

Starting with Geth 1.5, we are distributing significantly more build artifacts than before. Our two major deliverables are archives containing Geth only, and bundles containing Geth and any other tools deemed useful for developers and/or users of the Ethereum platform. These artifacts are pre-compiled for every stable release as well as every single develop commit to a very wide variety of targets: Linux (386, amd64, arm-5, arm-6, arm-7 and arm64), macOS (amd64) and Windows (386, amd64).

One of our feature updates are library bundles for using go-ethereum in mobile projects. On Android we’re providing official builds for .aar archives containing binaries for 386, amd64, arm-7 and arm64, covering all popular mobiles as well as local simulator builds. On iOS we’re providing official XCode Framework bundles containing binaries for amd64, arm-7 and arm64, covering all iPhone architectures as well as local simulator builds.

Besides the standalone binary archives we’re also distributing all of the above in the form of Homebrew bundles for macOS, launchpad PPA packages for Ubuntu, NSIS installers for Windows (Chocolatey distribution will need further administrative hurdles to overcome), Maven Central dependencies for Android and CocoaPods dependencies for iOS!

Digital signatures

For a long time our binary distributions were a bit chaotic, sometimes providing checksums, sometimes not, which depended on who made the release packages and how much time we had to tie up loose ends. The lack of checksums often lead to users asking how to verify bundles floating around the internet, and more seriously it resulted in a number of fake developer and project clones popping up that distributed malware.

To sort this out once and for all, from Geth 1.5 an on, all our officially built archives will be digitally signed via a handful of OpenPGP keys. We will not rely on checksums any more to prove authenticity of our distributed bundles, but will ask security-conscious users to verify any downloads via their attached PGP signatures. You can find the list of signing keys we use at our OpenPGP Signatures section.

Repository branches

A bit before the Frontier release last July, we switched to a source repository model where the master branch contained the latest stable code and develop contained the bleeding edge source code we were working on.

This repository model however had a few drawbacks: a) people new to the project wanting to contribute always started hacking on master, only to realize later that their work was based on something old; b) every time a major release was made, master needed to be force-pushed, which looked pretty bad from a repository history perspective; c) developers trying to use the go-ethereum codebase in their own projects rarely realized there was a more advanced branch available.

Beginning with Geth 1.5, we will no longer maintain a separate master branch for latest-stable and develop branch for latest-edge, rather we will switch to master as the default and development branch of the project, and each stable release generation will have its own indefinitely living branch (e.g. release/1.4, release/1.5). The release branches will allow people to depend on older generations (e.g. 1.4.x) without finding surprising git issues with history rewrites. And havingmaster as the default development branch would allow developers to use the latest code.