Running an Ethereum Full Node on a RaspberryPi 4 (model B)

Introduction

My wife recently offered me the latest Raspberry Pi 4 (model B) 4GB. So I was really excited to try to sync an Ethereum Full Node on the number one Single-Board Computer (SBC).

Syncing Ethereum has always been a pain point for many people because it's a complex process with multiple options, including different verification modes which requires different setup that can cost a lot of money and takes weeks to sync.

Here is a summary of the different options available to synchronise the Ethereum blockchain with Geth (Go-Ethereum):

Blockchain sync mode [--syncmode]:

full sync: A full sync downloads all the data (block headers and block bodies), processes the entire blockchain one link at a time, and replays all transactions that ever happened in history (transaction processing and PoW verification). This method is the most traditional and stable but can take a very a long time (up to a few weeks) and would require a more powerful machine. At the end of the process, the node is a full node.

fast sync: A fast sync also downloads all the data (block headers and block bodies) but exchanges processing power for bandwidth usage. Instead of processing all the transactions that ever happened, fast sync downloads all the transaction receipts and the entire recent state database and perform a PoW verification.
When the chain reaches a recent state (head - 1024 blocks), geth switches to full sync mode, import the remaining blocks and process them as in the classical sync (full) to obtain a full node.

light sync: Light mode syncs directly to the last few blocks, does not store the whole blockchain in database. Unlike full and fast, is not a full node as it doesn't store the entire blockchain but only the block headers, and it depends on full nodes. But this approach, less secure and more suitable for IOT/mobiles, only uses 100MB of space.

Blockchain garbage collection mode [--gcmode]:
Garbage collection is used to discard old state tries and save some space.

--gcmode full enables the garbage collection to keep in memory only the latest 128 tries. This saves a lot of space and it takes less than 200 GB at this stage (Sept 2019) to run a full node in this setup.

--gcmode archive disables the garbage collection and keep all the historical state data blocks after blocks of Ethereum since the Genesis. (bear in mind, it takes more than 2.3 TB of space). But very few people (such as Block Explorers) need an archive node.

In this guide we will follow the second synchronisation mode, fast (with full garbage collection), to run a full-node on a RaspberryPi 4. Some people might ask what the benefits are of running your own node. Here are some examples:

You will own a trusted Ethereum stack you can rely on to manage your assets and send transactions to the network yourself (remote nodes are generally reliable but are controlled by 3rd parties and typically throttle heavy usage).

You can help secure the network; the more independent nodes running the more copies there are of the blockchain and the more resilient it is.

You want to make the network faster and more secure; the more nodes the lower the latency in sharing blocks and the more copies of the blockchain that exist.

It is fun!

Hardware

We will start with an example of setup using a Raspberry Pi 4, a SSD and all the necessary components. But you can try alternative and equivalent solutions which should work depending on the following requirements:

Memory: 4GB RAM DDR3

Fast SSD (recommended to use NVME SSD if the board has a PCIe interface – this is not the case with the RPi4)

SD Card

Disk SSD

In order to store the large Ethereum state database, which requires very high Disk IO performance, we connect a Samsung SSD T5 (500GB) to the board via USB3.0.

It is recommended that you use at least a 500GB SSD because the actual size of the Ethereum mainnet after a fast sync is about 200GB. That should give you a few years before rebuilding the whole thing on a larger disk.

We enabled ssh by default during step 5, so it is possible to connect remotely via SSH from a Linux terminal (or using Putty if you use Windows) to the system.

$ ssh pi@192.168.0.38
pi@192.168.0.38's password: raspberry
Linux raspberrypi 4.19.57-v7l+ #1244SMP Thu Jul 418:48:07BST2019 armv7l
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in/usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
SSH is enabled and the default password for the 'pi' user has not been changed.
This is a security risk - please login as the 'pi' user and type 'passwd' to set a new password.

10. Change the default password of user pi

The default password configured by Raspbian is well known, so it is highly recommended to change it to something else:

The default hostname attributed by Raspbian is raspberrypi, this can cause confusion if you have multiple devices, so it's recommended to rename the machine with a more specific hostname related to its purpose.

Open the file /etc/hostname and replace the content by geth:

$ sudo vi /etc/hostname

Then change /etc/hosts and replace the line 127.0.1.1 raspberrypi by 127.0.1.1 geth

$ sudo vi /etc/hosts

12. Upgrade the OS

Upgrade the system in order to get the latest patches.

$ sudo apt-get update && sudo apt-get upgrade

13. Configure a static IP

We are now going to configure the static IP 192.168.0.24 so the router won't assign a different IP each time the Raspberry PI restarts.
You can either do this via your router DHCP configuration or directly in the network configuration of the machine or even both.

a. Assign a static private IP address to Raspberry Pi with a router

Go to your network router console and configure the static IP in the DHCP section.

Mount the SSD

In the second part of this guide, We will mount an SSD connected to one of the two USB3.0 ports.

As explained, only SSDs are fast enough (I/O speed) to sync Geth to the Ethereum mainnet.

1. Plug the SSD to the USB3.0 (blue) port

2. Find the disk name (drive)

Run the command fdisk -l to list all the connected disks to the system (includes the RAM) and try to identify the SSD. The disk which has a size of 465.6 GiB and a model name Portable SSD T5 and located into /dev/sda is our SSD.

Geth can consume a lot of memory during the syncing process so it is highly suggested that you create a swap file (overflow RAM) to prevent any kind of OutOfMemory error. It is also strongly advised to put the swap file on the fastest disk, which is the SSD in our case.

Edit the file /etc/dphys-swapfile

replace CONF_SWAPSIZE=100 by CONF_SWAPSIZE=8192 to allocate a 8GB SWAP

replace CONF_SWAPFILE=/var/swap by CONF_SWAPFILE=/mnt/ssd/swap.file to locate the swap on the SSD

Below 50MB/s (write/read), I wouldn't recommend trying to syncing a Geth node because you might never be able to reach the head and complete the sync.

Remove /mnt/ssd/deleteme.dat after the performance test.

Other configuration

Port-forwarding

In order to communicate correctly with other peers, Geth needs to accept connections on port 30303 from outside. You will to configure your firewall accordingly to allows for incoming requests on port 30303 to reach the machine via port-forwarding or port-triggering.

Example - VirginMedia Hub (Port-forwarding)

Required softwares

Install the following software which might be needed during the procedure.

$ sudo apt-get install git sysstat -y

Recommended options to stabilise the node

Decrease the RAM allocated to the GPU

Edit /boot/config.txt and add or edit the following line:

gpu_mem=16

Invoke 64 bits kernel

Edit /etc/systemd/system.conf, and add or edit the following line

arm_64bit=1

Install and configure Geth

Now our system is ready for you to install and configure Geth.

a. Install and configure Golang

Download the archive in ~/download

For a Raspberry Pi 4, we need to download Golang for Architecture ARMv6: go1.13.1.linux-armv6l.tar.gz

c. Configure and run Geth

We first need to configure Geth to synchronise in fast mode using the flag --syncmode fast.

Geth also has a --cache option which specifies the amount of RAM the client can use. Raspberry Pi 4 has 4GB RAM so we can use --cache 256 without running into Out Of Memory errors.

By default, all the data are stored in ~/.ethereum/geth/ located on the SD Card. We want to store the Ethereum data on the SSD. For that, we can use the option --datadir /mnt/ssd/ethereum to tell Geth to read/write the datastore on the SSD.

d. Configure Geth as a service (systemd)

We want to run Geth as a service and keep the process running in the background after we close the session, and be able to recover from crashes automatically.
We need to install a systemctl service (systemd explanation)

Syncing

We've installed and configured Geth, so now we have to wait a few days until the sync ends. In the meantime, let me share some insights about the syncing process and what's going on under the hood.

First of all, in fast sync node, the syncing process is composed of two phases running in parallel: block sync and state trie download. Both phases need to be done in order to have a full node and switch to full mode where every transaction is executed and verified.

The block sync downloads all the block information (header, transactions). This phase uses a lot of CPU and space to store all the data. You can observe this process in the logs with the mention of "Importing block headers and block receipts".

However, in fast mode no transaction are executed, so we do not have any account state available (ie. balances, nonces, smart contract code and data). Geth needs to download and cross-check with the latest block the state trie. This phase is called state trie download and usually takes longer than the block sync.
This phase is describes in the logs by the following statements:

The charts below shows some metrics during the syncing process. We an observe that once the block sync has finished, we are storing less data and consuming less CPU and memory. However, Geth is still downlading and writing the state entries at a high rate.

During the process, you will observe some strange behaviours which are common to many people.

Between 64 and 128 blocks behind
After you finished the block sync phase and during the state trie download phase, the block number count will always oscillate between 64 and 128 block behind the latest block mined on Ethereum.
This is normal until the state trie download phase ends and your node is fully synced.

Pivot became stale
If you can’t download all the state in 30 minutes (spoiler alert: you can’t), then you need to “pivot.” Pivoting means switching to a new launch block, and starting to sync again. Pivoting doesn’t mean starting from scratch, but it does increase the time spent downloading and verifying state.

Dropping peer
Geth is connected to multiple peers in order to retrieve the necessary information to run a full node. However, a peer can sometime be dysfunctional. That's why Geth automatically drops a peer when it detects an anomaly.

The website ethstats retrieves the latest state of the Ethereum mainnet in real time and we can compare to see if we are synced.

Conclusion

In conclusion, this article shows how simple and affordable it is to run a full-node on the Ethereum mainet and contribute to the good health of the network.

Special thanks

This guide began from a discussion about how hard it is to keep an Ethereum node stable and synced on a Single-Board Computer. So thank you for the interesting discussions and for your help in the last few weeks to make this experiment a success!