This short tutorial will show you how to upgrade a ODROID-XU4 running OpenMediaVault 2 to OpenMediaVault 3.
OMV 3 (Erasmus) uses Debian 8 (Jessie) instead of 7 (Wheezy). So alongside OMV you will get new packages as well.

Make a backup of your current installation!

Make sure your system is up-to date by running:
apt-get update&&apt-get upgrade

Reboot

Uninstall all plugins including the OMV-Extras package

Before you upgrade you should change your boot.ini and /etc/fstab.
Mount your boot partition like this:

1

2

mkdir/media/boot

mount/dev/mmcblk0p1/media/boot

Then edit your /media/boot/boot.ini and change ro to rw in the setenv line. The line should look like below:

It’s time for another small blog about the ODROID-XU4.
This is just a quick tip to improve your network and USB performance even more. It will optimize your hardware interrupts (IRQ) affinity on your ODROID-XU4.
This guide is for the 3.10.y kernel and debian 8. For other kernel versions the interrupts may have different numbers.

Description

Whenever a piece of hardware, such as disk controller or ethernet card, needs attention from the CPU, it throws an interrupt. The interrupt tells the CPU that something has happened and that the CPU should drop what it’s doing to handle the event. In order to prevent multiple devices from sending the same interrupts, the IRQ system was established where each device in a computer system is assigned its own special IRQ so that its interrupts are unique.

Starting with the 2.4 kernel, Linux has gained the ability to assign certain IRQs to specific processors (or groups of processors). This is known as SMP IRQ affinity, and it allows you control how your system will respond to various hardware events. It allows you to restrict or repartition the workload that you server must do so that it can more efficiently do it’s job.Source

It’s always a good idea to spread your interrupts evenly across all CPUs. In my case I want to achieve the best performance possible. Therefore I want to use the faster A15 CPU cluster for all important interrupt handling.

There are basically 3 different interrupts on a headless ODROID-XU4 server you should take into consideration:

the USB2 port

the first USB3 port

the second USB3 port (the 1 Gigabit ethernet adapter is connected to this one)

Per default all 3 interrupts for these devices are handled by CPU0, which is a A7 core as you can see in the output below:

Shell

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

lscpu-e

CPU SOCKET CORE ONLINE MAXMHZ MINMHZ

000yes1400.0000200.0000

101yes1400.0000200.0000

202yes1400.0000200.0000

303yes1400.0000200.0000

414yes2000.0000200.0000

515yes2000.0000200.0000

616yes2000.0000200.0000

717yes2000.0000200.0000

grep-E'CPU0|usb'/proc/interrupts

CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7

103:10000000GIC ehci_hcd:usb1,ohci_hcd:usb2

104:128530000000GIC xhci-hcd:usb3

105:74890000000GIC xhci-hcd:usb5

IRQ Tuning

First of all make sure that automatic IRQ balancing is disabled:

Shell

1

systemctl disable irqbalance

For debian add the following to your
/etc/rc.local file to pin the interrupt handling to A15 cores 4-6 (CPU4-6):

Shell

1

2

3

4

5

6

7

# Move USB and network irqs to A15 CPU cluster

# usb2

echo6>/proc/irq/103/smp_affinity_list

# usb3

echo5>/proc/irq/104/smp_affinity_list

# network (usb3)

echo4>/proc/irq/105/smp_affinity_list

After a reboot and some file transfer you should see something like this:

Shell

1

2

3

4

5

grep-E'CPU0|usb'/proc/interrupts

CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7

103:10000000GIC ehci_hcd:usb1,ohci_hcd:usb2

104:8355000024968900GIC xhci-hcd:usb3

105:4360004396187000GIC xhci-hcd:usb5

Note the numbers for CPU4 and CPU5. CPU0 handled some initial interrupts during the boot, because rc.local isn’t executed immediately.

Benchmarks

Tuning without measuring performance before and afterwards is useless. So, here are some iperf results:

Shell

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

# without irq tuning

iperf-c192.168.0.2-i2-r

------------------------------------------------------------

Server listening on TCP port5001

TCP window size:85.3KByte(default)

------------------------------------------------------------

------------------------------------------------------------

Client connecting to192.168.0.2,TCP port5001

TCP window size:272KByte(default)

------------------------------------------------------------

[5]local192.168.0.121port57696connected with192.168.0.2port5001

[ID]Interval Transfer Bandwidth

[5]0.0-2.0sec198MBytes830Mbits/sec

[5]2.0-4.0sec198MBytes830Mbits/sec

[5]4.0-6.0sec201MBytes842Mbits/sec

[5]6.0-8.0sec199MBytes835Mbits/sec

[5]8.0-10.0sec199MBytes835Mbits/sec

[5]0.0-10.0sec995MBytes834Mbits/sec

[4]local192.168.0.121port5001connected with192.168.0.2port41073

[4]0.0-2.0sec206MBytes865Mbits/sec

[4]2.0-4.0sec207MBytes870Mbits/sec

[4]4.0-6.0sec210MBytes881Mbits/sec

[4]6.0-8.0sec211MBytes883Mbits/sec

[4]8.0-10.0sec210MBytes882Mbits/sec

[4]0.0-10.0sec1.02GBytes876Mbits/sec

# with irq tuning

iperf-c192.168.0.2-i2-r

------------------------------------------------------------

Server listening on TCP port5001

TCP window size:85.3KByte(default)

------------------------------------------------------------

------------------------------------------------------------

Client connecting to192.168.0.2,TCP port5001

TCP window size:289KByte(default)

------------------------------------------------------------

[5]local192.168.0.121port57702connected with192.168.0.2port5001

[ID]Interval Transfer Bandwidth

[5]0.0-2.0sec224MBytes941Mbits/sec

[5]2.0-4.0sec223MBytes936Mbits/sec

[5]4.0-6.0sec223MBytes935Mbits/sec

[5]6.0-8.0sec223MBytes937Mbits/sec

[5]8.0-10.0sec223MBytes934Mbits/sec

[5]0.0-10.0sec1.09GBytes936Mbits/sec

[4]local192.168.0.121port5001connected with192.168.0.2port41076

[4]0.0-2.0sec219MBytes920Mbits/sec

[4]2.0-4.0sec220MBytes924Mbits/sec

[4]4.0-6.0sec220MBytes924Mbits/sec

[4]6.0-8.0sec220MBytes924Mbits/sec

[4]8.0-10.0sec220MBytes924Mbits/sec

[4]0.0-10.0sec1.08GBytes923Mbits/sec

Up to 100 Mbit/s faster. Not bad for such an easy fix 🙂

Read my post in the ODROID forum to get some more information and tuning tips.

Per default the Mali-T628 GPU inside the ODROID-XU4 will run at maximum speed (600MHz) the whole time.
This doesn’t consume that much energy if there is no load on the GPU. But nevertheless you can tune this a little bit if you don’t need the GPU at all.

Your kernel needs the following commit: mali: restore sysfs entries
Newer 3.10 ODROID-XU4 kernels should have this activated. I tested this with my own custom build kernel. You can grab it here. Source.
To flash this kernel you can use a script like this one.

To lock the GPU frequency to the lowest frequency possible (177MHz) do the following. This will automatically lower the voltage as well (see
cat/sys/devices/11800000.mali/vol before and after the change).

I did some measurements with a power meter and this change reduced the power consumption by 0.7 – 0.8W. At first glance, this doesn’t sound that much, but it’s a reduction of about 20% compared to the idle power consumption of an ODROID-XU4 with ondemand governor, which is just 3-4W!
The SOC will be 1-3°C cooler as well 🙂 Perfect for headless servers.

Preface

Nearly 3 years ago I have bought myself a PogoplugV2 (see Post). It is still an awesome device for this price and worked without any issue the whole time.

However it is time for an upgrade. USB 2.0 was a little bit too slow for me (~30MB/s). In addition I have a lot of external USB 3.0 HDDs and it is time to use their full potential. So there are 4 key points a device needs to become my next NAS (Network-attached storage) and home server system:

reasonable fast ARM processor with at least 2 cores (I prefer ARM over x86 for this use case because of its low energy consumption)

Gigabit ethernet

at least 2 USB 3.0 ports because I want to attach 2 active 4-port USB hubs. Almost all USB hubs >4 ports are cascaded 4 port hubs. This is quite bad because it can cause a lot of compatibility and of course performance issues.

all together <150€. I do not want to pay 300-400€ for a simple NAS with USB disks…

ODROID-XU4

After a bit of research I have bought an ODROID-XU4.
Let’s look at the specs:

Samsung Exynos5422 Cortex™-A15 2Ghz and Cortex™-A7 Octa core CPUs

Mali-T628 MP6(OpenGL ES 3.0/2.0/1.1 and OpenCL 1.1 Full profile)

2Gbyte LPDDR3 RAM PoP stacked

eMMC5.0 HS400 Flash Storage

2 x USB 3.0 Host, 1 x USB 2.0 Host

Gigabit Ethernet port

HDMI 1.4a for display

Size : 82 x 58 x 22 mm

Price: ~80 € + PSU ~8 € + Case ~8€ = ~96 €

I do not need the GPU and display output but whatever… the price is quite good for this performance. It is less expensive in US than it is in germany. But that is always the case 😉

Additional equipment

I am using a microSD Card for OS because eMMC is quite expensive. Boot times and program loading times are not that important my use case.
With two additional 4 port active USB 3.0 hubs I have 8 USB 3.0 ports and 1 USB 2.0 port. At the moment 5 disks are connected with a total of 7.5 TB storage.
Connecting a 2 TB disk to each USB 3.0 port would be 16 TB storage which should be good enough for some time.

OS Choices

The ODROID-XU4 SOC is the same as its predecessor ODROID-XU3. That is why they share a common kernel and os images are compatible.
Nevertheless Exynos5422 SOC is not fully integrated in mainline kernel yet. That is why you have to use a custom kernel from hardkernel. But that is not a big issue because there a quite a few OS choices with the custom kernel like Android (ofc pretty useless for NAS), Ubuntu 15.04, Arch Linux, Fedora, Kali Linux, …

Also I really like Arch Linux I have chosen a different path this time. There is OpenMediaVault (OMV) for some Odroids. I thought hey let’s give it a try the web interface looks quite nice.
After a few days I can say I really like it. The web interface is really good and looks modern.
OMV is running on Debian Wheezy 7.9.

Basic configuration

Colorful Shell

Because we will use the shell for quite some time let’s add some color to it:

vi~/.bashrc and comment out the following lines:

Shell

1

2

3

4

5

6

# You may uncomment the following lines if you want `ls' to be colorized:

export LS_OPTIONS='--color=auto'

eval"`dircolors`"

aliasls='ls $LS_OPTIONS'

aliasll='ls $LS_OPTIONS -l'

aliasl='ls $LS_OPTIONS -lA'

vi/etc/vim/vimrc and comment out the following line:

Shell

1

syntax on

and some more set’s from the bottom of the file as you like.

Note: If you are not familiar with
vi/vim you can use
nano to edit all files.

Performance tuning

Performance with default settings was really bad. Disk read and write was around 30MB/s to ext4 and SSH was laggy. After a few minutes I have found the issue. Per default OMV sets the conservative governor. This may work well with x86 CPUs or other ARM CPUs but with Odroid it is a pain.

In the web interface under “Power Management” is an option called “Monitoring – Specifies whether to monitor the system status and select the most appropriate CPU level.” This sounds quite good, problem is this option sets the cpu governor to conservative. Conservative governor with default settings works really bad on Odroid in combination with I/O.
Disabling this option sets governor to performance. All 8 cores at max clock speed the whole time produce quite a bit of heat (fan spinning a lot) and it is not really energy efficient.

CPU governor

But no problem we can change the governor to ondemand and with a little bit tuning your Odroid will fly. The following settings will replace conservative with ondemand governor if you enable the Power Management option.

vi/etc/default/openmediavault and append the following lines to this file:

Shell

1

2

# Ondemand Scheduler

OMV_CPUFREQUTILS_GOVERNOR="ondemand"

Then regenerate the config with

Shell

1

omv-mkconf cpufrequtils

Note: There seems to be a bug in OpenMediaVault. After disabling Power Management and enabling it again it does not change cpu governor anymore. To fix this do the following:

Shell

1

update-rc.dcpufrequtils defaults

Ondemand governor tuning

I did a few benchmarks and this showed I/O performance (my main focus) depends a lot on the cpu frequency. Therefore we further have to tune ondemand governor to get full I/O throughput. To do so do the following:

Shell

1

2

apt-getinstall sysfsutils

vi/etc/sysfs.conf

Copy the following to this file

Shell

1

2

3

4

5

6

7

# cpu0 sets cpu[0-3], cpu4 sets cpu[4-7]

devices/system/cpu/cpu0/cpufreq/ondemand/io_is_busy=1

devices/system/cpu/cpu4/cpufreq/ondemand/io_is_busy=1

devices/system/cpu/cpu0/cpufreq/ondemand/sampling_down_factor=10

devices/system/cpu/cpu4/cpufreq/ondemand/sampling_down_factor=10

devices/system/cpu/cpu0/cpufreq/ondemand/up_threshold=80

devices/system/cpu/cpu4/cpufreq/ondemand/up_threshold=80

Afterwards change to ondemand governor and activate these values with

Shell

1

2

3

cpufreq-set-gondemand-c0

cpufreq-set-gondemand-c4

service sysfsutils start

I did benchmarks with all these settings and for me this is the sweet spot. Nearly same performance as performance governor but lower frequency and less power consumption when idle.

Some explanation of all 3 settings:

sampling_down_factor: this parameter controls the rate at which the kernel makes a decision on when to decrease the frequency while running at top speed. When set to 1 (the default) decisions to reevaluate load are made at the same interval regardless of current clock speed. But when set to greater than 1 (e.g. 100) it acts as a multiplier for the scheduling interval for reevaluating load when the CPU is at its top speed due to high load. This improves performance by reducing the overhead of load evaluation and helping the CPU stay at its top speed when truly busy, rather than shifting back and forth in speed. This tunable has no effect on behavior at lower speeds/lower CPU loads.

up_threshold: defines what the average CPU usage between the samplings of ‘sampling_rate’ needs to be for the kernel to make a decision on whether it should increase the frequency. For example when it is set to its default value of ’95’ it means that between the checking intervals the CPU needs to be on average more than 95% in use to then decide that the CPU frequency needs to be increased.

io_is_busy: if 1 waiting for I/O will increase the calculated cpu usage. The governor will calculate iowait as busy and not idle time. Thus cpu will reach higher frequencies faster with I/O load.

NTFS mount options

Besides the ondemand governor we will add the big_writes mount option to all NTFS mounts. In addition we will add noatime.

big_writes: this option prevents fuse from splitting write buffers into 4K chunks, enabling big write buffers to be transferred from the application in a single step (up to some system limit, generally 128K bytes).

noatime this option disables inode access time updates which can speed up file operations and prevent sleeping (notebook) disks spinning up too often thus saving energy and disk lifetime.

vi/etc/default/openmediavault and append the following

Shell

1

2

# Optimize NTFS Performance

OMV_FSTAB_MNTOPS_NTFS="defaults,nofail,noexec,noatime,big_writes"

Then you have to unmount, apply, mount, apply all your NTFS volumes in the web interface.

With default mount options you get around 16 MB/s write. With big_writes you get up to 62 MB/s write. See this comparison below:

This looks fairly fast but keep in mind NTFS is very cpu intensive on such a system. Therefore real network throughput via samba (which is cpu heavy as well) is way less compared to a native linux filesystem. I tested this disk with samba and measured only ~30MB/s read and write speeds. This is considerably less than a native linux filesystem (see Samba benchmarks).

If you want to get full performance you have to use a native linux filesystem like ext4 or xfs. You really should!

More Monitoring

I really like the monitoring setup of OMV with rrdtool. Nevertheless I miss 2 graphs which I am interested in. It would be nice to have CPU frequency and CPU temperature graphs, therefore I extended the existing monitoring plugin.

Monitoring cpu frequency is no big deal because there is a native collectd plugin for CPU frequency. For CPU temperature we have to write our own collectd plugin which looks like this:

The default diagrams are a little small for my taste, that is why I have increased the size a bit:

vi/etc/default/openmediavault and add the following to this file

Shell

1

2

3

# RRDTool graph width and height

OMV_COLLECTD_RRDTOOL_GRAPH_WIDTH=800

OMV_COLLECTD_RRDTOOL_GRAPH_HEIGHT=200

Afterwards update OMV’s config files with

Shell

1

2

omv-mkconf collectd

omv-mkconf rrdcached

Afterwards you will have 2 new tabs which show graphs like these:

Finally
reboot to see if everything is working as expected.

Benchmarks

All benchmarks were done with performance governor to get consistent results.

Disks

I have several USB disks connected to my Odroid. Expect one disk all are 2.5″ USB 3.0 disks. All USB 3.0 disks are connected to two active USB 3.0 hubs. The USB 2.0 disk is connected to the USB 2.0 port.
In the following you can find hdparm read and dd read/write benchmarks for all connected disks. As you can see the performance is quite good and should be near the maximum the disks can handle.