Is this a kernel memory leak (kernel 4.16.7-1)?

Recently I noticed that an abnormal amount of RAM was being used on my Arch machine. This seems very abnormal because I have a machine with 64GB of RAM, and I'm consuming 61GB. The incident has happened multiple times for me after a couple of days. No RAM intensive applications were running. While trying to figure out where my RAM was going, I ran into the "linux ate my RAM" page. The buff/cache being consumed on the computer is significantly less that what's being used, so I don't think that's the reason for the high memory usage. Is this a kernel memory leak? Would I need to compile a new kernel to debug this?

Top below shows that I'm using 61GB/64GB. I've sorted on memory usage, and it can be seen that the total memory consumption of the applications I'm running gets nowhere near 95%. The buff/cache is 1GB, and the available memory is 2GB...

Re: Is this a kernel memory leak (kernel 4.16.7-1)?

Yup, the kernel has no explanation for that huge gap.How fast does this happen (does it take days or weeks to build up or do you get there within minutes/hours after the boot)?Also "lsmod" output please, for suspicious modules.

Re: Is this a kernel memory leak (kernel 4.16.7-1)?

I'm seeing the same on Arch ARM (aarch64) for a month or so now; that machine is running the 4.16.x series of kernels. I do not see it manifesting on my Arch boxes though, just the ODROID-C2. I haven't been able to track down the cause of it.

Re: Is this a kernel memory leak (kernel 4.16.7-1)?

Re: Is this a kernel memory leak (kernel 4.16.7-1)?

If you're not using it anyway, I'd suggest trying to unload (and not auto-load) the vbox kernel modules.And of course it would be interesting to hear whether graysky has vboxdrv loaded as well. (Actual any overlapping kernel modules - should not be that many with an odroid)

Re: Is this a kernel memory leak (kernel 4.16.7-1)?

seth wrote:

...And of course it would be interesting to hear whether graysky has vboxdrv loaded as well.

I do not. The Arch ARM box I have just runs pihole and openvpn... very minimal. Typically, it's only using 150-175 MB of RAM but ever since aarch64 went to 4.16.x, it has been "leaking" memory very similar to what the OP reported.

Re: Is this a kernel memory leak (kernel 4.16.7-1)?

I do have a couple virtual machines and I do want to use them (one of the reasons I have so much RAM), but since I'm not doing anything with them right now I've disabled the virtualbox kernel modules from loading and will see if that has any effect. Given that graysky doesn't have virtualbox kernel modules running, this may not be the issue.

Re: Is this a kernel memory leak (kernel 4.16.7-1)?

I have not gotten to the bottom of the leak. It has been ~7 days since I rebooted. My system has not consumed all of my RAM in that time, but it is still consuming more than it should be. 18/64 GB are currently in use. 2GB of the 18GB is being used for buff/cache which leaves 16GB/64GB = 25% used by something else. Top shows that no processes are using anything near 25%. Perhaps unloading the virtual box modules helped a little bit. This time the resources are disappearing much more slowly.

Re: Is this a kernel memory leak (kernel 4.16.7-1)?

I can't offer much help since I can't even track what process is consuming the memory on my machine but I will say that for me the bug occurs on aarch64 which runs the 4.16.x kernel (confirmed on odroid-c2 and raspberry pi 3 running aarch64). When I switch over to armv7h on raspberry pi3 which runs the 4.14.x kernels, I do not see the leak. Can you try booting in the linux-lts kernel package on x86_64 box? It too runs the 4.14.x series.

Re: Is this a kernel memory leak (kernel 4.16.7-1)?

I did a few tests using an older x86_64 box (Intel E5200), an few aarch64 boxes (ODROID-C2 and Raspberry Pi3), and a a few boxes running armv7h (RPi2 and RPi3).

tl;dr summary: I see memory "leaks" for the x86_64, and aarch64 boxes, but not for the armv7h boxes.

Details:For x86_64, the test was to boot it under 4.16.11 and 4.14.41 and log the mem usage `free --mega | grep Mem | awk '{ print $3 }'` once per hour via a cronjob. The result was the used memory went up for each kernel with the box just sitting idle (mysqld, kodi, vncserver/lxqt, and openvpn/server all running just sitting idle with no user interaction with the exception of openvpn which has my family hitting it):

Under 4.16.11, used memory over a 24 h period increased in a linear fashion at a rate of approx 6 MB/hour.Under 4.14.41, used memory over a 24 h period increased in a linear fashion at a rate of approx 5 MB/hour.

For aarch64.ODROID-C2 under kernel 4.16.11, sitting idle with just systemd networking, sshd, and ufw running, memory increased over 24 h at a very low rate of 0.3 MB/hour.Running pihole and openvpn in lxcs in addition to the base services, memory increased increased in a linear fashion over a 24 h at a rate of 10.5 MB/hour,

Note that, if I let the ODROID-C2 run without a reboot, the free memory will eventually increase to consume the entire free amount.

I also tried running Arch ARM aarch64 on a Raspberry Pi 3. Measured under kernel 4.16.10.Running pihole and openvpn in lxcs in addition to the base services, memory increased in a linear fashion over a 36 h at a rate of 9.9 MB/hour.

If I let it run without a reboot, it too will go until there is no free RAM.

In contrast, running armv7h on a Raspberry Pi 2 or 3 (similar results), does not show the memory leak:I haven't run with the log file (doing it now) but one example is a RPi3 box running 4.14.39. All it does is runs pihole, nginx, php in an lxc. It has 13 days of uptime currently and `free --mega` shows 103 used.... when I booted this machine the used memory was 82 so in 13 days (312 hours), the rate of memory use was 0.09 MB/hour... that's over 100 times less than either of the aarch64 boxes

Re: Is this a kernel memory leak (kernel 4.16.7-1)?

I think the culprit could be systemd-journal. Right now I am seeing the used memory (free --mega) track linearly with the resident size set (ps --sort -rss -eo pid,pmem,rss,vsz,comm) for systemd-journalctl. I need to collect more data/will post back.

Re: Is this a kernel memory leak (kernel 4.16.7-1)?

It's been about 7 days again for me and I'm not seeing a memory leak with the 4.14 kernel, "4.14.41-1-lts". I started running virtualbox VMs again as can be seen in the output of top. The total memory consumption is about 7GB, and it all seems to be explained by running processes and buff/cache usage.