fun with lua-nginx-module, openssl, strace, gdb, glibc and Linux VM

07 Mar 2017

tldr; A memory trick in lua-nginx-module leads to redundant large memory allocation.

Recently I made one line Nginx configuration change and it resulted in Out of Memory(OOM) Killer killing the Nginx process before it completed loading the new configuration. Here’s the line added to the configuration file:

lua_ssl_trusted_certificate /etc/ssl/certs/ca-certificates.crt;

In this post, I’m going to explain how I tracked down the root cause of the issue and document which tools I used and what I learnt during the process chronologically. The post will be a bit of everything.
Before going any further here are the details of the software stack I used:

OpenSSL version: 1.0.2j

OS: Ubuntu Trusty with Linux 3.19.0-80-generic

Nginx: Openresty bundle 1.11.2

glibc: Ubuntu EGLIBC 2.19-0ubuntu6.9

First of all let’s start with OOM Killer. It is one of the many functions of Linux kernel that gets triggered when the kernel can not allocate more memory. The job of OOM Killer
is to detect which process is the most damaging(refer to https://linux-mm.org/OOM_Killer for more details on how the badness score is calculated) for the system,
and once detected kill the process to free up memory. This means in my case Nginx was asking for more and more memory and eventually kernel failed to allocate more memory and triggered OOM Killer that in its turn killed the Nginx process. Alright, now let’s try to see what Nginx has been doing while it was reloading its new configuration. To do so strace can be used. It’s a great tool to see what a program is doing
without reading its source code. In this case I executed

sudo strace -p `cat /var/run/nginx.pid` -f

and then

sudo /etc/inid.t/nginx reload

-f flag tells strace to follow the child processes too. You can get a better overview of strace at http://jvns.ca/zines/#strace-zine.
Following is an interesting piece from the output of strace after executing the above commands:

which means the kernel was being asked to allocate 1Mb of memory in the middle of reads. Another interesting detail in the strace output was that allocated memory never gets munmaped(freed). Notice that 0x7f6dc8266000 gets munmaped after the call to close. These facts(and me having almost no prior experience with low level debugging) made me think that Ningx was leaking memory when lua_ssl_trusted_certificate is set! Memory leak in Nginx, is not that exciting?! Not so fast. In order to find out what exact component of Nginx was leaking memory I decided to use gdb. gdb becomes very useful if you compile your program with debug flags enabled. As I mentioned above I was using Nginx Openresty bundle. To compile it with debug flags enabled use the following command:

--with-openssl-opt="-d no-asm -g3 -O0 -fno-omit-frame-pointer -fno-inline-functions" makes sure that OpenSSL will also be compiled with debug flags enabled. Now that I have debug flags enabled version of Openresty executable I can run it through gdb and find out what exact function was triggering the mmap call mentioned above. First we need to start gdb for debugging Openresty executable:

sudo gdb `which openresty`

This command will open the gdb console that looks like following:

GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/local/openresty/bin/openresty...done.
(gdb)

As a next step I’ll set the command line arguments of my program:

(gdb) set args -p `pwd` -c nginx.conf

This makes sure that gdb when instructed to do so will start Openresty/Nginx using the given command line arguments. The next step is to configure breakpoints to pause the program execution at a given line in a given file, or at a given function call. Because I wanted to find out the suspicious mmap caller, after opening the trusted certificate file, I’ll first add a break point to pause when

open("/etc/ssl/certs/ca-certificates.crt", O_RDONLY) = 5

is made. Here’s how I did it:

break open if strcmp($rdi, "/etc/ssl/certs/ca-certificates.crt") == 0

If you have not realized yet, gdb is super cool, it even lets you add a custom condition to create more flexible break points. In this case, we are telling gdb to pause the execution of the program if open function is called and the data set in rdi register is “/etc/ssl/certs/ca-certificates.crt”. I don’t know if there’s a better way but I found this just by playing out a bit that open function’s first argument(path to file) is kept in rdi register. Hence the break point condition. Now we can tell gdb to execute the program:

(gdb) run

In the first occurence of open("/etc/ssl/certs/ca-certificates.crt", O_RDONLY) call, gdb will pause the program execution. Now we can use different gdb helper commands to examine the internal state of program at the given time. This is how it looked like when the program execution hit the breakpoint:

How awesome, gdb shows us the complete backtrace of function calls with the passed parameters! To examine the data in registers at this point, you can use info registers command. In order to better understand and parse this backtrace I looked into how Nginx(remember Openresty is just Nginx with extra modules) internally works. Everything in Nginx(excep its core) is implemented as a module and those modules register handlers or filters. Nginx’s config file consists of three main section, namely main, server and location. Let’s say your custom Nginx module introduces a new config directive then you need to also register a handler to do something with the value of that directive. So what happens is Nginx parses the config file and calls the registered config handlers after(there are actually more callback types) each config section is parsed.
Here’s how lua-nginx-module(the core module of Openresty Nginx bundle) does it:

This is how an Nginx module registers its handlers. As you can see from the comments too, ngx_http_lua_merge_loc_conf will be called once Nginx parses a location section config and merges it to main section. Now if we return back to the gdb output above we can see that at line #13 this function was called. So by definition this function will be called for every location section configured. As you can see here the function is quite straightforward to read and what it mainly does is to inherit relevant config pieces from server section into the location section(every section has its own configuration context) and set default values. As you can see it does also call ngx_http_lua_set_ssl function which in its turn calls ngx_ssl_trusted_certificate function from Nginx’s SSL module if lua_ssl_trusted_certificate directive is set. ngx_ssl_trusted_certificate is a simple function that sets the verify depth for the SSL context for given configuration section(a location section in this case) and calls another OpenSSL API to load the certificate file(well and some error handling):

The complete code for Nginx SSL module can be found here. Alright, now we are halfway through the backtrace and out of the Nginx world already. The next function that was called is SSL_CTX_load_verify_locations which is from OpenSSL. At this point the program opened the trusted certificate file and paused. As a next step it will start reading it(refer to the strace output above). As my initial purpose was to find out what was initiating the suspicious mmap call naturally the next breakpoint I created was:

(gdb) b mmap

b is a shortcut for break. Then (gdb) c to continue the program execution. And program hits the next breakpoint:

At this point I was completely hyped!!! I “found” a memory leak in OpenSSL! With so much excitement I started to read and try to understand OpenSSL code that’s been under development since the nineties. With so much excitement I spent next couple of days and nights to understand those functions and try to find out the memory leak that I was so so so sure that was caused by a bug in of those functions. Seeing a lot of memory leak bug reports for OpenSSL(and specifically related to the functions above) my confidence doubled so that I spent couple of more nights on trying to find the bug! So basically what those function calls do is to first open the trusted certificate file, then allocate a buffer(4096 bytes) and read next 4Kb from the file into the buffer then decrypt the data and convert it to the respective internal representaton of OpenSSL and save it in the certificate store of the given SSL context(which belongs to a location section context). So whenever in the future Nginx needs to do SSL client verification, in the context of that location section, it will call SSL_get_verify_result function from OpenSSL and pass the saved SSL context from the beginning. Then that function will use the loaded and internalized trusted certificate to verify the client. So the outcome of those nights is that I learnt how all these things fit together but still could not find any bug! I also learnt that that suspicious mmap call was initiated by malloc which was triggered by CRYPTO_malloc from an another OpenSSL function to expand the cert store size so that it can fit the decrypted and internalyzed certificate data. Now that I learnt what exactly was happening I realized that not freeing that allocated memory is expected because OpenSSL keeps it for future potential uses in the lifetime of the process. But the main issue why Nginx memory consumption grew so fast when lua_ssl_trusted_certificate set was still a mystery. However with all these data I have it’s obvious that it was the result of that mmap calls per location section. At this point I decided to extract the problem out of Openresty/Nginx context and write a standalone C program that uses the exact same OpenSSL API calls to load the same trusted certificate file repeatedly. Loading it repeatedly would simulate multiple location sections(5000 locations in this case):

If I solve the issue here then I could solve it in the context of Openresty/Nginx too as this was equivalent to the original problem. But guess what, strace output for this program was different than what I expected!

Notice the brk call does not follow with mmap call anymore and also memory was not growing unexpectedly! Alright, I was so tired of this shit at this point and almost gave up. But my curiousity did not let me. I decided to read more on how memory allocation works. So normally when your program tries to allocate more memory it calls malloc function(or some modification of it) from glibc. glibc abstracts out a lot of dirty memory management work from the user space programs and provides an API to work with Virtual Memory. By default when a program calls malloc function to allocate more memory in the heap, malloc first will try to use brk function to move the break up in the memory address space as requested. But brk won’t work if there’s a hole in the heap. What that means is let’s say you have 1Gb of heap and its all free. To create a hole in it you can directly call mmap function with a specific address(A), and size of memory you want and it will allocate that area in the heap(starting from address A) for you. But because the break is still at the beginning of the heap, when (s)brk function is called to allocate B > A bytes of memory it won’t be able to complete the request because some part of the memory brk tries to allocate has already been allocated(hole). In this situations malloc uses mmap instead to allocate the requested memory. And because mmap calls are expensive to reduce the number of possible mmap calls malloc allocates 1Mb memory even if the requested memory size is lower than 1Mb. This is documented in the source code at https://code.woboq.org/userspace/glibc/malloc/malloc.c.html#406. If you scroll up you can see that suspcious mmap call was allocating 1048576 bytes of memory which is equal to 1Mb - the default value malloc uses to call mmap when brk fails. Exciting!!! Putting the pieces together, the next obvious hypothesis to test is brk call was followed by the mmap call in the context of Openresty but not in the standalone C program is because Openresty somewhere before loading the certificate file creates a hole in the memory. This was not too hard to find by just greping through the PRs, issues and source code of lua-nginx-module. So turns out Luajit requires to work in a lower address space to be efficient this is why lua-nginx-module guys decided to do following in the very beginning of the program execution:

For the complete source code please refer to the original repository. Now I have not yet figured out how this let’s Luajit to own the lower address space (I would appreciate if someone explains it in the comment) but it is definitely the source of the main issue. To prove that I copied this code to my standalone C program:

When I compile and run this program through strace I can see the exact same behaviour as I was seeing in the context of Openresty. To be even more sure I edited Openresty source code, commented out ngx_http_lua_limit_data_segment and compiled it and confirmed that memory grows does not happen. Eureka!!!

This was a lot of learning for me. As a result of the investigation I also created an issue for lua-nginx-module repository to let them know about this behaviour. This becomes a real issue only if you have a lot of location section defined. For example if you have a big Nginx configuration file with 4k location sections and you add lua_ssl_trusted_certificate directive into the main configuration then during the reload/restart/start Nginx’s memory consumption will grow up to ~4Gb(4k*1Mb) and will stay like that.