This article is aimed as being a primer for developers who are interested in using core dumps to help them debug Node.js problems. If you are a C/C++ developer who is comfortable with core dumps, ulimits, gcore and debuggers then you are probably comfortable with using core dumps. Being able to integrate JavaScript debugging with C/C++ may be useful if you are working on native npm modulues or other C/C++ code that interacts with JavaScript. If you’re a pure Node.js developer wondering if using core dumps is something you should care about or just be aware of in case of an emergency later on then hopefully this will provide a starting place.

In this article I’ll give a basic demonstration of how you can use llnode to give you a better view of how your JavaScript code is invoked by Node.js. In future posts I’ll cover other commands and how they can help with specific types of problem.

A core dump is a snapshot of a processes memory at a particular point in time. There are many ways to create one but the most usual is a crash within the process that causes the operating system to create a core dump and terminate the process. Generally they accessed using a debugger such as lldb or gdb.

llnode is a plugin for the lldb debugger that lets it make sense of many of the v8 structures within a core dump. You can use it to help understand what the state of your program was at the time the core dump was generated.

You can find the instructions on how to compile and install llnode on the projects github page.

Core dumps

If you aren’t really familiar with core dumps there’s a couple of things you should know about them, although really they are the same thing:

Core dumps are a snapshot of your processes entire memory so core dumps are big. Because of that although most operating systems can create one when a process crashes this is often disabled by default. This is the reason they are often disabled, it’s bad if a process crashes frequently and needs restarting. It’s disastrous if you set it up to be restarted automatically but it keeps crashing and fills your disk.

Core dumps are a snapshot of your processes entire memory which includes things like usernames and passwords and other things you don’t want to share. Giving someone a core dump of your application gives them access to all the data it had in memory at the time the core dump was created.

There’s several ways to get a core dump. The most well known is the OS generating one when your application crashes. Node.js has a flag –abort-on-uncaught-exception which makes it easy to get a core dump from JavaScript code by forcing an abort when an exception crashes your program and that’s a good way to experiment with llnode but we won’t need that for our examples below. Since core dumps are often disabled to stop crashing applications filling up your disk we should make sure they are enabled. The basic way to re-enable this is to change the ulimit for core files from 0 to unlimited:

Setting the ulimit at the terminal in this way will change the setting for processes started within that terminal session but is not a permanent system wide change.

On Mac OS this should cause core files to be created in /cores. The core file will be named /cores/core.[process_id].

On Linux, by default, it would allow them to be created in the current directory. On Linux there are a number of other configuration settings that can change what happens with a core dump, if they are created, where they are created and what they are called. In particular the setting in /proc/sys/kernel/core_pattern which can change the file name or even pipe the core dump to another tool. You can probably get up to date instructions for your distribution just by Googling “enable core dumps [distro-name]”. A good rule of thumb is to check the syslog after a core dump has been generated as there will often be a message in there telling you where it went.

This will only run once, the exception inside the requestListener function. (I’ve named my request listener as it makes for a better stack trace. I don’t know of any technical reasons not to do that and it does make debugging easier, native or otherwise.)

Now we can open this with lldb. We specify the executable and the core file to use (with -c) and then run the standard backtrace command “bt” to produce a backtrace of our failure as a native call stack. There is also an llnode command called “v8 bt” this does the same as backtrace but tries to fill in any stack frames that lldb can’t find symbols for by seeing if they are JavaScript frames.

The problem with this backtrace is obvious by looking at stack frames #3 to #13. That’s where v8 is executing your JavaScript code. Because it’s running code v8 has generated at runtime and not compiled C/C++ the debugger has way to find symbols (names) for the functions that are being executed and the backtrace produced by “bt” has gaps. If we use the llnode command “v8 bt” to walk the stack instead we get a better result:

llnode uses the lldb API to walk the stack exactly as before but when lldb can’t provide any details for a frame it checks to see if it is JavaScript and if it is fills it in.

With the native C/C++ stacks and JavaScript stacks integrated we can also see the calls made from the libuv down into the JavaScript code where we forced an exception. This is obviously a contrived example but when you are attempting to debug a problem that you suspect began in some native code and ended up in JavaScript (or vice-versa) then having the combined stacks is a big advantage.

Because we named it we can easily see that the callback myRequestListener was being run at frame #6. We can select that< frame in lldb and use the llnode extensions to dump the source out:

We specify the object to inspect using it’s address in memory not it’s name. We can see that llnode lets us inspect variables and drill down to see the fields inside them. As well as being incredibly useful the “v8 inspect” and “v8 source list” commands also highlight that everything is in the core dump – your users data and even your code.

There’s much more you can do with llnode and ways you can use it to investigate problems at a higher level than just examining individual objects. Brendan Gregg has done an excellent article on using llnode for Node.js Memory Analysis. If there’s anything in the article you have questions about feel free to leave a comment here or ask me on twitter.

[…] The findrefs command extends llnode to allow you to discover which objects refer to another object and enhances the memory analysis capabilities of llnode. llnode is a plugin that allows you to explore Node.js core dumps with the lldb debugger. […]

[…] Hellyer has provided an introduction to the use of the LLDB debugger in his post Exploring Node.js core dumps using the llnode plugin for lldb and describes a new command available in the llnode plugin here nodeinfo command for llnode. He has […]

That looks like a bug in lldb’s architecture handling code. We’ve seen a couple of issues around opening core dumps generated in different ways. Was the core from your production server generated via gcore or a signal/crash?
You may get the same result but it might be worth trying copying the core file from the production machine to your development environment where you have been able to open a core dump before.

[…] and causing node to generate a core file on failure is a production best-practice. (See Exploring Core Dumps for information on how to use core files for post-mortem debugging.) Despite this, most clouds run […]