Profiling Node.js applications

In previous articles, I talked about Node.js internals, why Node.js is so fast, V8 internals and its optimization tricks. That’s cool, but… Understanding these things is not enough to write high-performance Node.js applications. You still need to know how to profile your Node.js application, find the bottleneck and optimize it, knowing how Node.js and V8 optimizes it.

Profiler

The main goal of a profiler is to measure all the CPU ticks spent on executing functions in your application. There are also memory profilers, which can be used to find memory leaks, but in this article I’ll be talking about performance issues only.

For instance, Google Chrome (or any other modern browser) has a built-in profiler in DevTools, recording all the information about functions and how long it takes to execute them into a log file. Afterwards, Google Chrome analyzes this log file, providing you with human-readable information about what’s happening (Flamechart, stack, whatever), so you can understand it and find the bottleneck.

Node.js has a built-in profiler, but with one difference. It doesn’t analyze log files as Google Chrome does. Instead, it just collects all the information into the log file and that’s it. It means that you need to have some separate tool that can understand this log file and provide you with human-readable information.

Let’s skip the theory for now and start with a simple example. I recommend you do everything I do step by step, so you can see what’s going on on your local machine as well (my Node.js version is 5.10.1 and npm 3.8.7, I can’t guarantee this example to work for you if you have a different version of Node.js).

Project with a bottleneck

Let’s imagine that the following example is some big project with performance issues. I have chosen crypto module and sync version of pbkdf2 specifically, because they decrease performance a lot and that’s what we want in this example.

const crypto = require('crypto');

function hash(password) {
  const salt = crypto.randomBytes(128).toString('base64');
  const hash = crypro.pbkdf2Sync(password, salt, 10000, 512);

  return hash;
}

// Imagine that loop below is real requests to some route
for (let i = 0; i < 50; i++) {
  hash('random_password');
}

Running this snippet gives me:

$ time node server.js
9.24 real 9.21 user 0.02 sys

We definitely have an issue with performance here — 9.24 seconds.

Let’s profile it!

Collecting ticks information into a log file

Node.js has a flag which enables profiler — prof. When you run your application with this flag, it collects the CPU ticks into the log file in the same folder where your script is.

Run Node.js with profiler and wait for it to finish:

$ node --prof server.js

Now, you have the log file with tick information. This log file gets a name something like isolate-*.log. If you try to see the content of the log file with a tool like cat, you will see a lot of un-readable information. That’s why we need a tool for analyzing it.

Analyzing log file

Until NodeJS 4.x you have to use separate packages like node-tick-processor. Since we have NodeJS 5.10.1, we can use a built-in tool.

Run Node.js with flag prof-process and provide path to log file, generated before. Filename will be different, so replace it with yours.

$ node --prof-process isolate-0x101804a00-v8.log

After some time, you will get all information about your application, including amount of processor ticks spent for each function (since we have a small example, we have few lines only).

Statistical profiling result from isolate-0x101804a00-v8.log, (7436 ticks, 1 unaccounted, 0 excluded).
[Shared libraries]:
ticks total nonlib name
7431 99.9% /usr/local/bin/node
2 0.0% /usr/lib/system/libsystem_malloc.dylib
1 0.0% /usr/lib/system/libsystem_platform.dylib
[JavaScript]:
ticks total nonlib name
1 0.0% 50.0% LazyCompile: ~runInThisContext node.js:306:28
[C++]:
ticks total nonlib name
[Summary]:
ticks total nonlib name
1 0.0% 50.0% JavaScript
0 0.0% 0.0% C++
5 0.1% 250.0% GC
7434 100.0% Shared libraries
1 0.0% Unaccounted
[C++ entry points]:
ticks cpp total name
[Bottom up (heavy) profile]:
Note: percentage shows a share of a particular caller in the total
amount of its parent calls.
Callers occupying less than 2.0% are not shown.
ticks parent name
7431 99.9% /usr/local/bin/node
7425 99.9% /usr/local/bin/node
7389 99.5% LazyCompile: ~pbkdf2 crypto.js:557:16
5144 69.6% LazyCompile: ~exports.pbkdf2Sync crypto.js:552:30
5144 100.0% LazyCompile: ~hash /Users/ghaiklor/Library/Projects/test.js:5:14
5144 100.0% Function: ~<anonymous> /Users/ghaiklor/Library/Projects/test.js:1:11
2245 30.4% LazyCompile: *hash /Users/ghaiklor/Library/Projects/test.js:5:14
2245 100.0% Function: ~<anonymous> /Users/ghaiklor/Library/Projects/test.js:1:11
2245 100.0% LazyCompile: ~Module._compile module.js:380:37

prof-process analyzed ticks log file and now we have readable information about what’s happening in our script. It has six sections: Shared libraries, JavaScript, C++, Summary, C++ entry points, and Bottom up (heavy) profile.

I’ll explain these sections another time, for now, let’s look at Bottom (up) heavy profile. That’s the place where you get information about more heavy functions.

As we can see, 99.9% of CPU ticks are spent in pbkdf2Sync function in crypto.js module which is called from our hash function. This is our bottleneck. Knowing that, we can optimize it, replacing synchronous function with asynchronous and provide a callback that will do some stuff after hashing.

const crypto = require('crypto');

function hash(password, cb) {
  const salt = crypto.randomBytes(128).toString('base64');
  const hash = crypto.pbkdf2(password, salt, 10000, 512, cb);
}

// Imagine that loop below is real requests to some route
for (let i = 0; i < 50; i++) {
  hash('random_password');
}

Run our changed snippet via time:

$ time node server.js
2.64 real 9.91 user 0.07 sys

9.24 seconds versus 2.64 seconds. It runs 3.5 times faster. Now, let’s profile it again and compare profiling results with the un-optimized version before:

$ node --prof server.js
$ node --prof-process isolate-0x101804a00-v8.log
Statistical profiling result from isolate-0x101804a00-v8.log, (107 ticks, 1 unaccounted, 0 excluded).
[Shared libraries]:
ticks total nonlib name
53 49.5% /usr/local/bin/node
47 43.9% /usr/lib/system/libsystem_kernel.dylib
2 1.9% /usr/lib/system/libsystem_malloc.dylib
1 0.9% /usr/lib/system/libsystem_platform.dylib
1 0.9% /usr/lib/system/libsystem_c.dylib
[JavaScript]:
ticks total nonlib name
1 0.9% 33.3% LazyCompile: formatValue util.js:243:21
1 0.9% 33.3% LazyCompile: COMPARE native runtime.js:73:17
[C++]:
ticks total nonlib name
[Summary]:
ticks total nonlib name
2 1.9% 66.7% JavaScript
0 0.0% 0.0% C++
4 3.7% 133.3% GC
104 97.2% Shared libraries
1 0.9% Unaccounted
[C++ entry points]:
ticks cpp total name
[Bottom up (heavy) profile]:
Note: percentage shows a share of a particular caller in the total
amount of its parent calls.
Callers occupying less than 2.0% are not shown.
ticks parent name
53 49.5% /usr/local/bin/node
47 88.7% /usr/local/bin/node
16 34.0% LazyCompile: ~runInThisContext node.js:306:28
16 100.0% LazyCompile: ~NativeModule.compile node.js:387:44
16 100.0% LazyCompile: NativeModule.require node.js:321:34
5 31.3% LazyCompile: ~startup node.js:13:19
3 18.8% Function: ~<anonymous> module.js:1:11
2 12.5% LazyCompile: ~<anonymous> node.js:222:48
2 12.5% Function: ~<anonymous> tty.js:1:11
1 6.3% LazyCompile: ~setupGlobalVariables node.js:201:32
1 6.3% LazyCompile: Module._load module.js:285:24
1 6.3% Function: ~<anonymous> stream.js:1:11
1 6.3% Function: ~<anonymous> assert.js:1:11
2 4.3% LazyCompile: ~Revive native json.js:19:16
2 100.0% LazyCompile: ~Revive native json.js:19:16
2 100.0% LazyCompile: ~parse native json.js:43:19
2 100.0% LazyCompile: ~setupConfig internal/process.js:50:21
2 4.3% Function: ~<anonymous> timers.js:1:11
2 100.0% LazyCompile: ~NativeModule.compile node.js:387:44
2 100.0% LazyCompile: NativeModule.require node.js:321:34
2 100.0% LazyCompile: ~setupGlobalTimeouts node.js:211:31
2 4.3% Function: ~<anonymous> net.js:1:11
2 100.0% LazyCompile: ~NativeModule.compile node.js:387:44
2 100.0% LazyCompile: NativeModule.require node.js:321:34
2 100.0% Function: ~<anonymous> tty.js:1:11
2 4.3% Function: <anonymous> crypto.js:1:11
2 100.0% LazyCompile: ~NativeModule.compile node.js:387:44
2 100.0% LazyCompile: NativeModule.require node.js:321:34
2 100.0% LazyCompile: Module._load module.js:285:24
1 2.1% LazyCompile: ~startup node.js:13:19
1 100.0% Function: ~<anonymous> node.js:10:10
1 2.1% LazyCompile: ~slowToString buffer.js:414:22
1 100.0% LazyCompile: ~Buffer.toString buffer.js:480:37
1 100.0% LazyCompile: fs.readFileSync fs.js:417:27
1 100.0% LazyCompile: ~Module._extensions..js module.js:420:37
1 2.1% LazyCompile: ~setupConfig internal/process.js:50:21
1 100.0% LazyCompile: ~startup node.js:13:19
1 100.0% Function: ~<anonymous> node.js:10:10
1 2.1% LazyCompile: ~onwrite _stream_writable.js:323:17
1 100.0% LazyCompile: ~WritableState.onwrite _stream_writable.js:88:26
1 100.0% LazyCompile: ~Socket._writeGeneric net.js:628:42
1 100.0% LazyCompile: ~Socket._write net.js:697:35
1 2.1% LazyCompile: ~match native string.js:122:23
1 100.0% LazyCompile: ~<anonymous> node.js:353:58
1 100.0% LazyCompile: ~InnerArraySome native array.js:967:24
1 100.0% LazyCompile: ~some native array.js:987:19
1 2.1% LazyCompile: ~fs.Stats fs.js:109:20
1 100.0% /usr/local/bin/node
1 100.0% LazyCompile: ~fs.lstatSync fs.js:885:24
1 100.0% LazyCompile: ~realpathSync fs.js:1458:40
1 2.1% LazyCompile: ~exports.pbkdf2 crypto.js:534:26
1 100.0% LazyCompile: ~hash /Users/ghaiklor/Library/Projects/test.js:5:14
1 100.0% Function: ~<anonymous> /Users/ghaiklor/Library/Projects/test.js:1:11
1 100.0% LazyCompile: ~Module._compile module.js:380:37
1 2.1% LazyCompile: ~exports.inherits util.js:779:28
1 100.0% Function: ~<anonymous> _stream_readable.js:1:11
1 100.0% LazyCompile: ~NativeModule.compile node.js:387:44
1 100.0% LazyCompile: NativeModule.require node.js:321:34
1 2.1% LazyCompile: ~emitPendingUnhandledRejections internal/process/promises.js:37:42
1 100.0% LazyCompile: ~_tickCallback internal/process/next_tick.js:87:25
1 100.0% LazyCompile: ~Module.runMain module.js:445:26
1 100.0% LazyCompile: ~startup node.js:13:19
1 2.1% LazyCompile: ~bind native v8natives.js:1200:22
1 100.0% LazyCompile: ~Console console.js:5:17
1 100.0% Function: ~<anonymous> console.js:1:11
1 100.0% LazyCompile: ~NativeModule.compile node.js:387:44
1 2.1% LazyCompile: ~WritableState.onwrite _stream_writable.js:88:26
1 100.0% LazyCompile: ~Socket._writeGeneric net.js:628:42
1 100.0% LazyCompile: ~Socket._write net.js:697:35
1 100.0% LazyCompile: ~doWrite _stream_writable.js:293:17
1 2.1% LazyCompile: ~WritableState _stream_writable.js:26:23
1 100.0% LazyCompile: ~Writable _stream_writable.js:138:18
1 100.0% LazyCompile: ~Duplex _stream_duplex.js:23:16
1 100.0% LazyCompile: ~Socket net.js:115:16
1 2.1% LazyCompile: ~Uint8Array native typedarray.js:124:31
1 100.0% LazyCompile: ~createBuffer buffer.js:65:22
1 100.0% LazyCompile: ~createPool buffer.js:71:20
1 100.0% Function: ~<anonymous> buffer.js:1:11
1 2.1% LazyCompile: ~ToPropertyDescriptor native v8natives.js:237:30
1 100.0% LazyCompile: ~defineProperty native v8natives.js:766:30
1 100.0% LazyCompile: ~<anonymous> internal/streams/lazy_transform.js:20:19
1 100.0% LazyCompile: ~InnerArrayForEach native array.js:942:27
1 2.1% LazyCompile: ~ReadableState _stream_readable.js:15:23
1 100.0% LazyCompile: ~Readable _stream_readable.js:81:18
1 100.0% LazyCompile: ~Duplex _stream_duplex.js:23:16
1 100.0% LazyCompile: ~Socket net.js:115:16
1 2.1% LazyCompile: ~NativeModule.compile node.js:387:44
1 100.0% LazyCompile: NativeModule.require node.js:321:34
1 100.0% LazyCompile: ~setupGlobalVariables node.js:201:32
1 100.0% LazyCompile: ~startup node.js:13:19
1 2.1% LazyCompile: ~Module._resolveFilename module.js:325:35
1 100.0% LazyCompile: Module._load module.js:285:24
1 100.0% LazyCompile: ~Module.runMain module.js:445:26
1 100.0% LazyCompile: ~startup node.js:13:19
1 2.1% LazyCompile: ~DefineObjectProperty native v8natives.js:464:30
1 100.0% LazyCompile: ~DefineOwnProperty native v8natives.js:633:27
1 100.0% LazyCompile: ~defineProperty native v8natives.js:766:30
1 100.0% Function: ~<anonymous> events.js:1:11
1 2.1% LazyCompile: ~<anonymous> internal/process/stdio.js:23:46
1 100.0% Function: ~<anonymous> console.js:1:11
1 100.0% LazyCompile: ~NativeModule.compile node.js:387:44
1 100.0% LazyCompile: NativeModule.require node.js:321:34
1 2.1% LazyCompile: SHR native runtime.js:266:13
1 100.0% LazyCompile: ~isFd fs.js:104:14
1 100.0% LazyCompile: fs.readFileSync fs.js:417:27
1 100.0% LazyCompile: ~Module._extensions..js module.js:420:37
1 2.1% LazyCompile: Module._load module.js:285:24
1 100.0% LazyCompile: ~Module.runMain module.js:445:26
1 100.0% LazyCompile: ~startup node.js:13:19
1 100.0% Function: ~<anonymous> node.js:10:10
1 2.1% Function: ~<anonymous> node.js:10:10
1 2.1% Function: ~<anonymous> fs.js:1:11
1 100.0% LazyCompile: ~NativeModule.compile node.js:387:44
1 100.0% LazyCompile: NativeModule.require node.js:321:34
1 100.0% Function: ~<anonymous> module.js:1:11
47 43.9% /usr/lib/system/libsystem_kernel.dylib
1 2.1% /usr/local/bin/node
1 100.0% LazyCompile: NativeModule node.js:311:24
1 100.0% LazyCompile: NativeModule.require node.js:321:34
1 100.0% LazyCompile: ~startup node.js:13:19
1 100.0% Function: ~<anonymous> node.js:10:10

We can see now, that in Bottom up (heavy) profile not a lot of CPU ticks, comparing to 7431 at the previous time. Also, there is no hash function that calls pbkdf2.

This means that we have an optimized version of our server and there are no heavy functions.

Summary

This was just a simple example to show you the basics of Node.js application profiling. Knowing how to profile your applications and tools you can use to do it helps you make right decisions about what you need to optimize.

Thanks for reading! I hope it helped you understand how to profile Node.js applications a little better. Tweet me on Twitter if you have questions or suggestions about my articles.


Eugene Obrezkov, Developer Advocate at Onix-Systems, Kirovohrad, Ukraine.

One thought on “Profiling Node.js applications

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.