How Node.js requires native shared objects

Recently, I faced an issue with requiring native bindings in JavaScript code, so I researched it. If you ever used commands, like require(‘my_module.node’) but don’t know how they work from JavaScript perspective — this article is for you.

What is .node files and why do we need them?

Sometimes, we are using npm packages that have native bindings for their purposes. Sometimes, we are building our own C++ code to extend the Node.js functionality for our own purposes. In both cases, we need to build an external native code to an object that can be usable by Node.js.

And that’s what .node files are for. .node files are dynamically shared objects that Node.js can load to its environment. Making an analogy here, I would say that .node files are very similar to .dll or .so files.

Where does require method come from?

Before digging into internals, let’s remember where require() comes from.

All the JavaScript files are actually wrapped into functions:

const WRAPPER = [
  '(function (exports, require, module, __filename, __dirname) { ',
  'n})'
];
const JS_SOURCE = 'script here';
const WRAPPED_SCRIPT = WRAPPER[0] + JS_SOURCE + WRAPPER[1];

So, let us say, you have some index.js file with the following content:

const fs = require('fs');

When Node.js tries to load it, it will look like this:

(function (exports, require, module, __filename, __dirname) {
  const fs = require('fs');
})

This means that all files\scripts are functions that Node.js will call when needed. However, that means that require() method is provided when calling this function.

That function is being called in NativeModule.prototype.compile() method:

NativeModule.prototype.compile = function() {
var source = NativeModule.getSource(this.id);
source = NativeModule.wrap(source);
this.loading = true;
try {
const fn = runInThisContext(source, {
filename: this.filename,
lineOffset: 0,
displayErrors: true
});
fn(this.exports, NativeModule.require, this, this.filename);
this.loaded = true;
} finally {
this.loading = false;
}
};

As we can see, require() method is pointing to NativeModule.require() method.

However, there is another type of module. NativeModule loads internal modules, but Module loads your modules (aka userland).

Module.compile() has similar implementation as well:

// Run the file contents in the correct scope or sandbox. Expose
// the correct helper variables (require, module, exports) to
// the file.
// Returns exception, if any.
Module.prototype._compile = function(content, filename) {
// Remove shebang
var contLen = content.length;
if (contLen >= 2) {
if (content.charCodeAt(0) === 35/*#*/ &&
content.charCodeAt(1) === 33/*!*/) {
if (contLen === 2) {
// Exact match
content = '';
} else {
// Find end of shebang line and slice it off
var i = 2;
for (; i < contLen; ++i) {
var code = content.charCodeAt(i);
if (code === 10/*\n*/ || code === 13/*\r*/)
break;
}
if (i === contLen)
content = '';
else {
// Note that this actually includes the newline character(s) in the
// new output. This duplicates the behavior of the regular expression
// that was previously used to replace the shebang line
content = content.slice(i);
}
}
}
}
// create wrapper function
var wrapper = Module.wrap(content);
var compiledWrapper = vm.runInThisContext(wrapper, {
filename: filename,
lineOffset: 0,
displayErrors: true
});
if (process._debugWaitConnect && process._eval == null) {
if (!resolvedArgv) {
// we enter the repl if we're not given a filename argument.
if (process.argv[1]) {
resolvedArgv = Module._resolveFilename(process.argv[1], null);
} else {
resolvedArgv = 'repl';
}
}
// Set breakpoint on module start
if (filename === resolvedArgv) {
delete process._debugWaitConnect;
const Debug = vm.runInDebugContext('Debug');
Debug.setBreakPoint(compiledWrapper, 0, 0);
}
}
var dirname = path.dirname(filename);
var require = internalModule.makeRequireFunction.call(this);
var args = [this.exports, require, this, filename, dirname];
var depth = internalModule.requireDepth;
if (depth === 0) stat.cache = new Map();
var result = compiledWrapper.apply(this.exports, args);
if (depth === 0) stat.cache = null;
return result;
};
view raw Module_compile.js hosted with ❤ by GitHub

Here, compile wraps source into a function and calls it. And, for this case, require argument is internalModule.makeRequireFunction.call(this).

So, for different modules Node.js uses different loaders: NativeModule and Module. However, we will talk about Module only.

Module has the following require() implementation:

Module.prototype.require = function(path) {
assert(path, 'missing path');
assert(typeof path === 'string', 'path must be a string');
return Module._load(path, this, /* isMain */ false);
};
view raw Module_require.js hosted with ❤ by GitHub

So, our require() method, we are heavily using, is actually a pointer to Module.prototype.require() method. If I drop the details, then that’s all you should know, that require() -> Module.prototype.require().

Requiring .node file

Ok, so now, we know what is require() in our code. What happens if we will require a .node file:

const myBindings = require('./build/Release/mybinding.node');

What’s happened there? What was happening in require()?

Well, first, it goes into Module.prototype.require() method which calls Module._load() method with a provided path. In our case, ./build/Release/mybinding.node. Here is the implementation:

// Check the cache for the requested file.
// 1. If a module already exists in the cache: return its exports object.
// 2. If the module is native: call `NativeModule.require()` with the
// filename and return the result.
// 3. Otherwise, create a new module for the file and save it to the cache.
// Then have it load the file contents before returning its exports
// object.
Module._load = function(request, parent, isMain) {
if (parent) {
debug('Module._load REQUEST %s parent: %s', request, parent.id);
}
var filename = Module._resolveFilename(request, parent, isMain);
var cachedModule = Module._cache[filename];
if (cachedModule) {
return cachedModule.exports;
}
if (NativeModule.nonInternalExists(filename)) {
debug('load native module %s', request);
return NativeModule.require(filename);
}
var module = new Module(filename, parent);
if (isMain) {
process.mainModule = module;
module.id = '.';
}
Module._cache[filename] = module;
tryModuleLoad(module, filename);
return module.exports;
};
view raw Module_load.js hosted with ❤ by GitHub

It checks, if our module exists in cache and, if not, it creates a Module instance and calls tryModuleLoad() function, providing the instance and a filename of our binding. All tryModuleLoad() is trying to do is to call load() method on its instance. Here is an implementation of load() method:

// Given a file name, pass it to the proper extension handler.
Module.prototype.load = function(filename) {
debug('load %j for module %j', filename, this.id);
assert(!this.loaded);
this.filename = filename;
this.paths = Module._nodeModulePaths(path.dirname(filename));
var extension = path.extname(filename) || '.js';
if (!Module._extensions[extension]) extension = '.js';
Module._extensions[extension](this, filename);
this.loaded = true;
};
view raw ModuleLoad.js hosted with ❤ by GitHub

Here, it goes through a list of defined extensions in Module._extensions. This list contains functions that are processing loading of different file types. At the time of writing this article, this list contains functions for .js.json and .node files. Though I bet that this really will not be changed, anyway.

So, if it finds extension in that list, in our case .node, then it calls a function with a path to the module you want to require. In case with .node extension it calls a method that has process.dlopen() method, which is a binding from Node C++ sources into a JavaScript context.

//Native extension for .node
Module._extensions['.node'] = function(module, filename) {
return process.dlopen(module, path._makeLong(filename));
};

dlopen() method is actually very similar to how .dll or .so files are loaded on Windows and Linux. Here is an implementation of a method that injects into JavaScript context as process.dlopen() method:

// DLOpen is process.dlopen(module, filename).
// Used to load 'module.node' dynamically shared objects.
void DLOpen(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
uv_lib_t lib;
CHECK_EQ(modpending, nullptr);
if (args.Length() != 2) {
env->ThrowError("process.dlopen takes exactly 2 arguments.");
return;
}
Local<Object> module = args[0]->ToObject(env->isolate()); // Cast
node::Utf8Value filename(env->isolate(), args[1]); // Cast
const bool is_dlopen_error = uv_dlopen(*filename, &lib);
// Objects containing v14 or later modules will have registered themselves
// on the pending list. Activate all of them now. At present, only one
// module per object is supported.
node_module* const mp = modpending;
modpending = nullptr;
if (is_dlopen_error) {
Local<String> errmsg = OneByteString(env->isolate(), uv_dlerror(&lib));
uv_dlclose(&lib);
#ifdef _WIN32
// Windows needs to add the filename into the error message
errmsg = String::Concat(errmsg, args[1]->ToString(env->isolate()));
#endif // _WIN32
env->isolate()->ThrowException(Exception::Error(errmsg));
return;
}
if (mp == nullptr) {
uv_dlclose(&lib);
env->ThrowError("Module did not self-register.");
return;
}
if (mp->nm_version != NODE_MODULE_VERSION) {
char errmsg[1024];
snprintf(errmsg,
sizeof(errmsg),
"The module '%s'"
"\nwas compiled against a different Node.js version using"
"\nNODE_MODULE_VERSION %d. This version of Node.js requires"
"\nNODE_MODULE_VERSION %d. Please try re-compiling or "
"re-installing\nthe module (for instance, using `npm rebuild` or "
"`npm install`).",
*filename, mp->nm_version, NODE_MODULE_VERSION);
// NOTE: `mp` is allocated inside of the shared library's memory, calling
// `uv_dlclose` will deallocate it
uv_dlclose(&lib);
env->ThrowError(errmsg);
return;
}
if (mp->nm_flags & NM_F_BUILTIN) {
uv_dlclose(&lib);
env->ThrowError("Built-in module self-registered.");
return;
}
mp->nm_dso_handle = lib.handle;
mp->nm_link = modlist_addon;
modlist_addon = mp;
Local<String> exports_string = env->exports_string();
Local<Object> exports = module->Get(exports_string)->ToObject(env->isolate());
if (mp->nm_context_register_func != nullptr) {
mp->nm_context_register_func(exports, module, env->context(), mp->nm_priv);
} else if (mp->nm_register_func != nullptr) {
mp->nm_register_func(exports, module, mp->nm_priv);
} else {
uv_dlclose(&lib);
env->ThrowError("Module has no declared entry point.");
return;
}
// Tell coverity that 'handle' should not be freed when we return.
// coverity[leaked_storage]
}
view raw DLOpen.cc hosted with ❤ by GitHub

It tries to load a shared object via libuv API and if everything works as expected; it registers this dynamically shared object in exports object, returning it into a JavaScript context.

Summary

Basically, that’s how require(‘binding.node’) works, so you can build C++ code to share an object, using node-gyp, and require it in your JavaScript code.

Don’t forget to follow me here if you’re interested in such things. Get in touch with me on Twitter. Ask questions. Thanks for reading.

Related articles\videos

How does NodeJS work?
Creating Native Addons — General Principles
Addons API


Eugene Obrezkov, Senior Node.js Developer at Kharkov, Ukraine.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.