Illumination #001: Debugging Rustler on illumos

Welcome to SYSTEM•ILLUMINATION! This is the first illumination I have written and the one that prompted me to start this space. This first session tackles several topics as you join me on the journey I took to debug Rustler misbehaving on OmniOS.

I'm a beginner with illumos. This page serves a twofold purpose: to help me document and clarify my learnings as I delve into the illumos/Solaris world. And to shine a bit of light into a system that is fairly obscure and hard to get good info on. However, because I'm also learning it, I may be wrong on many things. Use the information in these pages as a starting point for your own investigations.

From Linux to illumos

Recently I decided I wanted to delve deeper into the illumos ecosystem; I had briefly used it a couple of years ago for a small server I was running. Two weeks ago, I had to decide on and build the final infrastructure for Katarineko a personal project I'm about to release. I had the choice between using my trusted Linux or giving in to the voice in my head that said USE ILLUMOS!. I did give in, maybe not the sanest choice, but it is my choice.

Katarineko is built with Elixir, a language that runs on the Erlang virtual machine. For reasons I will not get into here, I also have a couple of NIFs written in Rust. NIFs are the main way you can call into compiled languages from Erlang/Elixir. The Rustler project is a very straightforward way of integrating NIFs written in Rust, it handles compilation lifecycle, exposing all the right structures and functions, and serialization between Erlang and Rust, including support for Serde. We'll get into how this works a bit further down.

Migrating was quite simple. I used pkgsrc to get Elixir packages, set up some zones, and I was up and running within a day. At first, everything appeared to be running properly: there were no errors, and the page was running smoothly. I shared the beta version with a friend that was testing and he tried to use the Spaced-Repetition system. :erlang.nif_error(:nif_not_loaded). The Sun gods had provided an opportunity to explore the famous observability tooling of Solaris earlier than I thought. Let's get right to it!

dtrace primer

One reason for moving to illumos was to learn dtrace(8). dtrace is a dynamic tracing framework to observe the state and evolution of the system and user programs. We will use it extensively during this debugging session, so it is better to become acquainted with its syntax and capabilities. The D language is a mix between awk and C. Let's check an example:

#!/usr/sbin/dtrace -s syscall::open:entry, syscall::openat:entry1 /pid == $target/2 { printf("target used open syscall on %s\n", copyinstr(arg0)); 3 }

Each D program consists of a set of clauses. Each clause has 3 sections:

  • Probes that enable the clause
  • Predicates are expressions enclosed in slashes that are evaluated at probe firing time. If the expression evaluates to true, the actions will execute. Predicates are optional and can be omitted.
  • Actions are a list of C-like statements that will be executed if the probe fires and the predicate evaluates to true.

Another useful feature we'll use soon is thread-local variables, this allows you to conditionally run some actions only after a previous condition has been met. You'll see how this becomes useful very soon. Thread-local variables are referenced by using the -> operator to the identifier self. For example self->trace = 1 sets a thread-local variable named trace to the value one.

You can read more about DTrace in the Dynamic Tracing Guide

Erlang NIFs with Rustler

NIFs didn't load. How do NIFs usually load? The only thing I knew is that NIFs are loaded from a shared library, an .so file in Unix-like systems. So let's use DTrace to figure out what Erlang code is responsible for this to try to find where it goes wrong. Since it is a shared library object, it is probably using dlopen. For reasons I cannot remember, I traced the open(2) syscall instead. Here's a very small script I wrote for this purpose.

erlang_nif.d syscall::open:entry /pid == $target && strstr(copyinstr(arg0), "libkatarineko_srs") != NULL/ { printf("OPENED LIBKATARINEKO %s\n", copyinstr(arg0)); ustack(); } You can run it like so: # dtrace -s erlang_nif.d -c 'mix test'

This D script contains only one clause, which will execute the actions whenever the target process (specified by -p or -c) executes the open syscall with a first argument that contains the string "libkatarineko_srs". If that matches we'll print what the argument was and print the user-space stack (with ustack()). Running it gives us the following stack trace:

dtrace: script 'erlang_nif.d' matched 8 probes CPU ID FUNCTION:NAME 28 618 open:entry OPENED LIBKATARINEKO /root/katarineko/_build/prod/lib/katarineko/priv/native/libkatarineko_srs.so ld.so.1`_syscall6+0x1b ld.so.1`open+0x33 ld.so.1`file_open+0x456 ld.so.1`find_path+0x96 ld.so.1`load_so+0x240 ld.so.1`load_path+0x87 ld.so.1`load_one+0x15c ld.so.1`dlmopen_core+0x11e ld.so.1`dlmopen_intn+0x93 ld.so.1`dlmopen_check+0x71 ld.so.1`dlopen+0x59 beam.smp`erts_sys_ddll_open_noext+0x2f beam.smp`erts_sys_ddll_open+0x73 beam.smp`erts_load_nif+0x5b1 beam.smp`_Z17beam_jit_load_nifP7processPKvPm+0x3a 0xfffffbffe20012b0 0x3b

We quickly find which function in Erlang is responsible for loading the nifs. Let's take a look at the function and see what could be going wrong or what else we could inspect to understand the problem better. The function seems to take care of most error conditions that were going on through my mind. It seemed there was no problem loading the shared library. In case I was wrong, I added a user function trace for load_nif_error, but it was never called.

pid$target::load_nif_error:entry { printf("ERROR: %s\n", copyinstr(arg0)); }

To confirm that it was working properly, I traced the user function patch_call_nif_early that is called only in the case everything goes well.

if (ret == am_ok) { ... patch_call_nif_early(entry, this_mi); ... }

It was called, so Erlang appeared to load everything correctly and to consider the operation successful; however, no NIF was actually being loaded. I needed to understand a bit more how the shared library actually exposes the NIFs and how Erlang discovers the functions. Here I also cloned the Rustler repo and checked if the tests failed for their repo as well or if it was my own setup. All of their tests failed as well.

NIF Shared Libraries

The entry above is of type ErlNifEntry.

typedef struct enif_entry_t { int major; int minor; const char* name; int num_of_funcs; ErlNifFunc* funcs; int (*load) (ErlNifEnv*, void** priv_data, ERL_NIF_TERM load_info); int (*reload) (ErlNifEnv*, void** priv_data, ERL_NIF_TERM load_info); int (*upgrade)(ErlNifEnv*, void** priv_data, void** old_priv_data, ERL_NIF_TERM load_info); void (*unload) (ErlNifEnv*, void* priv_data); /* Added in 2.1 */ const char* vm_variant; /* Added in 2.7 */ unsigned options; /* Unused. Can be set to 0 or 1 (dirty sched config) */ /* Added in 2.12 */ size_t sizeof_ErlNifResourceTypeInit; /* Added in 2.14 */ const char* min_erts; } ErlNifEntry;

When Erlang loads a NIF shared library, it looks for a symbol nif_init and calls it with no arguments. This function returns an ErlNifEntry that, among other information, contains a field useful for us: num_of_funcs. I was thinking that maybe, for some reason, under illumos Rustler was not properly populating the function array.

Now, I wasn't sure how to use structs in DTrace, and I was in an AVE heading to my first Go tournament and I had very bad internet so I went with what I knew. I created a small shared library that internally called the actual NIF shared library. The idea was to be able to introspect what the NIF shared library was returning in the ErlNifEntry.

#include <dlfcn.h> #include <stdio.h> #include <stdlib.h> typedef struct enif_entry_t { int major; int minor; const char *name; int num_of_funcs; // We don't need any more of the struct for now } nif_entry; void* _init_nif() { const char *lib_path = "./priv/native/rustler_serde_test.so"; void *hdl = dlopen(lib_path, RTLD_NOW); void *sym = dlsym(hdl, "nif_init"); nif_entry* (*initfn)(void) = (nif_entry* (*)(void))sym; nif_entry *entry = initfn(); printf( "NIF loaded: %s (version %d.%d)\n", entry->name, entry->major, entry->minor ); printf("# Funcs: %d\n", entry->num_of_funcs); return (void*)entry; } void* init_nif() { return _init_nif(); }

I compiled this as a shared library and loaded this library instead of the original. It all loaded fine, the name and version looked good (so it seems we did have a valid entry). However the number of functions was zero. It seems that whatever Rustler does to gather all the NIFs wasn't working on illumos.

Rustler Internals

I dove into the Rustler codebase. It is fairly small but it contains a lot of fairly hard to follow code (at least for me) with several macros and string-based code generation. Since I knew that at some point it had to generate a NIF entry I looked for places where entry was referenced. Finally I found this section of the code.

let nif_funcs: Box<[_]> = rustler::codegen_runtime::inventory::iter::() .map(rustler::Nif::get_def) .collect();

By some (for now unkown) mechanism the functions were available in this inventory iterator. A quick println! debugging showed that indeed, nif_funcs was empty. I also saw that the NIFs were added to this inventory in the #[nif] macro.

quote! { rustler::codegen_runtime::inventory::submit!( rustler::Nif { ... } } }

Somehow, at the top level, submit! was adding structs to this inventory. And this mechanism intuitively seemed to be the culprit. I validated that these macros were being called for the functions I was exposing, but when collecting the NIFs, they were empty. One more layer, enter the Inventory crate.

The Inventory Crate

Inventory defines itself as

a way to set up a plugin registry into which plugins can be registered from any source file linked into your application. There does not need to be a central list of all the plugins

It also explains a bit the mechanism as to how it works

Inventory is built on runtime initialization functions similar to __attribute__((constructor)) in C, and similar to the ctor crate. Each call to inventory::submit! produces a shim that evaluates the given expression and registers it into a registry of its corresponding type. This registration happens dynamically as part of life-before-main for statically linked elements. Elements brought in by a dynamically loaded library are registered at the time that dlopen occurs.

Inventory is a very small library, a little under 600 lines of code, most of it comments and documentation. The most interesting and relevant part is the __do_submit macro . Let's dissect what the macro is doing.

It requires you to define a Plugin type. In this case it is rustler::Nif. Once the library is loaded, when you call the collect! macro from the crate it will implement a Registry trait for your type. This trait will later allow submitting the structs at runtime and building the linked list dynamically. Let's see how that works.

For each submit! it generates an __INVENTORY value that contains the rustler::Nif struct.

static __INVENTORY: $crate::Node = $crate::Node { value: &{ $($value)* }, next: $crate::core::cell::UnsafeCell::new($crate::core::option::Option::None), #[cfg(target_family = "wasm")] initialized: $crate::core::sync::atomic::AtomicBool::new(false), };

A function __ctor() is generated; when called at library load time, it submits the node to the registry.

unsafe extern "C" fn __ctor() { unsafe { $crate::ErasedNode::submit(__INVENTORY.value, &__INVENTORY) } }

Calling all of these functions populates a linked list (using the next attribute from the Node type) that will then be walked by the iter() method we saw earlier. The last piece of the puzzle adds the address of __ctor() to the .init_array section of the ELF file in the case of illumos.

$($used)+ #[cfg_attr( all( not(target_family = "wasm"), any( target_os = "linux", target_os = "android", target_os = "dragonfly", target_os = "freebsd", target_os = "haiku", target_os = "illumos", target_os = "netbsd", target_os = "openbsd", target_os = "none", ) ), link_section = ".init_array", )] #[cfg_attr( target_family = "wasm", $crate::__private::attr( any(all(stable, since(1.85)), since(2024-12-18)), link_section = ".init_array", ), )] #[cfg_attr( any(target_os = "macos", target_os = "ios"), link_section = "__DATA,__mod_init_func,mod_init_funcs", )] #[cfg_attr(windows, link_section = ".CRT$XCU")] static __CTOR: unsafe extern "C" fn() = __ctor;

ld.so.1

ld.so.1(1) is the illumos runtime linker for dynamic objects, and it is responsible for loading the shared library objects like our NIF library. One of the tasks after loading and resolving symbols is calling the initialization functions, there are several sections for this, but as mentioned before the one we care about is .init_array. .init_array is an array of pointers to functions that is run sequentially. These functions are executed before dlopen(3C) returns. Let's check if those were properly populated in the binary.

$ elfdump priv/native/rustler_serde_test.so | grep CTOR [1726] 0x00000000001e87f0 0x0000000000000008 OBJT LOCL D 0 .init_array _ZN18rustler_serde_test4test1_6__CTOR17hff0bbe6e5632e929E [1815] 0x00000000001e87f8 0x0000000000000008 OBJT LOCL D 0 .init_array _ZN18rustler_serde_test1_6__CTOR17hb75d58ed18ea6865E [1816] 0x00000000001e8800 0x0000000000000008 OBJT LOCL D 0 .init_array _ZN18rustler_serde_test1_6__CTOR17hc62992ae880cf163E [1817] 0x00000000001e8808 0x0000000000000008 OBJT LOCL D 0 .init_array _ZN18rustler_serde_test1_6__CTOR17hd4e36f059b2ed20eE [1914] 0x00000000001e8810 0x0000000000000008 OBJT LOCL D 0 .init_array _ZN18rustler_serde_test4json1_6__CTOR17hc50c07f6139fceefE [1915] 0x00000000001e8818 0x0000000000000008 OBJT LOCL D 0 .init_array _ZN18rustler_serde_test4json1_6__CTOR17h600856d690bcf7dcE [1916] 0x00000000001e8820 0x0000000000000008 OBJT LOCL D 0 .init_array _ZN18rustler_serde_test4json1_6__CTOR17h47804ad1bc4e3b93E [1917] 0x00000000001e8828 0x0000000000000008 OBJT LOCL D 0 .init_array _ZN18rustler_serde_test4json1_6__CTOR17hd7df8ce4201058e0E [1918] 0x00000000001e8830 0x0000000000000008 OBJT LOCL D 0 .init_array _ZN18rustler_serde_test4json1_6__CTOR17h558cc91143bf9f1bE [1919] 0x00000000001e8838 0x0000000000000008 OBJT LOCL D 0 .init_array _ZN18rustler_serde_test4json1_6__CTOR17h63d284544c9c3173E

So they are where they are supposed to be; what's unclear is whether they are properly called. Since it is not working maybe the illumos runtime linker doesn't take into account .init_array? The other option is that the functions themselves are not working but I find that unlikely and I don't think there should be a difference between Linux and Ilumos in this case. Let's test the theory and check if they are being called at all. Since the symbols will not be available until dlopen has been called we'll need to employ a bit of a trick with dtrace.

We'll start with a script that stops the process when we are about to start the dlopen(3) initialization section, let's browse the linker code to find a good place to trace. Once we find it, the idea is to determine when we open the library we care about, and then stop before initialization (but after symbols are loaded; very important, since we need to be able to attach probes to the __CTOR symbols). The call_init function seems like a good place to do so. Let's write the D script to stop the process once we arrive there.

pid$target::dlopen:entry /strstr(copyinstr(arg0), "rustler_serde_test.so") != NULL/ { printf("DLOPEN: %s\n", copyinstr(arg0)); self->trace = 1; } pid$target::dlopen:return /self->trace == 1/ { printf("DLOPEN RETURN"); self->trace = 0; } pid$target::call_init:entry /self->trace == 1/ { printf("PREINIT: %s %d", copyinstr(arg0), pid); stop(); }

This script has three clauses, the first one checks when we are executing dlopen(3) for the target library we want. In that case we set a thread-local variable stating that we are intending to trace, the second clause removes that thread-local variable so no other calls to call_init stop the process. The third clause ensures we stop the process just before starting the initialization code. We need to call this script with DTrace's -w flag, to allow destructive operations (like stopping the process).

Once we have the process stopped right were we want to, we can do a user-function tracing like we did before for one of the __CTOR symbols.

pid$target::_ZN18rustler_serde_test1_6__CTOR17hc62992ae880cf163E:entry

Now we kill the first dtrace program since we can only trace a program once and run the second D script. That will automatically resume the process and output trace logs if any of the probes matched. They did not. Does ld.so.1(1) support .init_array?.

Et tu, ELF?

It does support it. I examined the code and, from the same call_init function we traced, we can see that it calls the functions in .init_array

void call_init(Rt_map **tobj, int flag) { ... call_array(INITARRAY(lmp), INITARRAYSZ(lmp), lmp, SHT_INIT_ARRAY); ... }

You can also check the code for call_array . It is very straightforward, iterate over the array and call each function in turn. So, why aren't they being called?. Let's try to get some more information from the call_init function. This function takes as its first parameter, an Rt_map. Let's see what that looks like and if we can get any extra information from there. For example the size of the array. We'll write a D script that introspects into this struct.

struct rt_map { ... Addr *rt_initarray; /* .init_array table */ Addr *rt_finiarray; /* .fini_array table */ Addr *rt_preinitarray; /* .preinit_array table */ ... uint_t rt_initarraysz; /* size of .init_array table */ uint_t rt_finiarraysz; /* size of .fini_array table */ uint_t rt_preinitarraysz; /* size of .preinit_array table */ ... };

I have zoomed in on the relevant fields, you can check the whole struct here. Let's modify the stop() script to print the rt_initarraysz that is passed to call_array.

pid$target::dlopen:entry /strstr(copyinstr(arg0), "rustler_serde_test.so") != NULL/ { printf("DLOPEN: %s\n", copyinstr(arg0)); self->trace = 1; } pid$target::dlopen:return /self->trace == 1/ { printf("DLOPEN RETURN"); self->trace = 0; } - pid$target::call_init:entry - /self->trace == 1/ - { - printf("PREINIT: %s %d", copyinstr(arg0), pid); - stop(); - } + pid$target::call_array:entry + /self->trace == 1/ + { + printf("CALL ARRAY (size: %d)", arg1); + }

When I ran this I got: CALL ARRAY (size: 8), now taking into account that this is in bytes, this means the array only has one element!

Let's check what elfedit(1) has to say about the .init_array section and why it reports a size of 1 when I see 10 __CTOR symbols inside.

$ elfedit -e 'shdr:dump .init_array' ./priv/native/rustler_serde_test.so Section Header[693]: sh_name: .init_array sh_addr: 0x1e87e8 sh_flags: [ SHF_WRITE SHF_ALLOC ] sh_size: 0x8 sh_type: [ SHT_INIT_ARRAY ] sh_offset: 0x1d87e8 sh_entsize: 0x8 (1 entry) sh_link: 0 sh_info: 0 sh_addralign: 0x8 Section Header[694]: sh_name: .init_array sh_addr: 0x1e87f0 sh_flags: [ SHF_WRITE SHF_ALLOC 0x100000 ] sh_size: 0x50 sh_type: [ SHT_INIT_ARRAY ] sh_offset: 0x1d87f0 sh_entsize: 0x8 (10 entries) sh_link: 0 sh_info: 0 sh_addralign: 0x8

Ooh.. we get TWO .init_array sections!! And the first one has one entry (0x8 bytes). Looking at the ld.so.1(1) call_init code again I can see that it only expects a single .init_array section, and I guess it takes the first one it sees? (The answer is no, but I didn't know yet).

Et tu, /usr/bin/ld?

Okay, so how do I fix this? Several things came to mind, first thing was to create a link script, however it seems that the illumos linker is a bit more limited in this aspect than the GNU one I'm (more or less) used to. There is a thing called a mapfile but the documentation is honestly pretty bad and I couldn't figure it out.

The next thing I tested was creating a C program using libelf(3lib) to try to merge both sections. However, when I did, elfdump(1) reported many errors about broken references. In between those complaints I saw one thing that caught my eye.

DT_INIT_ARRAY does not point to a section of type SHT_INIT_ARRAY. DT_INIT_ARRAYSZ is invalid

Hmmm, what is this? These are .dynamic table entries, and they tell the runtime linker where to obtain certain information. And these two entries in particular tell ld.so.1(1) which .init_array section to use! elfdump(1) has the -d flag to dump the dynamic section.

$ elfdump -d priv/native/rustler_serde_test.so Dynamic Section: .dynamic index tag value [0] NEEDED 0x564e9 libsocket.so.1 [1] NEEDED 0x56528 librt.so.1 [2] NEEDED 0x5653c libpthread.so.1 [3] NEEDED 0x56555 libumem.so.1 [4] NEEDED 0x56562 libgcc_s.so.1 [5] NEEDED 0x5658a libc.so.1 [6] INIT 0x1aac60 [7] FINI 0x1aac70 [8] RUNPATH 0x56630 /usr/gcc/14/lib/amd64 [9] RPATH 0x56630 /usr/gcc/14/lib/amd64 [10] HASH 0x2b930 [11] STRTAB 0x48130 [12] STRSZ 0x56846 [13] SYMTAB 0x398f8 [14] SYMENT 0x18 [15] SUNW_SYMTAB 0x306a0 [16] SUNW_SYMSZ 0x17a90 [17] SUNW_SORTENT 0x4 [18] SUNW_SYMSORT 0x9ff18 [19] SUNW_SYMSORTSZ 0x3b3c [20] CHECKSUM 0x2c97 [21] VERNEED 0x9e978 [22] VERNEEDNUM 0x6 [23] RELACOUNT 0x8ad [24] FINI_ARRAY 0x1e87e0 [25] FINI_ARRAYSZ 0x8 [26] INIT_ARRAY 0x1e87e8 [27] INIT_ARRAYSZ 0x8 [28] PLTRELSZ 0x3cc0 [29] PLTREL 0x7 [30] JMPREL 0xb1c00 [31] RELA 0xa3a58 [32] RELASZ 0x11e68 [33] RELAENT 0x18 [34] SYMINFO 0x29278 [35] SYMINSZ 0x26b4 [36] SYMINENT 0x4 [37] FLAGS 0 0 [38] FLAGS_1 0x208000 [ DISPLACE-RELOCS-DONE EDITED ] [39] SUNW_STRPAD 0x200 [40] SUNW_LDMACH 0x3e EM_AMD64 [41] PLTGOT 0x1e7088 [42-52] NULL 0

So, what happens if I change these values to the values for the other .init_array section? We can do that easily with elfedit(1). Note that we also set the sh_type of the "wrong" section to SHT_NULL, I'm not sure if it is needed but elfdump(1) complains if I don't do it.

$ elfedit priv/native/rustler_serde_test.so > dyn:value DT_INIT_ARRAY 0x1e87f0 > dyn:value DT_INIT_ARRAYSZ 0x50 > shdr:sh_type -shndx 693 SHT_NULL > sys:write > sys:quit

Aaaand... it works! NIFs now are properly loaded.

Next steps and open questions

For now this has unblocked the release of my project, I've added an extra step when building to change the dynamic table values so the corrent .init_array gets called. However, a question remains, why are two separate sections created while in Linux there's only one section? It must of course be a difference in the linkers. I haven't found a way to force the linker to merge both sections, if you know let me know at marce [at] dziban [dot] net. My best guess is that both sections have different flags, which prevents the sections from merging. The "wrong" section has sh_flags: [ SHF_WRITE SHF_ALLOC ] while the correct one has sh_flags: [ SHF_WRITE SHF_ALLOC 0x100000 ]. That 0x100000 flag is a Solaris flag that is not recognized by illumos, from my research it corresponds to SHF_SUNW_NODISCARD, which prevents the compiler and linker from discarding the section even if it appears unused. I wonder what's adding that flag? Is it LLVM? Maybe the #[used] macro from the Inventory crate?