Wkhtmltopdf / Wkhtmltoimage x CentOS x Xen = Segfault Mania

The Behance Network uses a number of nifty little binary files to create some of the useful services that our platform offers. Two of them in particular are wkhtmltoimage and wkhtmltopdf ( both 64-bit static binary files ). These two files convert HTML to either a PDF or a thumbnail image of a full webpage and dumps the output. These tools work flawlessly on both our sandbox environments ( Ubuntu 11.04 ) and our production image servers ( CentOS 5.5 ). When we try to execute these files on one of our new Imageservice cloud servers ( CentOS 5.4 ),  we receive the dreaded:

“Segmentation fault”

Let’s start off with basics, what exactly is a segfault?

According to Wikipedia ( http://en.wikipedia.org/wiki/Segmentation_fault ):

A segmentation fault (often shortened to segfault) or bus error is generally an attempt to access memory that the CPU cannot physically address. It occurs when the hardware notifies a Unix-like operating system about a memory access violation. The OS kernel then sends a signal to the process which caused the exception. By default, the process receiving the signal dumps core and terminates.

But surely we can obtain a little information about why this is happening, no? Let’s try a couple things.

First, let’s try to use “strace” ( trace system calls / signals … should help us figure out if we’re missing a dependency or something like that ):

$ strace ./wkhtmltopdf-amd64
execve(“./wkhtmltopdf-amd64″, ["./wkhtmltopdf-amd64"], [/* 24 vars */]) = 0
mmap(0x26d8000, 34316715, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, 0, 0) = 0x26d8000
readlink(“/proc/self/exe”, “/path/to/imageserice/library/wkhtml/wkhtmltopdf-amd64″…, 4096) = 72
mmap(0×400000, 36536320, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0×400000
mmap(0×400000, 32300155, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0×400000
mprotect(0×400000, 32300155, PROT_READ|PROT_EXEC) = 0
mmap(0x24cd000, 1950128, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0x1ecd000) = 0x24cd000
mprotect(0x24cd000, 1950128, PROT_READ|PROT_WRITE) = 0
mmap(0x26aa000, 188272, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x26aa000
brk(0x26d8000) = 0x26d8000
— SIGSEGV (Segmentation fault) @ 0 (0) —
+++ killed by SIGSEGV +++

Well, that was useless. Right when it starts executing, we get hit with a Segfault ( SIGSEGV ).

Okay, lets try using “gdb” to debug the file and see whats going on “inside” while it executes. The “bt” command ( backtrace ) should be able to find out whats wrong right before it crashes.

$ gdb
GNU gdb Fedora (6.8-37.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type “show copying”
and “show warranty” for details.
This GDB was configured as “x86_64-redhat-linux-gnu”.
Use the “file” or “exec-file” command.
(gdb) file ./wkhtmltopdf-amd64
Reading symbols from /path/to/imageserice/library/wkhtml/wkhtmltopdf-amd64…(no debugging symbols found)…done.
(gdb) run
Starting program: /path/to/imageserice/library/wkhtml/wkhtmltopdf-amd64 ./wkhtmltopdf-amd64

Program received signal SIGSEGV, Segmentation fault.
0×0000000003166025 in ?? ()
(gdb) bt
#0 0×0000000003166025 in ?? ()
#1 0×0000000003166438 in ?? ()
#2 0×0000000000000000 in ?? ()
(gdb)

Again, useless. Crashes before it even starts.

What’s the big difference between our Imageservice in production and the new one we plan on using ?

Imageservice ( CentOS 5.5 )    vs     Imageservice.2 ( CentOS 5.4 )

One word:       Xen

What the hell is Xen? Wikipedia ( http://en.wikipedia.org/wiki/Xen ):

The Xen® hypervisor, the powerful open source industry standard for virtualization, offers a powerful, efficient, and secure feature set for virtualization of x86, x86_64, IA64, ARM, and other CPU architectures

Simply put, Xen is a completely different type of Linux kernel that Rightscale ( our web based cloud computing management platform ) uses to deploy millions of cloud-servers onto the web. The creator of the wkhtmltoimage/PDF libraries currently does NOT support Xen ( and probably won’t ). As a result, the static binary files FAIL upon execution every single time. Currently the only solution, ironically, is to use the 32-bit versions of these tools. Hopefully this saves someone out there from spending a couple days trying to debug this issue!

Now back to your regularly scheduled programming.

– Malcolm
@bossjones