Hacking window titles to help OBS

This write-up is meant to present the rationale and technical details behind a tiny project I wrote the other day, WTH, or WindowTitleHack, which is meant to force a constant window name for apps that keep changing it (I'm looking specifically at Firefox and Krita, but there are probably many others).

Why tho?

I've been streaming on Twitch from Linux (X11) with a barebone OBS Studio setup for a while now, and while most of the experience has been relatively smooth, one particularly striking frustration has been dealing with windows detection.

If we don't want to capture the whole desktop for privacy reasons or simply to have control over the scene layout depending on the currently focused app, we need to rely on the Window Capture (XComposite) source. This works mostly fine, and it is actually able to track windows even when their title bar is renamed. But obviously, upon restart it can't find them again because both the window titles and the window IDs changed, meaning we have to redo our setup by reselecting the windows again.

It would have been acceptable if that was the only issue I had, but one of the more advanced feature I'm extensively using is the Advanced Scene Switcher (the builtin one, available through the Tools menu). This tool is a basic window title pattern matching system that allows automatic scene switches depending on the current window. Note that it does seem to support regex, which could help with the problem, but there is no guarantee that the app would leave a recognizable matchable pattern in its title. Also, if we want multiple Firefox windows but only match one in particular, the regex wouldn't help.

Hacking Windows

One unreliable hack would be to spam xdotool commands to correct the window title. This could be a resource hog, and it would create quite a bunch of races. One slight improvement over this would be to use xprop -spy, but that wouldn't address the race conditions (since we would adjust the title after it's been already changed).

So how do we deal with that properly? Well, on X11 with the reference library (Xlib) there are actually various (actually a lot of) ways of changing the title bar. It took me a while to identify which call(s) to target, but ended up with the following call graph, where each function is actually exposed publicly:

From this we can easily see that we only need to hook the deepest function XChangeProperty, and check if the property is XA_WM_NAME (or its "modern" sibling, _NET_WM_NAME).

How do we do that? With the help of the LD_PRELOAD environment variable and a dynamic library that implements a custom XChangeProperty.

First, we grab the original function:

#include <dlfcn.h>

/* A type matching the prototype of the target function */
typedef int (*XChangeProperty_func_type)(
    Display *display,
    Window w,
    Atom property,
    Atom type,
    int format,
    int mode,
    const unsigned char *data,
    int nelements
);

/* [...] */

XChangeProperty_func_type XChangeProperty_orig = dlsym(RTLD_NEXT, "XChangeProperty");

We also need to craft a custom _NET_WM_NAME atom:

_NET_WM_NAME = XInternAtom(display, "_NET_WM_NAME", 0);

With this we are now able to intercept all the WM_NAME events and override them with our own:

if (property == XA_WM_NAME || property == _NET_WM_NAME) {
    data = (const unsigned char *)new_title;
    nelements = (int)strlen(new_title);
}
return XChangeProperty_orig(display, w, property, type, format, mode, data, nelements);

We wrap all of this into our own redefinition of XChangeProperty and… that's pretty much it.

Now due to a long history of development, Xlib has been "deprecated" and superseded by libxcb. Both are widely used, but fortunately the APIs are more or less similar. The function to hook is xcb_change_property, and defining _NET_WM_NAME is slightly more cumbered but not exactly challenging:

const xcb_intern_atom_cookie_t cookie = xcb_intern_atom(conn, 0, strlen("_NET_WM_NAME"), "_NET_WM_NAME");
xcb_intern_atom_reply_t *reply = xcb_intern_atom_reply(conn, cookie, NULL);
if (reply)
    _NET_WM_NAME = reply->atom;
free(reply);

Aside from that, the code is pretty much the same.

Configuration

To pass down the custom title to override, I've been relying on an environment variable WTH_TITLE. From a user point of view, it looks like this:

LD_PRELOAD="builddir/libwth.so" WTH_TITLE="Krita4ever" krita

We could probably improve the usability by creating a wrapping tool (so that we could have something such as ./wth --title=Krita4ever krita). Unfortunately I wasn't yet able to make a self-referencing executable accepted by LD_PRELOAD, so for now the manual LD_PRELOAD and WTH_TITLE environment will do just fine.

Thread safety

To avoid a bunch of redundant function round-trips we need to globally cache a few things: the new title (to avoid fetching it in the environment all the time), the original functions (to save the dlsym call), and _NET_WM_NAME.

Those are loaded lazily at the first function call, but we have no guarantee with regards to concurrent calls on that hooked function so we must create our own lock. I initially though about using pthread_once but unfortunately the initialization callback mechanism doesn't allow any custom argument. Again, this is merely a slight annoyance since we can implement our own in a few lines of code:

/* The "once" API is similar to pthread_once but allows a custom function argument */
struct wth_once {
    pthread_mutex_t lock;
    int initialized;
};

#define WTH_ONCE_INITIALIZER {.lock=PTHREAD_MUTEX_INITIALIZER}

typedef void (*init_func_type)(void *user_arg);

void wth_init_once(struct wth_once *once, init_func_type init_func, void *user_arg)
{
    pthread_mutex_lock(&once->lock);
    if (!once->initialized) {
        init_func(user_arg);
        once->initialized = 1;
    }
    pthread_mutex_unlock(&once->lock);
}

Which we use like this:

static struct wth_once once = WTH_ONCE_INITIALIZER;

static void init_once(void *user_arg)
{
    Display *display = user_arg;
    /* [...] */
}

/* [...] */

wth_init_once(&once, init_once, display);

The End?

I've been delaying doing this project for weeks because it felt complex at first glance, but it actually just took me a few hours. Probably the same amount of time it took me to write this article. While the project is admittedly really small, it still feel like a nice accomplishment. I hope it's useful to other people.

Now, the Wayland support is probably the most obvious improvement the project can receive, but I don't have such a setup locally to test yet, so this is postponed for an undetermined amount of time.

The code is released with a permissive license (MIT); if you want to contribute you can open a pull request but getting in touch with me first is appreciated to avoid unnecessary and overlapping efforts.

For updates and more frequent content you can follow me on Mastodon. Feel also free to subscribe to the RSS in order to be notified of new write-ups. It is also usually possible to reach me through other means (check the footer below). Finally, discussions on some of the articles can sometimes be found on HackerNews, Lobste.rs and Reddit.