A Sudo vulnerability (CVE-2021–3156) found by Qualys, Baron Samedit: Heap-Based Buffer Overflow in Sudo, is a very interesting issue because Sudo program is widely installed on Linux, BSD, macOS, Cisco (maybe more).
This post describes the exploitation of the vulnerability on Linux x64. For vulnerability detail, please see the original Qualys’ advisory.
Trace Heap Usages
The vulnerability is heap-based buffer overflow on suid binary. We have only one chance to trigger heap overflow without knowing memory address in advance (because of ASLR). So we should avoid overwriting pointers and focus on data which lead to code execution.
To understand the execution flow related to heap, I traced heap usages on Ubuntu 18.04 from malloc, realloc, calloc, and free functions with gdb script. Here is a flow from main to a vulnerable function.
Then a flow from a vulnerable function to asking password code.
Identify possible objects for overwriting
With above flows, I found interesting objects that allocated before heap overflow and used after it (target for overwriting).
- nss service_user object (mentioned in advisory)
- def_timestampdir path (mentioned in advisory too)
- compar function pointer in rbtree struct
A function pointer can be partial overwritten to bypass ASLR (with small bruteforcing) but unfortunately the first argument is empty string.
- userspecs object from parsing /etc/sudoers
possible to bypass authentication on sudo version >=1.8.9 but have to fake many objects
If you have read Qualys’ advisory, you might have noticed sudo_hook_entry overwrite is missing. I traced heap usage on Ubuntu 18.04. In a flow, I found getenv(“SUDO_EDITOR”) after authentication. Normally you cannot pass authentication stage. So this method is invalid on Ubuntu 18.04.
glibc heap with/without tcache
Since glibc version 2.25, tcache is added to heap allocation. The tcache bins and fast bins are very similar. Here is a quick comparison
What is same:
- Both are marked as used
- Both are LIFO (Last In, First Out)
- They can be allocated again only if request size is exactly same as bin size
What is different:
- Max fast bins size is 0x80. Max tcache bins size is 0x410
- When request size is larger than small bins (0x400), all fast bins are moved to unsorted bins. Then they might get consolidated then moved to small or large bins.
There are many large allocations (many from file buffer in glibc) in sudo program. With tcache bins, large chunk allocation has no effect on tcache bins. Some free chunks with certain size might stay in tcache bins forever. We will see it later that the vulnerability can be exploited reliably with tcache bins.
From a trace log, I noticed a malloc and a free functions are called many times inside a glibc setlocale function. The interesting point is that allocation and free size are based on LC_* environments which are controllable before executing sudo.
With a hope for controlling heap with LC_* environments, I read glibc setlocale source code. Here is a short summary for exploiting this vulnerability.
When a setlocale function is called with empty string (at start of sudo program), LC_* environments are used as its input found in _nl_find_locale function.
From code, the priority for getting locale name for each category is LC_ALL, LC_<CATEGORY_NAME>, LANG environment. If none are set, special locale name “C” is used. If locale name is “C”, _nl_find_locale function will return immediately without touching heap. While other names are exploded as shown below.
Comment is very clear. A return value, “mask”, is flags to indicate parts existence in given locale name. Then _nl_make_l10nflist function is called to check if a given locale name for the category is loaded or not. Before checking, a given name must be constructed from parts and category name. It requires malloc for storing full filename. If it is in list or a call is for getting only, free it and return.
If it is not in loaded list, a _nl_make_l10nflist function is called again to create all possible paths (base directory is /usr/lib/locale) from combination of parts and category name by calling itself recursively with modified “mask”. The algorithm might generate duplicated part (malloc) and remove it (free) after checked. More parts in locales, more malloc and free functions are called.
Below is an example of heap usage by _nl_make_l10nflist function with “C.UTF-8@A…A”. So many mallocs and frees in one function. This can help us for managing heap layout. But it’s too complicate. I cannot predict it in advance. What I do is try and see a result. If I want bigger free chunk, I try increasing modifier length and see a result. If I want more free chunk, I try adding territory and see a result.
Then, a _nl_find_locale try loading locale data from generated paths one by one. If a valid locale data is found, data will be returned. Then, a setlocale function strdup a given locale name and save it internally.
If an error occurs, a setlocale function will free all saved names and use “C” as default then return immediately. So a locale name cannot be random. At least, language and codeset must be valid.
After all category name is strdup()ed and data is loaded, LC_ALL is created in new_composite_name function. If all LC name are same, its value is just pick from a first one (same as setting LC_ALL environment). If all LC name are not same, its value is combined from all LC name such as “LC_CTYPE=C;LC_NUMERIC=C.UTF-8;…”.
Control heap usages with environment
Besides LC environment value length, I found some environment value that helping exploit this vulnerability.
— Set environment “TZ=:”. This one reduces heap usages in glibc tzset() function to a few and completely predictable
— Append “;x=x” in any LC category environment. Here is what happen
- First setlocale(“”) does mallocs and frees normally and LC_ALL will be “…;x=x;…”
- Then, setlocale(NULL) to get current LC_ALL value and save it
- Then, setlocale(“C”) will free() all given locale name
- Then, setlocale(saved_LC_ALL) will do nothing because ‘x’ is invalid category name
- Now LC_ALL in glibc is “C”
- Next setlocale will do nothing because LC_ALL is “C”
- The result is we have free chunks with controllable size from LC environment
struct service_user overwrite with glibc tcache
From a trace log, I found only 2 nss_load_library calls. Then I checked where the service_user object is created.
It is created from “group” line in “nsswitch.conf”. As seen, sudo created 2 service_user objects (from 2 services in passwd line) before a target one. So the exploit should read nsswitch.conf to determine an offset or number of free chunk size 0x40 to be created.
With “TZ=:” and “;x=x” in LC environment tricks, controlling heap usage before parsing “/etc/nsswitch.conf” becomes easy. The malloc/free trace from calling sudo_conf_read_v1 function to calling get_user_info function looks like below.
For example, setting environment “LC_CTYPE=C.UTF-8@”+”A”*0x28 and “LC_NUMERIC=C.UTF-8@”+”A”*0x68. Chunk size for LC_CTYPE is 0x30+8 (8 is heap metadata size), then round up to 0x40 and chunk size LC_NUMERIC is 0x80. Then I got similar below layout for my “nsswitch.conf” with 2 services for passwd.
So my plan was a target service_user object at free LC_CTYPE chunk and do heap overflow from free LC_NUMERIC chunk to overwrite service_user object.
I faced some problem with this plan. There are many heap allocations and frees before heap overflow vulnerability. If any object requires chunk size 0x80, free LC_NUMERIC chunk will be taken. From testing on my VMs, I found some settings that might use chunk size 0x80.
- A user is in 9 or 10 groups
- A host with many IP addresses
- An object when loading library with dlopen function
- Maybe more
My first workaround was checking these conditions then adding dummy chunks (size 0x80) with other LC_ environments. Another workaround was bruteforcing a number of chunks size 0x80.
But after checking chunk size usage from heap trace of tested machines, I found out chunk size 0xf0 is never requested. So I changed LC_NUMERIC chunk size from 0x80 and 0xf0. Then uncertain problems are gone.
Another problem is offset from free LC_NUMERIC chunk to free LC_CTYPE chunk is different for each Linux distribution. I guess it is because of glibc version and compilation options. A setlocale function allocates heap with different size even LC environment value length are same.
Luckily offset is not too much different, we can spray partial fake service_user objects (only library, known and name are important). Just need one of fake service_user overwrite a target one.
The result is my exploit can one shot on all tested VMs (Ubuntu 18.04, Ubuntu 20.04, Debian 10, and CentOS 8).
By testing on many VMs, I found one limitation of this method. If nscd service is enabled for caching group, getting shell by overwriting service_user object is impossible. Because after heap overflow, a nss_load_library function only gets called with newly created service_user objects.
def_timestampdir overwrite with glibc tcache
From a source code in an init_defaults function, a def_timestampdir (value is “/run/sudo/ts”) chunk size is only 0x20. This chunk is very common. We cannot control heap same way as overwriting service_user method.
There is a call to “_” function, it is defined as “dcgettext”. The glibc “dcgettext” function uses LC_MESSAGES for localization. It also uses “_nl_make_l10nflist” function for creating all possible path to “sudoers.mo”.
As mentioned above, “_nl_make_l10nflist” might malloc and free some duplicated path. By controlling LC_MESSAGES size, we can control a hole size before def_timestampdir.
On Ubuntu 18.04 and 20.04, glibc looks up a locale from “/usr/share/locale-langpack/” while Debian 10 and CentOS 8 look up a locale from “/usr/share/locale/”. The result is different free chunks after dcgettext is done.
On Ubuntu, a hole is very near def_timestampdir. Heap overflow from a target hole can overwriting with any value.
On Debian and CentOS, we have to overwrite loaded_l10nfile object. Its pointers can be overwritten with NULL value. Its chunk metadata (size) must be valid too. Another problem chunk is a free chunk 0x40 (0x5559ea0a45e0 in pic). It will be used as an object with pointers (cannot be NULL).
To solve a problem, I use another LC_ environment with same chunk size. When setlocale function is called again, the free chunk 0x40 will be taken and can be overwritten with any value.
As a result, I can overwrite def_timestampdir without a crash on Ubuntu 18.04, Ubuntu 20.04, Debian 10 and CentOS 8.
Next, race condition exploitation steps are same as described in advisory. But I show the related sudo code for understanding. In a timestamp_open function, a ts_secure_dir is called first.
sudo_secure_dir is called to make sure a target directory (if existed) belongs to root. If a target directory is missing (SUDO_PATH_MISSING case), ts_mkdirs is called. It makes a target directory with below code.
It’s ok if creating directory is existed. Then a directory is chown()ed to root. To win a race condition, we have to make a target directory after a check in ts_secure_dir and before mkdir is called.
Before making a target directory, sudo_mkdir_parents function is called. It follows each subdirectory name and create it if does not exist. We can take advantage of this function by overwriting def_timestampdir to “./././././././././a” (use more ./).
A result is higher winning chance for race condition because sudo_mkdir_parents takes longer time to finish.
Another problem is chown to root. To be able to modify content in directory after winning this race condition and the directory owner and group is changed to root, I set umask(0) and mkdir with 0777. With this, we can create symbolic link in the directory even owner is root.
Then timestamp file is opened (created if not existed) with below code.
In our case, we want timestamp file to be symbolic link to /etc/passwd. If a /etc/passwd file modification time is older than boot time, symbolic link is deleted (only one time). We have to create symbolic again before sudo create a timestamp file. This race is easy. We can just call symlink function until it is success 2 times such as below code. Also sleep time before racing can be auto adjusted from error checking.
With all mentioned above, I win a race condition and succeed overwriting /etc/passwd with less than 20 tries (average is less than 10).
Final note for race condition method, exploit on a target machine with 1 processor is unlikely (I never test).
Exploit without glibc tcache
Exploiting this vulnerability without tcache is depended on sudo version and settings. The free chunks (including fast bins) from LC_* is likely taken very quickly. I cannot control heap hole. All I can do is finding a hole size at vulnerability point, then change allocating size to fit hole.
On Ubuntu 16.04 and Debian 9, I found an object created early in heap (after service_user objects are created a bit) and freed before set_cmnd function is called. After trying several LC_*, overflown buffer address is before target service_user. I could overwrite service_user object again.
But name_database_entry object is allocated before service_user object. After checking, just overwriting all pointers with NULL is enough. Then I got a root shell.
Next is CentOS 6 and 7. All I found is a big hole before parsed “/etc/sudoers” data. Then I changed argv length to fit a hole size, so I could overwrite sudoers data. They are used after heap buffer overflow vulnerability for checking permission (in nss_file_lookup function).
The parsed data is composed of few structs. Pointers are in them. The key is member object pointed from userspecs objects. We can bypass authentication by overwriting them. Most difficult part is freeing overwritten data without a crash while cleaning up. With carefully crafted fake objects in stack and bruteforcing, I succeed bypassing authentication and modifying /etc/passwd on CentOS7.
I found exploitability by overwriting userspecs on sudo version older than 1.8.9 is unlikely because it uses tq_pop function for cleaning up list (in init_parser function). We need an address of list head in bss section. So I cannot exploit sudo 1.8.6 on CentOS 6 with default configuration.
Last, this exploit method cannot be universal because the data layout/offset might be different when /etc/sudoers or program version is changed.
Avoid getting logged
Whenever sudo program asking for password, your action will be logged even you cancel it Ctrl+C. This might alarm an administrator if your exploit fails (without segfault). Sudo askpass feature can help us. In tgetpass function,
If askpass option is set (-A in command line argument) but SUDO_ASKPASS environment is not set, sudo program prints an error message and exit without logging. An error is handled like passing invalid options combination. But it is checked just before asking password. So we can abuse this to avoid getting log from exploit attempts (especially race condition exploit or bruteforcing without segfault).
On system with glibc (>=2.25) tcache, the vulnerability can be exploited reliably. On system with old glibc, the vulnerability is still exploitable but it is very depended on sudo version and (OS/sudo) settings. With a small change, exploit might have to be reworked.