Intro

simpleGC was a 107 point pwnable challenge in 343C3CTF. I found it to be a very interesting challenge because it gave me my first exposure to the tcache thread local caching mechanism, which was introduced somewhat recently in glibc 2.26, the default libc version used in Ubuntu 17.10 (Artful Aardvark). I learned a lot about how the tcache works and how to exploit it while doing this challenge, so I decided to do a writeup for it. (kileak solved it during the competition)

Reversing

The binary is a menu-driven program that allows us to create users and assign them to groups, which we can also create.

A global array of user pointers, user *userArr[96], and a global array of group pointers,group *groupArr[96], are used to keep track of different users and groups, respectively.

Reversing the relevant structures gives us the following definitions:

structuser{intage;char*userName;char*groupName;};

structgroup{char*groupName;intnumMembers;//refcount};

We can also display information about specific users or display information about each of the users associated with a given group.
Additionally, we can edit groups and delete users.

When we edit a group, we must first specify a index into userArr[] to select a user whose groupName pointer we wish to modify.

Additionally, we must specify if we want to either change the contents of *groupName directly by propagating the change, or if we wish to, instead, change the groupName pointer to make it point to a different groupName chunk, entirely.

cookie=__readfsqword(0x28u);printf("Enter index: ");readWrapper((__int64)&input,4uLL);idxInt=atoi(&input);if(userArr[idxInt]){printf("Would you like to propagate the change, this will update the group of all the users sharing this group(y/n): ",4LL);readWrapper((__int64)&input,2uLL);printf("Enter new group name: ",2LL);if(input=='y'){readWrapper(userArr[idxInt]->pGroupName,0x18uLL);// overwrite original *(group->groupName)}else{readWrapper((__int64)&newGroupName,0x18uLL);groupFound=findGroup(&newGroupName);if(groupFound)userArr[idxInt]->pGroupName=groupFound->pGroupName;// doesn't subtract original group->numMembers!elseuserArr[idxInt]->pGroupName=addGroup((__int64)&newGroupName)->pGroupName;}}return__readfsqword(0x28u)^cookie;}

Let’s take a look at deleting users now.

Deleting a user will decrement the numMembers value, essentially a refcount, in all groups that share the same groupName that the user is associated with. Keep this in mind because this will come up later!

v3=__readfsqword(0x28u);printf("Enter index: ");readWrapper((__int64)&idx,4uLL);idxInt=atoi(&idx);if(idxInt<=0x5F){if(userArr[idxInt]){removeFromGroup((constchar*)userArr[idxInt]->pGroupName);// decrements refcount of ALL groups that share the same groupNamefree(userArr[idxInt]);userArr[idxInt]=0LL;}}else{puts("invalid index");}return__readfsqword(0x28u)^v3;}

Very important to note, is that at the begnning of the main() function, a new thread is created that runs the watchGroup() function, which is basically a garbage collector that frees groupNames and the group that they’re tied to, if the latter has had its refcount decremented to 0.

As we can see, the least significant byte of the group->numMembers value is checked to see if it is 0, and if it is, both the groupName and group will be freed.

Vulnerabilities

There are several vulnerabilities in the program.

First, whenever thread 2, hereinafter t2, checks to see if any groups have a refcount of 0, it only checks the LSB. This means if we have added 0x100 users to a group, t2 will think that the group has no users associated with it, and free it as well as its associated groupName chunk! Meanwhile, all 0x100 users will still be allocated and contain a reference to the now freed groupName chunk, creating a dangling pointer situation.

Similarly, when a user’s groupName is edited, if we choose to propogate the change, the program doesn’t check to see that the new groupName already exists. This means we can create 2 different groups that share the same string in their groupName chunks. This is problematic because when we delete a user and remove it from its group, ALL the groups in groupArr[] are iterated through, and if ANY of them share the same contents in their groupName chunk, they will have their refcount decremented. This means if user A and user B are associated with different groups, but both groups have the same name, when we delete user A, user B’s group’s refcount will be decremented. And if it is decremented to 0, t2 will free user B’s group and groupName even though user B is still allocated and still contians a reference to groupName. This also creates a dangling pointer situation!

Visually this dangling pointer situation looks something like this (green is allocated, blue is freed):

As we can see in the diagram, user A and user B are tied to different groups, group A and group B, but since both groups share the same name, group A and groupName A will be freed even if only user B is deleted.

Another vuln I found is that when you edit a user’s groupName, if we change it to be associated with a different group, we don’t decrement the original group’s refcount. For the purposes of this writeup, we can ignore this one. The vuln we will be exploiting will be the 2nd dangling pointer situation we just mentioned.

So now that we know the vulns, how can we exploit this binary?

As it turns out, we can use the 2nd dangling pointer condition to control the contents of free chunks in order to abuse the tcache caching mechanism.

Before we do this though, let’s first talk about this amazing new feature, tcache, and try to understand it a little better.

Tcache 101

The tcache is a structure that is initialized by malloc()ing a chunk, which means it is one of the first structures that exists on the heap.

As we can see in the above snippet, tcache is a tcache_per_thread_struct *-typcasted pointer.
And what is this new tcache_per_thread_struct structure, you ask?

2932/* There is one of these for each thread, which contains the2933 per-thread cache (hence "tcache_perthread_struct"). Keeping2934 overall size low is mildly important. Note that COUNTS and ENTRIES2935 are redundant (we could have just counted the linked list each2936 time), this is for performance reasons. */2937typedefstructtcache_perthread_struct2938{2939charcounts[TCACHE_MAX_BINS];// TCACHE_MAX_BINS = 642940tcache_entry*entries[TCACHE_MAX_BINS];2941}tcache_perthread_struct;

2925/* We overlay this structure on the user-data portion of a chunk when2926 the chunk is stored in the per-thread cache. */2927typedefstructtcache_entry2928{2929structtcache_entry*next;2930}tcache_entry;

In reading the above snippet, we can see that the tcache_perthread_struct structure is just an array of 64 bins, called entries, and an array of 64 counts, which basically track the number of chunks there are per bin.

Each bin contains a singly-linked list of tcache_entry objects, each of which, start with a FD pointer to the next tcache_entry in the bin.

Each thread has its own tcache!

This means that for our binary, since we have 2 threads running, we have 2 tcache’s, and whenever the garbage collection function running in t2 frees a groupName chunk and its associated group chunk, they will both end up in t2’s tcache. Similarly, whenever a user chunk is freed in t1, the chunk will end up in t1’s tcache, NOT t2’s tcache.

In the above example, we can see there are 4 chunks in tcache_t2 and 1 chunk in tcache_t1. So, how did they get there in the first place?

Starting in glibc 2.26, when a chunk is freed, the first thing the memory allocator does now, is instead of attempting to first place the chunk in a fastbin, it now first attempts to place the chunk in the current thread’s tcache.

4173#ifUSE_TCACHE4174{4175size_ttc_idx=csize2tidx(size);41764177if(tcache4178&&tc_idx<mp_.tcache_bins4179&&tcache->counts[tc_idx]<mp_.tcache_count)// .tcache_count = 74180{4181tcache_put(p,tc_idx);4182return;4183}4184}4185#endif41864187/*4188 If eligible, place chunk on a fastbin so it can be found4189 and used quickly in malloc.4190 */41914192if((unsignedlong)(size)<=(unsignedlong)(get_max_fast())

/* rest of fastbin code */

As we can see, the memory allocator will check to make sure that the size of p, which is the chunk that is being freed, matches one of the tcache_bins.
tcache_bins can store chunksizes from 24 to 1032 bytes on x64. The memory allocator will also check to make sure that the bin in which p will be placed, isn’t full.

Each tcache_bin can only hold 7 chunks!

If both of these conditions pass and the tcache exists, p will be placed in the appropriate tcache_bin.
If, however, one of these conditions is not met, then _int_free() continues as it would pre-2.26 and next attempts to place p in one of the fastbins. And if that fails, then it checks to see if p is not mmapped and so on and so forth.

This means if one of our fastbin-sized tcache bins is full (has 7 chunks in it), the next time we attempt to put p in it, p will instead be placed in main_arena.fastbinsY!This statement holds true, even for chunks that are freed in t2!

Yes! You heard that right. If a chunk is freed in t2 and the memory allocator attempts to place it in tcache_t2, but the corresponding tcache_bin is full, the chunk will instead be attempted to be placed in main_arena which is in t1!

Strange, I know.

Well, now that we have a basic understanding of how tcache works, let’s go back to the challenge and try to exploit it to give ourselves a shell.

Fastbin and Tcache Attack

So, going back to where we left off before we went on a tangent to discuss the tcache, thanks to our aforementioned dangling-pointer vulnerability, we can write into free’d groupName chunks to control the FD pointer, as we would in a typical fastbin attack.

But since chunks are only freed in t2, but never allocated, this won’t be very useful to us. Or will it?

Remember when we said when a thread’s tcache_bin is filled up and we try to add another chunk to it, that it will instead use the main_arena for storage? Well, we can abuse this fact to still perform a fastbin attack!

First, we will allocate and free enough chunks to fill up t2’s tcache_bin[0] and start placing chunks that we can later reallocate into t1’s main_arena.fastbinsY.

Notice that we also edit userArr[7]’s groupname so that it shares the same name as userArr[8].
This is so that when we delete userArr[7], both userArr[7]’s groupName+group AND userArr[8]’s groupName+group are freed.

Since we’ve freed so many group chunks in t2, we can now control the contents of a free’d group fastchunk that has been placed in t1’s main_arena.fastbinsY.

Before we modify the contents of this freed fastchunk though, let’s first leak a heap printer by printing out userArr[8]->groupName, which, as the FD ptr, now points to the next chunk in the fastbin.

Notice how now the fastchunk 0x6020d0 has an invalid fastchunk size of 0x0. Normally in a fastbin attack this would be problematic because it wouldn’t pass the size check when we allocate this fast chunk out, but as we will see, this won’t matter to us, since near the beginning of _int_malloc(), no size checks will be performed before fastbin[0] is actually copied over to the tcache_bin, where there is also no size check performed before allocating tcache chunks out!

3585#ifUSE_TCACHE3586/* While we're here, if we see other chunks of the same size,3587 stash them in the tcache. */3588size_ttc_idx=csize2tidx(nb);3589if(tcache&&tc_idx<mp_.tcache_bins)3590{3591mchunkptrtc_victim;35923593/* While bin not empty and tcache not full, copy chunks over. */3594while(tcache->counts[tc_idx]<mp_.tcache_count3595&&(pp=*fb)!=NULL)3596{3597REMOVE_FB(fb,tc_victim,pp);3598if(tc_victim!=0)3599{3600tcache_put(tc_victim,tc_idx);3601}3602}3603}3604#endif

In order to reach this code to copy the freed fastchunks over to the tcache_bin, we need to empty tcache_bin[0] by allocating all the tcache chunks out, and then allocate 1 more chunk out of fastbin[0].

To do this, we can allocate 6 chunks out of the tcache_bin by editing userArr[8]’s group 3 times, since each time we edit a group and give it a new name, 2 heap chunks are allocated.
After this, our fastbin and tcache_bin will look like this:

Now, if we edit userArr[8]’s group one more time, the following actions will be performed:

we will allocate 0x6033c0 out of the tcache_bin

the memory allocater will attempt to allocate another chunk out of the tcache_bin, but upon seeing no more tcache chunks available for the next allocation will instead allocate 0x603790 out of fastbin[0]

move/push the rest of fastbin[0] into the tcache_bin starting with 0x6020e0

This will result in our fastbin and tcache_bin looking like this:

# fastbin:fastbin[0]: 0x0

# tcachebin:[heap] : 0x603050 --> 0x6020e0 --> 0x603780 --> 0x0

As we can see, userArr now sits at the head of our tcache_bin, giving us the ability to write into it upon the next memory allocation!

Now we can craft the following fake groupName chunk upon the next memory allocation, that sits on top of the userArr[] array:

# alloc 8 chunks# place userArr @ head of tcache_bineditGroup(8,"n","round1")editGroup(8,"n","round2")editGroup(8,"n","round3")editGroup(8,"n","round4")# alloc 1 chunk out of tcache, 1 chunk out of fastbin, copy rest of fastbin into tcache

Sources

While I was doing this challenge, I relied heavily on reading the glibc 2.26 source as well as this article by _2can.
I found the latter to be an excellent summary of the new features and changes to malloc.c that tcache brings, and I highly recommend reading it if for anyone interested in learning more about this new caching mechanism.