In this code, ‘vp’ pointer is used to store a ‘vnode’ structure (defined in sys/vnode.h). The bug is a missing clean up of that structure before returning. As you can read in last ‘if’ clause, in case of an error in msleep(), it will decrement the writers’ reference counter, and if there are no others left, it will lock the socket descriptor ‘fip->fi_readsock’ using socantrcvmore(), then start a MUTEX lock to increment ‘fip->fi_wgen’ counter and finally, call fifo_cleanup() on ‘vp’ pointer to dispose the FIFO resources like this:

However, in fifo_open() the ‘if’ clause for ‘ap->a_mode & FWRITE’, in case of non-blocking mode on that FIFO and a readers’ reference counter equal to zero it will unlock the FIFO MUTEX and return ENXIO (aka. Device not configured) without releasing the resource. This results in a resource leak.
The suggested patch as we can read in the original advisory, is to add the missing clean-up function.

They retrive the maximum IPC socket number using the previous wrapper routine and set ‘maxiter’ to that value multiplied by two unless the user specified a value through the first argument of the program. The next code is this.

This loop will iterate as long as it has not reached more than ‘maxiter’ (maximum IPC socket number multiplied by two) times and flag ‘notdone’ is non-zero. Inside the ‘while’ loop, it creates a FIFO in the previously unlinked file and sets its mode accordingly. Then, it opens that FIFO as write only and non-blocking and then it just unlinks it. If the open(2) system call returns ‘ENXIO’, flag ‘notdone’ is zeroed out. This is a simple code to reach the fiflo_open() bug discussed above since the FIFO created is on write and non-blocking mode and it has no readers on it.
Finally, the code continues…

Just some printf(3)s of the number of open IPC sockets using the sysctl(3) interface and an informative message if the system had returned ‘ENXIO’ (meaning it’s buggy) and consequently zeroed out ‘notdone’ or not.

This is a serious issue and to begin with, credits for this advisory go to John Baldwin, Konstantin Belousov, Alan Cox, and Bjoern Zeeb. The bug affects all of the FreeBSD releases prior to that patch. The concept is that user space processes have no limitation in mapping the location of NULL pointer and this results in perfectably exploitable conditions for NULL pointer dereference vulnerabilities. For example, if a kernel process attempts to access data in NULL or an offset of it because of a NULL pointer dereference, a user could map that address and inject malicious data that could lead to code execution in the context of the kernel.
To fix this, a patch was developed which adds a new sysctl(8) option like this…

This is inserted in sys/kern/kern_exec.c and it initializes a new sysctl named “security.bsd.map_at_zero” that uses the static integer ‘map_at_zero’ which by default is set to 1. From the SYSCTL_INT() we can easily deduce that by default it is allowed to perform mappins in virtual address 0 since it is set to 1. To disable the NULL mappings users could use something like:

security.bsd.map_at_zero="0"

In their /boot/loader.conf or /etc/sysctl.conf. Function exec_new_vmspace() from kern/kern_exec.c which is responsible for destroying the old address space and allocating new stack was also changed to include a new variable:

The initial check required that VM reference count is set to 1, the minimum VM mapping would be equal to ‘sv_minuser’ and the maximum VM mapping equal to ‘sv->sv_maxuser’. As we can read from sys/sysent.h, ‘sv_minuser’ and ‘sv_maxuser’ represent:

So, the above check was simply checking the boundaries of the VM. The new check is using the newly added ‘map_at_zero’ varible, and if it contains a non-zero value, it initializes ‘sv_minuser’ with the user requested ‘sv->sv_minuser’. Otherwise, it will use the maximum allowable for that page address. Meaning, in a request similar to NULL, the result would be:

MAX(0x0, 0x1000) = 0x1000

The old check was inserted after the NULL mapping check as you can see in the above diff file. In addition to this, the else clause of that check was changed to use the appropriate min/max VM address values since the old one was using the user controlled ones directly.
This is the first protection against NULL pointer mappings in FreeBSD and I think it didn’t get the attention it should…

P. Frasunek noticed that in devfs_open() of the same file, ‘fp->f_vnode’ is not initialized and thus, remains with value of zero during the execution of the above code. This routine uses devvn_refthread() to initialize the ‘dswp’ pointer. Then, if ‘devp’ (which is the pointer to the requested device) isn’t NULL it will release the device thread using ‘dev_relthread()’ and return with ENXIO (aka. Device not configured), otherwise, it will assert() that ‘(*devp)->si_refcount’ (which contains the number of references to that structure) is greater than zero, and if ‘dswp’ is NULL, it will immediately return with ENXIO. In any other case, it will initialize ‘curthread->td_fpop’ with ‘fp’. curthread points to the FS:[0] (on IA-32) or GS:[0] (on x86_64) segment selector which has the currently executing thread’s structure (aka. struct thread), and ‘td_fpop’ as we can read from sys/proc.h contains the file referencing cdev under op.
Now, a closer look to dev_relthread() which is called in case of a non-NULL device pointer can be found at kern/kern_conf.c and does this:

Basically, it simply decrements the thread’s counter by one in a lock. The second routine that is being called in devfs_fp_check() is the devvn_refthread() which is used to initialize ‘dswp’ pointer. This is probably the most interesting one…

It takes a vnode and a cdev structures as arguments, and after a simple assertion, it locks the device and sets the device pointer to ‘vp->v_rdev’. Since, ‘fp->f_vnode’ was not properly initialized in devfs_open() and it is directly used as the first argument of devvn_refthread(), this will result in a NULL pointer dereference and ‘devp’ will be pointing to NULL->v_rdev which as P. Frasunek discovered. Next, if ‘devp’ isn’t NULL (which is our case), it will initialize ‘cdp’ with ‘(*devp)->si_priv’, and check the CDP_SCHED_DTR and if set, initialize ‘csw’ to ‘(*devp)->si_devsw’. At last, if this is not NULL, it will increment ‘(*devp)->si_threadcount++’.
This final operation allows the modification of an arbitrary user controlled location but unfortunately, it is restored in its original value through the decrement that dev_relthread() does when called in devfs_fp_check(). Nevertheless, P. Frasunek managed to code a really awesome exploit code for that vulnerability. Before moving on with the analysis of his exploit code, here is how it was patched by the FreeBSD guys:

FILE_LOCK(fp);
fp->f_data = dev;
+ fp->f_vnode = vp;
FILE_UNLOCK(fp);

A simple initialization of ‘fp->f_vode’ in devfs_open() was enough.
Now, to the exploit code…

So, he’s using a JE instruction in devfs_fp_check() as its target for the increment/decrement race condition and ‘ap’ is initialized to point to ‘pages[2]’ which has the address of ‘dsw->d_kqfilter()’ routine. This is filled with the contents of kernel_code() which is this:

It retrieves the current thread structure on IA-32 systems (on X86_64 he should be using %%gs:0), and sets the current thread’s UID to that of root (aka. 0) and the pointer to a possible jail that is being running for the current thread to NULL to escape from a jail environment.
Back to main() we have…

He initializes ‘kq’ using kqueue() system call and he creates two threads that will execute do_thread() and do_thread2() respectively, then, he initializes a timespec structure and at last, doing a simple sleeping loop to wait for the threads to gain execution in the kernel context. Here is the code of do_thread():

As long as it has not gained root access, it will initialize a kevent structure using EV_SET() macro setting the changelist to fd with an EVFILT_READ event (which means that it will return when there are data available to read from fd) and EV_ADD to add the event to the kqueue. Finally, it will invoke kevent() on the previously set event.
Now, do_thread2() goes like this:

So, if it is able to set its UID to 0 it will spawn a shell which would be a root-shell. :)
The goal of this exploit code is to follow these steps:
1) place the JE instruction of devfs_fp_check() to the location that the increment will take place
2) Open a device to trigger the increment. This will make the JE (which is 0x74) to JNE (which is 0x75) and this results in the invocation of dsw->d_kqfilter() as we can see here:

3) The kernel will jump to dsw->d_kqfilter() but this is where kernel_code() resides and leads to privilege escalation and possible jail escape.

By doing so, P. Frasunek avoids the dev_relthread() (the decrement) in devfs_kqfilter_f() as you can clearly see. The two threads are used to reach that race window of the increment/decrement using kevent() on the ‘fd’ and opening/closing the ‘fd’.

This bug was reported by Aragon Gouveia on 19 July 2009 and it affects (at least) 8.0-BETA1 release of FreeBSD. This should not be considered as a critical vulnerability (in my opinion) since it only affects systems using ppp(8) and using NAT to forward port 65535. The buggy code can be found at src/usr.sbin/ppp/nat_cmd.c like this:

As you can see, llocalport, haliasport, hremoteport and lremoteport are allo of them declared as unsigned short integers, meaning 2 bytes long and consequenlty, they have a range of 0-65535. The while loop iterates as long as the local alias NAT port is less than, or equal to the host’s port. However, as you can see both are incremented regardless of their values. Because of this, if user attempts to use port 65535 this will result in an infinite loop since laliasport will wrap around and iterate forever. To fix this, they changed the above loop like this:

As you can see, during the first arithmetic operation, it multiplies ‘len’ with some constant values. If ‘len’ is large enough, it could result to an integer overflow and the subsequent call to os_malloc() would allocate incorrect number of bytes which will later result to heap memory corruption. The proposed patch is a simple check after the calculation.

This bug was discovered and disclosed by Shaun Colley on 22 June 2009. This issue affects FreeBSD 6.0 as well as 8.0 branch and probably more releases as well. Here is the vulnerable code as seen in dev/ata/ata-all.c of FreeBSD 6.0:

The bug is really simple. If the IOCTL command is IOCATAREQUEST, it will immediately attempt to allocate ioc_request->count bytes which was derived from the user controlled data passed to that IOCTL call. Therefore, a user could request a huge amount of memory which will panic the kernel. Shaun Colley published atapanic.c as well that does that IOCTL call and requests 0xffffffff bytes to be allocated. Of course, in order to do this you need read access to an ATA device.

This is not a security related issue. Nevertheless it is still interesting. It affects FreeBSD 7.2 release and credits for that bug go to Pete French and David Christensen. The bug was officially disclosed on 24 June 2009 from the FreeBSD project. The bug appears in bce(4) which is a device driver for Broadcom NetXtreme II PCI/PCIe Gigabit Ethernet adapters. If you add a network adapter with that device driver as a lagg(4) member, interface will stop working. In addition to this, in case of non-ZERO_COPY_SOCKETS there will be no update in the packet length and thus lead to incorrect values passed to userspace. This was fixed simply by adding the missing #else clause after the #ifdef ZERO_COPY_SOCKETS like this: