I would like to use Mach threads in one of my applications. However, I'm having trouble getting a simple "Hello, world" program using Mach threads to work on OS X 10.9.4. The following MWE always crashes with a segmentation fault. Perhaps I am not setting up the stack correctly, but I can't figure out where I'm going wrong. Do you have any ideas?

My immediate suspect would be that you need to convert Mach threads to Pthreads in order to use std* functions (like printf). You can look at the code injection sample I put up a while ago in the bonus section. Said "promotion" wasn't explained in the book, but surely will be, in the second edition.

Thanks for the reply! I don't think it's the call to printf that's the problem: I removed the call to printf (so that the test function is empty), and allocated the stack, adjusted the stack pointer, and created the thread in the same way that is done in the code injection example. Unfortunately, the segmentation fault still persists. I have included the updated C++11 MWE below. In case you want to compile it, you can do so using clang++ -std=c++11 <file>.

- Without anything in test(), the function actually does execute, then returns - i.e. pops the RA - but you didn't set anything on the stack, so you get a POP of a NULL value, hence a segfault. - With something (e.g. printf()) in test, you fail, as I mentioned earlier, because of pthread_getspecific, since you did not convert the mach thread to a pthread thread.

Thanks so much for taking the time to look into this, and for introducing me to LLDB! It has been immensely helpful.

I am now able to get the "Hello, world!" to show up using the raw write() system call. Since I don't want to return to anything from test(), I would like to tell the kernel to terminate the current thread. The following code results in a crash after printing "Hello, world!", despite the fact that mach_thread_self() is returning the correct thread ID (which I confirmed).

You mentioned in your book that "the implementation of get_ active_thread() wraps CPU_DATA_GET(cpu_active_thread,thread_t), which is inline assembly (relying on the GS register)". I tried to search online to get any hints as to what the cause of the problem is, but was unable to solve the issue. Do you know how I would go about correctly terminating the thread at the end of test()?

Like Administrator said, if you don't use pthread_set_self() at the beginning of your mach thread, you will eventually crash.The issue is that a lot of functions will make a call to (g/s)et_thread_specifics() and that will crash because GS is not properly set (will be 0). Therefore, you get a segmentation fault or something like that.

You certainly can use native functions but you will need to figure out which function make references to GS and which don't.

I apologize for the very late reply -- I only recently got the time to properly look into what you suggested. I have two questions regarding pthread_set_self that I could use some help answering. Based on information I gathered about x64 system call conventions for OS X and the value of the SYSCALL_MDEP_CONSTRUCT macro, I think I am supposed to call pthread_set_self (which is thread_fast_set_cthread_self64 for x64) as follows:

I obtained the constant 50331651 by expanding SYSCALL_MDEP_CONSTRUCT(3), since thread_fast_set_cthread_self64 occupies the third slot in the machine-dependent system call table on x64. The value moved into RDI should be replaced by the value to which we wish to set the thread ID. Is this right?

Also, to which value do I set the thread ID? I'm pretty sure it's not the value returned by thread_create, so what should it be?

I was finally able to find a solution to my problem*, but the solution is not very favorable. I am including this information here in case it is useful to others who are facing similar problems. While reading some system header files, I came across the function pthread_create_suspended_np, which sets up a pthread with the underlying Mach thread in a suspended state. This does the nontrivial task of filling in the fields of the _opaque_pthread_t structure with the right values and calling pthread_set_self with a pointer to this structure. (For more information about pthread_set_self, see The Mac Hacker’s Handbook, pages 301–305.) After using this function, I assigned each thread a distinct affinity tag, and then ran all of the threads using mach_thread_resume. As it turns out, it is still not possible to force each thread to run on its own processor. It looks like this is the best we can do for now =/