9.32 gauche.threads - Threads

If enabled at compilation time, Gauche can use threads
built on top of either POSIX threads (pthreads) or Windows threads.

Module: gauche.threads

Provides thread API. You can ’use’ this module regardless
whether the thread support is compiled in or not; if threads are not
supported, many thread-related procedures simply signals a
"not supported" error.

If you want to switch code depending on whether pthreads are
available or not, you can use a feature identifier gauche.sys.threads
with cond-expand form (see Feature conditional).

There are also feature identifiers gauche.sys.pthreads and
gauche.sys.wthreads defined for pthreads and Windows threads
platforms, respectively.
In Scheme level, however, you hardly need to distinguish
the underlying implementations. It is recommended to use
gauche.sys.threads to switch the code according to
thread availability.

To check if threads are available at runtime,
instead of compile time, use the following procedure.

Function: gauche-thread-type

Returns a symbol that indicates the supported thread type.
It can be one of the following symbols.

none

Threads are not supported.

pthread

Threads are built on top of POSIX pthreads.

win32

Threads are built on top of Win32 threads.

(Note: On pthreads platforms, it should return pthreads instead
of pthread; then the returned symbol would correspond to the
value given to --enable-threads option at configuration time.
It’s a historical overlook, stuck for the backward compatibility.)

9.32.1 Thread programming tips

What’s Gauche threads for

Although the surface API of threads looks simple and portable,
you need to know how the threads are implemented in order to utilize
the feature’s potential. Some languages support threads as
language’s built-in construct and encourage programmers
to express the calculation in terms of threads.
However, it should be noted that in many cases there are
alternative ways than threads to implement the desired
algorithm, and you need to compare advantages and
disadvantages of using threads depending on how the threads
are realized in the underlying system.

In Gauche, the primary purpose of threads is to write programs
that require preemptive scheduling, therefore are
difficult to express in other ways. Preemptive threads may
be required, for example, when you have to call a module that
does blocking I/O which you can’t intercept, or may spend
nondeterministic amount of calculation time that you want
to interrupt.

For each Gauche’s thread, an individual VM is allocated
and it is run by the dedicated POSIX thread. Thus the
cost of context switch is the same as the native
thread, but the creation of threads costs much higher than,
say, lightweight threads built on top of call/cc.
So Gauche’s preemptive threads are not designed for
applications that want to create thousands of threads
for fine-grained calculation.

The recommended usage is the technique so called "thread pool",
that you create a set of threads and keep them around for
long time and dispatch jobs to them as needed. Gauche provides
a thread pool implementation in control.thread-pool module
(see Thread pools).

Preemptive threads have other difficulties
(e.g. see FairThreads),
and sometimes the alternatives may be a better fit
than the native preemptive threads.

If what you need is just a concurrent calculation, you
might be able to use cooperative thread technique built
on top of call/cc. Creating call/cc-based threads
is much faster than creating native threads.

If what you need is to deal with blocking I/O, and you have
all your code at hand, it is sometimes easier to use good old
select-based dispatching (See Simple dispatcher,
for example).

If what you need is to control the resource consumption in the
subsystem, and the subsystem works fairly independently from
the main system, you may be able to use Unix processes instead of threads.
It may sound to go backward, but Unix process does provide
higher "shield" between the subsystem and the main system
(e.g. the main system can keep running even if subsystem segfaults).

Of course, these technique are not mutually exclusive with
native threads. You can use dispatcher with "thread pool" technique,
for example. Just keep it in your mind that the native threads
are not only but one of the ways to realize those features.

Uncaught errors in a thread body

When you run a single-thread program that raises an unexpected (unhandled)
error, Gauche prints an error message and a stack trace by default.
So sometimes it perplexes programmers when a thread doesn’t print
anything when it dies because of an unhandled error.

What’s happneing is this: An unhandled error in a thread body would
cause the thread to terminate, and the error itself will propagate
to the thread who’s expecting the result of the terminated thread.
So, you get the error (wrapped by <uncaught-exception>)
when you call thread-join! on a thread which is terminated
because of an unhandled error. The behavior is defined in SRFI-18.

If you fire a thread for one-shot calculation, expecting to receive
the result by thread-join!, then this is reasonable—you can
handle the error situation in the “parent” thread. However,
if you run a thread to loop indefinitely to process something and
not expect to retrieve its result via thread-join!, this becomes
a pitfall; the thread may die unexpectedly but you wouldn’t know it.
(If such a thread is garbage-collected, a warning is printed.
However you wouldn’t know when that happens so you can’t count on it.)

For such threads, you should always wrap the body of such thread
with guard, and handles the error explicitly. You can call
report-error to display the default error message and a stack
trace.

Note: As of 0.9.5, Gauche has a known bug that the tail call of
error handling clauses of guard doesn’t become a proper
tail call. So, the following code, which should run safely
in Scheme, could eat up a stack:

9.32.2 Thread procedures

Builtin Class: <thread>

A thread. Each thread has an associated thunk which is evaluated by
a POSIX thread. When thunk returns normally, the result is stored
in the internal ’result’ slot, and can be retrieved by thread-join!.
When thunk terminates abnormally, either by raising an exception or
terminated by thread-terminate!, the exception condition is
stored in their internal ’result exception’ slot, and will be passed
to the thread calling thread-join! on the terminated thread.

Each thread has its own dynamic environment and dynamic handler stack.
When a thread is created, its dynamic environment is initialized by
the creator’s dynamic environment. The thread’s dynamic handler
stack is initially empty.

A thread is in one of the following four states at a time.
You can query the thread state by the thread-state procedure.

new

A thread hasn’t started yet. A thread returned from make-thread
is in this state.
Once a thread is started it will never be in this state again.
At this point, no POSIX thread has been created; thread-start!
creates a POSIX thread to run the Gauche thread.

runnable

When a thread is started by thread-start!, it becomes to this
state. Note that a thread blocked by a system call is still in
runnable state.

stopped

A thread becomes in this state when it is stopped by thread-stop!.
A thread in this state can go back to runnable state by
thread-cont!, resuming execution from the point when
it is stopped.

terminated

When the thread finished executing associated code, or
is terminated by thread-terminate!, it becomes in this state.
Once a thread is in this state, the state can no longer be changed.

Access to the resouces shared by multiple threads must be protected
explicitly by synchronization primitives.
See Synchronization primitives.

Access to ports are serialized by Gauche. If multiple threads attempt
to write to a port, their output may be interleaved but no output
will be lost, and the state of the port is kept consistent.
If multiple threads attempt to read from a port, a single read
primitive (e.g. read, read-char or read-line)
works atomically.

Signal handlers are shared by all threads, but each thread has
its own signal mask. See Signals and threads, for details.

A thread object has the following external slots.

Instance Variable of <thread>: name

A name can be associated to a thread.
This is just for the convenience of the application.
The primordial thread has the name "root".

Instance Variable of <thread>: specific

A thread-local slot for use of the application.

Function: current-thread

[SRFI-18], [SRFI-21]
Returns the current thread.

Function: thread?obj

[SRFI-18], [SRFI-21]
Returns #t if obj is a thread, #f otherwise.

Function: make-threadthunk :optional name

[SRFI-18], [SRFI-21]
Creates and returns a new thread to execute thunk.
To run the thread, you need to call thread-start!.
The result of thunk may be retrieved by calling thread-join!.

You can provide the name of the thread by the optional argument name.

The created thread inherits the signal mask of the calling thread
(see Signals and threads), and has a copy of
parameters of the calling thread at the time of creation
(see Parameters).

Other than those initial setups, there will be no relationship between
the new thread and the calling thread; there’s no parent-child
relationship like Unix process. Any thread can call thread-join!
on any other thread to receive the result. If nobody issues
thread-join! and nobody holds a
reference to the created thread, it will be garbage collected
after the execution of the thread terminates.

If a thread execution is terminated because of uncaught exception,
and its result is never retrieved by thread-join!, a warning
will be printed to the standard error port notifying
“thread dies a lonely death”: It usually indicates some coding
error. If you don’t collect the result of threads, you have to
make sure that all the exceptions are captured and handled within thunk.

Internally, this procedure just allocates and initializes a Scheme
thread object; the POSIX thread is not created until thread-start!
is called.

Function: thread-statethread

Returns one of symbols new, runnable, stopped
or terminated, indicating the state of thread.

Function: thread-namethread

[SRFI-18], [SRFI-21]
Returns the value of name slot of thread.

Function: thread-specificthread

Function: thread-specific-set!thread value

[SRFI-18], [SRFI-21]
Gets/sets the value of the thread’s specific slot.

Function: thread-start!thread

[SRFI-18], [SRFI-21]
Starts the thread. It is an error if thread is already started.
Returns thread.

Function: thread-yield!

[SRFI-18], [SRFI-21]
Suspends the execution of the calling thread and yields CPU to other
waiting runnable threads, if any.

Function: thread-sleep!timeout

[SRFI-18], [SRFI-21]
Suspends the calling thread for the period specified by timeout,
which must be either a <time> object (see Time) that
specifies absolute point of time, or a real number that specifies
relative point of time from the time this procedure is called
in number of seconds.

After the specified time passes, thread-sleep! returns with
unspecified value.

If timeout points a past time, thread-sleep! returns
immediately.

Function: thread-stop!thread :optional timeout timeout-val

Stops execution of the target thread temporarily.
You can resume the execution of the thread by thread-cont!.

The stop request is handled synchronously; that is,
Gauche VM only checks the request at the “safe” point
of the VM and stops itself. It means if the thread is
blocked by a system call, it won’t become stopped state
until the system call returns.

By default, thread-stop! returns after the target
thread stops. Since it may take indefinitely, you can give optional
timeout argument to specify timeout. The timeout
argument can be #f, which means no timeout, or
a <time> object that specifies an absolute point of time,
or a real number specifying the number of seconds to wait.

The return value of thread-stop! is thread if
it could successfully stop the target, or timeout-val
if timeout reached. When timeout-val is omitted, #f
is assumed.

If the target thread has already been stopped by the caller
thread, this procedure returns thread immediately.

When thread-stop! is timed out, the request remains
effective even after thread-stop! returns.
That is, the target thread may stop at some point in future.
The caller thread is expected to call thread-stop!
again to complete the stop operation.

An error is signaled if the target thread has already been
stopped by another thread (including the “pending” stop
request issued by other threads), or the target thread
is in neither runnable nor stopped state.

Function: thread-cont!thread

Resumes execution of thread which has been stopped by
thread-stop!. An error is raised if thread
is not in stopped state, or it is stopped by another thread.

If the caller thread has already requested to stop the target
thread but timed out, calling thread-cont! cancels
the request.

Function: thread-terminate!thread

[SRFI-18], [SRFI-21]
Terminates the specified thread thread.
The thread is terminated and an instance of
<terminated-thread-exception> is stored in the result exception
field of thread.

If thread is the same as the calling thread, this procedure
won’t return. Otherwise, this procedure returns unspecified value.

This procedure should be used with care, since
thread won’t have a chance to call cleanup
procedures (such as ’after’ thunks of dynamic-wind).
If thread is in a critical section, it can leave some state
inconsistent. However, once a thread is terminated, any mutex
that the thread has kept becomes ’abandoned’ state, and an attempt
to lock such a mutex by other thread raises an ’abandoned mutex exception’,
so that you will know the situation. See Synchronization primitives.

Function: thread-join!thread :optional timeout timeout-val

[SRFI-18], [SRFI-21]
Waits termination of thread, or until the timeout is reached
if timeout is given.

Timeout must be either a <time> object (see Time)
that specifies absolute point of time, or a real number that specifies
relative point of time from the time this procedure is called
in number of seconds, or #f that indicates no timeout (default).

If thread terminates normally, thread-join! returns
a value which is stored in the result field of thread.
If thread terminates abnormally, thread-join! raises
an exception which is stored in the result exception field of thread.
It can be either a <terminated-thread-exception> or
<uncaught-exception>.

If the timeout is reached, thread-join! returns timeout-val
if given, or raises <join-timeout-exception>.

9.32.3 Synchronization primitives

Mutex

Builtin Class: <mutex>

A primitive synchronization device. It can take one of four states:
locked/owned, locked/not-owned, unlocked/abandoned and unlocked/not-abandoned.
A mutex can be locked (by mutex-lock!) only if it is in unlocked state.
An ’owned’ mutex keeps a thread that owns it.
Typically an owner thread is the one that locked the mutex,
but you can make a thread other than the locking thread own a mutex.
A mutex becomes unlocked either by mutex-unlock! or the owner
thread terminates. In the former case, a mutex becomes unlocked/not-abandoned
state. In the latter case, a mutex becomes unlocked/abandoned state.

A mutex has the following external slots.

Instance Variable of <mutex>: name

The name of the mutex.

Instance Variable of <mutex>: state

The state of the mutex. This is a read-only slot.
See the description of mutex-state below.

Instance Variable of <mutex>: specific

A slot an application can keep arbitrary data. For example, an application
can implement a ’recursive’ mutex using the specific field.

Function: mutex?obj

[SRFI-18], [SRFI-21]
Returns #t if obj is a mutex, #f otherwise.

Function: make-mutex:optional name

[SRFI-18], [SRFI-21]
Creates and returns a new mutex object.
When created, the mutex is in unlocked/not-abandoned state.
Optionally, you can give a name to the mutex.

Function: mutex-namemutex

[SRFI-18], [SRFI-21]
Returns the name of the mutex.

Function: mutex-specificmutex

Function: mutex-specific-set!mutex value

[SRFI-18], [SRFI-21]
Gets/sets the specific value of the mutex.

Function: mutex-statemutex

[SRFI-18], [SRFI-21]
Returns the state of mutex, which may be one of the followings:

a thread

The mutex is locked/owned, and the owner is the returned thread.

symbol not-owned

The mutex is locked/not-owned.

symbol abandoned

The mutex is unlocked/abandoned.

symbol not-abandoned

The mutex is unlocked/not-abandoned.

Function: mutex-lock!mutex :optional timeout thread

[SRFI-18], [SRFI-21]
Locks mutex. If mutex is in unlocked/not-abandoned
state, this procedure changes its state to locked state exclusively.
By default, mutex becomes locked/owned state, owned by the
calling thread. You can give other owner thread as thread argument.
If thread argument is given and #f, the mutex becomes
locked/not-owned state.

If mutex is in unlocked/abandoned state, that is, some other
thread has been terminated without unlocking it, this procedure
signals ’abandoned mutex exception’ (see Thread exceptions)
after changing the state of mutex.

If mutex is in locked state and
timeout is omitted or #f, this procedure blocks until
mutex becomes unlocked. If timeout is specified,
mutex-lock! returns when the specified time reaches in
case it couldn’t obtain a lock. You can give timeout
an absolute point of time (by <time> object, see Time),
or a relative time (by a real number).

Mutex-lock! returns #t if mutex is successfully
locked, or #f if timeout reached.

Note that mutex itself doesn’t implements a ’recursive lock’
feature; that is, if a thread that has locked mutex tries to lock
mutex again, the thread blocks. It is not difficult, however,
to implement a recursive lock semantics on top of this mutex.
The following example is taken from SRFI-18 document:

[SRFI-18], [SRFI-21]
Unlocks mutex. The state of mutex becomes unlocked/not-abandoned.
It is allowed to unlock a mutex that is not owned by the calling thread.

If optional condition-variable is given, mutex-unlock!
serves the "condition variable wait" operation (e.g. pthread_cond_wait
in POSIX threads). The current thread atomically wait on
condition-variable and unlocks mutex.
The thread will be unblocked when other thread signals on
condition-variable (see condition-variable-signal!
and condition-variable-broadcast! below), or timeout
reaches if it is supplied. The timeout argument can be either
a <time> object to represent an absolute time point (see Time),
a real number to represent a relative time in seconds, or #f which
means never. The calling thread may be unblocked prematurely,
so it should reacquire the lock of mutex and checks the
condition, as in the following example (it is taken from SRFI-18 document):

The return value of mutex-unlock! is #f when it returns
because of timeout, and #t otherwise.

Function: mutex-lockermutex

Function: mutex-unlockermutex

Returns (lambda () (mutex-lock! mutex)) and
(lambda () (mutex-unlock! mutex)), respectively.
Each closure is created at most once per mutex,
thus it is lighter than using literal lambda forms in a tight loop.

Condition variable

Builtin Class: <condition-variable>

A condition variable keeps a set of threads that are waiting for
a certain condition to be true. When a thread modifies the state
of the concerned condition, it can call condition-variable-signal!
or condition-variable-broadcast!, which unblock one or more
waiting threads so that they can check if the condition is satisfied.

A condition variable object has the following slots.

Instance Variable of <condition-variable>: name

The name of the condition variable.

Instance Variable of <condition-variable>: specific

A slot an application can keep arbitrary data.

Note that SRFI-18 doesn’t have a routine equivalent to pthreads’
pthread_cont_wait. If you want to wait on condition variable,
you can pass a condition variable to mutex-unlock! as an
optional argument (see above), then acquire mutex again by
mutex-lock!. This design is for flexibility; see
SRFI-18 document for the details.

[SRFI-18], [SRFI-21]
Returns a new condition variable. You can give its name by
optional name argument.

Function: condition-variable-namecv

[SRFI-18], [SRFI-21]
Returns the name of the condition variable.

Function: condition-variable-specificcv

Function: condition-variable-specific-set!cv value

[SRFI-18], [SRFI-21]
Gets/sets the specific value of the condition variable.

Function: condition-variable-signal!cv

[SRFI-18], [SRFI-21]
If there are threads waiting on cv, causes the scheduler to select
one of them and to make it runnable.

Function: condition-variable-broadcast!cv

[SRFI-18], [SRFI-21]
Unblocks all the threads waiting on cv.

Atom

An atom is a convenient wrapper to make operations on a
given set of objects thread-safe. Instead of defining
thread-safe counterparts of every structure, you can easily
wrap an existing data structures to make it thread-safe.

Function: atomval …

Creates and returns an atom object with val … as the
initial values.

Function: atom?obj

Returns #t iff obj is an atom.

The following procedures can be used to atomically
access and update the content of an atom. They commonly
takes optional timeout and timeout-val arguments,
both are defaulted to #f, which causes those procedures
to block until they acquire a lock.

Those arguments can be used to modify the behavior when
the lock cannot be acquired in timely manner.
timeout may be a <time> object (see Time)
to specify an absolute point of time, or a real number
to specify the relative time in seconds. If timeout is
expired, those procedures give up acquiring the lock,
and the value given to timeout-val is returned.

Function: atom-refatom :optional index timeout timeout-val

Returns index-th value of atom.
See above for timeout and timeout-val arguments.

(define a (atom 'a 'b))
(atom-ref a 0) ⇒ a
(atom-ref a 1) ⇒ b

Function: atomicatom proc :optional timeout timeout-val

Calls proc with the current values in atom,
while locking atom. proc must take
as many arguments as the number of values atom has.

The returned value(s) of proc is the result of
atomic, unless timeout occurs.
See above for timeout and timeout-val arguments.

For example, the ref/count procedure
in the following example counts the number of times
the hashtable is referenced in thread-safe way.

Calls proc with the current values in atom
while locking atom, and updates the values in atom
by the returned values from proc.
proc must take as many arguments as the number of
values atom has, and must return the same number of
values.

The returned value(s) of proc is the result of
atomic, unless timeout occurs.
See above for timeout and timeout-val arguments.

The following example shows a thread-safe counter.

(define a (atom 0))
(atomic-update! a (cut + 1 <>))

Note: The term atom in historical Lisps
meant objects that are not a cons cell (pair). Back then
cons cells were the only aggregated datatype and there were
few other datatypes (numbers and symbols), so having a
complementary term to cells made sense.

Although it still appears in introductory Lisp tutorials,
modern Lisps, including Scheme, has so many datatypes and it makes
little sense to have a specific term for non-aggregate types.

Clojure adopted the term atom for thread-safe (atomic)
primitive data, and we followed it.

Note: The constructor of atom is not make-atom
but atom, following the convention of list/make-list,
vector/make-vector, and string/make-string;
that is, the name without make- takes its elements as
variable number of arguments.