Pthreads in C

From RoggeWiki
Revision as of 22:59, 17 December 2011 by Tom (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

One very popular API for threading an application is pthreads, also known as POSIX threads, P1003.1c, or ISO/IEC 9945-1:1990c.

Data race

THREAD 1                THREAD 2
a = data;               b = data;
a++;                    b--;
data = a;               data = b;

Now if this code is executed serially (THREAD 1, the THREAD 2) there isn't a problem. However threads execute in an arbitrary order, so consider this:

THREAD 1                THREAD 2
a = data;
                       b = data;
a++;
                       b--;
data = a;
                       data = b;
[data = data - 1!!!!!!!]

So data could end up +1, 0, -1, and there is NO way to know which as it is completely non-deterministic!

In fact the problem is worse, because on many machines a = data is non-atomic. This means that when data is loaded into a, you could end up with the low order bits of the old data, and the high order bits of the new data. CHAOS.

The solution to this is to provide functions that will block a thread if another thread is accessing data that it is using.

Pthreads use a data type called a mutex to achieve this.

Creating a POSIX thread

Pthreads are created using pthread_create().

#include <pthread.h>
int pthread_create (pthread_t *thread_id, const pthread_attr_t *attributes,
                void *(*thread_function)(void *), void *arguments);

This function creates a new thread. pthread_t is an opaque type which acts as a handle for the new thread. attributes is another opaque data type which allows you to fine tune various parameters, to use the defaults pass NULL. thread_function is the function the new thread is executing, the thread will terminate when this function terminates, or it is explicitly killed. arguments is a void * pointer which is passed as the only argument to the thread_function.

Pthreads terminate when the function returns, or the thread can call pthread_exit() which terminates the calling thread explicitly.

int pthread_exit (void *status);

status is the return value of the thread. (note a thread_function returns a void *, so calling return(void *) is the equivalent of this function.

One Thread can wait on the termination of another by using pthread_join()

int pthread_join (pthread_t thread, void **status_ptr);

The exit status is returned in status_ptr.

A thread can get its own thread id, by calling pthread_self()

pthread_t pthread_self ();

Two thread id's can be compared using pthread_equal()

int pthread (pthread_t t1, pthread_t t2);

Returns zero if the threads are different threads, non-zero otherwise.

Mutual Exclusion

Mutexes have two basic operations, lock and unlock. If a mutex is unlocked and a thread calls lock, the mutex locks and the thread continues. If however the mutex is locked, the thread blocks until the thread 'holding' the lock calls unlock.

There are 5 basic functions dealing with mutexes.

int pthread_mutex_init (pthread_mutex_t *mut, const pthread_mutexattr_t *attr);

Note that you pass a pointer to the mutex, and that to use the default attributes just pass NULL for the second parameter.

int pthread_mutex_lock (pthread_mutex_t *mut);

Locks the mutex :).

int pthread_mutex_unlock (pthread_mutex_t *mut);

Unlocks the mutex :).

int pthread_mutex_trylock (pthread_mutex_t *mut);

Either acquires the lock if it is available, or returns EBUSY.

int pthread_mutex_destroy (pthread_mutex_t *mut);

Deallocates any memory or other resources associated with the mutex.

An example with mutex

Consider the problem we had before, now lets use mutexes:

THREAD 1                        THREAD 2
pthread_mutex_lock (&mut);      
                                 pthread_mutex_lock (&mut); 
a = data;                       /* blocked */
a++;                            /* blocked */
data = a;                       /* blocked */
pthread_mutex_unlock (&mut);    /* blocked */
                                b = data;
                                b--;
                                data = b;
                                pthread_mutex_unlock (&mut);
[data is fine.  The data race is gone.]

Condition Variables

Mutexes allow you to avoid data races, unfortunately while they allow you to protect an operation, they don't permit you to wait until another thread completes an arbitrary activity.

Condition Variables solve this problem.

There are six operations which you can do on a condition variable:

Initialisation
int pthread_cond_init (pthread_cond_t *cond, pthread_condattr_t *attr);

Again to use the default attributes, just pass NULL as the second parameter.

Waiting
int pthread_cond_wait (pthread_cond_t *cond, pthread_mutex_t *mut);

This function always blocks. In pseudo-code:

pthread_cond_wait (cond, mut)
begin
        pthread_mutex_unlock (mut);
        block_on_cond (cond);
        pthread_mutex_lock (mut);
end

Note that it releases the mutex before it blocks, and then re-acquires it before it returns. This is very important. Also note that re-acquiring the mutex can block for a little longer, so the the condition which was signalled will need to be rechecked after the function returns. More about this later.

Signalling
int pthread_cond_signal (pthread_cond_t *cond);

This wakes up at least one thread blocked on the condition variable. Remember that they must each re-acquire the mutex before they can return, so they will exit the block one at a time.

Broadcast Signalling
int pthread_cond_broadcast (pthread_cond_t *cond);

This wakes up all of the threads blocked on the condition variable. Note again they will exit the block one at a time.

Waiting with timeout
int pthread_cond_timedwait (pthread_cond_t *cond, pthread_mutex_t *mut, 
                       const struct timespec *abstime);

Identical to pthread_cond_wait(), except it has a timeout. This timeout is an absolute time of day.

struct timespec to {
        time_t tv_sec;
        long tv_nsec;
};

If a abstime has passed, then pthread_cond_timedwait() returns ETIMEDOUT.

Deallocation
int pthread_cond_destroy (pthread_cond_t *cond);

Bye Bye condition variable :).