Monday, June 1, 2009

Handling Semaphores in C Programs

A while back, Carlo Schroder, over at LinuxToday.com, put out a request for articles on programming. Now that I have put the downloads for Realeyes IDS version 0.9.5 up on SourceForge, I get to have some fun answering her call.

What I have found in programming books, including those by the late W. Richard Stevens (which I turn to most often) is usually a good start, but never the whole story. But since this is not a general programming text, I will focus on a single issue in detail. This post will cover using semaphores.

First, a little background on locks, in case you have never used them. In *nix, process locks are implemented with the semaphore system calls. Since I use child processes that share memory, I have to implement semaphores. Threads use pthread_mutex calls, which do essentially what these functions do, and then some.

The most common reason for implementing locks is if you have multiple concurrently running processes or threads that have access to the same variable in memory. Obtaining or releasing a lock is guaranteed by the operating system to be completed without interruption. A single line of C code, such as
 if (flag & 4)
requires a minimum of two machine instructions:
  • Get the value of flag from memory into a register

  • Compare the value to zero
It is possible the thread running that code could be swapped out after getting the value, but before comparing it, and the value of flag could be changed by another thread, making the comparison invalid. By requiring every thread to get the flag lock before reading or writing the value of flag, only one thread accesses flag at a time.

The rule of thumb for the code while a lock is held is to do only what requires holding the lock, and no more. Often there are less than ten instructions between getting and releasing a lock. However, sometimes there are a couple dozen, because all of them require holding the lock. My memory management is an example of this, and I will try to cover it down the road.

OK, now for how I implement semaphores. For some reason, the caller is required to define the semun structure. This definition is taken from the semctl man page and is in the rae_lock_mgmt.h file.

union semun {
int val; /* Value for SETVAL */
struct semid_ds *buf; /* Buffer for IPC_STAT, IPC_SET */
unsigned short int *array; /* Array for GETALL, SETALL */
struct seminfo *__buf; /* Buffer for IPC_INFO */
};
All of the following code is from the rae_lock_mgmt.c file. First, I define a global to keep track of a held lock. This is done so that if there is an interrupt, such as a segmentation fault, while the lock is held, it can be released in the signal handler by calling rae_release_lock. The caller must set these fields to the address of the variables used to track the semaphore ID and index.
/* Pointer to currently held lock */
int *rae_held_lock = NULL;
int *rae_hl_index = NULL;
Before a lock can be used, it must be initialized. I use a lot of locks and have found that different Linux distributions have different defaults for the maximum number of semaphores an application may allocate. To keep the number of allocations down, I have grouped the locks by functionality, and each group gets a semaphore set (or array) which only uses a single semaphore ID. Therefore, the number of locks in the group is passed to the init function.
int rae_init_lock(int il_size)
{
int i, il_semid = -1;
union semun sem_union;
The semget call returns a new semaphore set. Then each lock in the array is initialized to a value of 1 to indicate that it is available. If that fails, the semaphore ID is set to a negative value, which is -1 if the semaphore set is released, and -2 if it is not.
  if ((il_semid = semget(IPC_PRIVATE, il_size, (0660|IPC_CREAT))) > 0)
{
sem_union.val = 1;
for (i=0; i < il_size; i++)
{
if (semctl(il_semid, i, SETVAL, sem_union) == -1)
{
if (semctl(il_semid, 0, IPC_RMID, sem_union) == -1)
il_semid = -2;
else
il_semid = -1;
}
}
}
return(il_semid);
} /* end rae_init_lock */
When the application shuts down, the locks must be freed. If not, they remain allocated by the system. You can run 'ipcs -s' to see the locks that are held. If an application fails to relaese a lock, you can run ipcrm (read the man page) as root to release it.

Notice that the location of the semaphore ID in the application is passed to this function, and it is set to zero by the function. This is because the sem_ctl command, IPC_RMID, ignores the index and simply removes the entire semaphore. Also, I prefer to do as much as possible in a function so the caller does not have to worry about the details. This way, when I call the same function from different places, I reduce the risk of forgetting to set something.

int rae_free_lock(int *fl_semid, int fl_idx)
{
int fl_stat = 0;
union semun sem_union;

if (*fl_semid == 0)
goto out;
sem_union.val = 1;
fl_stat = semctl(*fl_semid, fl_idx, IPC_RMID, sem_union);
*fl_semid = 0;
out:
return(fl_stat);
} /* end rae_free_lock */
When a lock is needed for a memory location, the get lock function is called with the lock identifier, which consists of the semaphore ID and the index in its array. I have added a wait flag to allow some locks to be conditional. In my memory management code, I have three available buffer queues, and if the lock on one is held, the caller can simply try the next one without waiting.

The semop call gets and releases a lock by subtracting the supplied value from the semaphore's value. What this means is that you, the programmer, are responsible for defining what value indicates a held or released lock. By keeping all of this logic in a pair of functions, you have control over how it is implemented. All of the examples I have seen use 1 to indicate the lock is available, and 0 to indicate that it is held. I can imagine how other values might be used, but it seems ridiculously complicated and prone to error.
int rae_get_lock(int gl_wait, int gl_semid, int gl_idx)
{
int gl_stat = 0;
struct sembuf sem_b;

if (gl_semid == 0)
{
gl_stat = -2;
goto out;
}
The semaphore buffer structure is defined in the system headers. This is where the semaphore array index and operation are set. And as I said before, this function allows the caller to wait or not, which is accomplished by using the semaphore flag bit, IPC_NOWAIT.

The SEM_UNDO flag bit is supposed to reverse the operation when the process terminates, which implies that if the process fails while holding a lock, the system will release it (but not free it). However, in my experience, that doesn't always work, so I have included the capability to do this in my interrupt handlers, as I mentioned above.

  sem_b.sem_num = (short int) gl_idx;
sem_b.sem_op = -1;
if (gl_wait & raeLOCK_WAIT)
sem_b.sem_flg = SEM_UNDO;
else if (gl_wait & raeLOCK_NWAIT)
sem_b.sem_flg = SEM_UNDO | IPC_NOWAIT;
else
{
gl_stat = -1;
goto out;
}
This is the heart of the function. Read the semop man page for a lot more detail, but the general idea is as follows. The semop system call will attempt to subtract 1 from the current value of the lock. If that value is 1, the operation occurs immediately. Otherwise, the call will wait or return with the errno value of EAGAIN if the lock is held. Of course, there is the possibility the call will fail entirely, which must be handled.

If the lock value is set to 0, this means the lock is obtained, and this function sets the semaphore ID and index in the global lock tracking variables.

  if ((gl_stat = semop(gl_semid, &sem_b, 1)) == -1)
{
if ((gl_wait & raeLOCK_NWAIT) && errno == EAGAIN)
gl_stat = 1;
else if (errno == EIDRM)
gl_stat = -2;
}
if (!gl_stat && rae_held_lock != NULL)
{
*rae_held_lock = gl_semid;
*rae_hl_index = gl_idx;
}
out:
return(gl_stat);
} /* end rae_get_lock */
This is the reverse of the get lock function, in that it adds one to the lock value. There is no wait flag for releasing a lock, so only the semaphore ID and index are supplied. If the lock value is set to 1, the lock is released and this function clears the semaphore ID and index in the global lock tracking variables.
int rae_release_lock(int rl_semid, int rl_idx)
{
int rl_stat = 0;
struct sembuf sem_b;

sem_b.sem_num = (short int) rl_idx;
sem_b.sem_op = 1;
sem_b.sem_flg = SEM_UNDO;
rl_stat = semop(rl_semid, &sem_b, 1);
if (!rl_stat && rae_held_lock != NULL)
{
*rae_held_lock = 0;
*rae_hl_index = 0;
}
return(rl_stat);
} /* end rae_release_lock */
This set of functions makes using semaphores as easy as:
  • Init lock

  • Get lock

  • Release lock

  • Free lock
Of course, the caller code must be well thought out to prevent a deadly embrace. That is accomplished by keeping the code using the Get and Release calls as simple as possible, and making sure the instructions between them absolutely require the lock.

Later . . . Jim

1 comment:

Unknown said...

here is only one way to prevent 'deadly embrace' that is to ensure that two locks are always taken (and released) in the same order whenever used. IOW

foo()
{
//lock_a
//lock_b
}


bar()
{
//lock_a
//lock_b
}

is fine, while

baz()
{
//lock_b
//lock_a
}

will introduce the problem.