15 - ACE Semaphore
15 - ACE Semaphore
Semaphores are a powerful mechanism used to lock and/or synchronize access to shared resources
in concurrent applications. A semaphore contains a count that indicates the status of a shared
resource. Application designers assign the meaning of the semaphore's count, as well as its initial
value. Semaphores can therefore be used to mediate access to a pool of resources.
Since releasing a semaphore increments its count regardless of the presence of waiters, they are
useful for keeping track of events that change shared program state. Threads can make decisions
based on these events, even if they have already occurred. Although some form of semaphore
mechanism is available on most operating systems, the ACE semaphore wrapper facades resolve
issues arising from subtle variations in syntax and semantics across a broad range of environments.
Class Capabilities
The ACE_Thread_ semaphore and ACE_Process_ Semaphore classes portably encapsulate
process-scoped and system-scoped semaphores, respectively, in accordance with the Wrapper
Facade pattern. The constructor is slightly different, however, since semaphore initialization is
more expressive than mutexes and readers/writer locks, allowing the semaphore's initial count to
be set. The relevant portion of the ACE_Thread_ Semaphore API is shown below:
class ACE_Thread_Semaphore
{
public:
// Initialize the semaphore, with an initial value of <count>,
// a maximum value of <max>, and unlocked by default.
ACE_Thread_ semaphore (u_int count = 1,
const char *name = 0,
void *arg = 0,
int max = 0x7FFFFFFF);
// ... same as pseudo <ACE_LOCK> signatures.
};
The ACE_Process_Semaphore has the same interface, though it synchronizes threads at the system
scope rather than at the process scope.
These two ACE classes encapsulate OS-native semaphore mechanisms whenever possible,
emulating them if the OS platform does not support semaphores natively. This allows applications
to use semaphores and still be ported to new platforms regardless of the native semaphore support,
or lack thereof.
The ACE_Null_Semaphore class implements all of its methods as "no-op" inline functions. Two
of its acquire() methods are implemented below:
class ACE_Null_Semaphore
{
public:
int acquire () { return 0; }
int acquire (ACE_Time_Value *) { errno = ETIME; return -1; }
// ...
1
};
Although semaphores can coordinate the processing of multiple threads, they do not themselves
pass any data between threads. Passing data between threads is a common concurrent
programming technique, however, so some type of lightweight intraprocess message queueing
mechanism can be quite useful. Therefore, a Message_Queue class implementation that provides
the following capabilities is shown:
the key parts of the Message_Queue class implementation is shown below and the use of
ACE_Thread_Semaphore is showcased.
class Message_Queue
{
public:
// Default high and low water marks.
enum {
DEFAULT_LWM = 0, // 0 is the low water mark.
DEFAULT_HWM = 16 * 1024 // 16 K is the high water mark.
};
// Initialize.
Message_Queue (size_t = DEFAULT_HWM, size_t = DEFAULT_LWM);
// Destroy.
~Message_Queue ();
private:
// Implementations that enqueue/dequeue ACE_Message_Blocks.
int enqueue_tail_i (ACE_Message_Block *, ACE_Time_Value * = 0);
int dequeue_head_i (ACE_Message_Block *&, ACE_Time_Value * = 0);
2
// Implement the checks for boundary conditions.
int is_empty_i () const;
int is_full_i () const;
The Message_Queue constructor shown below creates an empty message list and initializes the
ACE_Thread_Semaphores to start with a count of O (the mutex lock_ is initialized automatically
by its default constructor).
The following methods check if a queue is "empty," that is, contains no messages, or "full," that
is, contains more than high_water_mark_ bytes in it.
3
int Message_Queue::is_empty () const {
ACE_GUARD_RETURN (ACE_Thread_Mutex, guard, lock_, -1);
return is_empty_i ();
}
These methods acquire the lock_ and then forward the call to one of the following implementation
methods:
These methods assume the lock_ is held and actually perform the work.
The enqueue_tail() method inserts a new item at the end of the queue and returns a count of the
number of messages in the queue. As with the dequeue_head() method, if the timeout parameter
is 0, the caller will block until action is possible. Otherwise, the caller will block only up to the
amount of time in *timeout. A blocked call can return when a signal occurs or if the time specified
in timeout elapses, in which case errno is set to EWOULDBLOCK.
if (result == -1) {
if (enqueue_waiters_ > 0)
--enqueue_waiters_;
4
if (errno == ETIME)
errno = EWOULDBLOCK;
return -1;
}
// Tell any blocked threads that the queue has a new item!
if (dequeue_waiters_ > 0) {
--dequeue_waiters_;
notempty_.release ();
}
return queued_messages; // guard's destructor releases lock_.
}
The enqueue_tail() method releases the notempty_ semaphorewhen there is at least one thread
waiting to dequeue a message. The actual enqueueing logic resides in enqueue_tail_i(), which is
omitted here since it is a low-level implementation detail. Note the potential race condition in the
time window between the call to not_full_acquire() and reacquiring the guard lock. It is possible
for another thread to call dequeue_head (), decrementing enqueue_waiters_ in that small window
of time. After the lock is reacquired, therefore, the count is checked to guard against decrementing
enqueue_waiters_ below 0.
The dequeue_head() method removes the front item from the queue, passes it back to the caller,
and returns a count of the number of items still in the queue, as follows:
if (result == -1) {
if (dequeue_waiters_ > 0)
--dequeue_waiters
if (errno == ETIME)
errno = EWOULDBLOCK;
5
return -1;
}
Servers can be categorized as either iterative, concurrent, or reactive. The primary trade-offs in
this dimension involve simplicity of programming versus the ability to scale to increased service
offerings and host loads.
Iterative servers
Iterative servers handle each client request in its entirety before servicing subsequent requests.
While processing a request, an iterative server therefore either queues or ignores additional
requests. Iterative servers are best suited for either
Short-duration services, such as the standard Internet ECHO and DAYTIME services, that
have minimal execution time variation or
Infrequently run services, such as a remote file system backup service that runs nightly
when platforms are lightly loaded
Iterative servers are relatively straightforward to develop. Iterative servers execute their service
requests internally within a single process address space, as shown by the following pseudo-code:
6
Due to this iterative structure, the processing of each request is serialized at a relatively coarse-
grained level, for example, at the interface between the application and an OS synchronous event
demultiplexer, such as select() or WaitForMultipleObjects(). However, this coarse-grained level
of concurrency can underutilize certain processing resources (such as multiple CPUs) and OS
features (such as support for parallel DMA transfer to/from I/O devices) that are available on a
host platform.
Iterative servers can also prevent clients from making progress while they are blocked waiting for
a server to process their requests. Excessive server-side delays complicate application and
middleware-level retransmission time-out calculations, which can trigger excessive network
traffic. Depending on the types of protocols used to exchange requests between client and server,
duplicate requests may also be received by a server.
Concurrent servers
Concurrent servers handle multiple requests from clients simultaneously. Depending on the OS
and hardware platform, a concurrent server either executes its services using multiple threads or
multiple processes. If the server is a single-service server, multiple copies of the same service can
run simultaneously. If the server is a multiservice server, multiple copies of different services may
also run simultaneously.
Concurrent servers are well-suited for I/O-bound services and/or long-duration services that
require variable amounts of time to execute. Unlike iterative servers, concurrent servers allow finer
grained synchronization techniques that serialize requests at an application-defined level.
Concurrent servers can be structured various ways, for example, with multiple processes or
threads. A common concurrent server design is thread-per-request, where a master thread spawns
a separate worker thread to perform each client request concurrently:
The master thread continues to listen for new requests, while the worker thread processes the client
request, as follows:
7
perform requested service
if (response required) send response to client
terminate thread
}
It's straightforward to modify this thread-per-request model to support other concurrent server
models, such as thread-per-connection:
In this design, the master thread continues to listen for new connections, while the worker thread
processes client requests from the connection, as follows:
void worker_thread()
{
for (each request on the connection) {
receive the request
perform requested service
if (response required) send response to client
}
}
Thread-per-connection provides good support for prioritization of client requests. For instance,
connections from high-priority clients can be associated with high-priority threads. Requests from
higher-priority clients will therefore be served ahead of requests from lower-priority clients since
the OS can preempt lower-priority threads.
Reactive servers
Reactive servers process multiple requests virtually simultaneously, although all processing is
actually done in a single thread. Before multithreading was widely available on OS platforms,
concurrent processing was often implemented via a synchronous event demultiplexing strategy
where multiple service requests were handled in round-robin order by a single-threaded process.
For instance, the standard X Windows server operates this way.
A reactive server can be implemented by explicitly time-slicing attention to each request via
synchronous event demultiplexing mechanisms, such as select() and WaitForMultipleObjects().
8
The following pseudo-code illustrates the typical style of programming used in a reactive server
based on select():
void reactive_server()
{
initialize listener endpoint(s)
// Event loop.
for (;;) {
select() on multiple endpoints for client requests
for (each active client request) {
receive the request
perform requested service
if (response is necessary) send response to client
}
}
}
Although this server can service multiple clients over a period of time, it is fundamentally iterative
from the server's perspective. Compared with taking advantage of full-fledged OS support for
multithreading, therefore, applications developed using this technique possess the following
limitations: