-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Implement barrier and double barrier #11725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
pub fn enter(&self) { | ||
self.barrier.enter(); | ||
self.barrier.exit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this exit may just be unnecessary synchronization? You'll have all the tasks get into enter
, but then they'll all immediately stampede to exit
, and then they'll all be allowed out. The code above is pretty small, so perhaps this could just be reimplemented in terms of the enter
above? I do think that the barrier should be able to get re-used after tasks have exited the barrier, though.
What do you think about calling this function wait
instead?
Are there other libraries we can look to for the names of these types? These don't do what I expected from the names: I figured a 'barrier' would be a memory barrier and I wouldn't be able to guess what a double barrier is. Googling doesn't turn up hits for 'double barrier'. Seems like the documentation could be beefed up a lot too since I had to read the code to understand what these do. |
For reference, these are the two places I have seen that discuss double barriers: Per @alexcrichton's previous comments, I added some comments and code examples. Also, my previous implementation was flatly wrong, as I overlooked the issue of spurious wakeups. |
What do you think about only implementing |
Do you mean: rename |
Not quite. I would imagine this sequence of events:
|
FWIW, this seems to be the behaviour the Pthreads barrier has: http://linux.die.net/man/3/pthread_barrier_wait |
@alexcrichton The standard implementation of a barrier, like that given in the The Little Book of Semaphores (page 44), is to have two phases (which correspond to If you think about it, you can't simply reset the count. Consider this code: pub fn enter(&self) {
self.arc.access_cond(|state, cond| {
state.count += 1;
if state.count < self.num_tasks {
cond.wait();
} else {
state.count = 0;
cond.broadcast();
}
});
} The problem with this code is that it doesn't guard against spurious wakeup. As a standard practice, condition variables should wait only in a while loop which checks for a condition, to prevent a thread from accidentally waking up when the condition has not become true yet. So, the code above could be rewritten like this: pub fn enter(&self) {
self.arc.access_cond(|state, cond| {
state.count += 1;
if state.count < self.num_tasks {
while state.count < self.num_tasks {
cond.wait();
}
} else {
state.count = 0;
cond.broadcast();
}
});
} However, since we are reseting the counter to 0, the tasks being woken up might not be able to escape the while loop. |
It's true that condition variables are often susceptible to spurious wakeups, but remember that it is an implementation detail that isn't necessarily true in all circumstances. Notably our condition variables are not susceptible spurious wakeups. Additionally, I believe that it's still possible to write a generic fn wait(&mut self) {
self.arc.access_cond(|state, cond| {
let id = state.generation_id;
state.count += 1;
if state.count < self.num_tasks {
while state.generation_id == id && state.count < self.num_tasks {
cond.wait();
}
} else {
state.count = 0;
state.generation_id += 1;
cond.broadcast();
}
});
} The generation id ensures that a thread only waits in one usage of the a barrier. With our condition variables as-is I don't believe that this is necessary, but we may be susceptible to spurious wakeups at some point. |
From my understanding, spurious wakeup is an inherent problem of Anyway, I changed the code according to your instruction. Note that instead of using |
We can get around spurious wakeups in two ways:
I think we may want to stick with an integer generation, though. Consider a sequence of events like this with a barrier of count 2:
Essentially you don't know how many generations have passed since when you were signaled and when you've actually woken up. If we use an integer it's pretty unlikely to have 4 billion generations between when you were signaled and when you woke up, but I think it would be more likely to have 1 generation happen in that span of time. |
You are absolutely correct. I missed the fact that it doesn't matter if |
// The inner state of a double barrier | ||
struct BarrierState { | ||
priv count: uint, | ||
// For each usage of the barrier, we flip the flag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is a little out of date now.
Thanks for reviewing. Fixed. |
Could you rebase these commits into one? Other than that, this looks good to me! |
@alexcrichton done. |
@alexcrichton Fixed formatting issues. Retry? |
Revert rust-lang#11490 Closes rust-lang#11725 rust-lang#11490 was a little misguided. Quoting the test name should be a client concern, since it's the client that actually runs `cargo`.
No description provided.