Implement barrier and double barrier #11725

derekchiang · 2014-01-22T15:46:04Z

No description provided.

alexcrichton · 2014-01-22T19:27:34Z

src/libextra/sync.rs

+
+    pub fn enter(&self) {
+        self.barrier.enter();
+        self.barrier.exit();


I think this exit may just be unnecessary synchronization? You'll have all the tasks get into enter, but then they'll all immediately stampede to exit, and then they'll all be allowed out. The code above is pretty small, so perhaps this could just be reimplemented in terms of the enter above? I do think that the barrier should be able to get re-used after tasks have exited the barrier, though.

What do you think about calling this function wait instead?

brson · 2014-01-23T01:03:56Z

Are there other libraries we can look to for the names of these types? These don't do what I expected from the names: I figured a 'barrier' would be a memory barrier and I wouldn't be able to guess what a double barrier is. Googling doesn't turn up hits for 'double barrier'. Seems like the documentation could be beefed up a lot too since I had to read the code to understand what these do.

derekchiang · 2014-01-23T02:49:46Z

For reference, these are the two places I have seen that discuss double barriers:

Per @alexcrichton's previous comments, I added some comments and code examples. Also, my previous implementation was flatly wrong, as I overlooked the issue of spurious wakeups.

alexcrichton · 2014-01-23T03:30:00Z

What do you think about only implementing Barrier, but allowing re-use as you do now? I would think that a DoubleBarrier is just trivially two waits on a single barrier.

derekchiang · 2014-01-23T03:36:39Z

Do you mean: rename DoubleBarrier to Barrier, and implement DoubleBarrier such that both enter and exit just call wait?

alexcrichton · 2014-01-23T03:39:32Z

Not quite. I would imagine this sequence of events:

Remove Barrier
Rename DoubleBarrier to Barrier
Remove exit
Rename enter to wait
Modify the broadcast() in wait to also reset the count.

c-a · 2014-01-23T13:29:46Z

Not quite. I would imagine this sequence of events:

Remove Barrier
Rename DoubleBarrier to Barrier
Remove exit
Rename enter to wait
Modify the broadcast() in wait to also reset the count.

FWIW, this seems to be the behaviour the Pthreads barrier has: http://linux.die.net/man/3/pthread_barrier_wait

derekchiang · 2014-01-24T04:23:30Z

@alexcrichton The standard implementation of a barrier, like that given in the The Little Book of Semaphores (page 44), is to have two phases (which correspond to enter and exit in my implementation) and have the wait() call the two phases. Note that the implementation on the book uses semaphores, while mine uses condition variables, but conceptually they are the same.

If you think about it, you can't simply reset the count. Consider this code:

pub fn enter(&self) {
    self.arc.access_cond(|state, cond| {
        state.count += 1;
        if state.count < self.num_tasks {
            cond.wait();
        } else {
            state.count = 0;
            cond.broadcast();
        }
    });
}

The problem with this code is that it doesn't guard against spurious wakeup. As a standard practice, condition variables should wait only in a while loop which checks for a condition, to prevent a thread from accidentally waking up when the condition has not become true yet. So, the code above could be rewritten like this:

pub fn enter(&self) {
    self.arc.access_cond(|state, cond| {
        state.count += 1;
        if state.count < self.num_tasks {
            while state.count < self.num_tasks {
                cond.wait();
            }
        } else {
            state.count = 0;
            cond.broadcast();
        }
    });
}

However, since we are reseting the counter to 0, the tasks being woken up might not be able to escape the while loop.

alexcrichton · 2014-01-24T05:36:09Z

It's true that condition variables are often susceptible to spurious wakeups, but remember that it is an implementation detail that isn't necessarily true in all circumstances. Notably our condition variables are not susceptible spurious wakeups.

Additionally, I believe that it's still possible to write a generic wait function:

fn wait(&mut self) {
    self.arc.access_cond(|state, cond| {
        let id = state.generation_id;
        state.count += 1;
        if state.count < self.num_tasks {
            while state.generation_id == id && state.count < self.num_tasks {
                cond.wait();
            }
        } else {
            state.count = 0;
            state.generation_id += 1;
            cond.broadcast();
        }
    });
}

The generation id ensures that a thread only waits in one usage of the a barrier. With our condition variables as-is I don't believe that this is necessary, but we may be susceptible to spurious wakeups at some point.

derekchiang · 2014-01-24T18:10:05Z

From my understanding, spurious wakeup is an inherent problem of pthread, so I'm not sure how Rust could avoid it. Could you explain a bit?

Anyway, I changed the code according to your instruction. Note that instead of using generation_id, I'm using a boolean flag to make the barrier truly reusable.

alexcrichton · 2014-01-24T18:19:58Z

We can get around spurious wakeups in two ways:

Don't use pthreads. We have scheduling primitives on the tasks themselves, so we can use those to implement a cvar
Have fine-grained knowledge about how a cvar is being used. For example, native tasks right now use a pthread cvar in order to implement deschedule, and they're able to protect against spurious wakeups because they have detailed knowledge about the usage pattern of the wakeup procedure (reawaken)

I think we may want to stick with an integer generation, though. Consider a sequence of events like this with a barrier of count 2:

Thread A blocks with the generation of false.
Thread B finishes the generation, waking up A, the generation is now true.
Threads C and D sleep on the barrier, flipping it back to false.
Thread A wakes up, sees the generation is still false, and the count is 0, so it goes back to sleep.

Essentially you don't know how many generations have passed since when you were signaled and when you've actually woken up. If we use an integer it's pretty unlikely to have 4 billion generations between when you were signaled and when you woke up, but I think it would be more likely to have 1 generation happen in that span of time.

derekchiang · 2014-01-24T23:11:24Z

You are absolutely correct. I missed the fact that it doesn't matter if generation_id overflows. Fixed.

alexcrichton · 2014-01-25T06:31:35Z

src/libextra/sync.rs

+// The inner state of a double barrier
+struct BarrierState {
+    priv count: uint,
+    // For each usage of the barrier, we flip the flag


This comment is a little out of date now.

derekchiang · 2014-01-25T06:53:04Z

Thanks for reviewing. Fixed.

alexcrichton · 2014-01-25T06:55:41Z

Could you rebase these commits into one? Other than that, this looks good to me!

derekchiang · 2014-01-25T07:02:24Z

@alexcrichton done.

derekchiang · 2014-01-25T17:01:46Z

@alexcrichton Fixed formatting issues. Retry?

Revert rust-lang#11490 Closes rust-lang#11725 rust-lang#11490 was a little misguided. Quoting the test name should be a client concern, since it's the client that actually runs `cargo`.

alexcrichton reviewed Jan 22, 2014
View reviewed changes

alexcrichton reviewed Jan 25, 2014
View reviewed changes

Implement barrier.

a937d18

bors added a commit that referenced this pull request Jan 25, 2014

auto merge of #11725 : derekchiang/rust/add-barrier, r=alexcrichton

b85fe01

bors closed this Jan 25, 2014

bors merged commit a937d18 into rust-lang:master Jan 25, 2014

Implement barrier and double barrier #11725

Implement barrier and double barrier #11725

Uh oh!

Conversation

derekchiang commented Jan 22, 2014

Uh oh!

alexcrichton Jan 22, 2014

Choose a reason for hiding this comment

Uh oh!

brson commented Jan 23, 2014

Uh oh!

derekchiang commented Jan 23, 2014

Uh oh!

alexcrichton commented Jan 23, 2014

Uh oh!

derekchiang commented Jan 23, 2014

Uh oh!

alexcrichton commented Jan 23, 2014

Uh oh!

c-a commented Jan 23, 2014

Uh oh!

derekchiang commented Jan 24, 2014

Uh oh!

alexcrichton commented Jan 24, 2014

Uh oh!

derekchiang commented Jan 24, 2014

Uh oh!

alexcrichton commented Jan 24, 2014

Uh oh!

derekchiang commented Jan 24, 2014

Uh oh!

alexcrichton Jan 25, 2014

Choose a reason for hiding this comment

Uh oh!

derekchiang commented Jan 25, 2014

Uh oh!

alexcrichton commented Jan 25, 2014

Uh oh!

derekchiang commented Jan 25, 2014

Uh oh!

derekchiang commented Jan 25, 2014

Uh oh!

Uh oh!