Asynchronicity: An Introduction To Asynchronous Programming
Asynchronicity: An Introduction To Asynchronous Programming
A Lecture by
Matt Patenaude
Tuesday, March 12, 13
It's a Police joke. Get it? Get it?
@mattpat
1
A simple JavaScript example with three functions (doThis, doThat, and doSomethingElse), which
we then call in order. A very important, subtle point about this code, though, is that it actually says
this: doThis, THEN doThat, THEN doSomethingElse. Theres something nice and comforting
about this programming model: its predictable, its simple, and its intuitive. This is whats known as
synchronous programming. It does, however, have its obstacles.
At this point in the presentation, I asked Sam to go get me a glass of water, and then stopped
talking. I didnt resume the presentation until Sam came back with the glass of water. The problem
with synchronous programming is that if one operation takes a long time, youre stuck waiting doing
absolutely nothing until it finishes. Its a waste of everyones time and resources, and theres no
reason I had to just sit there while I waited for the operation (Sam getting water) to finish.
Asynchronous programming provides the answer.
function doThis() {
console.log(Hello, world!);
}
var myFunc = doThis;
myFunc();
// Hello, world!
Our ability to program asynchronously in JavaScript comes from this language feature: I can assign
the name of a function (NOTE: I did not use parentheses at the end) to a variable, and that variable
now becomes a new "alias" for the original function. If I then call my new variable as a function, it
does exactly what the original function did.
This is analogous to the concept of function pointers in C: the name of a function always acts as a
pointer to it.
function doThis(callback) {
// do important things
callback();
}
function doThat() { ... }
// my callback
function afterDoThis() {
doThat();
}
doThis(afterDoThis);
Tuesday, March 12, 13
With that in mind, we find a solution to asynchronous programming using callbacks. Here, we
rewrite our original doThat function to take one parameter callback which we then call at the
end of the function to signal were done. We then create a new function, called afterDoThis that
will be our doThis callback all it does is call doThat. Finally, we call doThis passing in afterDoThis
as its callback.
The idea here is that this is effectively no different than if we just called doThis(); doThat();
doThat will happen after doThis finishes. The difference, though, is that weve now freed things
up to happen asynchronously. doThis now has the capability to tell us when its done, so our
process can actually do other things in the meantime.
The problem with the code we just wrote, though, is that its pretty confusing and requires a lot of
work. Writing a whole new set of callback functions and then delicately weaving them together takes
a lot of care, and hinders productivity. The solution to this is a new coding style that allows us to
pass the baton between functions.
Here, we have the same code as before, but written asynchronously. Note that weve modified
doThis and doThat to take callbacks (named cb), which they call at the end of their work. Then, we
use JavaScript anonymous functions which can be created by just using the word function
without providing a name and nest them into our function calls. The effect is that each new level
of indentation represents the next place where execution will resume after each operation
completes.
Continuation passing style lets us write code like its synchronous, but preserves the new
asynchronous nature of the callbacks.
Demo
Tuesday, March 12, 13
In the code provided with these slides is a file called demo.js. Take any large file, rename it
bigfile.dat, and put it in the same folder as demo.js, then run the script (node demo.js). If you look
at the code, you can generally see whats happening: we request to read bigfile.dat into memory,
but meanwhile, we start counting up from 0, stopping when the file finishes loading.
Youll notice that, even though it only takes on the order of a few hundred milliseconds to read the
file into memory, were still able to count into the tens of thousands in the meantime. While 500ms is
an instant for a human, its an eternity for a computer, which is why its important to write code
asynchronously (that way, we always maximize all the time thats given to us).
Also provided in the folder is a C version of the code using Blocks and GCD (two features available
only on the Mac). If you have Xcode installed on your machine, you can compile the code by
running make, and then execute it with ./demo
Request
Request
Request
listen :80
read req
read req
read req
while true:
accept_conn c
fork c
compose res
send res
compose res
send res
compose res
send res
exit
exit
exit
exit
150 MB
100 MB
100 MB
100 MB
10
Here we show an example of how web servers used to work. The idea is that old web servers, like
Apache, use threads to handle large numbers of clients. They start one thread initially, which listens
on a port (like 80), then just waits until clients try to connect. Because they are written
synchronously though, once a client does connect, they can't respond to it in the same thread,
because that will prevent other clients from making new connections (think about it: if I need to send
a large file back, which will take a few seconds, other clients will have to wait that few seconds
before I get back around to checking if new clients are waiting to connect). In order to solve that
problem, then, Apache forks a new thread for every incoming request, which is responsible for
handling *just* that request.
Here's a scenario, though: imagine you pay $20 p/month for web hosting, and you get 512MB of
RAM for that price. In the Apache model, each thread has its own copy of PHP in memory, which
means each thread weighs in at ~150MB of RAM. If three users try to access your site
simultaneously, you have effectively exhausted the capacity of your server, and other users will
have to wait.
150 MB
11
Event Loop
listen :80
while true:
accept_conn c
handle c
read req
compose res
send res
read req
compose res
send res
read req
compose res
send res
12
This works by structuring code in an event loop. The idea is that, initially, theres only one event in
the loop, which is to check for new connections. Once connections come in, new events get added
to the loop to handle each request.
Now that many of our important operations (like reading files and sending responses) are
asynchronous though, thanks to Nodes callback-friendly APIs, I no longer have to wait for one
request to finish before moving onto the next one. Consider, for example, the large file scenario: if I
get a request for a large file, I can start reading it in, but because it will let me know when its done, I
no longer have to wait I can move on and start handling the next request. Eventually, the
response to the first request will get sent once the file finishes loading. This maximizes the utility of
our one thread, rather than having lots of heavyweight threads that don't do all that much
individually.
13
Lets apply this to a typical example: an Express app. Here, we require Express, setup the app, tell it to use the
bodyParser middleware, and then start listening on port 8080. All of these options happen synchronously we
dont provide them callbacks, nor should we, because theyre just part of our server start-up time.
However, the interesting things here are our app.get and app.post calls. First of all, its important to note that
app.get() doesnt mean get something, it means Im going to tell you what to do when a *web browser* tries
to get something. This is how you associate callbacks with particular URLs, and this is the reason we use Express
it affords us the convenience of linking arbitrary callbacks to URLs.
Inside each URL callback, we do any work we need to do, and then send a response. Note that if we dont call
res.send() (or something similar like res.end(), res.render(), or res.redirect()), the browser will just sit there
forever. This is a consequence of asynchronous programming: rather than just sending back an empty response at
the end of your callback, Express assumes that you might be waiting on some *other* callback before you want to
send a response.
var
var
var
var
express = require(express);
anyDB = require(any-db);
db = anyDB.createConnection(sqlite3://db.db);
app = express();
app.use(express.bodyParser());
app.get(/chats.json, function(req, res){
var result = [];
});
app.listen(8080);
Tuesday, March 12, 13
14
Closures
15
function returnX() {
var x = 6;
return function() {
return x;
};
}
var myFunc = returnX();
var x = myFunc();
console.log(x);
// 6
Tuesday, March 12, 13
16
function alwaysReturn(x) {
return function() {
return x;
};
}
var always7 = alwaysReturn(7);
var alwaysTrue = alwaysReturn(true);
var alwaysNull = alwaysReturn(null);
17
function respondWith(str) {
return function(req, res) {
res.status(200);
res.type(text);
res.send(str);
};
}
app.get(/good-page, respondWith(Yay!));
app.get(/bad-page, respondWith(Boo.));
18
This gets more interesting in our new example: a respondWith function. Here, we pass a
string into the respondWith function, which in turn returns a new function that follows the
standard pattern of an Express callback. Inside that callback, we send 200 OK, set the
response type to plaintext, and send the original string we passed in as a response.
The utility here is that we can very easily generate new callbacks for Express. Here, we
setup /good-page to respond with Yay!, and /bad-page to respond with Boo. While
ordinarily we would have to duplicate the anonymous function code to create handlers, we
use respondWith to make our code short.
Its important to notice here that respondWith *does not get called when we go to, e.g., /
good-page*: respondWith is called *immediately* when we setup the callback. The return
value of respondWith, a new function, is what gets called when we visit /good-page.
Lazy Evaluation
19
20
setTimeout(function(){
alert(Hi, John!);
}, 5000);
21
function alertGen(str) {
return function(){
alert(str);
};
}
setTimeout(alertGen(Hi, John!), 5000);
22
Here, we define a new function called alertGen (which takes a string), that returns a
new function that just calls alert with the given string (that inner function is the same
as the callback we used on the previous slide). Now, when we go to use setTimeout,
we call alertGen instead of alert. Everything works as expected.
Whats important to notice here is that we now have the exact same semantics as our
first example (that didnt work): the only difference in our call to setTimeout here is
the three letters Gen in alertGen. Otherwise, it looks as though were calling the alert
function, just delayed by 5 seconds.
The reason this works is that alertGen actually *does* execute immediately, like alert
did on our first slide. The difference is that alertGen doesnt do anything itself, it just
returns a new function that actually does the real work of showing the alert.
23
The code on the previous slide can be made very generic, which is called the generator pattern.
Imagine you want to call the function myFunc with parameters p1, p2, and p3, but you instead
want to do it in such a way that you can use it as a callback i.e., you dont want to call it
*immediately*, you want to give it to someone else (like setTimeout or app.get) to call at a later
time.
To do that, you can pretty much copy the above code verbatim. Create a new function called
myFuncGen (or whatever really) that takes the parameters of your original function. Then, return
an anonymous function (that takes no parameters) that just calls the original function with the
parameters passed to the generator.
If I now did var foo = myFuncGen(a, b, c), and then later did foo(), it would have the same effect
as calling myFunc(a, b, c). You create a new function thats bound to a set of parameters you
want to use later, and that new function can either be called directly, or used as a callback.
});
ul.appendChild(li);
24
While lazy evaluation and the generator pattern is a little esoteric in and of itself, it actually
has a common application. Consider the above code, running in a web browser, where you
make an Ajax request, then create a bunch of lis with messages retrieved from the server.
For each message, you add a callback so that when someone clicks on the message, it
pops up an alert with the number of the message that you clicked.
Unfortunately, this code doesnt work. If you run this, and you have (e.g.) 10 messages,
clicking *any* message will pop up an alert that says Im message 10. Why is that?
The reason is that our anonymous callback functions that we create in the loop all close
over the same scope (set of variables a function can see at a given time), which means
they all share the same i variable. Thus, as i keeps increasing with each iteration of the
loop, it increases in *all* functions that closed over it.
});
ul.appendChild(li);
function handlerGen(i) {
return function() {
alert(Im message + i);
}
}
Tuesday, March 12, 13
25
The solution is to use the generator pattern. Here, instead of writing our callback
inline, we moved it (unchanged) into a new function called handlerGen, that takes an i
parameter. Then, in our loop, we now call handlerGen with the current value of i.
handlerGen then returns the callback that we want to use.
The reason this works is because handlerGen, being a separate function, has its *own*
value for i it doesnt use the same one that the loop uses. Thus, each call to
handlerGen returns a *different* callback bound to the value of i during that particular
iteration of the loop.
You dont have to fully understand how this works (though its pretty cool). The
takeaway here, though, is that if you have this kind of loop, and you find somehow
that every item in your list is doing something that only the last item in your list
should do, you can usually fix it using the generator pattern.
Q&A
Tuesday, March 12, 13
26