|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +title: "Surprises in GopherJS Performance" |
| 4 | +date: 2015-09-28 |
| 5 | +author: Dmitri Shuralyov |
| 6 | +--- |
| 7 | + |
| 8 | +The GopherJS project first caught my attention about 2 year ago, back when few parts of the Go spec were implemented. |
| 9 | +However, I noticed the incredible pace at which Richard was working, making multiple sophisticated commits per day, as well as fixing reported compiler issues within hours. |
| 10 | +A few months later, I decided to download it and give it try on a relatively [large pure Go package](https://godoc.org/github.com/shurcooL/markdownfmt/markdown?import-graph&hide=1) for formatting Markdown, and I was quite shocked when it... [simply worked](https://github.com/shurcooL/atom-markdown-format/commit/6b5f21c4457309f8eba3a78b82e0c9a458ff13b4). |
| 11 | + |
| 12 | +Since then, GopherJS has made significant progress, both in feature support (by now, full support for goroutines, channels, select statement, and the rest of the Go language spec), as well as quite a few performance leaps. |
| 13 | +For example, in [issue 142](https://github.com/gopherjs/gopherjs/issues/142), I reported a case where the GopherJS performance was pretty bad, taking nearly 30 seconds to do what native Go did in under 100 ms. |
| 14 | +Fast forward just a few days later, and Richard came up with optimizations that lead to a [10x improvement](https://github.com/gopherjs/gopherjs/issues/142#issuecomment-68664354) in performance! |
| 15 | + |
| 16 | +One day, I was perusing the golang.org home page and decided to play with the [concurrent pi](https://play.golang.org/p/RdbPXQcZHi) sample. |
| 17 | +I wanted to see how much overhead using goroutines was (they were used to demonstrate how lightweight they are compared to threads, but it's still suboptimal for performance), so I converted the program to a purely iterative one. |
| 18 | +It looked like this: |
| 19 | + |
| 20 | +```Go |
| 21 | +// Play with benchmarking a tight loop with many iterations and a func call, compare gc vs GopherJS performance. |
| 22 | +package main |
| 23 | + |
| 24 | +import ( |
| 25 | + "fmt" |
| 26 | + "math" |
| 27 | + "time" |
| 28 | +) |
| 29 | + |
| 30 | +func term(k float64) float64 { |
| 31 | + return 4 * math.Pow(-1, k) / (2*k + 1) |
| 32 | +} |
| 33 | + |
| 34 | +// pi performs n iterations to compute an approximation of pi using math.Pow. |
| 35 | +func pi(n int32) float64 { |
| 36 | + f := 0.0 |
| 37 | + for k := int32(0); k <= n; k++ { |
| 38 | + f += term(float64(k)) |
| 39 | + } |
| 40 | + return f |
| 41 | +} |
| 42 | + |
| 43 | +func main() { |
| 44 | + // Start measuring time from now. |
| 45 | + started := time.Now() |
| 46 | + |
| 47 | + const n = 50 * 1000 * 1000 |
| 48 | + fmt.Printf("approximating pi with %v iterations.\n", n) |
| 49 | + fmt.Println(pi(n)) |
| 50 | + |
| 51 | + fmt.Println("total time taken is:", time.Since(started)) |
| 52 | +} |
| 53 | +``` |
| 54 | + |
| 55 | +I ran the program on my computer: |
| 56 | + |
| 57 | +```bash |
| 58 | +$ go run main.go |
| 59 | +approximating pi with 50000000 iterations. |
| 60 | +3.1415926735902504 |
| 61 | +total time taken is: 8.358498915s |
| 62 | +``` |
| 63 | + |
| 64 | +8.35 seconds to perform 50 million iterations, not bad. |
| 65 | +Then I got curious how long it would take if compiled to JavaScript via GopherJS. |
| 66 | + |
| 67 | +I realized that this is a very tight loop, so any overhead incurred by the conversion of Go to JavaScript would be multiplied and be very visible. |
| 68 | +Still, I was curious, so fired up GopherJS and ran the same program by compiling it to JavaScript and running it with node: |
| 69 | + |
| 70 | +```bash |
| 71 | +$ gopherjs run main.go |
| 72 | +approximating pi with 50000000 iterations. |
| 73 | +3.1415926735902504 |
| 74 | +total time taken is: 2.317s |
| 75 | +``` |
| 76 | + |
| 77 | +23 seconds, that's actually... wait, WHAT!? |
| 78 | + |
| 79 | +2.3 seconds! That's 4 times faster than the native Go version. |
| 80 | +For a few minutes, I looked at the two numbers in disbelief. |
| 81 | +Then I decided to investigate what's going on. |
| 82 | +Is the same code running in both cases? |
| 83 | +Is the program correct? |
| 84 | +Is node doing something weird? |
| 85 | + |
| 86 | +I tried running it in the [GopherJS Playground](http://www.gopherjs.org/playground/#/K7r0-q_Jwc), which you can also do: |
| 87 | + |
| 88 | +http://www.gopherjs.org/playground/#/K7r0-q_Jwc |
| 89 | + |
| 90 | +And got the same time in Chrome browser (stable channel). |
| 91 | + |
| 92 | +The calculated value of pi was the same, and after adding some debugging statements I was sure the calculation was indeed correct, and iterations were not being skipped. |
| 93 | + |
| 94 | +But how could it be that taking this Go program and compiling it to JavaScript and executing that would be 4 times faster? |
| 95 | +I had to get to the bottom of it. |
| 96 | + |
| 97 | +The first thing I needed to ensure, was the same code being run in both cases? |
| 98 | +The entire code is plain Go, with the exception of `math.Pow`. |
| 99 | +So I looked at how [Go implements it](http://gotools.org/math#Pow). |
| 100 | +Pretty straightforward Go code. |
| 101 | +Now I knew GopherJS uses some JavaScript native APIs to implement parts of the standard library, so I checked how [it implemented `math.Pow`](https://github.com/gopherjs/gopherjs/blob/master/compiler/natives/math/math.go#L157). |
| 102 | +Aha! It's not the same code after all. |
| 103 | +GopherJS implements it by using [JavaScript's `Math` object](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math), so it translates to the following JavaScript code: |
| 104 | + |
| 105 | +```JavaScript |
| 106 | +Math.pow(x, y) |
| 107 | +``` |
| 108 | + |
| 109 | +That's when it hit me. |
| 110 | +In this code, which was taken from a snippet that optimized for brevity and demonstration purposes rather than performance, `math.Pow` was being used with the first argument of -1, and the second argument are values 0, 1, 2, 3, etc., in sequence. |
| 111 | +The output of that is an alternating sequence of 1, -1, 1, -1, 1, -1, etc. |
| 112 | +But using `math.Pow` for that is extremely inefficient, since it's meant to work with arbitrary inputs that are much harder to calculate. |
| 113 | +This can be trivially rewritten with an if statement. |
| 114 | + |
| 115 | +So, in order to ensure the same code runs in both cases, I did that and tried this program: |
| 116 | + |
| 117 | +```Go |
| 118 | +// Play with benchmarking a tight loop with many iterations and a func call, compare gc vs GopherJS performance. |
| 119 | +// |
| 120 | +// An alternative more close-to-metal implementation that doesn't use math.Pow. |
| 121 | +package main |
| 122 | + |
| 123 | +import ( |
| 124 | + "fmt" |
| 125 | + "time" |
| 126 | +) |
| 127 | + |
| 128 | +func term(k int) float64 { |
| 129 | + if k%2 == 0 { |
| 130 | + return 4 / (2*float64(k) + 1) |
| 131 | + } else { |
| 132 | + return -4 / (2*float64(k) + 1) |
| 133 | + } |
| 134 | +} |
| 135 | + |
| 136 | +// pi performs n iterations to compute an approximation of pi. |
| 137 | +func pi(n int) float64 { |
| 138 | + f := 0.0 |
| 139 | + for k := int(0); k <= n; k++ { |
| 140 | + f += term(k) |
| 141 | + } |
| 142 | + return f |
| 143 | +} |
| 144 | + |
| 145 | +func main() { |
| 146 | + // Start measuring time from now. |
| 147 | + started := time.Now() |
| 148 | + |
| 149 | + const n = 1000 * 1000 * 1000 |
| 150 | + fmt.Printf("approximating pi with %v iterations.\n", n) |
| 151 | + fmt.Println(pi(n)) |
| 152 | + |
| 153 | + fmt.Println("total time taken is:", time.Since(started)) |
| 154 | +} |
| 155 | +``` |
| 156 | + |
| 157 | +Let's try that: |
| 158 | + |
| 159 | +```bash |
| 160 | +$ go run main.go |
| 161 | +approximating pi with 1000000000 iterations. |
| 162 | +3.1415926545880506 |
| 163 | +total time taken is: 10.916861037s |
| 164 | + |
| 165 | +$ gopherjs run main.go |
| 166 | +approximating pi with 1000000000 iterations. |
| 167 | +3.1415926545880506 |
| 168 | +total time taken is: 6.585s |
| 169 | +``` |
| 170 | + |
| 171 | +I had to bump up the number of iterations to 1 billion, because this code runs so much faster than the naive `math.Pow`-using version, in both cases. |
| 172 | +But GopherJS version is still faster. |
| 173 | + |
| 174 | +Aha, then I realized that [GopherJS emulates a 32-bit architecture](https://github.com/gopherjs/gopherjs#architecture). |
| 175 | +But I'm running native Go on a 64-bit machine. |
| 176 | +So the size of `int` is 32-bit for GopherJS code but 64-bit for Go code. |
| 177 | +Let's make it use `int32` consistently and try again: |
| 178 | + |
| 179 | +```bash |
| 180 | +$ gopherjs run main.go |
| 181 | +approximating pi with 1000000000 iterations. |
| 182 | +3.1415926545880506 |
| 183 | +total time taken is: 6.658s |
| 184 | + |
| 185 | +$ gopherjs run main.go |
| 186 | +approximating pi with 1000000000 iterations. |
| 187 | +3.1415926545880506 |
| 188 | +total time taken is: 6.549s |
| 189 | +``` |
| 190 | + |
| 191 | +As expected, the GopherJS time did not change because it was a no-op, but the native Go performance has now caught up to the GopherJS version! |
| 192 | + |
| 193 | +Just to be sure, I wanted to see if 6.5 seconds was as fast as these 1 billion iterations could happen, even if you were to implement this in a low-level language like C: |
| 194 | + |
| 195 | +```C |
| 196 | +#include <stdio.h> |
| 197 | +#include <time.h> |
| 198 | + |
| 199 | +double term(int k) { |
| 200 | + if (k%2 == 0) { |
| 201 | + return 4.0 / (2.0*(double)(k) + 1.0); |
| 202 | + } else { |
| 203 | + return -4.0 / (2.0*(double)(k) + 1.0); |
| 204 | + } |
| 205 | +} |
| 206 | + |
| 207 | +// pi performs n iterations to compute an approximation of pi. |
| 208 | +double pi(int n) { |
| 209 | + double f = 0.0; |
| 210 | + for (int k = 0; k <= n; k++) { |
| 211 | + f += term(k); |
| 212 | + } |
| 213 | + return f; |
| 214 | +} |
| 215 | + |
| 216 | +int main() { |
| 217 | + int n = 1000 * 1000 * 1000; |
| 218 | + printf("approximating pi with %d iterations.\n", n); |
| 219 | + printf("%.16f\n", pi(n)); |
| 220 | + |
| 221 | + return 0; |
| 222 | +} |
| 223 | +``` |
| 224 | +
|
| 225 | +The [timing library](http://en.cppreference.com/w/c/chrono) of C isn't as friendly to use as the Go time package, so I gave up and just used `time` instead: |
| 226 | +
|
| 227 | +```bash |
| 228 | +$ gcc main.c |
| 229 | +$ time ./a.out |
| 230 | +approximating pi with 1000000000 iterations. |
| 231 | +3.1415926545880506 |
| 232 | +
|
| 233 | +real 0m11.385s |
| 234 | +user 0m11.377s |
| 235 | +sys 0m0.006s |
| 236 | +``` |
| 237 | + |
| 238 | +11.3 seconds? Slower? |
| 239 | +Ah, of course, I was too used to `go` build tool that uses optimization by default, and forgot that C compilers don't do that. |
| 240 | + |
| 241 | +```bash |
| 242 | +$ gcc -O3 main.c |
| 243 | +$ time ./a.out |
| 244 | +approximating pi with 1000000000 iterations. |
| 245 | +3.1415926545880506 |
| 246 | + |
| 247 | +real 0m6.434s |
| 248 | +user 0m6.427s |
| 249 | +sys 0m0.004s |
| 250 | +``` |
| 251 | + |
| 252 | +Nice, it's the same time as the Go and GopherJS versions. |
| 253 | +That means a few things. |
| 254 | + |
| 255 | +The V8 JavaScript engine is incredible. |
| 256 | +It's able to take Go code that is compiled to JavaScript code, and just-in-time compile to it to machine instructions that are as efficient as the native Go compiler. |
| 257 | + |
| 258 | +The JavaScript `Math.pow` implementation is faster when value of x is -1 and values of y are integers. |
| 259 | +I haven't compared its performance for other inputs; let me know if you do. |
| 260 | +However, using `Pow` with such inputs is silly and should not be done, as you can see by the 50 million to 1 billion iteration increase when rewriting it with an equivalent if statement. |
| 261 | + |
| 262 | +You can try the final optimized version of GopherJS in your browser via the GopherJS Playground: |
| 263 | + |
| 264 | +http://www.gopherjs.org/playground/#/sDEYM2TwC7 |
| 265 | + |
| 266 | +It's fascinating to think about what happens when you do that. |
| 267 | +The GopherJS compiler, written in pure Go, has compiled itself to JavaScript, which runs in your browser. |
| 268 | +That compiler takes your input Go program, compiles it to JavaScript and runs it. |
| 269 | +The V8 engine (or whatever JavaScript engine your browser uses) takes the generated JavaScript and JITs it to the equivalent machine code as produced by a low-level C implementation compiled with -O3, the max optimization setting. |
| 270 | + |
| 271 | +There are still cases where the code GopherJS generates does not translate to something JS engines can optimize really well. |
| 272 | +For example, in [issue 276](https://github.com/gopherjs/gopherjs/issues/276), GopherJS version runs an unusually 1000x slower than native version. |
| 273 | +But I'm sure with some work, significant performance improvements can be made there, and in most other cases the performance is much better. |
| 274 | + |
| 275 | +With the prospects of asm.js and the upcoming WebAssembly, I think there's a bright future for having Go language as a viable choice for the browser. |
| 276 | +I suggest you give it a try for your next little frontend project, or play with compiling any pure Go package to run in the browser. |
| 277 | +You may end up being pleasantly surprised, like I was. |
0 commit comments