Potential performance improvements to our Jest test suite

Preface

After running a few tests locally, I noticed that my resource consumption was higher than usual. At first, I thought this was an issue with the tests I was working with, but instead, it seems that it is because we use the default values for maxWorkers in our CI and local Jest base configurations; Jest default behavior is to spawn a node process for each of the available cores in your machine even if they don't end up being used.

From the Jest docs:

In single run mode, this defaults to the number of the cores available on your machine minus one for the main thread

https://jestjs.io/docs/cli#--maxworkersnumstring

Using my machine as an example

$ nproc
10

That means I will get 9 workers set regardless of the number of test files I want to run, which might not be as desirable if we have the GDK running in the background.

Benchmarks

Using /usr/bin/time, I ran a test suite in a folder using different maxWorkers configurations

To run the benchmark

/usr/bin/time -al yarn jest -f ee/spec/frontend/ci/runner --all

And on jest.config.base.js

diff --git a/jest.config.base.js b/jest.config.base.js
index 8642cc22b0bf..db82fbce35d4 100644
--- a/jest.config.base.js
+++ b/jest.config.base.js
@@ -280,5 +280,6 @@ module.exports = (path, options = {}) => {
       ...(IS_EE ? ['<rootDir>/ee/app/assets/javascripts/', ...extRootsEE] : []),
       ...(IS_JH ? ['<rootDir>/jh/app/assets/javascripts/', ...extRootsJH] : []),
     ],
+    maxWorkers: '75%'
   };
 };

Default configuration benchmark

8.12 real        34.99 user         8.74 sys
           436453376  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
              297795  page reclaims
                  89  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                 162  messages sent
                 123  messages received
                   0  signals received
                  58  voluntary context switches
               78733  involuntary context switches
          1338024450  instructions retired
           500167620  cycles elapsed
            71381888  peak memory footprint

The "control" run. Using all available workers, the tests consume around 436MB of memory and take 35 seconds to run.

With 50% of available workers

7.41 real        25.61 user         5.54 sys
           543260672  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
              217827  page reclaims
                  73  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                 148  messages sent
                 111  messages received
                   0  signals received
                 117  voluntary context switches
               50687  involuntary context switches
          1338400168  instructions retired
           506652082  cycles elapsed
            69792640  peak memory footprint

This run uses half the available workers, consuming around 543MB of memory, but it takes less time 🤯

With 75% of available workers

8.14 real        30.93 user         7.34 sys
           483115008  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
              257388  page reclaims
                  81  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                 155  messages sent
                 117  messages received
                   0  signals received
                  46  voluntary context switches
               63815  involuntary context switches
          1338417743  instructions retired
           530749738  cycles elapsed
            70283840  peak memory footprint

This run uses 3/4 of the available workers and consumes around 483MB of memory. It takes less time than the control run and consumes less memory than the 50% one.

Where to next?

As a first iteration, I'd like to change the jest.config.base.js file to include a maxWorkers setting to help run test suites locally. A much larger iteration would involve speaking with team members from the engineering productivity team and introducing the exact change to our CI environment to see if we can improve either test run speeds or reduce costs; perhaps using this in GitLab-UI would be a better fit as the risk would be lower.

Please let me know your thoughts.