0% found this document useful (0 votes)
1K views

Performance With Laravel

The document discusses various tools and techniques for measuring backend performance in Laravel and PHP applications, including Apache Benchmark (ab), JMeter, Telescope, and OpenTelemetry. It covers important metrics like throughput, load time, and explains concepts like database indexing and optimizing asynchronous workflows.

Uploaded by

Alexcsandru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views

Performance With Laravel

The document discusses various tools and techniques for measuring backend performance in Laravel and PHP applications, including Apache Benchmark (ab), JMeter, Telescope, and OpenTelemetry. It covers important metrics like throughput, load time, and explains concepts like database indexing and optimizing asynchronous workflows.

Uploaded by

Alexcsandru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 217

Martin Joo - Performance with Laravel

Measuring performance
ab
jmeter
Inspector
Telescope
OpenTelemetry
XDebug + qcachegrind
Clockwork
htop
How to start measuring performance?
N+1 queries
Solutions
Prevent lazy loading
Let the client decide what it wants
Multiple resources
Pagination
Cursor pagination
Database indexing
Theory
Arrays
Linked list
Binary tree
Binary search tree (BST)
Indexing in the early days
Single-level indexing
Multi-level indexing
B-Tree
Problems with B-Trees
B+ Trees
Index access types
const
range
range (again)
index
ALL
Select *
Composite indexes
Cardinality
Database indexing in practice
Listing posts by status
Feed
Publishing posts
Avoiding memory problems
Avoiding spamming the database
Measuring performance
Async workflows
Web scraping with jobs
Concurrent programming
fork
Concurrent HTTP requests
Queues and workers

No. 1 / 217
Martin Joo - Performance with Laravel

supervisor
Multiple queues and priorities
Optimizing worker processes
Chunking large datasets
Exports
Imports
Generators & LazyCollections
PHP generators
Imports with generators
Imports with LazyCollections
Reading files
Deleting records
Miscellaneous
fpm processes
nginx cache
Caching static content
Caching fastcgi responses
MySQL slow query log
Monitoring database connections
Docker resource limits
Health check monitors

No. 2 / 217
Martin Joo - Performance with Laravel

Measuring performance
Before we talk about how to optimize performance we need ways to effectively measure it. But even before
we can measure it we need to know what exactly we want to measure.

Here are some of the most important performance measures of an API/backend service:

Throughput: the number of requests the system can handle without going down.

Load time: the amount of time it takes for an HTTP request to respond.

Size: the total size of the HTTP response.

Server uptime: the duration of time the server is up and running usually expressed as a percentage.

CPU usage: the amount of CPU your system needs to run. It is usually expressed as load average which
I'm gonna explain later.

Memory usage: the amount of memory your system uses.

In this book, we're going to talk about backend and APIs but of course there are some frontend-related
metrics as well:

Load time: the amount of time it takes for the full page to load.

First byte: the time taken to start loading the data of your web application after a user requests it.

Time to interactive: this measures how long it takes a page to become fully interactive, i.e., the time
when layout has stabilized, key web fonts are visible, and the main thread is available enough to
handle user input.

Page size: the total size of the web page, that includes all of its resources (HTML, CSS, JavaScript,
images, etc).

Number of requests: the number of individual requests made to the server to fully load the web page.

These things are "black box measures" or "external measures." Take load time for an example and say the
GET api/products endpoint took 912s to load which is slow. Measuring the load time tells you that your
system is slow, but it doesn't tell you why. To find out the cause we need to dig deeper into the black box.
We need to debug things such as:

Number of database queries

The execution time of database queries

Which function takes a long time to finish its job?

Which function uses more memory than it's supposed to?

What parts of the system can be async?

and so on

Measuring a system from the outside (for example load time of an API endpoint) is always easier than
measuring the internal parts. This is why we start with the external measures first.

No. 3 / 217
Martin Joo - Performance with Laravel

ab
The easiest tool to test your project's performance is ab or Apache Benchmark. It's a command line tool
that sends requests to a given URL and then shows you the results.

You can use it like this:

ab -n 100 -c 10 -H "Authorization: Bearer


1|A7dIitFpmzsDAtwEqmBQzDtfdHkcWCTfGCvO197u"
http:!"127.0.0.1:8000/api/transactions

This sends 100 requests to http://127.0.0.1:8000/api/transactions with a concurrency level of 10 .


This means that 10 of those requests are concurrent. They are sent at the same time. These concurrent
requests try to imitate multiple users using your API at the same time. It will send 10 requests at a time until
it reaches 100.

Unfortunately, in ab we cannot specify the ramp-up time. This is used to define the total time in which the
tool sends the requests to your app. For example, "I want to send 100 requests in 10 seconds." You cannot
do that with ab . It will always send requests when it can. So if the first batch of the concurrent requests
(which is 10 requests in this example) is finished in 3 seconds then it sends the next batch and so on. Other
than that, it's the perfect tool to quickly check the throughput of your application.

And now, let's interpret the results:

Concurrency Level: 10
Time taken for tests: 2.114 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 1636000 bytes
HTML transferred: 1610100 bytes
Requests per second: 47.31 [#/sec] (mean)
Time per request: 211.363 [ms] (mean)
Time per request: 21.136 [ms] (mean, across all concurrent requests)
Transfer rate: 755.88 [Kbytes/sec] received

As you can see, it sent a total of 100 requests with a concurrency level of 10. The whole test took 2114ms or
2.114 seconds. If we divide 100 with 2.114 seconds the result is 47.31. This is the throughput of the server.
It can handle 47 requests per second.

The next two numbers were quite hard to me to understand at first. They are:

Time per request: 211.363 [ms] (mean)

Time per request: 21.136 [ms] (mean, across all concurrent requests)

No. 4 / 217
Martin Joo - Performance with Laravel

When you run ab -n 100 -c 10 ab creates 10 request "groups" that contains 10 requests each:

In this case Time per request: 21.136 [ms] (mean, across all concurrent requests) means that 1
request took 21ms on average. This is the important number.

The other Time per request: 211.363 [ms] (mean) refers to a request group. Which contains 10
requests. You can clearly see the correlation between these numbers:

Time taken for tests: 2114 ms

All 100 requests took a total 2114 ms

Time per request: 21.136 [ms] (mean, across all concurrent requests)

2114 ms / 100 requests = 21.14 ms per request

One request took 21.14 ms on average

Time per request: 211.363 [ms] (mean)

2114 ms / 100 requests * 10 = 211.4 ms

One request group of 10 requests took 211 ms on average

So if you use concurrency the last number doesn't really make sense. It was really confusing for me at first,
so I hope I gave you a better explanation.

ab is a fantastic tool because it is:

Easy to install

Easy to use

You can load test your app in minutes and get quick results

But of course it has lots of limitations.

No. 5 / 217
Martin Joo - Performance with Laravel

jmeter
The next tool to load test your application is jmeter . It has more advanced features than ab including:

Defining ramp-up period

Building a "pipeline" of HTTP requests simulating complex user interactions

Adding assertions (response validation) to your HTTP tests

Better overview of your app's performance

Test results visualization using charts, graphs, tree view, etc.

Other useful testing features such as XPath, regular expression, JSON, script variables, and response
parsing, which help to build more exact and effective tests.

GUI

A quick note. If you're having trouble starting jmeter try this command with the UserG1GC argument: java -
XX:+UnlockExperimentalVMOptions -XX:+UseG1GC -jar ApacheJMeter.jar . You can also use this alias:
alias jmeter='JVM_ARGS="-XX:+UnlockExperimentalVMOptions -XX:+UseG1GC"
/path/to/jmeter/bin/jmeter'

To start load testing with jmeter you need to create a new Thread group that has the following options:

Number of threads refers to the number of users or put it simply, the number of requests.

Ramp-up period defines how much time should jmeter take to start the requests. If 10 threads are
used, and the ramp-up period is 100 seconds, then jmeter will take 100 seconds to get all 10 threads
up and running. Each thread will start 10 (100/10) seconds after the previous thread was begun.

And then we have Loop count . By default it's 1 meaning that jmeter runs your HTTP tests once. If you
set to 100 it repeats all of the tests a 100 times.

Note: you can find the example test plan in the source code 1-measuring-performance/simple-jmeter-
test-plan.jmx

Inside the Thread group we need to add an HTTP Sampler which can be found inside the Sampler
category. An HTTP request is pretty straightforward. You just need to configure the base URL, the endpoint,
and query or POST body parameters if you have any.

In order to measure the results we need to add Listeneres as well. These are the components that can
display the results in various formats. Two of the most crucial listeners are the Summary Report and the
View Results Tree . Add them to the Thread group.

Summary report looks like this:

No. 6 / 217
Martin Joo - Performance with Laravel

Average: the average response time in ms

Median: the median (50th percentile) response time

Max: the slowest response time

Std. Dev,: it's the standard deviation during the test. In general, standard deviation shows you the
amount of variation or dispersion of a set of values. In performance testing, the standard deviation
indicates how much individual sample times deviate from the average response time. In this example,
27 indicates that the individual sample times in the dataset are, on average, 27 units away from the
mean of the dataset. This means that there is a relatively high level of variability or dispersion in the
sample times.

Error%: the percentage of requests that resulted in an error (non-2xx)

Throughput: the number of requests per minute that can be served by the server

Received KB/Sec: the amount of data received per second during the test.

So the summary reoprt gives you a quick overview of the overall results of your tests.

View Results Tree on the other hand enables you to check out individual requests which can be helpful if
you have 5xx responses. It looks like this:

No. 7 / 217
Martin Joo - Performance with Laravel

The last thing you probably need is to send the Authorization header in your requests. In jmeter there's a
dedicated component to set header values. It's called HTTP Header Manager and can be found in the
Managers category. To set up is really easy you just need to add the header's name and value.

So a simple buf working test plan look like this:

Note: you can find this example test plan in the source code 1-measuring-performance/simple-jmeter-
test-plan.jmx

No. 8 / 217
Martin Joo - Performance with Laravel

Inspector
ab and jmeter are great if you want to understand the throughput and overall responsiveness of your
application. Let's say you found out that the GET /api/transactions endpoint is slow. Now what? You
open the project go to the Controller trying to find the slow part. You might add some dd or time() and so
on. Fortunately, there's a better way.

Inspector.dev allows you to visualize the internals of your applications.

For example, here's a time distribution of the different occurances of the GET /api/transactions request:

I sent the request 11 times:

7 times it took only 20-50ms. You can see these occurrences on the left side on the 0ms mark.

3 times it took something between 500ms and 2500ms. These are the 3 smaller bars.

And then one time it took almost 15 seconds. This is the lonely bar at the right side.

If I click on these bars I can quickly what's the difference between a 29ms and a 2300ms request:

No. 9 / 217
Martin Joo - Performance with Laravel

In the left panel you can see that only 4 mysql queries were executed and the request took 29ms to
complete. On the right side, however, there were 100 queries executed and it took 2.31 seconds. You can
see the individual queries as well. On the right side there are these extra select * from products queries
that you cannot see on the left side.

In the User menu you can always check out the ID of the user that sent the request. It's a great feature
since user settings and different user data can cause differences in performance.

If it's a POST request you can see the body in the Request Body menu:

Another great feature of Inspector is that it also shows outgoing HTTP requests and dispatched jobs in the
timeline:

No. 10 / 217
Martin Joo - Performance with Laravel

In my example application, the POST /api/transactions endpoint comunicates with other APIs and also
dispatched a job. These are the highlighted rows on the image.

The great thing about Inspector is that it integrates well with Laravel so it can detect things like your queue
jobs:

No. 11 / 217
Martin Joo - Performance with Laravel

You can dig into the details of a jobs just like an HTTP requests:

You have the same comparasion view with all the database queries, HTTP requests, or other dispatched
jobs:

No. 12 / 217
Martin Joo - Performance with Laravel

The best thing about Inspector? This is the whole installation process:

composer require inspector-apm/inspector-laravel

It's an awesome tool to monitor your application in production. Easy to get started and easy to use. It gives
you a great overview and you can dig deeper if you need to. It integrates well with Laravel so you'll see your
HTTP requests, commands, and jobs out of the box.

Check out their docs here.

No. 13 / 217
Martin Joo - Performance with Laravel

Telescope
Even tough, Inspector is awesome, it's a paid 3rd party tool so I understand not everyone wants to use it.

One of the easiest tools you can use to monitor your app is Laravel Telescope.

After you've installed the package you can access the dashboard at localhost:8000/telescope . If you
send some requests you'll see something like that:

It gives you a great overview of your requests and their duration. What's even better, if you click on a
specific request you can see all the database queries that were executed:

If you click on an entry you can see the whole query and the request's details:

No. 14 / 217
Martin Joo - Performance with Laravel

Telescope can also monitor lots of other things such as:

Commands

Jobs

Cache

Events

Exceptions

Logs

Mails

...and so on

For example, here are some queue jobs after a laravel-excel export:

No. 15 / 217
Martin Joo - Performance with Laravel

Telescope is a great tool and it's a must have if you want to monitor and improve your app's performance. If
you want to use only free and simple tools go with ab and Telescope. ab tells you what part of the app is
slow. Telescope tells you why it's slow.

No. 16 / 217
Martin Joo - Performance with Laravel

OpenTelemetry
Both Inspector and Telescope tracks everything by default. Which is a great thing. However, sometimes you
might want to control what's being tracked and what is not.

To do that, the best option in my opinion is OpenTelemetry. OpenTelemetry is an Observability framework


and toolkit designed to create and manage telemetry data such as traces, metrics, and logs. It's
independent of languages, tools, or vendors. It offers a standardized specification and protocol that can be
implemented in any language.

There are two important OpenTelemetry terms, they are:

Traces

Spans

A trace is a set of events. It's usually a complete HTTP request and contains everything that happens inside.
Imagine if your API endpoint sends an HTTP request to a 3rd, dispatched a jobs, sends a Notification and
runs 3 database queries. All of this is one trace. Every trace has a unique ID.

A span is an operation inside a trace. The HTTP request to the 3rd party can be a span. The dispatched job
can be another one. The notification can be the third one. And finally you can put the 3 queries inside
another span. Each span has a unique ID and their contain the trace ID. It's a parent-child relationship.

We can visualize it like this:

So it's similar to Inspector, however, it requires manual instrumentation. Instrumentation means you need
to start and stop the traces manually, and then you need to add spans as you like to. So it requires more
work but you can customize it as you wish.

OpenTelemetry offers a PHP SDK. However, the bare bone framework is a bit complex to be honest, so I'll
use a simple but awesome Spatie package to simplify the whole process. It's called laravel-open-telemetry.

The installation steps are easy:

composer require spatie/laravel-open-telemetry


php artisan open-telemetry:install

No. 17 / 217
Martin Joo - Performance with Laravel

To start use it we need to manually add the spans:

Measure!$start('Communicating with 3rd party');

Http!$get('!!%');

Measure:stop('Communicating to 3rd party');

The start method starts a new span. Behind the scenes a unique trace ID will be generated at the start of
every request. When you call Measure::start() a span will be started that will get that trace id injected.
So we only worry about spans. Traces are handled by the package.

But what happens with this traces and spans? How can I view them? Great question!

The collected data needs to be stored somewhere and needs a frontend. We need to run some kind of store
and connect it to the Spatie package. There are multiple tracing system that handles OpenTelemetry data.
For example, ZipKin or Jaeger. I'm going to use ZipKin since it's the most simple to set up locally. All we need
to do is this:

docker run -p 9411:9411 openzipkin/zipkin

Now the the Zipkin UI is available at http://localhost:9411.

In the open-telemetry.php config file we can condfigure the driver:

'drivers' !& [
Spatie\OpenTelemetry\Drivers\HttpDriver!$class !& [
'url' !& 'http:!"localhost:9411/api/v2/spans',
],
],

Now Spatie will send the collected metrics to localhost:9411 where ZipKin listens.

Let's see an example how we can add these spans. When you purchased this book (thank you very much!)
you interacted with Paddle even if you didn't realize it. It's merchant of record meaning you paid for them
and they will send me the money once a month. This way, I worry about only one invoice a month. They also
handle VAT ramification.

So imagine an endpoint when we can buy a product, let's call it: POST /api/transactions the requests
looks like this:

No. 18 / 217
Martin Joo - Performance with Laravel

namespace App\Http\Requests;

class StoreTransactionRequest extends FormRequest


{
public function rules(): array
{
return [
'product_id' !& ['required', 'exists:products,id'],
'quantity' !& ['required', 'numeric', 'min:1'],
'customer_email' !& ['required', 'email'],
];
}
}

It's a simplified example, of course. When someone buys a product we need to do a number of things:

Calculating the VAT based on the custumer's IP address

Inserting the actual DB record

Triggering a webhook meaning we call some user defined URLs with the transaction's data

Moving money via Stripe. We'll skip that part now.

Calculating the VAT involves talking to 3rd party services (for example VatStack). The transactions table
can be huge in an application like this so it's a good idea to place a span that contains this one query
specifically.

We can add spans like these:

public function store(


StoreTransactionRequest $request,
VatService $vatService,
Setting $setting
) {
Measure!$start('Create transaction');

$product = $request!'product();

/** @var Money $total !(


$total = $product!'price!'multiply($request!'quantity());

No. 19 / 217
Martin Joo - Performance with Laravel

Measure!$start('Calculate VAT');

$vat = $vatService!'calculateVat($total, $request!'ip());

Measure!$stop('Calculate VAT');

$feeRate = $setting!'fee_rate;

$feeAmount = $total!'multiply((string) $feeRate!'value);

Measure!$start('Insert transaction');

$transaction = Transaction!$create([
'product_id' !& $product!'id,
'quantity' !& $request!'quantity(),
'product_data' !& $product!'toArray(),
'user_id' !& $product!'user_id,
'stripe_id' !& Str!$uuid(),
'revenue' !& $total,
'fee_rate' !& $feeRate,
'fee_amount' !& $feeAmount,
'tax_rate' !& $vat!'rate,
'tax_amount' !& $vat!'amount,
'balance_earnings' !& $total
!'subtract($vat!'amount)
!'subtract($feeAmount),
'customer_email' !& $request!'customerEmail(),
]);

Measure!$stop('Insert transaction');

try {
if ($webhook = Webhook!$transactionCreated($request!'user())) {
SendWebhookJob!$dispatch($webhook, 'transaction_created', [
'data' !& $transaction,
]);
}

No. 20 / 217
Martin Joo - Performance with Laravel

} catch (Throwable) {}

Measure!$stop('Create transaction');

return response([
'data' !& TransactionResource!$make($transaction)
], Response!$HTTP_CREATED);
}

Here's the result in ZipKin:

The trace is called laravel: create transaction where laravel comes from the default config of the
package and create transaction comes from the first span.

There are four spans:

create transaction refers to the whole method

calculate vat tracks the communication with VatStack

insert transaction tracks the DB query

And finally SendWebhookJob was recorded by the package automatically. It tracks every queue job by
default and puts them into the right trace. It's a great feature of the Spatie package.

Unfortunately, it's not perfect. You can see the duration is 1.307s in the upper left corner that refers to the
duration of the whole trace. But it's not true since the operation took only 399ms + 78ms for the job. Since
the job is async there's a delay in dispatching it and the start of the execution by the worker process. I
honestly don't know how we can overcome this problem.

In retrospect, here's the same endpoint in Inspector:

No. 21 / 217
Martin Joo - Performance with Laravel

The duration is much better and I think the timeline is also better. Of course, it's more detailed. If these
small 5ms segments are annoying I have good news. You can group them using segments:

$vat = $vatService!'calculateVat($total, $request!'ip());

inspector()!'addSegment(function () {
$feeRate = $setting!'fee_rate;
$feeAmount = $total!'multiply((string) $feeRate!'value);
!"!!%
}, 'process', 'Group #1');

Group #1 will be displayed in Inspector as the name of this segment. Instead of 10 small segments you'll
see only one. It's a great feature if you're in the middle of debugging and you want to see less stuff to have a
better overview of your endpoint.

To sum it up:

OpenTelemetry is a great tool to profile your apps

You have to add your own spans which gives you great customization

No. 22 / 217
Martin Joo - Performance with Laravel

On the other hand, you have to invest some time upfront

It's free. However, you still need to deploy ZipKin somewhere

Compare it to Inspector:

You just install a package

You'll see every detail

You can still add your own segments to customize the default behavior

But of course, it's a paid service

No. 23 / 217
Martin Joo - Performance with Laravel

XDebug + qcachegrind
The next profiling tool is lowest-level of all. It might not be the most useful but I felt I had to include at least
one low-level tool.

XDebug can be used in two ways:

Step debugging your code

Profiling your code

Lots of people know about the step debugging aspect of it. And it's great. I think you should set it up and
use it. Here's a Jeffrey Way video that teaches you the whole process in 11 minutes and 39 seconds.

The other feature of XDebug is profiling. When you send a request to your app it can profile every method
call and creates a pretty big data structure out of it that can viewed and analysed for performance
problems. The programs that allows you the view these structures is called qcachegrind on Mac and
kcachegrind on Linux.

You can install these two programs on Mac by running these:

pecl install xdebug


brew install qcachegrind

If you run php -v you should see something like this:

After that we need to configure XDebug in php.ini :

zend_extension=xdebug
xdebug.profiler_enabled=1
xdebug.mode=profile,debug
xdebug.profiler_output_name=cachegrind.out.%c
xdebug.start_upon_error=yes
xdebug.client_port=9003
xdebug.client_host=127.0.0.1

No. 24 / 217
Martin Joo - Performance with Laravel

xdebug.mode=profile makes it to listen to our requests and create a function call map.
xebug.profiler_output_name is the file that it creates in the /var/tmp directory. client_port and
client_host is only needed to step debugging.

If you're not sure where is your php.ini file run this:

php !)ini

As you can see, I didn't add the XDebug config in the php.ini file but created an ext-xdebug.ini file in
the conf.d folder that is automatically loaded by PHP.

Now you need to restart your php artisan serve , or Docker container, or local fpm installation. If you did
everything right phpinfo() should include the XDebug extension.

Now all you need to do is send a request to your application. After that, you should see a new file inside the
/var/tmp directory:

Let's open it:

qcachegrind cachegrind.out.1714780293

We see something like that:

No. 25 / 217
Martin Joo - Performance with Laravel

It's a little bit old school, it's a little ugly, but it's actually pretty powerful.

On the left side you see every function that was invoked during the requests.

On the right side you see the full call graph.

The great thing about the call graph is that it includes the time spent in the given function. Take a look at
this:

No. 26 / 217
Martin Joo - Performance with Laravel

This is the part where Laravel dispatches my TransactionController class and calls the index method in
which I put a sleep function. 40% of the time was spent in the sleep function which is expected in this
case.

The great thing about XDebug+qcachegrind is that you can really dig deep into your application's behavior.
However, I think in most cases it's unnecessary. With Telescope or Inspector you'll get a pretty great
overview of your performance problems. In a standard, high-level, "business" application your problems will
be most likely related to database and Telescope or Inspector are just better tools to profile these kinds of
problems.

However, XDebug+qcachegrind can teach us a few things. For example, I never realized this:

These are the function that were executed during the requests. I highlighted four of them:

MoneyCast::set was called 500 times

MoneyForHuman::from was called 200 times

MoneyForHuman::__construct was called 200 times

MoneyCast::get was called 200 times

I give you some context. These examples come from a financial app. The requests I was testing is the GET
/api/transactions . It returns 50 transactions. A transacion record looks like this:

No. 27 / 217
Martin Joo - Performance with Laravel

ID product_id quantity revenue

1 1 2 1800

2 1 1 900

3 2 1 2900

It containes a sales of a product. It has other columns as well.

The Transaction model uses some value object casts:

class Transaction extends Model


{
use HasFactory;

protected $casts = [
'quantity' !& 'integer',
'revenue' !& MoneyCast!$class,
'fee_rate' !& PercentCast!$class,
'fee_amount' !& MoneyCast!$class,
'tax_rate' !& PercentCast!$class,
'tax_amount' !& MoneyCast!$class,
'balance_earnings' !& MoneyCast!$class,
'product_data' !& 'array',
];
}

MoneyCast is just a Cast that uses the Money value object from the moneyphp package:

No. 28 / 217
Martin Joo - Performance with Laravel

class MoneyCast implements CastsAttributes


{
public function get(Model $model, string $key, mixed $value, array
$attributes): mixed
{
return Money!$USD($value);
}

public function set(Model $model, string $key, mixed $value, array


$attributes): mixed
{
return $value!'getAmount();
}
}

Pretty simple. The database stores scalar values and this Cast casts them into value objects.

The TransactionController return with TransactionResource objects:

class TransactionResource extends JsonResource


{
public function toArray(Request $request): array
{
return [
'uuid' !& $this!'uuid,
'quantity' !& $this!'quantity,
'revenue' !& MoneyForHuman!$from($this!'revenue)!'value,
'fee' !& MoneyForHuman!$from($this!'fee_amount)!'value,
'tax' !& MoneyForHuman!$from($this!'tax_amount)!'value,
'balance_earnings' !& MoneyForHuman!$from(
$this!'balance_earnings
)!'value,
'customer_email' !& $this!'customer_email,
];
}
}

No. 29 / 217
Martin Joo - Performance with Laravel

There are thse MoneyForHuman calls. It's just another value object that formats Money objects.

The important part is that the Controller returns only 50 transactions:

return TransactionResource!$collection(
$request!'user()!'transactions()!'paginate(50),
);

Returning only 50 transactions resulted in 1,100 calls to these objects and functions!

It's crazy. If I put something in one of these classes that takes only 50ms the whole request will take an extra
5,500ms to complete. That is an extra 55 seconds.

Let's try it out!

These are the base results without slowing down the functions:

I sent only one request and it took 278ms to complete. Of course, it will vary but it's good enough.

And now I put 3 usleep(55000); calls in the code:

class MoneyForHuman
{
public function !*construct(private readonly Money $money)
{
usleep(55000);
!" !!%
}
}

No. 30 / 217
Martin Joo - Performance with Laravel

class MoneyCast implements CastsAttributes


{
public function get(): mixed
{
usleep(55000);
!" !!%
}

public function set(): mixed


{
usleep(55000);
!" !!%
}
}

At first try, ab timout was exceeded which is 30 seconds by default:

Let's increase it:

ab -n 1 -s 300 -H "Authorization: Bearer


5|JkQOThREkfVgcviCdfEEAU74WRyGHo1ZuKujG4fA"
http:!"127.0.0.1:8000/api/transactions

And the results are:

No. 31 / 217
Martin Joo - Performance with Laravel

The request took 53.5 seconds to complete.

So even tough, XDebug+qcachegrind can be a little bit too low-level for 95% of our usual performance
problems as you can see they can help us to see the small details that can ruin the performance of our
applications in some cases.

If you want to learn more about XDebug+qcachegrind check out this live stream from the creator of
XDebug.

No. 32 / 217
Martin Joo - Performance with Laravel

Clockwork
There are some other useful tools to profile your applications. Clockwork and Debugbar are great examples.

Clockwork is very similar to telescope. It's a composer package that you can install and after that you can
open 127.0.0.1:8000/clockwork and you'll get a page such as this:

It's the timeline of an API reuqest showing all the database queries that were executed.

You can also check how many models are being retreived to serve the request:

The great thing about Clockwork is that it also comes with a Chrome plugin. So you can see everything in
your developer tool:

No. 33 / 217
Martin Joo - Performance with Laravel

I think Clockwork is the fastest way to start profiling your application on your localhost. You don't even have
to go to a separate page. Just open your console and the information is there.

No. 34 / 217
Martin Joo - Performance with Laravel

htop
The last tool I'd like to talk about is htop . It's a simple but very powerful command line tool that I'll use in
the rest of this book. It looks like this:

You can check the utilization of your CPU and memory. It's a very important tool to debug performance
issues real-time. By real-time I mean two things:

When shit happens and there is some serious performance issue in your production environment you
can check out htop and see what's happening real-time.

When you're developing a feature on your local machine you can always check htop to get an idea
about the CPU load. Of course, it's highly different from your prod servers but it can be a good
indicator.

Other than the visual representation of the load of the cores we can also see the load average numbers.
They are 1.85, 2.27, 2.39 in my case. These numbers represent the overall load of your CPU. The three
numbers mean:

1.85 (the first one) is the load average of the the last 1 minutes

2.27 (the second one) is the load average of the the last 5 minutes

2.39 (the last one) is the load average of the the last 15 minutes

So we have an overview of the last 15 minutes.

What does a number such as 1.85 actually mean? It means that the overall CPU utilization was around 23%
on my machine in the last minute. Straightforward, right?

If you have only 1 CPU core a load average of 1 means your core is working 100%. It is fully utilized. If your
load average is 2 then your CPU is doing twice as much work as it can handle.

But if you have 2 CPU cores a load average of 1 means your cores are working at 50%. In this case, a load
average of 2 means 100% utilization.

So the general rule is that if the load average is higher then the number of cores your server is overloaded.

Back to my example. I have 8 cores so a load average of 8 would be 100% utilization. My load average is 1.85
on the image so it means 1.85/8 or about 23% CPU load.

No. 35 / 217
Martin Joo - Performance with Laravel

How to start measuring performance?


"That's all great but I've never been involved in optimizing and measuring performance. So how should I
start right now?"

If you're new to this stuff this is my recommendation.

Your home page

In a typical business application where users must log in probably one of the most important pages is the
dashboard, the home page that presents right after they log in. If it's a publicly available webpage than it is
the landing page. If you don't know where/how to start this is the perfect place.

Determine how many users you want to/have to serve. Let's say it's 1,000

Open up ab or jmeter and send 1,000 requests to your home page.

Play with the ramp-up times

Play with the concurrency level

Come up with a reasonable goal. For example, "I want to be able to serve 100 concurrent users
with a maximum load time of 1.5 seconds" (these numbers are completely random, please don't
take them seriously)

Now open up Inspector or Telescope and identify what takes a long time

Pick up the low hanging fruit and solve it

Continue reading the book :)

Your most valuable feature

I know, i know every feature of your app is "the most important at the moment" according to the product
team. However, we all know we can identify a handful of features that is the most critical in the application
no matter what. Try to identify them and measure them the same way as your home page. However, in this
case your target numbers can be lower because it's usually rare that 72% of your users use the same
feature at the same time. It's always true to the home page but usually it's not the case with other features.
Unless, of course, your feature has some seasonality such as booking.com or it follows some deadlines such
an accounting software. In this case, you know that on X day of every month 90% of users will use that one
feature.

Your heaviest background job

We tend to forget to optimize background jobs because they run in the background and they do not overly
affect the overall user experience. However, they still use our servers. They consume CPU and RAM. They
costs us money.

Just as with features, try to identify your most important/critical jobs and analyze them with Inspector
and/or Telescope the same way as if they were web requests. Try to reduce the number of queries, the
overall memory consumption, the execution time with techniques discussed in the book.

When you set target numbers (such as serving 100 concurrent users, loading the page within 500ms etc) it's
important to use the hardware as your production. Usually the staging environment is a good starting point.

No. 36 / 217
Martin Joo - Performance with Laravel

When you debug a specific feature or a job you can use your local environment as well. Of course, execution
times will be different compared to production, but you can think in percentages. For example, "this job
took 10s to finish but now it only takes 9s. I gained 10%." The number of queries, the overall memory
consuption sill be similar to production.

No. 37 / 217
Martin Joo - Performance with Laravel

N+1 queries
I put this chapter first because the N+1 query is one of the most common performance issues in lots of
projects. The good news is that it's relatively easy to fix.

This chapter doesn't discuss particular solutions it only describes what is an N+1 problem. Feel free to skip it
if you already know it.

Take a look at this:

foreach ($user!'posts as $post) {


if ($post!'comments!'count() > 100) {
echo "I'm famous!";
}
}

Congratulations! If the given user has 500 posts we just executed 501 queries:

1 query to get the posts

500 queries to get the comments' count

This is why it's called an "N+1 query" problem. It always has the following elements:

An initial query (for the posts, in this case)

A loop

N additional queries. A relationship is usually involved (such as comments in thic case)

Let's zoom out a little bit and see where the $user variable comes from:

foreach (User!$all() as $user) {


foreach ($user!'posts as $post) {
if ($post!'comments!'count() > 100) {
echo "I'm famous!";
}
}
}

This is exponentially worse. This is now an N*M+1 query problem where N is the number of users and M is
the average number of posts per user. If you have 1,000 users and they have 30 posts on average this
function runs 30,000 database queries.

No. 38 / 217
Martin Joo - Performance with Laravel

Another issue is that we load 1,000 users directly into memory all at once. It doesn't sound too much, right?
Can you guess the size of a User object in a fresh Laravel installation in bytes? We're going to talk about
that in another chapter.

N+1 queries come in different forms. For example:

class OrderController
{
public function markOrdersPaid(Request $request)
{
$orders = Order!$whereIn('id', $request!'ids);

foreach ($orders as $order) {


$order!'markAsPaid();
}
}
}

class Order extends Model


{
public function markAsPaid(): void
{
$this!'status = OrderStatus!$Paid;

$this!'save();
}
}

In this case, the issue is "hidden":

In the Controller it's not obvious because the N+1 query happens inside the model. Of course,
markAsPaid indicates that at least one query is executed. However, if the function call was something
like calculateVat and it executed 5 additional queries for all orders the situation would be much
worse.

In the model, you don't know immediately where the function is being called. You don't know the
context so it's not obvious that an N+1 issue is happening.

Another sneaky form of N+1 query is HTTP resources:

No. 39 / 217
Martin Joo - Performance with Laravel

class OrderController
{
public function index()
{
return OrderResource!$collection(Order!$all());
}
}

class OrderResource extends JsonResource


{
public function toArray($request)
{
return [
'id' !& $this!'id,
'items' !& OrderItemResource!$collection($this!'items),
'user' !& UserResource!$make($this!'user),
!!%
];
}
}

It's a hidden loop since the OrderResource class is being used N times where N is the number of orders
you return from the Controller. For each order, it executes two additional queries.

No. 40 / 217
Martin Joo - Performance with Laravel

Solutions
As we have seen, one of the most frequent occurrences of N+1 queries is when additional queries are
executed to get related models. Fortunately, there's an easy fix to that problem. Eager loading:

$users = User!$with('posts')!'get();

foreach ($users as $user) {


echo $user!'posts()!'count();
}

Instead of User::all() I use User::with('posts') which means that the posts relationship is loaded in
the original query.

If we just use User::all() here are the queries executed by Laravel:

Using with() the queries look like this:

Laravel runs only one additional query that gets all the related posts for the users.

In the case of resources we can do one more thing. Using the whenLoaded helper in the Resource:

No. 41 / 217
Martin Joo - Performance with Laravel

class OrderResource extends JsonResource


{
public function toArray($request)
{
return [
'id' !& $this!'id,
'items' !& OrderItemResource!$collection($this!'whenLoaded('items')),
'user' !& UserResource!$make($this!'whenLoaded('user')),
!!%
];
}
}

In this case, if the items or user relationships are not eager-loaded using with the resource won't query
them at all. This way you can avoid N+1 queries in resources completely. However, I think this is still not the
best solution when it comes to resources but it's a fix to the problem. In the following chapter, I'll show you
my favorite way of handling API requests and relationships in resources.

What about the cases when the issue is not caused by relationships? Such as this example:

class OrderController
{
public function markOrdersPaid(Request $request)
{
$orders = Order!$whereIn('id', $request!'ids);

foreach ($orders as $order) {


$order!'markAsPaid();
}
}
}

class Order extends Model


{
public function markAsPaid(): void
{
$this!'status = OrderStatus!$Paid;

No. 42 / 217
Martin Joo - Performance with Laravel

$this!'save();
}
}

They are harder to generalize but in the upcoming chapters, we're going to talk about these kinds of
problems. For example, in the Async workflows/Concurrent programming chapter, you can see how to run
queries in a parallel way utilizing most of your CPU cores.

No. 43 / 217
Martin Joo - Performance with Laravel

Prevent lazy loading


Fortunately, there's a way in Laravel to disable lazy loading completely. It means you can catch N+1 queries
while developing your application.

All you need to do is this:

namespace App\Providers;

class AppServiceProvider extends ServiceProvider


{
public function register()
{
!"
}

public function boot()


{
if (App!$environment() !!+ 'local') {
Model!$preventLazyLoading();
}
}
}

If we now try to run a function such as this one:

public function handle()


{
$pageViews = PageView!$whereBetween('id', [1,100])!'get();

foreach ($pageViews as $pageView) {


!" Lazy loading the site relationship
echo $pageView!'site!'id;
}
}

We'll get the following exception:

No. 44 / 217
Martin Joo - Performance with Laravel

So if you use preventLazyLoading in your local environment you can make sure there are no N+1
problems in your codebase.

No. 45 / 217
Martin Joo - Performance with Laravel

Let the client decide what it wants


This chapter is at the beginning of this book because it describes one of the most underutilized
optimizations that you can use in any project. It's also relatively easy to implement.

HTTP Resource classes tend to grow big over time. As the project grows, your User model (or any other
model) has more and more columns. It has more and more relationships. Usually, "more and more" is an
understatement. For example, here are some numbers from one of the projects I'm working on. The User
model has:

9 belongsToMany

2 belongsTo

3 hasManyThrough

28 hasMany

1 morphMany

12 hasOne

It has 55 relationships and it's not even a 10-year-old legacy project. As you add features to the project a
good portion of these relationships will appear in the UserResource :

class UserResource extends JsonResource


{
public function toArray($request)
{
return [
'username' !& $this!'username,
'posts' !& $this!'posts,
'comments' !& $this!'comments,
!!%
];
}
}

This will result in a lot of unnecessary database queries as it will execute a select * from posts where
user_id = 1 and a select * from comments where user_id = 1

To avoid executing lots of queries usually we eager load the necessary relationships in the Controller:

No. 46 / 217
Martin Joo - Performance with Laravel

class UserController
{
public function show(User $user)
{
$user!'load(['posts', 'comments']);

return UserResource!$make($user);
}
}

And then we use the whenLoaded helper in the Resource :

class UserResource extends JsonResource


{
public function toArray($request)
{
return [
'username' !& $this!'username,
'posts' !& PostResource!$collection($this!'whenLoaded('posts')),
'comments' !& CommentResource!$collection(
$this!'whenLoaded('comments')
),
!!%
];
}
}

That solves the N+1 query problem, however it comes with a cost:

Everyone has to remember to use whenLoaded

You need to load the necessary relationships in every controller every time.

If you forget the first one, you'll end up with N+1 queries. If you forget the second one, you'll get null and
you might end up having bugs on the frontend.

What if the frontend changes and it doesn't show the comments anymore? Then hypothetically you can
remove the comments from the resource. But it is used in 9 other endpoints so you can't really remove it
because you're not sure when it's needed.

These things should be decided by the frontend. It should know exactly what it needs.

No. 47 / 217
Martin Joo - Performance with Laravel

Instead of this:

GET /api/v1/users/1

The request should look like this:

GET /api/v1/users/v1?include=posts,comments

The frontend tells the backend that this specific page needs two relationships: posts and comments.
Another page might only need the posts:

GET /api/v1/users/v1?include=posts

The same resource is used in both cases that looks like this:

class UserResource extends JsonResource


{
public function toArray($request)
{
return [
'username' !& $this!'username,
'posts' !& $this!'when(
Str!$of($request!'input('include'))
!'explode(',')
!'contains('posts'),
PostResource!$collection($this!'posts),
),
'comments' !& $this!'when(
Str!$of($request!'input('include'))
!'explode(',')
!'contains('comments'),
CommentResource!$collection($this!'comments),
),
!!%
];
}
}

No. 48 / 217
Martin Joo - Performance with Laravel

If the expression returns true the given relationship is included, otherwise, it's not.

However, it still has N+1 query problems. Every time you load a user there are two additional queries for
posts and comments. Of course, we could use whenLoaded :

'posts' !& $this!'when(


Str!$of($request!'input('include'))!'explode(',')!'contains('posts'),
PostResource!$collection($this!'whenLoaded('posts')),
),

But now, we have the same problem. The controller must eager-load the posts relationship.

Fortunately, there's an easy solution with Spatie's laravel-query-builder package and it looks like this:

class UserController
{
public function index()
{
$users = QueryBuilder!$for(User!$class)
!'allowedIncludes(['posts', 'comments'])
!'get();

return UserResource!$collection($users);
}
}

allowedIncludes respects the include parameter of the request. If it contains posts or comments it
eager load the relationships.

It looks almost exactly as if we were eager-loading relations directly, right?

public function index()


{
$users = User!$with('posts', 'comments')
!'get();

return UserResource!$collection($users);
}

No. 49 / 217
Martin Joo - Performance with Laravel

But it has a pretty important distinction:

If you rely on eager-loading and you forget a relationship in the controller you'll end up with an N+1
query problem.

If you use QueryBuilder and you forget to allow an include you'll get an exception instead of
performance problems.

For example, if I only allow the comments relationship I get this error:

Requested include(s) `posts` are not allowed. Allowed include(s) are


`comments, commentsCount, commentsExists`.

This seems like a small difference, but believe me, in larger projects N+1 queries is one the most common
and annoying performance problem. HTTP resources are also common for these kinds of problems. By
using the include query param and the laravel-query-builder package you can eliminate lots of N+1
query issues.

No. 50 / 217
Martin Joo - Performance with Laravel

Multiple resources
If for some reason you can't or just don't wan't to use include parameters in your requests (for example, it
would be a huge refactor in a large project) you can start using multiple resources for the same model.

Let's say we're working on a real estate site, something like Zillow. We have a list and a detailed view of real
estate. On the list we only need to display 4 attributes:

Location

Price

Number of bedrooms

Area

But on the detailed page, we obviously need to show more attributes than this.

This is what a complete RealEstateResource would look like:

class RealEstateResource extends JsonResource


{
public function toArray($request)
{
return [
'id' !& $this!'id,
'location' !& $this!'location,
'price' !& $this!'price,
'number_of_bedrooms' !& $this!'number_of_bedrooms,
'area' !& $this!'area,
'description' !& $this!'description,
'number_of_levels' !& $this!'number_of_levels,
'has_basement' !& $this!'has_basement,
'has_cooling' !& $this!'has_cooling,
'year_build' !& $this!'year_built,
'number_of_views' !& $this!'number_of_views,
'tags' !& TagResource!$collection($this!'tags),
'contruction_materials' !& ConstructionMaterial!$collection(
$this!'construction_materials
),
'heatings' !& HeatingResource!$collection($this!'heatings),
'pictures' !& PictureResource!$collection($this!'pictures),
!!%
];

No. 51 / 217
Martin Joo - Performance with Laravel

}
}

And the list goes on. If you take a look at Zillow I think they have at least 20+ more attributes for each
property. If we use only this one resource we waste lots of bandwidth, memory, and CPU just to show the
first 4 attributes on the index page.

This response:

{
"data": {
"id": 12379,
"addres": "11 Starrow Drive, Newburgh, NY 12550",
"price": "$379,900",
"number_of_bedrooms": "3",
"area": "1,332",
"picture": "https:!"shorturl.at/awDV4"
}
}

is only 169 bytes and contains everything to replicate Zillow's home page.

So one of the solutions to avoid having large resources when you don't need them is to use many of them.
In this situation, we can create two:

RealEstateMinimalResource

RealEstateDetailedResource

RealEstateMinimalResource can be used on the home page and it would look like this:

No. 52 / 217
Martin Joo - Performance with Laravel

class RealEstateMinimalResource extends JsonResource


{
public function toArray(Request $request): array
{
return [
'id' !& $this!'id,
'addres' !& $this!'address,
'price' !& $this!'price,
'number_of_bedrooms' !& $this!'number_of_bedrooms,
'area' !& $this!'area,
'picture' !& $this!'picture,
];
}
}

RealEstateDetailedResource would be the one I showed you earlier and would be used on the detailed
page of a real estate with lots of information.

This is a pretty easy way to speed up your requests and save some bandwidth for your users.

No. 53 / 217
Martin Joo - Performance with Laravel

Pagination
First of all, use pagination whenever it's possible. I'm not going to go through the basics because it's an easy
concept and Laravel has awesome documentation.

Secondly, did you know that there's a pagination technique that can be 400x faster than the one you're
probably using?

First, let's see the simplest form of pagination:

class ProductController extends Controller


{
public function index()
{
return ProductResource!$collection(Product!$paginate(50));
}
}

The Product::paginate(50) method returns a LengthAwarePaginator and executes the following query:

select * from `products` limit 50 offset 0

If you send a request such as 127.0.0.1:8000/api/products?page=1500 the query looks like this:

select * from `products` limit 50 offset 74950

It offsets 1,500 * 50 - 50 rows so it returns product #74,951 to #75,000.

While this pagination is simple and works well for smaller projects the problem is that MySQL has to go
through 75,000 records and discard the first 74,950.

If we run an explain the results are clear:

No. 54 / 217
Martin Joo - Performance with Laravel

I'm using explainmysql.com built by Tobias Petry.

So the query results in a full table scan meaning that MySQL has to read everything from the disk loop
through the first 74,950 records, discard them, and then return rows from #74,951 to #75,000.

From this mechanism, you can quickly see that the more records you have the worse it gets. The higher the
requested page number is the worse it gets because MySQL needs to process more and more records. It
means two things:

Simple pagination does not perform particularly well with large datasets. It's hard to define "large" but
probably something like 100,000+

It's pretty good for smaller datasets. If you have a few thousand rows it's not gonna be a problem at all.

No. 55 / 217
Martin Joo - Performance with Laravel

Cursor pagination
What about a query such as this?

select * from `products`


where id > 74950
order by id
limit 50

It returns 50 products from #74,951 to #75,000 in an ordered way limiting the number of results by using a
where expression on the id column and a limit expression.

This is what the explain looks like:

Now MySQL can use the primary key index and it can perform a range query. Meaning, it can use the index
to perform a B-TREE traversal to find the starting point of a range and scan records from that point on. It
performs significantly fewer I/O operations than a full table scan.

However, when using this kind of pagination we don't directly use IDs in the query. Instead, we use MySQL
cursors. This is why this technique is called cursor pagination or relative cursor pagination. A cursor is an
object that allows for sequential processing of query results, row by row. It is a reference point that
identifies a specific position in the result set. It can be a unique ID, a timestamp, or any sortable value that
allows the database to determine where to continue fetching results. It's basically a pointer that points to a
specific row. It's important that the column has to be sortable (such as an ID) so the cursor knows how to
move forward in the result set.

This is how you can imagine a cursor:

No. 56 / 217
Martin Joo - Performance with Laravel

DECLARE product_id INT;


DECLARE product_title VARCHAR(255);

DECLARE cur CURSOR FOR


SELECT id, title FROM products;

OPEN cur;

FETCH cur INTO product_id, product_title;

When the fetch statement is executed the cursor runs the query and fetches the first row. Then the cursor
is set to that specific row. If we run fetch again the next row is going to be retrieved. So typically a cursor is
used inside a loop:

read_loop: LOOP
FETCH cur INTO product_id, product_title;

!) Process the fetched row


SELECT CONCAT('ID: ', product_id, ', Title: ', product_fifle) AS product;
END LOOP;

Of course, if there are no results we need to leave the loop and close the cursor but that's not important
now. The important thing is that a cursor is a pointer to a row and we can fetch data row by row. By the
way, this sounds like a pretty good technique to implement infinite scroll just as social media sites do.

Fortunately, Laravel provides an easy method to use cursors:

class ProductController
{
public function index()
{
return ProductResource!$collection(Product!$cursorPaginate(50));
}
}

When you use the cursorPaginate method, instead of page numbers, Laravel returns a cursor ID:

No. 57 / 217
Martin Joo - Performance with Laravel

{
"path": "http:!"127.0.0.1:8000/api/products",
"per_page": 50,
"next_cursor":
"eyJwcm9kdWN0cy5pZCI6NTAsIl9wb2ludHNUb05leHRJdGVtcyI6dHJ1ZX0",
"prev_cursor": null
}

And then you can get the next page by sending this cursor ID:

{
"first": null,
"last": null,
"prev": null,
"next": "http:!"127.0.0.1:8000/api/products?
cursor=eyJwcm9kdWN0cy5pZCI6NTAsIl9wb2ludHNUb05leHRJdGVtcyI6dHJ1ZX0"
}

A query such as this will be executed:

select * from `products` where (`products`.`id` > 50) order by


`products`.`id` asc limit 51

Let's compare the two queries. Here's the one with offset :

30ms. It's pretty fast, right?

Now take a look at the query that uses a cursor:

No. 58 / 217
Martin Joo - Performance with Laravel

It's 2.7ms.

The difference doesn't seem that much because these are pretty low numbers but there's a 11x difference
between the two queries. I used a demo database with just 100,000 records.

Let's see if the database contains 700,000 rows and we want to get products starting at #500,000. This is the
offset query:

Now it's red in Telescope and it took 262ms. We spend most of our time in the HTTP layer and looking at
HTTP requests and responses where 262ms is pretty fast, but at the database level, it's very very slow.

Here's the cursor query's results:

Now the difference is 74x.

If we check the same query with an offset value of 50,000 instead of 500,000 the result is 41ms:

So here's the proof that the more you paginate and the more records you have your database becomes
worse and worse.

Okay, I know these values (262ms vs 3.52ms) are so small that they sound abstract and neglectable. Let's
put them into context!

Imagine for a minute that you have a MySQL server and there is only one connection to your Laravel app
(which is not recommended and unlikely in the real world). So every user uses the same MySQL connection
and they have to wait for each others' queries to be finished.

If your home page gets 100 visitors and you use the offset query the total execution time is 100 * 262ms or
26.2s. The 10th user has to wait half a minute.

If you use the cursor query the total execution time is 100 * 3.52ms or 0.35 seconds.

Of course, in the real world, we have more than one connection but the point remains the same. A 74 times
difference is huge.

No. 59 / 217
Martin Joo - Performance with Laravel

Shopify experienced the same problem. They ran into pretty slow queries and even complete database
timeouts because of offset pagination. In their article, they explain how they managed to make a 400x
difference when adopting cursor-based pagination.

You can read the whole article here.

No. 60 / 217
Martin Joo - Performance with Laravel

Database indexing
My goal in this chapter is to give you the last indexing tutorial you'll ever need. Please do not skip the following
pages.

Theory
This is one of the most important topics to understand, in my opinion. No matter what kind of application
you're working there's a good chance it has a database. So it's really important to understand what happens
under the hood and how indexes actually work. Because of that, this chapter starts with a little bit of theory.

In order to understand indexes, first we need to understand at least 6 data structures:

Arrays

Linked lists

Binary trees

Binary search trees

B-Trees

B+ Trees

No. 61 / 217
Martin Joo - Performance with Laravel

Arrays
They are one of the oldest data structures. We all use arrays on a daily basis so I won't go into details, but
here are some of their properties from a performance point of view:

Operation Time complexity Space complexity

Accessing a random element O(1) O(1)

Searching an element O(N) O(1)

Inserting at the beginning O(N) O(N)

Inserting at the end O(1) or O(N) if the array is full O(1)

Inserting at the middle O(N) O(N)

An array is a fixed-sized contiguous data structure. The array itself is a pointer to a memory address and
each subsequent element has a memory of x + (sizeof(t) * i) where

x is the first memory address where the array points at

sizeof(t) is the size of the data type. For example, an int takes up 8 bytes

i is the index of the current element

This is what an array looks like:

The subsequent memory address has an interesting implication: your computer has to shift the elements
when inserting or deleting an item. This is why mutating an array in most cases is an O(n) operation.

Since it's a linear data structure with subsequent elements and memory addresses searching an element is
also an O(n) operation. You need to loop through all the elements until you find what you need. Of course,
you can use binary search if the array is sorted. Binary search is an O(log N) operation and quicksort is an
O(N * log N) one. The problem is that you need to sort the array every single time you want to find an
element. Or you need to keep it sorted all the time which makes inserts and deletes even worse.

What arrays are really good at is accessing random elements. It's an O(1) operation since all PHP needs to
do is calculating the memory address based on the index.

The main takeaway is that searching and mutating are slow.

No. 62 / 217
Martin Joo - Performance with Laravel

Linked list
Since arrays have such a bad performance when it comes to inserting and deleting elements engineers
came up with a linked list to solve these problems.

A linked list is a logical collection of random elements in memory. They are connected only via pointers.
Each item has a pointer to the next one. There's another variation called doubly linked list where each
element has two pointers: one for the previous and one for the next item.

This is what it looks like:

The memory addresses are not subsequent. This has some interesting implications:

Operation Time complexity Space complexity

Accessing a random element O(N) O(1)

Searching an element O(N) O(1)

Inserting at the beginning O(1) O(1)

Inserting at the end O(1) O(1)

Inserting at the middle O(N) O(1)

Since a linked list is not a coherent structure in memory, inserts always have a better performance
compared to an array. PHP doesn't need to shift elements. It only needs to update pointers in nodes.

A linked list is an excellent choice when you need to insert and delete elements frequently. In most cases, it
takes considerably less time and memory.

However, searching is as slow as it was with arrays. It's still O(n).

No. 63 / 217
Martin Joo - Performance with Laravel

Binary tree
The term binary tree can be misleading since it has lots of special versions. However, a simple binary means
a tree where every node has two or fewer children.

For example, this is a binary tree:

The only important property of this tree is that each node has two or fewer children.

This is also a binary tree with the same values:

No. 64 / 217
Martin Joo - Performance with Laravel

Now, let's think about how much it takes to traverse a binary tree. For example, in the first tree, how many
steps does it take to traverse from the root node to one of the leaf nodes (9, 5, 6, 5)? It takes three steps. If I
want to go to the left-most node (9) it'd look like this (we're already at the root node):

Now let's do the same with the second tree. How many steps does it take to go to the leaf node (to 43,
starting from the root)? 6 steps.

Both trees have 7 nodes. Using the first one takes only 2 steps to traverse to one of the leaf nodes but using
the second one takes 6 steps. So the number of steps is not a function of the number of nodes but the
height of the tree which is 2 in the first one and 6 in the second one. We don't count the root node.

Both of these trees have a name. The first one is a complete tree meaning every node has exactly two
children. The second one is a degenerative tree meaning each parent has only one child. These are the two
ends of the same spectrum. The first one is perfect and the other one is useless.

In a binary tree, density is the key. The goal is to represent the maximum number of nodes in the smallest
depth binary tree possible.

The minimum height of a binary tree is log n which is shown in the first picture. It has 7 elements and the
height is 3.

The maximum height possible is n-1 which is shown in the second picture. 7 elements with a height of 6.

No. 65 / 217
Martin Joo - Performance with Laravel

From these observations, we can conclude that traversing a binary tree is an O(log h) operation where h
is the height of the tree.

# of elements Height Time complexity

Complete tree 7 3 O(log n)

Degenerate tree 7 6 O(n-1)

To put it in context, if you have a tree with 100,000,000 elements and your CPU can run 100,000,000
operations per second:

# of iterations Time to complete

O(log n) 26 0,00000026 seconds

O(n - 1) 99,999,999 0,99 seconds

There's a 3,846,153 time difference between the two so engineers came up with the the following
conclusion: if a tree is structured well it can traverse it in O(log n) time which is far better than arrays or
linked lists.

No. 66 / 217
Martin Joo - Performance with Laravel

Binary search tree (BST)


So binary trees can have pretty great time complexity when it comes to traversing their nodes. Can we use
them to efficiently search elements in O(log n) time?

Enter the binary search tree:

It has three important properties:

Each node has two or fewer children

Each node has a left child that is less than or equal to itself

Each node has a right child that is greater than itself

The fact that the tree is ordered makes it pretty easy to search elements, for example, this is how we can
find 5.

Eight is the starting point. Is it greater than 5? Yes, so we need to continue in the left subtree.

No. 67 / 217
Martin Joo - Performance with Laravel

Is 6 greater than 5? Yes, so let's go to the left subtree.

No. 68 / 217
Martin Joo - Performance with Laravel

Is 4 greater than 5? Nope. Each node has a right child that is greater than itself. So we go right.

No. 69 / 217
Martin Joo - Performance with Laravel

Is 5 equal to 5? Yes.

We found a leaf node in just 3 steps. The height of the tree is 3, and the total number of elements is 9. This
is the same thing we discussed earlier. The cost of the search is (O log N) .

So if we take a usual binary tree and add two constraints to it so it is ordered at any time we have (O log
N) search.

Unfortunately, the constraints of a BST don't tell anything about balance. So this is also a perfectly fine BST:

No. 70 / 217
Martin Joo - Performance with Laravel

Each node has two or fewer children. The left child is always less than or equal to the parent. The right child
is always greater than the parent. But the right side of the tree is very unbalanced. If you want to find the
number 21 (the bottom node in the right subtree) it becomes an O(N) operation.

Binary search trees were invented in 1960, 35 years before MySQL.

No. 71 / 217
Martin Joo - Performance with Laravel

Indexing in the early days


Databases and storage systems were emerging in the 1960's. The main problem was the same as today:
accessing data was slow because of I/O operations.

Let's say we have a really simple users table with 4 columns:

ID name date_of_birth job_title

1 John Doe 2005-05-15 Senior HTML programmer

2 Jane Doe 1983-08-09 CSS programmer

3 Joe Doe 1988-12-23 SQL programmer

4 James Hetfield 1969-08-03 plumber

For simplicity, let's assume a row takes 128 bytes to store on the disk. When you read something from the
disk the smallest unit possible is 1 block. You cannot just randomly read 1 bit of information. The OS will
return the whole block. For this example, we assume a block is 512 bytes. So we can fit 4 records (4 * 128B)
into one block (512B). If we have 100 records we need 25 blocks.

Size of a record 128B

Size of a block 512B

# of records in one block 4

# of records overall 100

# of blocks needed to store the table 25

If you run the following query against this table (assuming no index, no PK):

select *
from users
where id = 50

Something like this happens:

The database will loop through the table

It reads the first block from the disk that contains row #1 - row #4

It doesn't contain user #50 so it continues

In the worst-case scenario, it executes 25 I/O operations scanning the table block-by-block. This is called a
full table scan. It's slow. So engineers invented indexing.

No. 72 / 217
Martin Joo - Performance with Laravel

Single-level indexing

As you can see, the problem was the size and the number of I/O operations. Can we reduce it by introducing
some kind of index? Some kind of secondary table that is smaller and helps reduce I/O operations? Yes, we
can.

Here's a simple index table:

The index table stores every record that can be found in users . They both have 100 rows. The main benefit
is that the index is small. It only holds an ID that is equivalent to the ID in the users table and a pointer.
This pointer points to the row on the disk. It's some kind of internal value with a block address or something
like that. How big is this index table?

Let's assume that both the ID and ptr columns take up 8 bytes of space. So a record's size in the index table
is 16 bytes.

No. 73 / 217
Martin Joo - Performance with Laravel

Size of a record 16B

Size of a block 512B

# of records in one block 32

# of records overall 100

# of blocks needed to store the index table 4

Only 4 blocks are needed to store the entire index on disk. To store the entire table the number of blocks is
25. It's a 6x difference.

Now what happens when we run?

select *
from users
where id = 50

The database reads the index from the disk block-by-block

It means 4 I/O operations in the worst-case scenario

When it finds #50 in the index it queries the table based on the pointer which is another I/O

In the worst-case scenario, it executes 5 I/O operations. Without the index table, it was 25. It's a 5x
performance improvement. Just by introducing a "secondary table."

No. 74 / 217
Martin Joo - Performance with Laravel

Multi-level indexing

An index table made things much better, however, the main issue remained the same: size and I/O
operations. Now, imagine that the original users table contains 1,000 records instead of 100. This is what
the I/O numbers would look like:

# of blocks to store the data # of I/O to query data

Database table with 1,000 users 250 250

Index table with 1,000 users 40 41

Everything is 10x larger, of course. So engineers tried to divide the problem even more by chunking the size
into smaller pieces and they invented multi-level indexes. Now we said that you can store 32 entries from
the index table in a single block. What if we can have a new index where every entry points to an entire
block in the index table?

Well, this is a multi-level index:

Each entry in the second level index points to a range of records in the first level index:

Row #1 in L2 points to row #1 - row #32 in L1

Row #2 points to row #33 - row #64

etc

No. 75 / 217
Martin Joo - Performance with Laravel

Each row in L2 points to a chunk of 32 rows in L1 because that's how many records can fit into one block of
disk.

If the L1 index can be stored using 40 blocks (as discussed earlier), then L2 can be stored using 40/32 blocks.
It's because in L2 every record points to a chunk of 32 records in L1. So L1 is 32x bigger than L2. 1,000 rows
in L1 is 32 rows in L2.

The space requirement for L2 is 40/32 or 2 blocks.

# of blocks to store the data

Database table with 1,000 users 250

L1 index with 1,000 users 40

L2 index with 1,000 users 2

What happens when we run:

select *
from users
where id = 50

The database reads L2 block-by-block

In the worst-case scenario, it read 2 blocks from the disk

It finds the one that contains user #50

It reads 1 block from L1 that contains user #50

It reads 1 block from the table that contains user #50

Now we can find a specific row by just reading 4 blocks from the disk.

# of blocks to store the data # of I/O to query data

Database table with 1,000 users 250 250

L1 index with 1,000 users 40 41

L2 index with 1,000 users 2 4

They were able to achieve a 62x performance improvement by introducing another layer.

Now let's do something crazy. Rotate the image by 90 degrees:

No. 76 / 217
Martin Joo - Performance with Laravel

It's a tree! IT'S A TREE!!

No. 77 / 217
Martin Joo - Performance with Laravel

B-Tree
In 1970, two gentlemen at Boeing invented B-Trees which was a game-changer in databases. This is the era
when Unix timestamps looked like this: 1 If you wanted to query the first quarter's sales, you would write
this: between 0 and 7775999 . Black Sabbath released Paranoid. Good times.

What does the B stand for? They didn't specify it, but often people call them "balanced" trees.

A B-Tree is a specialized version of an M-way tree. "What's an M-way tree?" Glad you asked!

This is a 3-way tree:

It's a bit different than a binary tree:

Each node holds more than one value. To be precise a node can have m-1 values (or keys).

Each node can have up to m children.

The keys in each node are in ascending order.

The keys in children nodes are also ordered compared to the parent node (such as 10 is at the left side
of 20 and 30 is at the right side)

Since it's a 3-way tree a node can have a maximum of 3 children and can hold up to two values.

If we zoom in on a node it looks like this:

cp stands for child pointer and k stands for key.

No. 78 / 217
Martin Joo - Performance with Laravel

The problem is however, there are no rules or constraints for insertion or deletion. This means you can do
whatever you want, and m-way trees can become unbalanced just as we see with binary search trees. If a
tree is unbalanced searching becomes O(n) which is very bad for databases.

So B-Trees are an extension of m-way search trees. They define the following constraints:

The root node has at least 2 children (or subtrees).

Each other node needs to have at least m/2 children.

All leaf nodes are at the same level

I don't know how someone can be that smart but these three simple rules make B-trees always at least half
full, have few levels, and remain perfectly balanced.

There's a B-Tree visualizer website where you can see how insertion and deletion are handled and how the
tree remains perfectly balanced at all times.

Here you can see numbers from 1 to 15 in a 4-degree B-Tree:

Of course, in the case of a database, every node has a pointer to the actual record on disk just as we
discussed earlier.

The next important thing is this: MySQL does not use a standard B-Tree. Even though we use the word
BTREE when creating an index it's actually a B+ Tree. It is stated in the documentation:

The use of the term B-tree is intended as a reference to the general class of index design. B-tree
structures used by MySQL storage engines may be regarded as variants due to sophistications not
present in a classic B-tree design. - MySQL Docs

It is also said by Jeremy Cole multiple times:

InnoDB uses a B+Tree structure for its indexes. - Jeremy Cole

He built multiple forks of MySQL, for example, Twitter MySQL, he was the head of Cloud SQL at Google and
worked on the internals of MySQL and InnoDB.

No. 79 / 217
Martin Joo - Performance with Laravel

Problems with B-Trees

There are two issues with a B-Tree. Imagine a query such as this one:

select *
from users
where id in (1,2,3,4,5)

It takes at least 8 steps to find these 5 values:

From 4 to 2

From 2 to 1

From 1 back to 2

From 2 to 3

From 3 to 2

From 2 to 4

From 4 to 6

From 6 to 5

So a B-Tree is not the best choice for range queries.

The other problem is wasting space. There's one thing I didn't mention so far. In this example, only the ID is
present on the tree. Because this example is a primary key index. But of course, in real life, we add indexes
to other columns such as usernames, created_at, other dates, and so on. These values are also stored in the
tree.

An index has the same number of elements as the table so its size can be huge if the table is big enough.
This makes a B-Tree less optimal to load into memory.

No. 80 / 217
Martin Joo - Performance with Laravel

B+ Trees
As the available size of the memory grew in servers, developers wanted to load the index into memory to
achieve really good performance. B-Trees are amazing, but as we discussed they have two problems: size
and range queries.

Surprisingly enough, one simple property of a B-Tree can lead us to a solution: most nodes are leaf nodes.
The tree above contains 15 nodes and 9 of them are leaves. This is 60%.

Sometime around 1973, someone probably at IBM came up with the idea of a B+ Tree:

This tree contains the same numbers from 1 to 15. But it's considerably bigger than the previous B-Tree,
right?

There are two important changes compared to a B-Tree:

Every value is present as a leaf node. At the bottom of the tree, you can see every value from 1 to 15

Some nodes are duplicated. For example, number 2 is present twice on the left side. Every node that is
not a leaf node in a B-Tree is duplicated in a B+ Tree (since they are also inserted as leaf nodes)

Leaf nodes form a linked list. This is why you can see arrows between them.

Every non-leaf node is considered as a "routing" node.

With the linked list, the range query problem is solved. Given the same query:

select *
from users
where id in (1,2,3,4,5)

This is what the process looks like:

No. 81 / 217
Martin Joo - Performance with Laravel

Once you have found the first leaf node, you can traverse the linked list since it's ordered. Now, in this
specific example, the number of operations is the same as before, but in real life, we don't have a tree of 15
but instead 150,000 elements. In these cases, linked list traversal is way better.

So the range query problem is solved. But how does an even bigger tree help reduce the size?

The trick is that routing nodes do not contain values. They don't hold the usernames, the timestamps, etc.
They are routing nodes. They only contain pointers so they are really small items. All the data is stored at
the bottom level. Only leaf nodes contain our data.

Leaf nodes are not loaded into memory but only routing nodes. As weird as it sounds at first according to
PostgreSQL this way the routing nodes take up only 1% of the overall size of the tree. Leaf nodes are the
remaining 99%:

Each internal page (comment: they are the routing nodes) contains tuples (comment: MySQL stores
pointers to rows) that point to the next level down in the tree. Typically, over 99% of all pages are leaf
pages. - PostgreSQL Docs

So database engines typically only keep the routing nodes in memory. They can travel to find the necessary
leaf nodes that contain the actual data. If the query doesn't need other columns it's essentially can be
served using only the index. If the query needs other columns as well, MySQL reads it from the disk using
the pointers in the leaf node.

I know this was a long introduction but in my opinion, this is the bare minimum we should know about
indexes. Here are some closing thoughts:

Both B-Trees and B+ trees have O(log n) time complexity for search, insert, and delete but as we've
seen range queries perform better in a B+ tree.

MySQL (and Postgres) uses B+ Trees, not B-Trees.

The nodes in real indexes do not contain 3 or 4 keys as in these examples. They contain thousands of
them. To be precise, a node matches the page size in your OS. This is a standard practice in databases.
Here you can see in MySQL's source code documentation that the btr_get_size function, for
example, returns the size of the index expressed as the number of pages. btr_ stands for btree .

Interestingly enough MongoDB uses B-Trees instead of B+ Trees as stated in the documentation.
Probably this is why Discord moved to Cassandra. They wrote this on their blog:

Around November 2015, we reached 100 million stored messages and at this time we started to see
the expected issues appearing: the data and the index could no longer fit in RAM and latencies
started to become unpredictable. It was time to migrate to a database more suited to the task. -
Discord Blog

No. 82 / 217
Martin Joo - Performance with Laravel

Now, let's apply what we've learned. When the following command is executed:

CREATE INDEX idx_name


ON users (last_name, first_name);

we know that the leaf nodes of the B+ Tree will contain last_name and first_name . This fact leads us to
the most important rule of indexing: You create an index for a specific query. Or a few queries. But an
index is not generic which will make your whole application magically faster.

Let's take this query:

select last_name, first_name


from users
where id = 100

We know that the index lookup looks like this:

So it's easy to find user #100. We also know that last_name and first_name are present in the leaf nodes.
They are all loaded from the disk with that information. If the query only needs these two columns then
there's no need to load the entire row from the disk.

You can imagine node #100 like this:

No. 83 / 217
Martin Joo - Performance with Laravel

Since the query needs only these two columns the DB engine won't touch the pointer and won't execute
extra I/O operations.

The moment you change the query, for example to this:

select job_title, last_name, first_name


!!%

The index cannot be used anymore. Or at least not in the same way with the same efficiency. In this case,
job_title cannot be found in the tree so MySQL needs to run I/O operations.

Let's now look at some actual examples.

No. 84 / 217
Martin Joo - Performance with Laravel

Index access types


In this chapter, we'll explore the MySQL explain command and query access types. If you're not familiar
with it you can read the docs, but here's an executive summary: if you put explain before a query MySQL
will return the execution plan instead of running the query. The execution plan contains valuable
information about how the database will run the actual query.

Access type which can be found in the type column of the output is the most important part of explain . It
tells us how MySQL will or will not use the indexes in the database.

We'll use this simple table to explore how MySQL runs queries and uses indexes:

CREATE TABLE `users` (


`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`first_name` varchar(255) DEFAULT NULL,
`last_name` varchar(255) DEFAULT NULL,
`job_title` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`),
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4
COLLATE=utf8mb4_0900_ai_ci

No. 85 / 217
Martin Joo - Performance with Laravel

const
Given the following query:

select *
from users
where id = 1

MySQL provides this execution plan:

This is the output of explain select parsed with Tobias Petry's MySQL Explain tool.

In the type column we can see const . This type is used when MySQL is able to see that only one row can
be selected based on a unique key index. It is the most efficient type as it involves only a single constant
value lookup. In the possible_keys column, we can see what indexes MySQL considered to use. In the
key column we can see the index that was actually used. In both cases, there's the PRIMARY which is
created by MySQL, based on the unique, auto-increment ID column.

Quick note: if you use UUIDs as your primary keys, your index is going to be huge. Storing a bigint
unsigned requires 8 bytes, while storing a char(32) column requires 32 bytes. It's a 4x difference in size. A
bigger index requires more space, more memory, and is slower to search in.

So the const type is used because of the where id = 1 clause. It's super fast, and works like this:

In just O(log N) time it's able to find the node and it loads the data from the disk since the query uses
select *

It's fast, and you cannot improve it to be faster.

No. 86 / 217
Martin Joo - Performance with Laravel

range
Given this query:

select *
from users
where id in (1,2)

The execution plan is:

It's a range type. This means that MySQL can traverse the B+ tree and find the first node that satisfies our
query. From that point on, it can traverse the linked link (leaf nodes) to get all the other nodes as well. It's a
very great access type since in just O(log N) time it's able to find the first node of the range. And from that
point on it only needs to inspect the x number of elements where x is the number of IDs in the in (1,2)
clause. It is often used for queries with range conditions such as between , in , etc.

This is what it looks like:

So the range access type can be pretty fast. However, the database still needs to perform I/O operations
after it finds the nodes.

No. 87 / 217
Martin Joo - Performance with Laravel

range (again)
Let's make one small change in the previous query:

select id
from users
where id in (1,2)

The execution plan is:

It's the same range type. However, there's an important new item in the extra column: Using index . It
means using only the index. Since the query only needs the id columns (not * ) the index is a covering
index meaning it contains everything the query needs. It covers the query.

The visual representation looks almost the same but without the extra I/O lines on the bottom:

No. 88 / 217
Martin Joo - Performance with Laravel

index
The new query is:

select id
from users
where id !, 1

The execution plan is:

It's an index type. In this case, MySQL cannot identify a range or a single row that needs to be returned.
The best it can do is to scan the entire index. This sounds good, or at least better than a full table scan but
it's still an O(n) operation. Generally speaking, it's not that bad, however, it can cause problems if n (your
table) is large enough.

index queries also come in two flavors:

When using index is present in the extra columns it's a bit better since at least MySQL doesn't need
to perform extra I/O operations.

But when using index is not present it is generally speaking a slower query. According to MySQL it is
as bad as a full table scan:

The index join type is the same as ALL , except that the index tree is scanned. This occurs in two
ways:

If the index is a covering index for the queries and can be used to satisfy all data required from
the table, only the index tree is scanned. In this case, the Extra column says Using index . An
index-only scan usually is faster than ALL because the size of the index usually is smaller than
the table data.

A full table scan is performed using reads from the index to look up data rows in index order.
Uses index does not appear in the Extra column.

MySQL Docs

Visually speaking, it can be imagined like this:

No. 89 / 217
Martin Joo - Performance with Laravel

Of course, 0001 is not selected but it is scanned and checked against the filter where id != 1

No. 90 / 217
Martin Joo - Performance with Laravel

ALL
Finally, the last query is this:

select first_name, last_name, job_title


from users
where first_name = "John"

The execution plan is:

It's an ALL type. I'd like to quote from Kai Sassnowski's awesome video: "avoid at all costs."

This type of query runs a full table scan so MySQL doesn't use the index at all. In this example, it's because
we filter based on the first_name column which is not part of any index. MySQL essentially runs a for loop
and scans every row until it finds John.

A full table scan can be visualized like this:

No. 91 / 217
Martin Joo - Performance with Laravel

If you want to try similar queries and check the explain output make sure your table contains at least a
few hundred rows. Otherwise, MySQL might choose to run a full table scan because the table is so small
that the full scan is actually faster than deciding between different optimization strategies.

No. 92 / 217
Martin Joo - Performance with Laravel

Select *
From the examples above we can come to an observation: select * is usually not a great thing to do. It
can be the difference between traversing a B+ Tree and executing thousands of extra I/O operations. Here
are some things about select * :

Index usage. As we discovered it may prevent the optimizer from utilizing indexes efficiently.

Network traffic. MySQL connections are simple TCP (network) connections. When you retrieve every
column vs just a few ones the difference can be big in size which makes TCP connections heavier and
slower.

Resource consumption. Fetching everything from disk just simply uses more CPU and memory as
well. The worst case scenario is when you don't need all the data and the query (without select * )
could have been served using only the index. In this case, the difference is "order of magnitudes."

So we can say, in general, it's a good thing to avoid select * queries and fetch only the rows you really
need. Of course, one of the disadvantages is this:

public function sendReminder(User $user)


{
$orders = $this!'orderService!'getAbandonedOrders($user);

$user!'notify(new AbandonedOrdersNotification($orders));
}

In the getAbandonedOrders method you only select the order ID and the order items' names that are being
used in the Notification. The possible bug is that you need to know that the $orders collection contains
only specific columns. You cannot use $order->total for example because it's not loaded. These
properties will default to null . So if you have a nullable column you might think everything is great, you
just have an order where column x happens to be null , but in fact, the column is not even loaded.

No. 93 / 217
Martin Joo - Performance with Laravel

Composite indexes
There's another topic I want to cover before jumping into more complicated queries and indexes. Let's talk
about composite indexes.

Given the same users table from earlier:

CREATE TABLE `orders` (


`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`total` int DEFAULT NULL,
`user_id` bigint unsigned DEFAULT NULL,
`created_at` timestamp NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

We can add a composite index to the user_id and create_at columns:

create index orders_user_id_created_at on orders(created_at, user_id)

Let's assume in this example we want to query orders by users in a given time period such as this:

select id
from orders
where user_id = 3001
and created_at between "2024-03-16 00:00:00" and "2024-04-16 23:59:59"

In my case, the table has 6,000 rows and 1,000 of them belongs to user #3001.

Let's take a look at the execution plan:

It all looks good. It's a range type Using index is present in the Extra column. Well, take a look at the
column called Rows . It says 5,513. Essentially, for some reason, MySQL thinks that it needs to look at every
single record in the database in order to execute the given query. It seems a bit weird especially since the
constraints in the query are quite strict:

No. 94 / 217
Martin Joo - Performance with Laravel

where user_id = 3001


and created_at between "2024-03-16 00:00:00" and "2024-04-16 23:59:59"

If user #3001 has 1,000 records overall then the date filter should narrow it down even more, right? In fact,
this query returns only 513 rows. So why does MySQL think that it needs to scan every node in the index?

The answer is the order of the columns in the index. We already discussed that indexes are sorted. The
same thing is true if you have a composite index but now it is ordered by two columns. First, created_at
and then user_id .

If we imagine the index as a table this is what the order looks like:

created_at user_id

2024-03-15 1

2024-03-16 1

2024-03-16 2

2024-03-16 3

2024-03-17 1

User IDs are sorted only in relation to the created_at dates. Just look at the value 1 in this table. It's all
over the place. So they are effectively unordered from this query's perspective:

where user_id = 3001


and created_at between "2024-03-16 00:00:00" and "2024-04-16 23:59:59"

So this is what MySQL does:

Finds every node in the index between 2024-03-16 and 2024-04-16 which happens to be 5,500
records (90% of the table)

Traverses through them and discard the ones where the user ID is not equal to 3001

Even though it's a range query, in practice it is a full index scan ( index type) which is the second worst
query type.

This brings us to an observation: column order does matter in an index.

Let's change it to:

create index orders_user_id_created_at on orders(user_id, created_at)

The new execution plan is:

No. 95 / 217
Martin Joo - Performance with Laravel

This is a much better execution plan because the database engine only wants to scan 513 rows. It's 10% of
the previous one. It's a 10x improvement. All of that is because of the column order in the index.

Now the index looks like this:

user_id created_at

1 2024-03-15

1 2024-03-16

1 2024-03-17

2 2024-03-16

3 2024-03-16

With this ordering, MySQL is able to select the following range: from row #1 to row #3 (which in the real
example is 500+ rows) and then perform the where filter on the created_at column.

This observation brings us to the next important database concept.

No. 96 / 217
Martin Joo - Performance with Laravel

Cardinality
Cardinality means how unique your dataset is. How many unique values are there? For example:

In one month there are 2,678,400 unique timestamps (if we count only seconds). In this one-month
period, the created_at column has a cardinality of 2.6m

There's a good chance you don't have 2.6m users but less. The cardinality of the user_id is x where
x is the number of users. Maybe a few thousand, maybe tens of thousands.

If there are fewer users than timestamps in this example, then user_id has much fewer unique values. It
has a lower cardinality. The cardinality of a column determines the selectivity of the query. The more
selective a query is the fewer rows it returns. Or in human-readable form: fewer unique values = fewer
results = faster queries.

This is exactly what happened with the composite index in the previous example.

user_id has a lower cardinality (it's more unique than created_at )

The index was ordered based on a column that has much fewer unique values

The result was that the optimizer was able to fetch a range of just 500 rows from the index

You should be able to exploit the fact that cardinality matters in some cases. One of the best examples is
when a column can have two or three values. Such as a bool, or a string label, for example, a status
column in a posts table where possible values are published , draft , or archived . These kinds of
columns can be excellent candidates for indexes and effective queries.

No. 97 / 217
Martin Joo - Performance with Laravel

Database indexing in practice


In this chapter, we'll implement some features in a sample application and hope we can make a connection
between theory and practice.

In recent years, I worked on an application that had some social features. The app is being used in larger
companies and admins can post news, events, etc to users. This is one of the most used features and is a
good example of database indexing and queries.

So here's the example table:

CREATE TABLE `posts` (


`id` int NOT NULL AUTO_INCREMENT,
`title` varchar(150) DEFAULT NULL,
`publish_at` timestamp NULL DEFAULT NULL,
`user_id` bigint unsigned DEFAULT NULL,
`status` varchar(10) DEFAULT NULL,
`content` text,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx` (`status`,`user_id`,`publish_at`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=12567 DEFAULT CHARSET=utf8mb4
COLLATE=utf8mb4_0900_ai_ci

The table itself is pretty simple and self explanatory but there are already some important things:

status is going to store three values: draft , published , and archived . If you know that the longest
word you are about to store is 9 characters, don't use a default varchar(255) column. If you use a
column in an index the size can have a big impact on performance. And if you have a status column
there's a good chance it's a candidate for an index.

content is a text column. In lots of projects somehow longtext became the default for storing
something like the content of a post. Do you know what is the size of a longtext column? It's
4,294,967,295 bytes. It's 4 billion characters. You probably don't need that much. A text column can
hold 65,535 characters.

The same goes for title . Usually, a column such as title , or name always ends up in an index. We
can save some space and set it to 150 . It's probably enough for users. If not, we can always resize.

Maybe now you say: "well, these are pretty minor stuff." Yeah, they are in some cases. In the Building an
analytics platform part I'll show you what these minor changes can do when you have a table with only a
few million records. TL;DR: save some space/memory/CPU and use the right data types.

The two main features:

No. 98 / 217
Martin Joo - Performance with Laravel

Listing posts by status. Admins (the owner of a post) want to see their own posts by status.

Feed

Publishing posts. There's a publish_at column in the table. It can hold past or future dates. If it's a
future one it means we need to automatically publish it. So in a real-world app, I would add a
background job that queries "publishable" posts and publishes them. We need a query for that.

In these examples, we'll only discuss raw MySQL queries. Remember, the main goal of this chapter is MySQL
and MySQL indexes, not some fancy Eloquent function.

No. 99 / 217
Martin Joo - Performance with Laravel

Listing posts by status


The first query that lists posts for a specific user in a given status looks like this:

select *
from posts
where user_id = 13268
and status = 'published'

Of course, the table has no indexes right now so it's going to do a full table scan. We can easily identify the
right index for this query:

create index posts_idx on posts(status,user_id)

After so much theory I think this index is self-explanatory. The column with the lower cardinality goes first.
As we discussed earlier, it helps in most cases (however, in some cases it doesn't make a big difference)

After adding this index the execution plan looks like this:

It's a ref type. We haven't seen that one before. I had a hard time understanding what it was based on the
MySQL Docs. Fortunately, PlanetScale's definition by Aaron Francis is much easier to understand:

The ref access type is used when the query includes an indexed column that is being matched by an
equality operator. If MySQL can locate the necessary rows based on the index, it can avoid scanning
the entire table, speeding up the query considerably. - Aaron Francis

As far as I know, it has a similar performance as range . The optimizer is able to identify a range of records
(390 in this case) based on equality operators. A range access type occurs when we use a range operator
such as between , or > , etc.

So it is an efficient query that won't cause problems. We know from the select * part that MySQL will
perform lots of I/O operations to get every column from the disk. We can optimize this a bit:

No. 100 / 217


Martin Joo - Performance with Laravel

select id, title, publish_at, created_at


from posts
where user_id = 13268
and status = 'published'
and publish_at between "2024-03-16" and "2024-04-16"

In a list we don't need the following columns:

content since it's not shown

user_id since the user already knows who is he/she (hopefully)

status since the user requested posts by status so he/she must know it (hopefully)

Using select col1, col2 over select * has two interesting properties:

It's an optimization since MySQL returns less data over the TCP connection

But on the I/O level it doesn't matter at all. As we discovered earlier, the OS always reads an entire
block of data from the disk. This block contains multiple rows. Each row contains every column. So
unfortunately we cannot save I/O operations with this technique. Even if you have a table with 10
columns and your query looks like this:

select col1
from my_table
where id = 1

MySQL reads probably hundreds of rows with 10 columns from the disk (given you don't have an index on
col1 ).

Back to the query. The next feature request is to add a date filter:

select id, title, publish_at, created_at


from posts
where user_id = 13268
and status = 'published'
and publish_at between "2024-03-16" and "2024-04-16"

Te execution plan is this:

No. 101 / 217


Martin Joo - Performance with Laravel

Only two things changed: the filtered column went down from 100 to 11.11 . And the extra column
contains Using where . What does it mean?

MySQL uses the index to get 390 rows. It uses the status and the user_id columns

These 390 rows have to be checked against the publish_at between ... expression

MySQL estimates that only 11.11% of the rows will remain in the result. This means about 43 records

This is also the reason why Using where appeared in the extra column.

It all makes sense, right? However, usually, a lower filtered value indicates a slower query. Just think
about it. If MySQL had to drop 88.89% of rows it means that the index is not very good for this query. In a
perfect world, we would be able to serve the request using only an index and save the time to filter out 88%
of records.

To test this theory we can add the publish_at column to the index:

create index posts_ids on posts(status,user_id,publish_at)

For this specific example, I seeded 177k posts from which 166K belong to user #13268. So the rows values
will be different:

As expected with the publish_at column being included in the index MySQL doesn't need to run extra
filters. The index can satisfy the query.

On my system, the execution times look like this:

84ms when publish_at is part of the index

105ms when publish_at is not part of the index and MySQL needs to execute an extra filter

These numbers are quite small, however there's a 20% difference between the two queries.

While the filtered column is certainly not the most important part of explain it can be a good indicator
for a better index. However, focusing on the access type is much more important, in my opinion.

The next interesting thing is Using index condition in the extra column. It means that MySQL was able
to use the index to run every filter we have in the query. Previously, it must read full table rows because
publish_at was not part of the index. Now, it is part of the index, so it can be used to run the necessary
filters. What's the difference between Using index and Using index condition ?

No. 102 / 217


Martin Joo - Performance with Laravel

Using index means that MySQL only used the index to satisfy the whole query.

Using index condition means that MySQL used the index to run the filters. However, it still had to
read from disk because in the select statement, we have created_at and title and they are not part
of the index.

Why not add these rows to the index as well? Because that index would contain almost the whole table and
it would consume so much memory and disk space that it can be problematic.

For example, if I add created_at and title to the index the query becomes a Using index one, and the
execution time is 77ms. Previously it was 84ms. So the time improvement is only 8.33% and I can guarantee
you that adding another timestamp and a varchar(255) to the index causes a memory consumption
increase that is larger than 8%.

Also, this query isn't worth optimizing anymore, I think. It's very unlikely that a single user has 166k posts in
an application. But even if that's the case:

The query runs under 100ms for 166k posts

It's a range query

It performs a minimum amount of I/O operations thanks to Using index condition

The number of rows examined looks healthy, filtered is 100

It's fine.

No. 103 / 217


Martin Joo - Performance with Laravel

Feed
This feature is not too realistic but it demonstrates another cool access type. Let's say we need to show a
feed to users that has no filter only a date. In a company environment, it's not realistic since you always
have some groups, visibilities, etc. But for a minute, assume this is a Facebook group and as one of the
members of the group, you'll see every post in a given time period.

The query is very simple:

select id
from posts
where publish_at between "2024-04-10" and "2024-04-17"
and status = "archived"

The execution plan looks like this:

It's a range query as it should be, but take a look at the extra column. It says Using index for skip
scan . This is called a Skip Scan Range access method and it's beautiful.

In this case, MySQL has three choices.

1. Range scan

As we discussed earlier, in a range scan MySQL identifies a range of nodes from the tree. It uses only the
index to achieve that.

However, in this case, it cannot be done. The index can be visualized as a table such as this one:

status user_id publish_at

archived 1 2024-04-17

archived 2 2024-03-28

archived 2 2024-04-01

draft 1 2024-03-01

draft 6 2024-02-28

published 4 2024-04-07

No. 104 / 217


Martin Joo - Performance with Laravel

Let's say we're looking for archived posts in February 2024. First, MySQL traverses the tree to get only
archived rows:

status user_id publish_at

archived 1 2024-04-17

archived 2 2024-03-28

archived 2 2024-04-01

And now what? Now it gets a subset of nodes ordered by user IDs. This is because user_id is the second
column in the index. The query, however, doesn't have a filter on user_id only on publish_at . This subset
of nodes is ordered by user_id . But we don't care about it right now. If we take it out from the table it
looks like this:

status publish_at

archived 2024-04-17

archived 2024-03-28

archived 2024-04-01

Now you can see what's the problem. From this perspective, timestamps are in completely random
order.

This means that MySQL cannot just traverse the tree to perform a binary search on the nodes because it's
not ordered by publish_at .

This is why an index range cannot be used.

2. Full index scan

So MySQL doesn't have a choice other than performing a full index scan using the linked list at the bottom.
But it's such a waste. Using the subset shown above it has to be a better way to select the required rows.
Fortunately, there is.

3. Skip scan range

This algorithm works in two steps:

Loops between unique values of the first index part ( user_id )

Performs a subrange scan based on the remaining index part ( publish_at )

Let's emulate it step-by-step using this subset:

No. 105 / 217


Martin Joo - Performance with Laravel

status user_id publish_at Row #

archived 1 2024-04-17 1

archived 2 2024-03-28 2

archived 2 2024-04-01 2

Get the first distinct value of user_id . It is 1

Construct a range based on this user_id and the date filter from the original query. So it will construct
a look-up that can imagined like this:

select id
from posts
where status = "archived"
and user_id = 1
and publish_at between "2024-04-10" and "2024-04-17"

In the real world, of course, it's not a MySQL query but an index lookup. This imaginary query examines Row
#1 which satisfies the filters. So we can skip to the next unique value of user_id and repeat the process:

Get the second distinct value of user_id . It is 2

Construct a range based on this user_id and the date filter from the original query:

select id
from posts
where status = "archived"
and user_id = 2
and publish_at between "2024-04-10" and "2024-04-17"

This imaginary query will examine Row #2 and Row #3. Both failed the filters so they got rejected from the
results.

The lookup is done in this specific example and only Row #1 made it.

As you can imagine, the cost of this operation is much lower than a full index scan which is O(n) . The skip
scan range requires O(m) iterations where m is the distinct number of users that have archived posts
which cannot be larger than the number of all users.

Once again, cardinality and the order of columns matter a lot in a composite index! Be aware of these
things.

No. 106 / 217


Martin Joo - Performance with Laravel

Publishing posts
And finally, let's write some PHP/Laravel code! This example is not strictly related to database indexing but
it's an interesting one so I left it here.

The next feature is publishing scheduled posts. To do that we need to query all publishable posts, where
publishable means:

The post is draft

The publish_at date is less than or equal to now() .

Then we loop through the posts mark them as published and send notifications to the audience. I added a
new relationship to users. Each user can have subscribers that are stored in a subscriptions table:

author_id subsriber_id

1 10

1 11

2 23

Author #2 has two subscribers in this example. Both columns are foreign keys to the users table. There are
two relationships in the User model:

namespace App\Models;

class User extends Model


{
public function subscriptions(): BelongsToMany
{
return $this!'belongsToMany(
User!$class, 'subscriptions', 'subscriber_id', 'author_id'
);
}

public function subscribers(): BelongsToMany


{
return $this!'belongsToMany(
User!$class, 'subscriptions', 'author_id', 'subscriber_id'
);
}
}

No. 107 / 217


Martin Joo - Performance with Laravel

It's an n-n relationship where each author (user) can have many subscribers (users) and each subscriber
(user) can subscribe to many authors (users).

Here's a very simple implementation of this feature:

namespace App\Jobs;

class PublishPostsJob implements ShouldQueue


{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

public function handle(): void


{
$posts = Post!$query()
!'select('id', 'title', 'user_id')
!'where('status', 'draft')
!'where('publish_at', '!-', now())
!'get();

foreach ($posts as $post) {


Notification!$send(
$post!'author!'subscriptions, new PostPublishedNotification($post)
);

$post!'status = 'published';

$post!'save();
}
}
}

No. 108 / 217


Martin Joo - Performance with Laravel

This can cause performance problems for multiple reasons:

It tries to load all publishable posts into memory at once. It works perfectly for a small number of posts
but the moment you try to load tens of thousands of them it will slow down your server or exceed your
memory limit and fail.

It tries to execute potentially thousands of update queries in a pretty short time. It'll spam the
database which can cause a slowdown or even worse an outage.

Neither the author nor the subscriptions relationships are lazy loaded. The for loop causes
potentially thousands of extra queries because of the $post->author->subscriptions line.

The goal here is to "divide et impera" or divide and conquer. We need to chunk this big task into smaller
more manageable pieces.

No. 109 / 217


Martin Joo - Performance with Laravel

Avoiding memory problems

To avoid loading 10,000 posts into memory at once we can use one of Laravel's built-in helpers:

lazy and lazyById that give us a LazyCollection that can be used just like a normal one but it uses
generators and keeps only one record in memory at once

chunk and chunkById that works the same but instead of one collection they return chunks of data in
predefined sizes

We'll use chunkById since it's a better fit for the use case:

public function handle(): void


{
Post!$query()
!'select('id', 'title', 'user_id')
!'with('author.subscriptions.id')
!'where('status', 'draft')
!'where('publish_at', '!-', now())
!'chunkById(100, function (Collection $posts) {
PublishPostChunkJob!$dispatch($posts);
});
}

This is going to load 100 posts into memory at once and dispatch another job that handles the notification
and the status update. The new job is the reason why chunkById is a better fit than lazyById .

Please notice the:

!'with('author.subscriptions:id,email')

line. It prevents N+1 queries and eager loads the required relationships. The :id,email at the end makes
sure we only load columns we actually need. Subscribers are required because of the notification so it only
needs an ID and an email address.

The memory problem is solved, now let's write the other job.

No. 110 / 217


Martin Joo - Performance with Laravel

Avoiding spamming the database

This is what the PublishPostChunkJob looks like:

namespace App\Jobs;

class PublishPostChunkJob implements ShouldQueue


{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

public function !*construct(private Collection $posts)


{
}

public function handle(): void


{
DB!$table('posts')
!'whereIn('id', $this!'posts!'pluck('id'))
!'update(['status' !& 'published']);

foreach ($this!'posts as $post) {


Notification!$send(
$post!'author!'subscriptions, new PostPublishedNotification($post)
);
}
}
}

It takes a collection of Posts and then updates them in one query. After that, it sends the notifications. The
collection has always 100 posts which is not a lot and it won't cause problems in the query. If you have
25,000 elements then you end up with a query like this:

update posts
set status = "published"
where id in (1,2,3,4,5,6, !!% 24997,24998,24999,25000)

No. 111 / 217


Martin Joo - Performance with Laravel

This query is going to be huge and it'll take a long time to run and consume lots of memory. As a general
rule of thumb, a few hundred or a few thousand items won't cause problems but tens of thousands
probably will.

Before measuring performance let's think about the numbers if the job needs to publish 10,000 posts:

It always keeps only 100 posts in the memory.

It chunks posts by 100 so it dispatches 100 chunk jobs.

Each chunk job runs only one query. So overall 100 queries will be executed.

10,000 notifications will be sent

So we successfully scaled down the problem into smaller pieces.

No. 112 / 217


Martin Joo - Performance with Laravel

Measuring performance

First, let's see the performance before optimization.

The job needs to publish 11k posts. On the first try, it failed after about 15 seconds:

The interesting thing is that it did not fail when loading 11k models into memory but after that. It processed
10.5k records:

If the database update query is removed the job runs in 246ms:

So it looks like loading 11k rows wasn't a problem but updating them in one go caused the process to fail.

So the straightforward implementation wasn't good enough to handle 11k records. Now let's see the
optimized one.

The optimized solution

The whole process (meaning all the jobs) took 8-10 seconds to complete with 11k posts using 4 workers.
This might be surprising but the PublishPostsJob executed 40 queries:

No. 113 / 217


Martin Joo - Performance with Laravel

As you can see, there are lots of subscriptions , users , and posts queries. The reason is that chunkById
or lazyById will run multiple database queries. It will generate a query such as this one:

select
`id`,
`title`,
`user_id`
from
`posts`
where
`status` = 'draft'
and `publish_at` !- '2024-04-18 01:14:14'
and `id` > 175810
order by
`id` asc
limit
100

It chunks the query based on the ID using the number you give it (100 in this case).

We used eager load so it's not just 1 x n where n is the number of chunks but 3 * n since we loaded two
additional relationships using eager-load. These are pretty small, performant queries and this job is
executed only once at the beginning so there's nothing to worry about.

No. 114 / 217


Martin Joo - Performance with Laravel

Now let's take a look at the PublishPostChunkJob job:

For some reason, each job runs 50 queries. Lots of them target the users and the subscriptions table.
Well, it looks like an N+1 query caused by this row:

foreach ($this!'posts as $post) {


Notification!$send(
$post!'author!'subscriptions, new PostPublishedNotification($post)
);
}

This is where we access the author (querying from the users table) and the subscriptions relationships
(reaching out to the subscriptions table).

But we directly used eager-loading to avoid this situation:

No. 115 / 217


Martin Joo - Performance with Laravel

Post!$query()
!" This line should avoid N+1
!'with('author.subscriptions')
!'select('id', 'title', 'user_id')
!'where('status', 'draft')
!'where('publish_at', '!-', now())
!'chunkById(100, function (Collection $posts) {
PublishPostChunkJob!$dispatch($posts);
});

Well, this is the harsh reality of queue jobs: models are serialized and eager-loaded relationships are lost.
They are not there when your worker processes the job.

Let me repeat that: eager-loaded relationships are not loaded in the worker process. It's an automatic
N+1 query.

Fortunately, it's very easy to fix:

No. 116 / 217


Martin Joo - Performance with Laravel

class PublishPostChunkJob implements ShouldQueue


{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

public function !*construct(private Collection $posts)


{
}

public function handle(): void


{
DB!$table('posts')
!'whereIn('id', $this!'posts!'pluck('id'))
!'update([
'status' !& 'published1',
]);

!" This is an important line


$this!'posts!'load('author.subscriptions');

foreach ($this!'posts as $post) {


Notification!$send(
$post!'author!'subscriptions,
new PostPublishedNotification($post),
);
}
}
}

We can reload the relationships by adding this line to the job:

$this!'posts!'load('author.subscriptions');

Look at this beauty:

No. 117 / 217


Martin Joo - Performance with Laravel

6 queries per job. 9.9ms spent on DB queries. That's what we want!

As you can see the update query is pretty fast as well. The overall time for the job is between 20-30ms:

This last feature was not strictly related to database indexes but I hope you found it useful.

Now that we have scratched the surface of jobs let's talk about parallelism, concurrency, and async
workflows. My all-time favorite topic since I was introduced to NodeJS and the event loop.

No. 118 / 217


Martin Joo - Performance with Laravel

Async workflows
You probably already know what an async task means but here's a quick summary: asynchronicity means
that a program can perform tasks independently of the main execution flow. In a typical Laravel app, it
means we run background jobs while users can still interact with the application.

There's another concept that we don't usually use in PHP. It's called parallelism. It means simultaneous
execution of multiple tasks, where each task is split into subtasks that can be processed concurrently. Later,
I'm going to show you some examples of concurrent workflows in Laravel.

Web scraping with jobs


Web scraping is quite a good use case to demonstrate some interesting async workflows using jobs.

It's the process of extracting data from websites. It involves fetching the HTML content of a web page and
then parsing the data to extract the desired information, such as text, images, links, or other content. This
data can then be saved, analyzed, or used for various purposes.

I'd like to implement three features:

Discovering the available URLs on a website

Fetching the contents of these pages and extracting some information

Exporting everything to a CSV file

Usually, these kinds of scrapers are used to fetch some product and price information or simply to fetch e-
mail addresses from websites and span the hell out of them. In this example, we'll build a simple one that
discovers a site and then simply extracts H1 and H2 tags. It only handles that simple scenario and I only
tested it with Freek's and my blog.

Here's how it works:

1. Fetch the given website's home page in HTML

2. Find every <a> tag on the page

3. Go to that URL

4. Fetch h1 and h2 tags and their content

5. Repeat Step 2

6. Export the results into a CSV

There's one major problem and it's Step 5 . Can you guess why? Just think about it for a minute. Can you
tell how many URLs a given website has before you start the scraping process?

You cannot, unfortunately. It's not like importing a huge CSV with 1,000,000 records. You know it has
1,000,000 rows and you can dispatch 1,000 jobs each processing 1,000 rows.

But if we don't know how many jobs we should start, how can we tell if every job has succeeded and we can
start exporting the results?

On top of that, discovering URLs involves recursion which makes everything 10% more weird.

No. 119 / 217


Martin Joo - Performance with Laravel

As far as I know, when you don't know how many jobs do you need you cannot determine if they succeded
natively with Laravel's toolset.

We need two jobs:

DiscoverPageJob is the one with recursion. It fetches the content of a given URL and looks for a tags.
It dispatches another DiscoverPageJob for every href it found.

ScrapePageJob is the one that finds h1 and h2 tags and fetches their content.

There are a number of different approaches to running these jobs. Here's an example website that helps us
understand these approaches:

The page /contact has no further links. Let's see two different solutions and compare them which would
be faster.

Discover first

Discover every page and then dispatch the scrape jobs. This would look like this:

DiscoverPageJob('/')

DiscoverPageJob('/blog')

DiscoverPageJob('/blog/first-article')

DiscoverPageJob('/blog/second-article')

DiscoverPageJob('/products')

DiscoverPageJob('/products/first-product')

DiscoverPageJob('/products/second-article')

DiscoverPageJob('/contact')

These jobs result in the URLs that the given website has. After that, we can dispatch 8 ScrapePageJobs for
the 8 URLs.

No. 120 / 217


Martin Joo - Performance with Laravel

Is this a good approach? To answer that question we need more information:

What does "good" mean? Good means two things in this example:

We want the scraping process to be as fast and effective as possible.

We have two runners and we don't want to have idle time when it's not necessary.

What are the other alternatives? Later we'll talk about them.

So there are two runners and we want to use them effectively. Let's simulate a scraping process:

We start with a DiscoverPageJob('/') . The two workers look like this:

Worker #1 Worker #2

DiscoverPageJob('/') -

In the next "tick" the job dispatches another job:

Worker #1 Worker #2

DiscoverPageJob('/blog') -

And now it discovers the last level:

Worker #1 Worker #2

DiscoverPageJob('/blog/first-article') DiscoverPageJob('/blog/second-article')

This "branch" of the discovery process has ended. There are no other pages on the /blog branch. So it
goes back to the /products branch:

Worker #1 Worker #2

DiscoverPageJob('/products') -

And then goes to Level 3:

Worker #1 Worker #2

DiscoverPageJob('/products/first-product') DiscoverPageJob('/products/second-product')

The branch has ended. It goes back to the last leaf of Level 2:

Worker #1 Worker #2

DiscoverPageJob('/contact') -

/contact does not have links so the discovery process has been completed.

No. 121 / 217


Martin Joo - Performance with Laravel

At this point, we have all 8 URLs so we can dispatch the 8 ScrapePageJob in 4 ticks. Both workers process
one job at a time.

So it took 6 ticks to discover the website and then 4 ticks to scrape it. Overall it's 10 ticks.

Let's see the utilization of the workers:

In the discovery process Worker #2 was idle in 4 ticks. It was idle 4/6 or 67% of the time. That's
probably not good.

Worker #1 was idle 0% of the time. Which is good.

Overall the two workers were utilized 8/12 or 67% of the time.

The scraping process utilizes both workers 100% which is perfect.

Number of ticks: 10

Overall utilization: 80.2% (weighted by number of ticks 6 vs 4)

No. 122 / 217


Martin Joo - Performance with Laravel

Discover and scrape at the same time

This is the second approach. Discover one level of the website and scrape it immediately.

This is what it looks like:

Worker #1 Worker #2 Pending

ScrapePageJob('/blog')
DiscoverPageJob('/blog')
ScrapePageJob('/products')
DiscoverPageJob('/') ScrapePageJob('/')
DiscoverPageJob('/products')
ScrapePageJob('/contact')
DiscoverPageJob('/contact')

You can immediately see the difference. As soon as we start discovering a page we can scrape it at the same
time.

The DiscoverPageJob dispatches other Discover and also Scrape jobs. In the case of the home page, it
finds three links: /blog , /products , and /contact so it dispatches 6 jobs to discover and scrape these 3
pages. This results in 6 pending jobs in the queue waiting to be processed.

The second tick:

Worker #1 Worker #2 Pending

ScrapePageJob('/products')
DiscoverPageJob('/products')
ScrapePageJob('/contact')
DiscoverPageJob('/contact')
ScrapePageJob('/blog') DiscoverPageJob('/blog')
ScrapePageJob('/blog/first-article')
DiscoverPageJob('/blog/first-article')
ScrapePageJob('/blog/second-article')
DiscoverPageJob('/blog/second-article')

Workers process the first jobs from the queue which is discovering and scraping the blog page. It has two
links, so the discover job dispatches 4 new jobs.

The third tick:

Worker #1 Worker #2 Pending

ScrapePageJob('/contact')
DiscoverPageJob('/contact')
ScrapePageJob('/blog/first-article')
DiscoverPageJob('/blog/first-article')
ScrapePageJob('/blog/second-article')
ScrapePageJob('/products') DiscoverPageJob('/products')
DiscoverPageJob('/blog/second-article')
ScrapePageJob('/product/first-product')
DiscoverPageJob('/product/first-product')
ScrapePageJob('/product/second-product')
DiscoverPageJob('/product/second-product')

This is the same but with the /products page. We know that there are no other links on the webpage so
from here we can process everything.

And then the rest of the tick looks like this:

No. 123 / 217


Martin Joo - Performance with Laravel

# of tick Worker #1 Worker #2

4 ScrapePageJob('/contact') DiscoverPageJob('/contact')

5 ScrapePageJob('/blog/first-article') DiscoverPageJob('/blog/first-article')

6 ScrapePageJob('/blog/second-article') DiscoverPageJob('/blog/second-article')

7 ScrapePageJob('/product/first-product') DiscoverPageJob('/product/first-product')

8 ScrapePageJob('/product/second-product') DiscoverPageJob('/product/second-product')

Here are the results:

Number of ticks: 8 (earlier it was 10)

Overall utilization: 100% (earlier it was 80.2%)

There's a 20% decrease in the number of ticks and a ~25% increase in utilization.

When I introduced the "Discover first" approach I asked the question, "Is this a good approach?" And then I
gave you a bit more information:

What does "good" mean? Good means two things in this example:

We want the scraping process to be as fast and effective as possible.

We have two runners and we don't want to have idle time when it's not necessary.

What are the other alternatives? Later we'll talk about them.

Now we can clearly see it was not a "good" approach. At least not the best.

The next question: can we do better? And the answer is no because:

The sample website has 8 pages

It means 16 jobs (8 discover and 8 scrape)

With 2 workers 16 can be processed in 8 ticks with 100% utilization

So 8 ticks and 100% utilization is as good as it can be

The point is that if you want to make something async by using jobs try to think parallel. Try to squeeze as
much work out of your workers as possible.

No. 124 / 217


Martin Joo - Performance with Laravel

Let's start implementing the DiscoverPageJob :

class DiscoverPageJob implements ShouldQueue


{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels,
Batchable;

public function !*construct(


private Scraping $scraping,
private ?string $currentUrl = '',
private int $depth = 0,
private int $maxDepth = 3,
) {
if (!$this!'currentUrl) {
$this!'currentUrl = $this!'scraping!'url;
}
}
}

First of all, let's talk about the arguments.

Scraping is a model that represents the scraping of a website. It has a HasMany relationship to the
ScrapingItem model that contains every discovered URL on the site. This is the scrapings table:

id url created_at

1 https://freek.dev 2024-01-31 14:25:00

2 https://martinjoo.dev 2024-01-31 15:01:43

I don't worry about user management and authentication right now but if it was a standard SaaS application
the scraping table would have contained a user_id column.

scraping_items has records such as these:

id scraping_id url content status created_at

https://martinjoo.d
1 2 {"h1": "Hey", "h2s": []} done 2024-01-31
ev/blog

https://martinjoo.d {"h1": "How to Measure Performance in Laravel


ev/how-to-measure Apps", "h2s": ["ab", "jmeter", "Inspector",
2 2 done 2024-01-31
-performance-in-lar "Telescope", "OpenTelemetry", "XDebug +
avel-apps qcachegrind"]}

No. 125 / 217


Martin Joo - Performance with Laravel

A Scraping has many ScrapingItem . One for every URLs. The content contains the h1 and all the h2 tags
on the given page.

Before we start a scraping process and dispatch the DicoverPageJob we have to create a Scraping model
and pass it to the job.

The second parameter to the job is the $currentUrl which is empty by default. It refers to the URL the job
has to discover. On the first run is the homepage so it defaults to the Scraping model's url property.

$depth refers to the current level that is being discovered just as I showed you earlier. $maxDepth sets a
limit where the job stops discovering the page. It's not necessary but it's a great way to avoid jobs running
for multiple hours or days (just imagine discovering 100% of Amazon).

public function handle(): void


{
if ($this!'depth > $this!'maxDepth) {
return;
}

$response = Http!$get($this!'currentUrl)!'body();

$html = new DOMDocument();

@$html!'loadHTML($response);
}

It fetches the content of the current URL and then it parses it as HTML using PHP's DOMDocument class.

Then we have to loop through the <a> tags scrape them and discover the further.

foreach ($html!'getElementsByTagName('a') as $link) {


$href = $link!'getAttribute('href');

if (Str!$startsWith($href, '#')) {
continue;
}

!" It's an external href


if (
Str!$startsWith($href, 'http') !.
!Str!$startsWith($href, $this!'scraping!'url)

No. 126 / 217


Martin Joo - Performance with Laravel

) {
continue;
}
}

The two if conditions filter out links such as:

#second-header

http://google.com

These are links we don't need to discover and scrape.

As the next step, we can dispatch the scrape and the discover jobs to the new page we just found at $href .
However, links can be in two forms, absolute, and relative. Some <a> tags contain links such as
https://example.com/page-1 while others have a relative URL such as /page-1 . We need to handle this:

if (Str!$startsWith($href, 'http')) {
$absoluteUrl = $href;
} else {
$absoluteUrl = $this!'scraping!'url . $href;
}

The $absoluteUrl variable contains an absolute URL where we can send HTTP requests so it's time to
dispatch the jobs:

ScrapePageJob!$dispatch($this!'scraping, $absoluteUrl, $this!'depth);

DiscoverPageJob!$dispatch(
$this!'scraping, $absoluteUrl, $this!'depth + 1, $this!'maxDepth
);

ScrapePageJob fetches the content of the page while DiscoverPageJob discovers all links on the page and
dispatches new jobs.

ScrapePageJob is really simple:

class ScrapePageJob implements ShouldQueue


{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels,
Batchable;

No. 127 / 217


Martin Joo - Performance with Laravel

public function !*construct(


private Scraping $scraping,
private string $url,
private int $depth
) {}

public function handle(): void


{
$scrapingItem = ScrapingItem!$create([
'scraping_id' !& $this!'scraping!'id,
'url' !& $this!'url,
'status' !& 'in-progress',
'depth' !& $this!'depth,
]);

try {
$response = Http!$get($this!'url)!'body();

$doc = new DOMDocument();

@$doc!'loadHTML($response);

$h1 = @$doc!'getElementsByTagName('h1')!'item(0)!'nodeValue;

$h2s = collect(@$doc!'getElementsByTagName('h2'))
!'map(fn ($item) !& $item!'nodeValue)
!'toArray();

$scrapingItem!'status = 'done';

$scrapingItem!'content = [
'h1' !& $h1,
'h2s' !& $h2s,
];

$scrapingItem!'save();

No. 128 / 217


Martin Joo - Performance with Laravel

} catch (Throwable $ex) {


$scrapingItem!'status = 'failed';

$scrapingItem!'save();

throw $ex;
}
}
}

It uses the same DOMDocument class to find h1 and h2 tags on the page and then it creates the
scraping_items record. If something goes wrong it sets the status to failed .

So we have the main logic. We can discover and scrape webpages (all right, I didn't show 100% of the code
here because it has some edge cases and small details but that's not important from the async point-of-
view. You can check out the source code.)

Now look at this line again in the DiscoverPageJob :

ScrapePageJob!$dispatch($this!'scraping, $absoluteUrl, $this!'depth);

DiscoverPageJob!$dispatch(
$this!'scraping, $absoluteUrl, $this!'depth + 1, $this!'maxDepth
);

How can we tell if scraping has finished? Right now there's no way. It's just an endless stream of jobs
without any coordination.

Usually, when you want to execute a function or dispatch another job when a set of jobs has finished you
can use job batches:

Bus!$batch([
new FirstJob(),
new SecondJob(),
new ThirdJob(),
])
!'then(function () {
echo 'All jobs have completed';
})
!'dispatch();

No. 129 / 217


Martin Joo - Performance with Laravel

This is a really convenient way of waiting for jobs to be finished.

However, in our case, we don't know exactly how many jobs there are because the DiscoverPageJob
recursively dispatches new ones.

So if we add a batch such as this:

foreach ($html!'getElementsByTagName('a') as $link) {


$href = $link!'getAttribute('href');

!" !!%

Bus!$batch([
new ScrapePageJob($this!'scraping, $absoluteUrl, $this!'depth),
new DiscoverPageJob(
$this!'scraping, $absoluteUrl, $this!'depth + 1, $this!'maxDepth
),
])
!'then(function () {
var_dump('Batch has finished');
})
!'dispatch();
}

The following happens:

No. 130 / 217


Martin Joo - Performance with Laravel

In every loop, there's a batch with the jobs.

Another idea would be to create the batch before the loop, add the jobs to it inside, and then dispatch them
once the loop is finished:

$batch = Bus!$batch([]);

foreach ($html!'getElementsByTagName('a') as $link) {


$href = $link!'getAttribute('href');

!" !!%

$batch!'add([
new ScrapePageJob($this!'scraping, $absoluteUrl, $this!'depth),
new DiscoverPageJob(
$this!'scraping, $absoluteUrl, $this!'depth + 1, $this!'maxDepth
),
]);
}

No. 131 / 217


Martin Joo - Performance with Laravel

$batch
!'then(function () {
var_dump('Batch has finished');
})
!'dispatch();

But it's not a good solution either. The only difference is that each batch has more jobs but we still have 8
batches if we're using the sample website from earlier as an example. Each DiscoverPageJob created a
new batch with number of links * 2 jobs in it.

So using batches is the right path because they allow us to await jobs. However, as far as I know, our exact
problem cannot be solved with Laravel's built-in methods and classes.

What we want to do is count the number of jobs as we dispatch them and then decrease the counter as
workers process them.

First, I create a new function called dispatchJobBatch :

public function dispatchJobBatch(array $jobs)


{
Bus!$batch($jobs)
!'dispatch();
}

This function is being used in the job's handle method:

$this!'dispatchJobBatch([
new ScrapePageJob($this!'scraping, $absoluteUrl, $this!'depth),
new DiscoverPageJob(
$this!'scraping, $absoluteUrl, $this!'depth + 1, $this!'maxDepth
),
]);

The next step is to implement the counter. Since we have multiple workers it has to be a "distributed"
counter available to all workers. Redis and the Cache facade is an awesome starting point. I mean, the
database cache_driver is equally amazing:

No. 132 / 217


Martin Joo - Performance with Laravel

public function dispatchJobBatch(array $jobs, string $name)


{
$jobCountCacheKey = $this!'scraping!'id . '-counter';

if (Cache!$has($jobCountCacheKey)) {
Cache!$increment($jobCountCacheKey, count($jobs));
} else {
Cache!$set($jobCountCacheKey, count($jobs));
}

Bus!$batch($jobs)
!'dispatch();
}

I'm using the Scraping object's ID as part of the cache key and I increment it with the number of jobs being
dispatched (usually 2).

So now we know exactly how many jobs are needed to scrape a given website:

Great! The next step is to decrease the counter as workers process the jobs. Fortunately, there's a
progress method available on the Bus object:

Bus!$batch($jobs)
!'progress(function () {
var_dump("Just casually making some progress ")
})
!'dispatch();

Now the output is this:

No. 133 / 217


Martin Joo - Performance with Laravel

So the progress callback runs every time a job is processed in the batch. Exactly what we need:

public function dispatchJobBatch(array $jobs, string $name)


{
$jobCountCacheKey = $this!'scraping!'id . '-counter';

if (Cache!$has($jobCountCacheKey)) {
Cache!$increment($jobCountCacheKey, count($jobs));
} else {
Cache!$set($jobCountCacheKey, count($jobs));
}

Bus!$batch($jobs)
!'progress(function () use ($jobCountCacheKey) {
Cache!$decrement($jobCountCacheKey);
})
!'dispatch()
}

No. 134 / 217


Martin Joo - Performance with Laravel

This is how it works:

Anytime you call this function and dispatch jobs the counter will be increased with the number of jobs

Then the batch is dispatched

After every job that has been completed the counter decreases

Now we're ready to add the then callback and run a callback when every job has been completed:

Bus!$batch($jobs)
!'then(function () use ($jobCountCacheKey, $scraping,
$discoveredUrlsCacheKey) {
if (Cache!$get($jobCountCacheKey) !!+ 0) {
var_dump("Look! I'm ready ");
}
})
!'progress(function () use ($jobCountCacheKey) {
Cache!$decrement($jobCountCacheKey);
})
!'dispatch();

The output is:

Exactly what we wanted! On the screenshot the var_dump is not the very last row, I know, but it's only the
terminal output. Logically it works as it needs to.

No. 135 / 217


Martin Joo - Performance with Laravel

Don't forget that we can only run our callback if the counter is zero. So this is a very important line:

if (Cache!$get($jobCountCacheKey) !!+ 0) {
var_dump("Look! I'm ready ");
}

Once again, the then function is called after a batch has been finished so we need this if statement.

One more thing we can do is delete the cache key entirely after all jobs have been completed:

!'then(function () use ($jobCountCacheKey, $scraping, $discoveredUrlsCacheKey)


{
if (Cache!$get($jobCountCacheKey) !!+ 0) {
Excel!$store(new ScrapingExport($scraping), 'scraping.csv');

Cache!$delete($jobCountCacheKey);
}
})

And of course, instead of var_dumping I actually dispatch another export job but that's not important right
now, and there's a dedicated chapter for exports and imports.

Here's the whole function:

No. 136 / 217


Martin Joo - Performance with Laravel

public function dispatchJobBatch(array $jobs)


{
$jobCountCacheKey = $this!'scraping!'id . '-counter';

if (Cache!$has($jobCountCacheKey)) {
Cache!$increment($jobCountCacheKey, count($jobs));
} else {
Cache!$set($jobCountCacheKey, count($jobs));
}

$scraping = $this!'scraping;

Bus!$batch($jobs)
!'then(function () use ($jobCountCacheKey, $scraping,
$discoveredUrlsCacheKey) {
if (Cache!$get($jobCountCacheKey) !!+ 0) {
Excel!$store(new ScrapingExport($scraping), 'scraping.csv');

Cache!$delete($jobCountCacheKey);
}
})
!'progress(function () use ($jobCountCacheKey) {
Cache!$decrement($jobCountCacheKey);
})
!'dispatch();
}

No. 137 / 217


Martin Joo - Performance with Laravel

In these callbacks ( then , progress , etc) we cannot use $this . This is why there's this line $scraping =
$this->scraping and then the $scraping variable is being used in the then callback.

And this is what the handle method looks like:

public function handle(): void


{
if ($this!'depth > $this!'maxDepth) {
return;
}

$response = Http!$get($this!'currentUrl)!'body();

$html = new DOMDocument();

@$html!'loadHTML($response);

foreach ($html!'getElementsByTagName('a') as $link) {


$href = $link!'getAttribute('href');

if (Str!$startsWith($href, '#')) {
continue;
}

!" It's an external href


if (
Str!$startsWith($href, 'http') !.
!Str!$startsWith($href, $this!'scraping!'url)
) {
continue;
}

if (Str!$startsWith($href, 'http')) {
$absoluteUrl = $href;
} else {
$absoluteUrl = $this!'scraping!'url . $href;
}

No. 138 / 217


Martin Joo - Performance with Laravel

$this!'dispatchJobBatch([
new ScrapePageJob($this!'scraping, $absoluteUrl, $this!'depth),
new DiscoverPageJob(
$this!'scraping, $absoluteUrl, $this!'depth + 1, $this!'maxDepth
),
]);
}
}

The main takeaways from this example:

Use async jobs whenever it's possible

Plan your workflow because there's a big difference between different solutions

Batches are great

I included the sample website's source code in async-workflows/sample-website so you can watch/debug
the process if you'd like to. You can serve the site with:

php -S localhost:3000

And then you can start the whole process with an action:

php artisan tinker

$scraper = app(App\Actions\ScrapeAction!$class);

$scrape!'execute('http:!"localhost:3000');

No. 139 / 217


Martin Joo - Performance with Laravel

Concurrent programming
PHP is traditionally a single-threaded, blocking language. But what does it mean?

If we have a program such as this:

sleep(5);

echo "Hello";

And we serve it using:

php -S localhost:3000

And then send two requests at the same time, this is what happens:

The first request took 5s to complete but the second one took almost 10s.

That's because PHP by default uses only one thread to execute your code. The first request occupied this
one thread for 5 seconds so the second one had to wait for 5s before it was processed.

This sounds lame. And fortunately, this is not what happens in production systems, right? I mean, if 10 users
use the application at the same time the app won't be 10 times slower for the last user. This is possible
because of PHP-FPM.

Earlier, I published a very detailed article about FPM, nginx, and FastCGI so if you don't know them very well,
please read this article. But here's the executive summary:

PHP-FPM acts as a process manager and load balancer

Whenever a new request comes in it goes to FPM (via nginx)

It has one master process and many worker processes

It balances the requests across the workers, but every request goes to one of them

The worker process boots up your application and executes your code

The main point is that in a production system, your application is not using only one thread. Each of those
worker processes uses different threads. So even though, in our codebase, we use one thread, the
execution environment runs multiple processes and multiple threads to handle multiple requests. And this
is a great architecture. Easy to write your code and easy to scale as well.

However, we can still end up in situations when one function takes a long time to run. No matter how great
FPM works this one function inside the request still needs to be executed and if it takes 10 seconds users
won't be happy. 90% of the time we can use jobs to process these long-running tasks in the background. In
an async way. However, what if we just can't do that? For example, there's a GET request and we need a

No. 140 / 217


Martin Joo - Performance with Laravel

response immediately. So it's not like you can just dispatch a request because you need to respond in a
sync way, but the function takes seconds to run.

We can fork new processes. Before we do so, let's clarify three terms: program, process, and thread.

Program: A program is a set of instructions and data that are stored in a file and are intended to be
executed by a computer. Your index.php is a program.

Process: A process is an instance of a program that is being executed. It represents the entire runtime
state of a program, including its code, data, stack, registers, and other resources such as open files and
network connections. If you open htop you see processes. For example, this is an artisan command
(program) being executed as a process:

Thread: A thread is the smallest unit of execution within a process. A single process can contain
multiple threads, each running its own sequence of instructions concurrently. Eventually, a thread is
the actor that executes your code.

So now we know that a process is a running program with code and data. When we fork a new process, the
currently running process copies itself and creates a child process with the same code and data. I
highlighted the words copy and code for a reason that will be important in a minute.

Here's how we can fork processes in PHP:

pcntl_fork();

From that point, everything that follows pcntl_fork will run twice. For example:

var_dump('only once');

pcntl_fork();

var_dump('twice');

Take a look at this:

No. 141 / 217


Martin Joo - Performance with Laravel

This is why it's called concurrent programming.

Remember, a process contains code and data. When we fork a new process, the currently running process
copies itself and creates a child process with the same code and data. So everything that comes after
pcntl_fork will be executed twice: by the parent and the child process as well.

But it's obviously not what we want. I mean, what's the point of having another process that does exactly
the same as the original one? Nothing.

Fortunately, pcntl_fork(); returns a value, the process ID or PID for short:

If we are in the parent process the PID is an actual number such as 25477

But if we are in the child process the PID is always 0

So we can do this:

$pid = pcntl_fork();

if ($pid !!+ 0) {
var_dump('child process');
} else {
var_dump('parent process');
}

No. 142 / 217


Martin Joo - Performance with Laravel

Of course, this is still weird. The whole code is one if-else statement without any user input and somehow
we managed to run both the if and the else branches at the same time. Imagine if this app had any
users.

But now you can see that we have different "branches." One for the child and one for the parent. Of course,
in a minute we'll have not one but 8 children and the basic idea is that:

A parent process does no heavy work. It only manages child processes.

A child process does the heavy. And lots of children can do lots of heavy work.

Let's simulate that the child process does some long-running tasks:

$pid = pcntl_fork();

if ($pid !!+ 0) {
var_dump('child process');

sleep(2);

var_dump('child finished');

exit;
} else {
var_dump('parent process');
}

The output is:

As you can see, the parent process exits before the child process finishes its job.

No. 143 / 217


Martin Joo - Performance with Laravel

One of the responsibilities of a parent is to wait for children. For that, we can use the pcntl_waitpid
function:

$pid = pcntl_fork();

if ($pid !!+ 0) {
var_dump('child process');

sleep(2);

var_dump('child finished');

exit;
} else {
var_dump('parent process');

pcntl_waitpid($pid, $status);

var_dump('parent finished');
}

And now the output looks better:

So now we have two "branches." One that does the work and another one that manages these workers. It
can wait for them, maybe receive some outputs, and so on.

No. 144 / 217


Martin Joo - Performance with Laravel

Now, let's create 8 children:

$childPids = [];

for ($i = 0; $i < 8; $i!/) {


$pid = pcntl_fork();

if ($pid !!+ 0) {
sleep(1);

var_dump('child finished with PID ' . getmypid());

exit;
} else {
$childPids[] = $pid;
}
}

foreach ($childPids as $pid) {


pcntl_waitpid($pid, $status);
}

var_dump('parent says bye bye');

The parent process collects all the PIDs in the $childPids array and waits for all of them in a simple
foreach . That's it and we have 8 worker processes.

By the way, you don't even need the else statement since the $pid is not 0 for the parent process and
there's an exit at the end of the if statement.

No. 145 / 217


Martin Joo - Performance with Laravel

Maybe you're confused because I said that the $pid is always zero for a child process. Yes, it is. But it
doesn't mean that a child process doesn't have a PID. It means that the function pcntl_fork returns 0 for
the children. But they still have PIDs as you can see.

The next step is to actually do something in the child process. Let's start with something simple, but CPU-
bound such as calculating prime numbers. For now, let's count prime numbers in a given range:

$childPids = [];

for ($i = 0; $i < 8; $i!/) {


$pid = pcntl_fork();

if ($pid !!+ 0) {
$start = $i * 1_000_000 + 1;

$end = ($i + 1) * 1_000_000;

$count = countPrimeNumbers($start, $end);

var_dump('Count is ' . $count);

exit;
} else {
$childPids[] = $pid;

No. 146 / 217


Martin Joo - Performance with Laravel

}
}

foreach ($childPids as $pid) {


pcntl_waitpid($pid, $status);
}

Each child process processes 1,000,000 numbers. The first one processes numbers from 1 to 1,000,000 the
second one processes numbers from 1,000,001 to 2,000,000 and so on until 8,000,000

This is the output:

The process took 5.7s to complete.

If we try the same function using only one processes this is the result:

So it took 23.9s

Using 8 processes makes the whole calculation 4.1 times faster.

If we take a look at htop we can clearly see that all the cores are busy running the parent-child.php
script:

No. 147 / 217


Martin Joo - Performance with Laravel

When we don't use child processes the difference is clear:

The next step is to "return" values from the child processes because right now they just echo out the
results.

Unfortunately, it's not simple because we cannot just use variables such as this:

$sum = 0;

if ($pid !!+ 0) {
$sum += counnPrimeNumbers(1, 1_000_000);
}

echo "this is the parent process";


echo "this won't work: ";

echo $sum;

It's not going to work:

No. 148 / 217


Martin Joo - Performance with Laravel

$sum remains zero because each process has its own code and data so there are no "shared" variables
across processes.

What we need is Inter-Process Communication or IPC for short. IPC refers to the mechanisms and
techniques used by processes to communicate and synchronize with each other. Processes can exchange
data, share resources, and coordinate their activities. Exactly what we need.

One technique is using sockets. Sockets are communication endpoints that allow processes to
communicate with each other, either on the same machine or across a network. Right now, they're on the
same machine so we don't need network sockets.

This is how you can create a socket in PHP:

$socket = socket_create(AF_UNIX, SOCK_STREAM);

AF_UNIX means it's a local socket using a local communication protocol family. This is what you need if your
processes live on the same machine.

SOCK_STREAM provides sequenced, reliable, full-duplex, connection-based byte streams.

But in order to communicate in a two-way manner (child -> parent, parent -> child) we need two of these
sockets. They can be created with the socket_create_pair function:

socket_create_pair(AF_UNIX, SOCK_STREAM, 0, $sockets);

[$socketToParent, $socketToChild] = $sockets;

No. 149 / 217


Martin Joo - Performance with Laravel

This is an old school function so instead of returning a value we have to pass an array as the last argument.
The function puts two sockets into it.

This is how can exchange data using sockets:

socket_create_pair(AF_UNIX, SOCK_STREAM, 0, $sockets);

[$socketToParent, $socketToChild] = $sockets;

$pid = pcntl_fork();

if ($pid !!+ 0) {
socket_write($socketToParent, 'hello', 1024);

socket_close($socketToParent);

exit;
}

$childResult = socket_read($socketToChild, 1024);

echo $childResult;

socket_close($socket);

The data you write into $socketToParent in the child process can be read from $socketToChild in the
parent process. We need to close the sockets as if they were files.

Now let's put everything together:

$childPids = [];

$primesCount = 0;

for ($i = 0; $i < 8; $i!/) {


socket_create_pair(AF_UNIX, SOCK_STREAM, 0, $sockets);

[$socketToParent, $socketToChild] = $sockets;

No. 150 / 217


Martin Joo - Performance with Laravel

$pid = pcntl_fork();

if ($pid !!+ 0) {
$start = $i * 1_000_000 + 1;

$end = ($i + 1) * 1_000_000;

$count = countPrimeNumbers($start, $end);

socket_write($socketToParent, $count, 1024);

socket_close($socketToParent);

exit;
} else {
$childPids[] = $pid;

$childSockets[$pid] = $socketToChild;
}
}

foreach ($childSockets as $pid !& $socket) {


$childResult = (int) socket_read($socket, 1024);

$primesCount += $childResult;

socket_close($socket);
}

foreach ($childPids as $pid) {


pcntl_waitpid($pid, $status);
}

var_dump($primesCount);

Now the child processes can communicate and return data to the parent process. The parent process can
manage the flow of the program.

No. 151 / 217


Martin Joo - Performance with Laravel

Both the single-process and the multi-process versions give the same result:

As you can see we can gain a lot from using multiple processes. However, there are a lot of technical details
we need to consider. Also, the code doesn't look too good, to be honest, and it's easy to mess it up.

No. 152 / 217


Martin Joo - Performance with Laravel

fork
Fortunately, there's a Spatie package (who else) that makes the process seamless. It's called spatie/fork.

This is what the prime number example looks like:

for ($i = 0; $i < 8; $i!/) {


$start = $i * 1_000_000 + 1;

$end = ($i + 1) * 1_000_000;

$callbacks[] = fn () !& countPrimeNumbers($start, $end);


}

$results = Fork!$new()
!'run(
!!%$callbacks,
);

var_dump(collect($results)!'sum());

Well, that was easy, wasn't it?

Fork gives you a very high-level, easy-to-understand API and takes the low-level stuff away. But under the
hood, it uses pcntl_fork and sockets.

The run function takes a number of callback functions and it returns an array with the results of these
callbacks in order, for example:

$results = Fork!$new()
!'run(
fn () !& return 1;
fn () !& return 2;
);

In this case, $results is going to be [1, 2] .

So how can we use this in a real-world project?

Sending multiple HTTP requests in a parallel way:

No. 153 / 217


Martin Joo - Performance with Laravel

Fork!$new()
!'run(
fn () !& Http!$get('https:!"foo.com'),
fn () !& Http!$get('https:!"bar.com'),
);

Splitting large SQL updates into chunks.

Let's say we are working on a financial application. There's a transactions table with hundreds of
thousands or millions of rows. We need to update a large chunk of these rows, let's say 10,000 rows:

$transactionIds = Transaction!$getIdsForUpdate();

Transaction!$query()
!'whereIn('id', $transactionIds)
!'update(['payout_id' !& $payout!'id]);

This is going to be a huge query with lots of memory consumption and it'll probably run for a long time.

The first type of refactoring we can do is using chunks:

$chunks = $transactionIds!'chunk(1000);

foreach ($chunks as $chunk) {


Transaction!$query()
!'whereIn('id', $chunk)
!'update(['payout_id' !& $payout!'id]);
}

Now there are 10 smaller queries that update 1,000 rows each. It's usually a better solution than having one
query that updates 10,000 rows.

Even better we can parallelize these smaller queries:

No. 154 / 217


Martin Joo - Performance with Laravel

$chunks = $transactionIds!'chunk(1000);

$callbacks = [];

foreach ($chunks as $chunk) {


$callbacks[] = fn () !&
Transaction!$query()
!'whereIn('id', $chunk)
!'update(['payout_id' !& $payout!'id]);
}

Fork!$new()
!'before(fn () !& DB!$connection('mysql')!'reconnect())
!'run(!!%$callbacks);

It creates 10 callback functions for the 10 chunks and then runs them using Fork. If the child processes run
database queries you have to include this line:

!'before(fn () !& DB!$connection('mysql')!'reconnect())

The function passed to before will run before every callback (child process). The package requires a
reconnect if we want to use the database in the child processes.

How do the three different solutions compare in performance?

Let's compare the three solutions:

One huge query

Lots of small queries

Lots of small queries run in parallel

The main measurements are:

Execution time

Memory usage

Here are the results in Inspector:

No. 155 / 217


Martin Joo - Performance with Laravel

As you can see, the parallel one is the clear winner by execution time. It's 2.2x faster than running one huge
query and 1.4x faster than using query chunks. However, there's no difference in memory usage.

In Inspector we can see the 10 update queries running concurrently:

No. 156 / 217


Martin Joo - Performance with Laravel

These results are already great, in my opinion, but let's see what happens if we "scale" our app and try to
update 100,000 transactions instead of 10,000. The chunks are going to be the same size (1,000).

First of all, updating 100,000 rows in one database query is not possible on my system:

local.ERROR: SQLSTATE[HY000]: General error: 1390 Prepared statement contains


too many placeholders

Second, the concurrent version wins with an even more impressive result:

Using query chunks took 4.9s. Using child processes took only 2.4s. So now there's a 2x time difference.

In general, the more tasks you have the more you can benefit from concurrent programming.

But as always, there's a trade-off. With child processes, you can process tasks faster but with a higher CPU
load.

This is the CPU usage of chunk queries:

No. 157 / 217


Martin Joo - Performance with Laravel

And this is the CPU usage of the concurrent version:

Obviously, the goal is to take the most out of your CPU. However, if there are other important tasks on this
server, then it can cause problems because right now, this job is using ~80% of your server. There's not
much computing power left for other tasks.

Fortunately, there's an "in-between" solution that gives us the best of both worlds. We can limit the number
of concurrent tasks when using fork . Right now, it uses as many CPU cores as possible, which is eight in
my case. But we can limit that to, let's say four:

No. 158 / 217


Martin Joo - Performance with Laravel

Fork!$new()
!'concurrent(4)
!'before(fn () !& DB!$connection('mysql')!'reconnect())
!'run(!!%$callbacks);

Now the average CPU usage drops to ~66%

The time difference is almost negligible but of course, it depends on lots of things. You can play around with
these numbers and hopefully, you can find your sweet spot.

No. 159 / 217


Martin Joo - Performance with Laravel

Concurrent HTTP requests


We've already seen how to send multiple HTTP requests at the same time with Fork. But did you know you
can do the same with only Guzzle?

use GuzzleHttp\Client;
use GuzzleHttp\Promise\Utils;

$client = new Client(['base_uri' !& 'http:!"example.com/']);

$promises = [
'image' !& $client!'getAsync('/image'),
'png' !& $client!'getAsync('/image/png'),
'jpeg' !& $client!'getAsync('/image/jpeg'),
'webp' !& $client!'getAsync('/image/webp')
];

$responses = Utils!$unwrap($promises);

echo $responses['image'];
echo $responses['png'];

It's pretty similar to Fork's syntax but it uses an associative array instead and in the results, we can
reference the keys which is nice. Instead of the get function we need to use getAsync and then
Utils::unwrap to wait for the promises.

It's basically the same idea as Promise::all() in JavaScript:

const [result1, result2] = await Promie!$all([


promise1,
promise2,
]);

With these techniques, concurrent programming is not hard at all. It can be pretty useful in some situations.
Whenever you cannot use queue jobs maybe you can use Fork or concurrent Guzzle requests instead. But
of course, you can write concurrent logic inside the job too.

For example, let's say you use some 3rd party systems such as MailChimp, a CRM, etc. You want to sync
your users' data with these 3rd parties on a scheduled basis. You have 5,000 users. For each user, you need
to send 3 HTTP requests to 3 different 3rd parties. Here's what you can do:

No. 160 / 217


Martin Joo - Performance with Laravel

Dispatch 5,000 jobs

Send the 3 requests concurrently inside each job

You can seriously speed up a workflow such as this one. Just think about it. If each HTTP request takes
500ms to complete, traditionally it would take 5,000x3x0.5s=7,500s or 125 minutes or 2 hours. If you send
the requests concurrently then approximately each job would take only 500ms (instead of 1500ms) so the
whole workflow would take 5,000*0.5s=2,500s or 41 minutes.

No. 161 / 217


Martin Joo - Performance with Laravel

Queues and workers


We talked about async and concurrent workflows quite a bit. now let's see how you can run and optimize
your workers in production.

supervisor
The next chapter is related to deployments but it's important to understand if we want to optimize worker
processes.

Whenever you're deploying worker processes it's a good idea to use supervisor .

The most important thing is that worker processes need to run all the time even if something goes wrong.
Otherwise, they'd be unreliable. For this reason, we cannot just run php artisan queue:work on a
production server as we do on a local machine. We need a program that supervises the worker process,
restarts them if they fail, and potentially scales the number of processes.

The program we'll use is called supervisor . It's a process manager that runs in the background (daemon)
and manages other processes such as queue:work .

First, let's review the configuration:

[program:worker]
command=php /var/!!0/html/posts/api/artisan queue:work !)tries=3 !)verbose !)
timeout=30 !)sleep=3

We can define many "programs" such as queue:work . Each has a block in a file called supervisord.conf .
Every program has a command option which defines the command that needs to be run. In this case, it's the
queue:work but with the full artisan path.

As I said, it can scale up processes:

[program:worker]
command=php /var/!!0/html/posts/api/artisan queue:work !)
queue=default,notification !)tries=3 !)verbose !)timeout=30 !)sleep=3
numprocs=2

In this example, it'll start two separate worker processes. They both can pick up jobs from the queue
independently from each other. This is similar to when you open two terminal windows and start two
queue:work processes on your local machine.

Supervisor will log the status of the processes. But if we run the same program ( worker ) in multiple
instances it's a good practice to differentiate them with "serial numbers" in their name:

No. 162 / 217


Martin Joo - Performance with Laravel

[program:worker]
command=php /var/!!0/html/posts/api/artisan queue:work !)
queue=default,notification !)tries=3 !)verbose !)timeout=30 !)sleep=3
numprocs=2
process_name=%(program_name)s_%(process_num)02d

%(program_name)s will be replaced with the name of the program ( worker ), and %(process_num)02d will
be replaced with a two-digit number indicating the process number (e.g. 00 , 01 , 02 ). So when we run
multiple processes from the same command we'll have logs like this:

Next, we can configure how supervisor is supposed to start or restart the processes:

[program:worker]
command=php /var/!!0/html/posts/api/artisan queue:work !)
queue=default,notification !)tries=3 !)verbose !)timeout=30 !)sleep=3
numprocs=2
process_name=%(program_name)s_%(process_num)02d
autostart=true
autorestart=true

No. 163 / 217


Martin Joo - Performance with Laravel

autostart=true tells supervisor to start the program automatically when it starts up. So when we start
supervisor (for example when deploying a new version) it'll automatically start the workers.

autorestart=true tells supervisor to automatically restart the program if it crashes or exits. Worker
processes usually take care of long-running heavy tasks, often communicating with 3rd party services. It's
not uncommon that they crash for some reason. By setting autorestart=true we can be sure that they are
always running.

[program:worker]
command=php /var/!!0/html/posts/api/artisan queue:work !)
queue=default,notification !)tries=3 !)verbose !)timeout=30 !)sleep=3
numprocs=2
process_name=%(program_name)s_%(process_num)02d
autostart=true
autorestart=true
stopasgroup=true
killasgroup=true

stopasgroup and killasgroup basically mean: stop/kill all subprocesses as well when the parent process
(queue:work) stops/dies.

As I said, errors happen fairly often in queue workers, so it's a good practice to think about them:

[program:worker]
command=php /var/!!0/html/posts/api/artisan queue:work !)
queue=default,notification !)tries=3 !)verbose !)timeout=30 !)sleep=3
numprocs=2
process_name=%(program_name)s_%(process_num)02d
autostart=true
autorestart=true
stopasgroup=true
killasgroup=true
redirect_stderr=true
stdout_logfile=/var/log/supervisor/worker.log

redirect_stderr=true tells supervisor to redirect standard error output to the same place as standard
output. We treat errors and info messages the same way.

stdout_logfile=/var/log/supervisor/worker.log tells supervisor where to write standard output for


each process. Since we redirect stderr to stdout we'll have one log file for every message:

No. 164 / 217


Martin Joo - Performance with Laravel

That was all the worker-specific configuration we need but supervisor itself also needs some config in the
same supervisord.conf file:

[supervisord]
logfile=/var/log/supervisor/supervisord.log
pidfile=/run/supervisord.pid

logfile=/var/log/supervisor/supervisord.log tells supervisor where to write its own log messages.


This log file contains information about the different programs and processes it manages. You can see the
screenshot above.

pidfile=/run/supervisord.pid tells supervisor where to write its own process ID (PID) file. These files
are usually located in the run directory:

No. 165 / 217


Martin Joo - Performance with Laravel

By the way, PID files on Linux are similar to a MySQL or Redis database for us, web devs.

They are files that contain the process ID (PID) of a running program. They are usually created by daemons
or other long-running processes to help manage the process.

When a daemon or other program starts up, it will create a PID file to store its own PID. This allows other
programs (such as monitoring tools or control scripts) to easily find and manage the daemon. For example,
a control script might read the PID file to determine if the daemon is running, and then send a signal to that
PID to stop or restart the daemon.

And finally, we have one more config:

[supervisorctl]
serverurl=unix:!!1run/supervisor.sock

This section sets some options for the supervisorctl command-line tool. supervisorctl is used to
control Supervisor. With this tool, we can list the status of processes, reload the config, or restart processes
easily. For example:

supervisorctl status

Returns a list such as this:

No. 166 / 217


Martin Joo - Performance with Laravel

And finally, the serverurl=unix:///run/supervisor.sock config tells supervisorctl to connect to


supervisor using a Unix socket. We've already used a Unix socket when we connected nginx to php-fpm.
This is the same here. supervisorctl is "just" a command-line tool that provides better interaction with
supervisor and its processes. It needs a way to send requests to supervisor .

No. 167 / 217


Martin Joo - Performance with Laravel

Multiple queues and priorities


First of all, multiple queues do not mean multiple Redis or MySQL instances.

connection: this is what Redis or MySQL is in Laravel-land. Your app connects to Redis so it's a
connection.

queue: inside Redis, we can have multiple queues with different names.

For example, if you're building an e-commerce site, the app connects to one Redis instance but you can
have at least three queues:

payments

notifications

default

Since payments are the most important jobs it's probably a good idea to separate them and handle them
with priority. The same can be true for notifications as well (obviously not as important as payments but
probably more important than a lot of other things). And for every other task, you have a queue called
default. These queues live inside the same Redis instance (the same connection) but under different keys
(please don't quote me on that).

So let's say we have payments, notifications, and the default queue. Now, how many workers do we need?
What queues should they be processing? How do we prioritize them?

A good idea can be to have dedicated workers for each queue, right? Something like that:

[program:payments-worker]
command=php artisan queue:work !)queue=payments !)tries=3 !)verbose !)
timeout=30 !)sleep=3
numprocs=4

[program:notifications-worker]
command=php artisan queue:work !)queue=notifications !)tries=3 !)verbose !)
timeout=30 !)sleep=3
numprocs=2

[program:default-worker]
command=php artisan queue:work !)queue=default !)tries=3 !)verbose !)
timeout=30 !)sleep=3
numprocs=2

And then when you dispatch jobs you can do this:

No. 168 / 217


Martin Joo - Performance with Laravel

ProcessPaymentJob!$dispatch()!'onQueue('payments');

$user!'notify(
(new OrderCompletedNotification($order))!'onQueue('notifications');
);

!" default queue


CustomersExport!$dispatch();

Alternatively, you can define a $queue in the job itself:

class ProcessPayment implements ShouldQueue


{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

public $queue = 'payments';


}

By defining the queue in the job you can be 100% sure that it'll always run in the given queue so it's a safer
option in my opinion.

And now, the following happens:

So the two (in fact three because there's also the default) are being queued at the same time by dedicated
workers. Which is great, but what if something like that happens?

No. 169 / 217


Martin Joo - Performance with Laravel

There are so many jobs in the notifications queue but none in the payments. If that happens we just waste
all the payments worker processes since they have nothing to do. But this command doesn't let them to
processes anything else:

php artisan queue:work !)queue=payments

This means they can only touch the payments queue and nothing else.

Because of that problem, I don't recommend you have dedicated workers for only one queue. Instead,
prioritize them!

We can do this:

php artisan queue:work !)queue=payments,notifications

The command means that if there are jobs in the payments queue, these workers can only process them.
However, if the payments queue is empty, they can pick up jobs from the notifications queue as well. And
we can do the same for the notifications workers:

php artisan queue:work !)queue=notifications,payments

Now if the payments queue is empty this will happen:

No. 170 / 217


Martin Joo - Performance with Laravel

Now payments workers also pick up jobs from the notifications so we don't waste precious worker
processes. But of course, if there are payments job they prioritize them over notifications:

In this example, only one payment job came in so one worker is enough to process it. All of this is managed
by Laravel!

And of course, we can prioritize three queues as well:

# payment workers
php artisan queue:work !)queue=payments,notifications,default

# notification workers
php artisan queue:work !)queue=notifications,payments,default

# other workers
php artisan queue:work !)queue=default,payments,notifications

No. 171 / 217


Martin Joo - Performance with Laravel

With this setup, we cover almost any scenario:

If there are a lot of payment jobs possibly three workers (more than three processes, of course) will
process them.

If there isn't any important job (payment or notification) there are a lot of workers available for default
jobs.

And we also cover the notifications queue.

No. 172 / 217


Martin Joo - Performance with Laravel

Optimizing worker processes


Number of worker processes

That's a tricky question, but a good rule of thumb: run one process for each CPU core.

But of course, it depends on several factors, such as the amount of traffic your application receives, the
amount of work each job requires, and the resources available on your server.

As a general rule of thumb, you should start with one worker process per CPU core on your server. For
example, if your server has 4 CPU cores, you might start with 4 worker processes and monitor the
performance of your application. If you find that the worker processes are frequently idle or that there are
jobs waiting in the queue for too long, you might consider adding more worker processes.

It's also worth noting that running too many worker processes can actually decrease performance, as each
process requires its own memory and CPU resources. You should monitor the resource usage of your
worker processes and adjust the number as needed to maintain optimal performance.

However, there are situations when you can run more processes than the number of CPUs. It's a rare case,
but if your jobs don't do much work on your machine you can run more processes. For example, I have a
project where every job sends API requests and then returns the results. These kinds of jobs are not
resource-heavy at all since they do not run much work on the actual CPU or disk. But usually, jobs are
resource-heavy processes so don't overdo it.

Memory and CPU considerations

Queued jobs can cause some memory leaks. Unfortunately, I don't know the exact reasons but not
everything is detected by PHP's garbage collector. As time goes on, and your worker processes more jobs it
uses more and more memory.

Fortunately, the solution is simple:

php artisan queue:work !)max-jobs=1000 !)max-time=3600

--max-jobs tells Laravel that this worker can only process 1000 jobs. After it reaches the limit it'll be shut
down. Then memory will be freed up and supervisor restarts the worker.

--max-time tells Laravel that this worker can only live for an hour. After it reaches the limit it'll be shut
down. Then memory will be freed up and supervisor restarts the worker.

These two options can save us some serious trouble.

Often times we run workers and nginx on the same server. This means that they use the same CPU and
memory. Now, imagine what happens if there are 5000 users in your application and you need to send a
notification to everyone. 5000 jobs will be pushed onto the queue and workers start processing them like
there's no tomorrow. Sending notifications it's too resource-heavy, but if you're using database notifications
as well, it means at least 5000 queries. Let's say the notification contains a link to your and users start to
come to your site. nginx has few resources to use since your workers eat up your server.

One simple solution to give a higher nice value to your workers:

No. 173 / 217


Martin Joo - Performance with Laravel

nice -n 10 php artisan queue:work

These values can go from 0-19 and a higher value means a lower priority to the CPU. This means that your
server will prioritize nginx or php-fpm processes over your worker processes if there's a high load.

Another option is to use the rest flag:

php artisan queue:work !)rest=1

This means the worker will wait for 1 second after it finishes with a job. So your CPU has an opportunity to
server nginx or fpm processes.

So this is what the final command looks like:

nice -n 10 php /var/!!0/html/posts/api/artisan queue:work !)


queue=notifications,default !)tries=3 !)verbose !)timeout=30 !)sleep=3 !)
rest=1 !)max-jobs=1000 !)max-time=3600

I never knew about nice or rest before reading Mohamed Said's amazing book Laravel Queues in Action.

No. 174 / 217


Martin Joo - Performance with Laravel

Chunking large datasets


When it comes to working with larger datasets one of the best you can apply to any problem is chunking.
Divide the dataset into smaller chunks and process them. It comes in many different forms. In this chapter,
we're going review a few of them but the basic idea is always the same: divide your data into smaller chunks
and process them.

Exports
Exporting to CSV or XLS and importing from them is a very common feature in modern applications.

I'm going to use a finance application as an example. Something like Paddle, Gumroad. They are a merchant
of records that look like this:

This is what's happening:

A seller (or content creator) uploads a product to Paddle

They integrate Paddle into their landing page

Buyers buy the product from the landing page using a Paddle checkout form

Paddle pays the seller every month

I personally use Paddle to sell my books and SaaS and it's a great service. The main benefit is that you don't
have to deal with hundreds or thousands of invoices and VAT ramifications. Paddle handles it for you. They
send an invoice to the buyer and apply the right amount of VAT based on the buyer's location. They also
handle VAT ramifications. You, as the seller, don't have to deal with any of that stuff. They just send you the
money once every month and you have only one invoice. It also provides nice dashboards and reports.

Every month they send payouts to their users based on the transactions. They also send a CSV that contains
all the transactions in the given month.

This is the problem we're going to imitate in this chapter. Exporting tens of thousands of transactions in an
efficient way.

This is what the transactions table looks like:

No. 175 / 217


Martin Joo - Performance with Laravel

id product_id quantity revenue balance_earnings payout_id stripe_id user_id created_at

1 1 1 3900 3120 NULL acac83e2 1 2024-04-22 13:59:07

2 2 1 3900 3120 NULL ... 1 2024-04-17 17:43:12

These are two transactions for user #1. I shortened some UUIDs so the table fits the page better. Most
columns are pretty easy to understand. Money values are stored in cent values so 3900 means $39 . There
are other rows as well, but they are not that important.

When it is payout time, a job queries all transactions in a given month for a user, creates a Payout object,
and then sets the payout_id in this table. This way we know that the given transaction has been paid out.
The same job exports the transactions for the user and sends them via e-mail.

laravel-excel is one of the most popular packages when it comes to imports/exports so we're going to
use it in the first example.

This is what a typical export looks like:

namespace App\Exports;

class TransactionsSlowExport implements FromCollection, WithMapping,


WithHeadings
{
use Exportable;

public function !*construct(


private User $user,
private DateInterval $interval,
) {}

public function collection()


{
return Transaction!$query()
!'where('user_id', $this!'user!'id)
!'whereBetween('created_at', [
$this!'interval!'startDate,
$this!'interval!'endDate,
])
!'get();
}

public function map($row): array

No. 176 / 217


Martin Joo - Performance with Laravel

{
return [
$row!'uuid,
Arr!$get($row!'product_data, 'title'),
$row!'quantity,
MoneyForHuman!$from($row!'revenue)!'value,
MoneyForHuman!$from($row!'fee_amount)!'value,
MoneyForHuman!$from($row!'tax_amount)!'value,
MoneyForHuman!$from($row!'balance_earnings)!'value,
$row!'customer_email,
$row!'created_at,
];
}

public function headings(): array


{
return [
'#',
'Product',
'Quantity',
'Total',
'Fee',
'Tax',
'Balance earnings',
'Customer e-mail',
'Date',
];
}
}

I've seen dozens of exports like this one over the years. It creates a CSV from a collection. In the
collection method, you can define your collection which is 99% of the time the result of a query. In this
case, the collection contains Transaction models. Nice and simple.

However, an export such as this one has two potential problems:

The collection method runs a single query and loads each and every transaction into memory. The
moment you exceed x number of models your process will die because of memory limitations. x of
course varies highly.

No. 177 / 217


Martin Joo - Performance with Laravel

If your collection is not that big and the export made it through the query, the map function will run for
each and every transaction. If you execute only one query here, it'll run n times where n is the
number of rows in your CSV. This is the breeding ground for N+1 problems.

Be aware of these things because it's pretty easy to kill your server with a poor export.

The above job has failed with only 2,000 transactions:

1,958 to be precise. The result is Allowed memory size exhausted :

As you can see, it is executed in a worker. This is possible by two things:

The export uses the Exportable trait from the package, which has a queue function

The method that runs the export uses this queue method:

new TransactionsExport(
$user,
$interval,
)
!'queue($report!'relativePath())
!'chain([
new NotifyUserAboutExportJob($user, $report),
]);

No. 178 / 217


Martin Joo - Performance with Laravel

This is how you can make an export or import queueable.

Fortunately, there's a much better export type than FromCollection , it is called FromQuery . This export
does not define a Collection but a DB query instead that will be executed in chunks by laravel-excel .

This is how we can rewrite the export class:

namespace App\Exports;

class TransactionsExport implements FromQuery, WithHeadings,


WithCustomChunkSize, WithMapping
{
use Exportable;

public function !*construct(


private User $user,
private DateInterval $interval,
) {}

public function query()


{
return Transaction!$query()
!'select([
'uuid',
'product_data',
'quantity',
'revenue',
'fee_amount',
'tax_amount',
'balance_earnings',
'customer_email',
'created_at',
])
!'where('user_id', $this!'user!'id)
!'whereBetween('created_at', [
$this!'interval!'startDate!'date,
$this!'interval!'endDate!'date,
])
!'orderBy('created_at');

No. 179 / 217


Martin Joo - Performance with Laravel

public function chunkSize(): int


{
return 250;
}

public function headings(): array


{
!" Same as before
}

public function map($row): array


{
!" Same as before
}
}

Instead of returning a Collection the query method returns a query builder. In addition, you can also use
the chunkSize method. It works hand in hand with Exportable and FromQuery :

Queued exports (using the Exportable trait and the queue method) are processed in chunks

If the export implements FromQuery the number of jobs is calculated by query()->count() /


chunkSize()

So in the chunkSize we can control how many jobs we want. For example, if we have 5,000 transactions for
a given user and chunkSize() returns 250 it means that 20 jobs will be dispatched each processing 250
transactions. Unfortunately, I cannot give you exact numbers. It all depends on your specific use case.
However, it's a nice way to fine-tune your export.

Using the techniques above, exporting 10k transactions is a walk in the park:

9,847 to be precise but the jobs are running smoothly. There are 40 jobs each processing 250 transactions:

No. 180 / 217


Martin Joo - Performance with Laravel

The last jobs larave-excel runs are CloseSheet and StoreQueuedExport .

No. 181 / 217


Martin Joo - Performance with Laravel

Imports
This is what a basic laravel-excel import looks like this:

namespace App\Imports;

class UsersImport implements ToModel


{
public function model(array $row)
{
return new User([
'name' !& $row[0],
]);
}
}

It reads the CSV and calls the model method for each row then it calls save on the model you returned. It
means that it executes one query for each row. If you're importing thousands or tens of thousands of
users you'll spam your database and there's a good chance it will be unavailable.

Fortunately, there are two tricks we can apply:

Batch inserts

Chunk reading

Batch inserts

Batch insert means that laravel-excel won't execute one query per row, but instead, it batches the rows
together:

namespace App\Imports;

class UsersImport implements ToModel, WithChunkReading


{
public function model(array $row)
{
return new User([
'name' !& $row[0],
]);
}

No. 182 / 217


Martin Joo - Performance with Laravel

public function batchSize(): int


{
return 500;
}
}

This will execute x/500 inserts where x is the number of users.

Chunk reading

Chunk reading means that instead of reading the entire CSV into memory at once laravel-excel chunks it
into smaller pieces:

namespace App\Imports;

class UsersImport implements ToModel, WithChunkReading


{
public function model(array $row)
{
return new User([
'name' !& $row[0],
]);
}

public function chunkSize(): int


{
return 1000;
}
}

This import will load 1,000 users into memory at once.

Of course, these two features can be used together to achieve the best performance:

No. 183 / 217


Martin Joo - Performance with Laravel

namespace App\Imports;

class UsersImport implements ToModel, WithBatchInserts, WithChunkReading


{
public function model(array $row)
{
return new User([
'name' !& $row[0],
]);
}

public function batchSize(): int


{
return 500;
}

public function chunkSize(): int


{
return 1000;
}
}

No. 184 / 217


Martin Joo - Performance with Laravel

Generators & LazyCollections


What if you can't or don't want to use laravel-excel to import a large CSV? We can easily read a CSV in
PHP:

public function readCsv(string $path): Collection


{
$stream = fopen($path, 'r');

if ($stream !!+ false) {


throw new Exception('Unable to open csv file at ' . $path);
}

$rows = [];
$rowIdx = -1;
$columns = [];

while (($data = fgetcsv($stream)) !!2 false) {


$rowIdx!/;

if ($rowIdx !!+ 0) {
$columns = $data;

continue;
}

$row = [];

foreach ($data as $idx !& $value) {


$row[$columns[$idx]] = $value;
}

$rows[] = $row;
}

fclose($stream);

No. 185 / 217


Martin Joo - Performance with Laravel

return collect($rows);
}

fgetcsv by default reads the file line by line so it won't load too much data into memory, which is good.

This function assumes that the first line of the CSV contains the headers. This block saves them into the
$columns variable:

if ($rowIdx !!+ 0) {
$columns = $data;

continue;
}

$column is an array such as this:

[
0 !& 'username',
1 !& 'email',
2 !& 'name',
]

fgets returns an indexed array such as this:

[
0 !& 'johndoe',
1 !& '[email protected]',
2 !& 'John Doe',
]

The following block transforn this array into an associative one:

No. 186 / 217


Martin Joo - Performance with Laravel

$row = [];

foreach ($data as $idx !& $value) {


$row[$columns[$idx]] = $value;
}

$rows[] = $row;

At the end, the function closes the file, and returns a collection such as this:

[
[
'username' !& 'johndoe',
'email' !& '[email protected]',
'name' !& 'John Doe',
],
[
'username' !& 'janedoe',
'email' !& '[email protected]',
'name' !& 'Jane Doe',
],
]

It's quite simple, but it has one problem: it holds every row in memory. Just like with laravel-excel it will
exceed the memory limit after a certain size. There are two ways to avoid this problem:

PHP generators

Laravel's LazyCollection

Since LazyCollections are built on top generators, let's first understand them.

No. 187 / 217


Martin Joo - Performance with Laravel

PHP generators
With a little bit of simplification, a generator function is a function that has multiple return statements. But
instead of return we can use the yield keyword. Here's an example:

public function getProducts(): Generator


{
foreach (range(1, 10_000) as $i) {
yield [
'id' !& $i,
'name' !& "Product #{$i}",
'price' !& rand(9, 99),
];
}
}

foreach (getProducts() as $product) {


echo $product['id'];
}

Any function that uses the yield keyword will return a Generator object which implements the Iterable
interface so we can use it in a foreach .

Each time you call the getProducts function you get exactly one product back. So it won't load 10,000
products into memory at once, but only one.

The function above behaves the same as this one:

public function getProducts(): array


{
$products = [];

foreach (range(1, 10000) as $i) {


$products[] = [
'id' !& $i,
'name' !& "Product #{$i}",
'price' !& rand(9, 99),
];
}

No. 188 / 217


Martin Joo - Performance with Laravel

return $products;
}

But this function will load 10,000 products into memory each time you call it.

Here's the memory usage of the standard (non-generator) function:

# of items Peak memory usage

10,000 5.45MB

100,000 49MB

300,000 PHP Fatal error: Allowed memory size of 134217728 bytes exhausted

It reached the 128MB memory limit with 300,000 items. And these items are lightweight arrays with only
scalar attributes! Imagine Eloquent models with 4-5 different relationships, attribute accessors, etc.

Now let's see the memory usage of the generator-based function:

# of items Peak memory usage

10,000 908KB

100,000 4.5MB

1,000,000 33MB

2,000,000 65MB

3,000,000 PHP Fatal error: Allowed memory size of 134217728 bytes exhausted

It can handle 2,000,000 items using only 65MB of RAM. It's 20 times more than what the standard function
could handle. However, the memory usage is only 32% higher (65M vs 49M).

No. 189 / 217


Martin Joo - Performance with Laravel

Imports with generators


Now let's add a generator to the readCsv function:

while (($data = fgetcsv($stream, 1000, ',')) !, false) {


$rowIdx!/;

if ($rowIdx !!+ 0) {
$columns = $data;

continue;
}

$row = [];

foreach ($data as $idx !& $value) {


$row[$columns[$idx]] = $value;
}

yield $row;
}

The whole function is identical except that it's not accumulating the data in a $rows variable but instead, it
yields every line when it reads it.

In another function we can use it as it was a standard function:

$transactions = $this!'readCsv();

foreach ($transactions as $transaction) {


!" !!%
}

This is the equivalent of chunk reading in laravel-excel . Now let's implement batch inserts as well.

This would be the traditional one-insert-per-line solution:

No. 190 / 217


Martin Joo - Performance with Laravel

$transactions = $this!'readCsv();

foreach ($transactions as $transaction) {


Transaction!$create($transaction);
}

It runs one DB query for each CSV line. It can be dangerous if the CSV contains 75,000 lines, for example.

So instead, we can batch these into bigger chunks:

$transactions = $this!'readCsv();
$transactionBatch = [];

foreach ($transactions as $i !& $transaction) {


$transactionBatch[] = $transaction;

if ($i > 0 !. $i % 500 !!+ 0) {


Transaction!$insert($transactionBatch);

$transactionBatch = [];
}
}

if (!empty($transactionBatch)) {
Transaction!$insert($transactionBatch);
}

It accumulates transactions until it hits an index that can be divided by 500 then it inserts 500 transactions
at once. If there were 1,741 transactions, for example, the insert after the loop inserts the remaining 241.

With generators and a little trick, we achieved the same two things as with laravel-excel :

Loading only one line into memory at once

Chunking the database writes

No. 191 / 217


Martin Joo - Performance with Laravel

Imports with LazyCollections


LazyCollection is a combination of collections and generators. We use it like this:

$collection = LazyCollection!$make(function () {
$handle = fopen('log.txt', 'r');

while (($line = fgets($handle)) !!2 false) {


yield $line;
}
});

It works the same way as Generators but the make function returns a LazyCollection instance that has
lots of useful Collection methods such as map or each .

Our example can be rewritten as this:

public function readCsv(): LazyCollection


{
return LazyCollection!$make(function() {
$stream = fopen(storage_path('app/transactions.csv'), 'r');

if ($stream !!+ false) {


throw new Exception(
'Unable to open csv file at ' . storage_path('app/transactions.csv')
);
}

$rowIdx = -1;
$columns = [];

while (($data = fgetcsv($stream, 1000, ',')) !, false) {


$rowIdx!/;

if ($rowIdx !!+ 0) {
$columns = $data;

continue;

No. 192 / 217


Martin Joo - Performance with Laravel

$row = [];

foreach ($data as $idx !& $value) {


$row[$columns[$idx]] = $value;
}

yield $row;
}
});
}

The function that uses the readCsv method now looks like this:

$this!'readCsv()
!'chunk(500)
!'each(function(LazyCollection $transactions) {
Transaction!$insert($transactions!'toArray());
});

We can leverage the built-in chunk method that chunks the result by 500.

Once again, we achieved two things:

Loading only one line into memory at once

Chunking the database writes

No. 193 / 217


Martin Joo - Performance with Laravel

Reading files
Similarly to CSVs, reading a simple text file by chunks is pretty straightforward:

public function readFileByLines(string $path): Generator


{
$stream = fopen($path, 'r');

while ($chunk = fread($stream, 1024)) {


yield $chunk;
}

fclose($stream);
}

Even when reading a 60MB file the peak memory usage is 20MB.

We can use this function as if it returned an array:

$contents = $this!'readByLines("./storage/app/test.txt");

foreach ($contents as $chunk) {


var_dump($chunk);
}

No. 194 / 217


Martin Joo - Performance with Laravel

Deleting records
Deleting a large number of records can be tricky because you need to run a huge query that keeps MySQL
busy for seconds or even minutes.

For example, I have a table from which I'd like to delete 4.2m records:

Running this query took 1.47s (there's no index on the created_at column).

The Laravel command that removes page_views records looks like this:

public function handle()


{
PageView!$where('created_at', '!-', now()!'subWeek())!'delete();
}

Deleting 4.2m records took almost 1 minute:

For 58s the database was really busy executing this huge delete operation. While it was deleting records I
ran the same count(*) query a few times to check the speed of MySQL. At some point, it took 4.4s to run.
Normally it took 1.47s. So a huge delete query such as this one will make your database much slower. Or
even brings it down completely.

A pretty simple and neat trick we can apply here is to chunk the query and add a sleep in the function. I
first read about this trick from Matt Kingshott on Twitter.

The solution looks like this:

No. 195 / 217


Martin Joo - Performance with Laravel

public function handle()


{
$query = PageView!$where('created_at', '!-', now()!'subWeek());

while ($query!'exists()) {
$query!'limit(5000)!'delete();

sleep(1);
}
}

This is how it works:

The $query doesn't specify if it's a select or a delete . It's just a query builder object with a where
expression.

The while loop invokes exists which will run a select count(*) query to determine if there are
any rows between the given time range.

If records can be found we delete 5,000 and then sleep for 1 second.

Of course, 1 second is just an example, maybe in your specific use case you need to use more than that.

This way, the process will take a much longer time to complete, but you won't overload your database and it
remains fast during the execution.

The point is whenever you need to work with a large dataset, it's probably a good idea to apply "divide and
conquer," or in other words chunk your data into smaller pieces and process them individually. In the
"Async workflows" chapter, you can see more examples of this idea.

No. 196 / 217


Martin Joo - Performance with Laravel

Miscellaneous
fpm processes
php-fpm comes with a number of configurations that can affect the performance of our servers. These are
the most important ones:

pm.max_children : This directive sets the maximum number of fpm child processes that can be
started. This is similar to worker_processes in nginx.

pm.start_servers : This directive sets the number of fpm child processes that should be started when
the fpm service is first started.

pm.min_spare_servers : This directive sets the minimum number of idle fpm child processes that
should be kept running to handle incoming requests.

pm.max_spare_servers : This is the maximum number of idle fpm child processes.

pm.max_requests : This directive sets the maximum number of requests that an fpm child process can
handle before it is terminated and replaced with a new child process. This is similar to the --max-jobs
option of the queue:work command.

So we can set max_children to the number of CPUs, right? Actually, nope.

The number of php-fpm processes is often calculated based on memory rather than CPU because PHP
processes are typically memory-bound rather than CPU-bound.

When a PHP script is executed, it loads into memory and requires a certain amount of memory to run. The
more PHP processes that are running simultaneously, the more memory will be consumed by the server. If
too many PHP processes are started, the server may run out of memory and begin to swap, which can lead
to performance issues.

TL;DR: if you don't have some obvious performance issue in your code php usually consumes more
memory than CPU.

So we need a few pieces of information to figure out the correct number for the max_children config:

How much memory does your server have?

How much memory does a php-fpm process consume on average?

How much memory does your server need just to stay alive?

Here's a command that will give you the average memory used by fpm processes:

ps -ylC php-fpm8.1 !)sort:rss

ps is a command used to display information about running processes.

-y tells ps to display the process ID (PID) and the process's controlling terminal.

-l instructs ps to display additional information about the process, including the process's state, the
amount of CPU time it has used, and the command that started the process.

-C php-fpm8.1 tells ps to only display information about processes with the name php-fpm8.1 .

No. 197 / 217


Martin Joo - Performance with Laravel

--sort:rss : will sort the output based on the amount of resident set size (RSS) used by each process.

What the hell is the resident set size? It's a memory utilization metric that refers to the amount of physical
memory currently being used by a process. It includes the amount of memory that is allocated to the
process and cannot be shared with other processes. This includes the process's executable code, data, and
stack space, as well as any memory-mapped files or shared libraries that the process is using.

It's called "resident" for a reason. It shows the amount of memory that cannot be used by other processes.
For example, when you run memory_get_peak_usage() in PHP it only returns the memory used by the PHP
script. On the other hand, RSS measures the total memory usage of the entire process.

The command will spam your terminal with an output such as this:

The RSS column shows the memory usage. From 25Mb 43MB in this case. The first line (which has
significantly lower memory usage) is usually the master process. We can take that out of the equation and
say the average memory used by a php-fpm worker process is 43MB.

However, here are some numbers from a production (older) app:

Yes, these are 130MB+ numbers.

No. 198 / 217


Martin Joo - Performance with Laravel

The next question is how much memory does your server need just to stay alive? This can be determined
using htop :

As you can see from the load average, right now nothing is happening on this server but it uses ~700MB of
RAM. This memory is used by Linux, PHP, MySQL, Redis, and all the system components installed on the
machine.

So the answers are:

This server has 2GB of RAM

It needs 700MB to survive

On average an fpm process uses 43MB of RAM

This means there is 1.3GB of RAM left to use. So we can spin up 1300/30=30 fpm processes.

It's a good practice to decrease the available RAM by at least 10% as a kind of "safety margin". So let's
calculate with 1.17GB of RAM: 1170/37=28.

So on this particular server, I can probably run 25-30 fpm processes.

Here's how we can determine the other values:

Config General This example

pm.max_children As shown above 28

pm.start_servers ~25% of max_children 7

pm.min_spare_servers ~25% of max_children 7

pm.max_spare_servers ~75% of max_children 21

To be completely honest, I'm not sure how these values are calculated but they are the "standard" settings.
You can search these configs on the web and you probably run into an article suggesting similar numbers.
By the way, there's also a calculator here.

To configure these values you need to edit the fpm config file which in my case is locaded in
/etc/php/8.1/fpm/pool.d/www.conf :

pm.max_children = 28
pm.start_servers = 7
pm.min_spare_servers = 7
pm.max_spare_servers = 21

No. 199 / 217


Martin Joo - Performance with Laravel

After that, you need to restart fpm :

systemctl restart php8.1-fpm.service

Changing the number of children processes requires a full restart since fpm needs to kill and spawn
processes.

No. 200 / 217


Martin Joo - Performance with Laravel

nginx cache
There are different types of caching mechanisms in nginx. We're gonna discover three of them:

Static content

FastCGI

Proxy

Caching static content


It's probably the most common and easiest one. The idea is that nginx will instruct the browser to store
static content such as images or CSS files so that when an HTML page tries to load them it won't make a
request to the server.

Caching static content with nginx can significantly improve the performance of a web application by
reducing the number of requests to the server and decreasing the load time of pages.

nginx provides several ways to cache static content such as JavaScript, CSS, and images. One way is to use
the expires directive to set a time interval for the cached content to be considered fresh.

This is a basic config:

location ~* \.(css|js|png|jpg|gif|ico)$ {
access_log off;
add_header Cache-Control public;
add_header Vary Accept-Encoding;
expires 1d;
}
~*` means a case-sensitive regular expression that matches files such as
`https:!"example.com/style.css

In most cases, it's a good idea to turn off the access_log when requesting images, CSS, and js files. It
spams the hell out of your access log file but doesn't really help you.

The other directives are:

add_header Cache-Control public; : this adds a response header to enable caching of the static files
by public caches such as browsers, proxies, and CDNs. Basically, this instructs the browser to store the
files.

add_header Vary Accept-Encoding; : this adds a response header to indicate that the content may
vary based on the encoding of the request.

expires 1d; : this sets the expiration time for the cached content to 1 day from the time of the
request. There's no "perfect" time here. It depends on your deployment cycle, the usage, and so on. I
usually use a shorter time since it doesn't cause too many errors. For example, if you cache JS files for 7
days because you deploy on a weekly basis it means you cannot release a bugfix confidently, because

No. 201 / 217


Martin Joo - Performance with Laravel

browsers might cache the old, buggy version. Of course, you can define a dedicated location directive
for JS, CSS files and another one for images. Something like this:

location ~* \.(css|js)$ {
access_log off;
add_header Cache-Control public;
add_header Vary Accept-Encoding;
expires 1d;
}

location ~* \.(png|jpg|gif|ico)$ {
access_log off;
add_header Cache-Control public;
add_header Vary Accept-Encoding;
expires 7d;
}

Assuming that images don't change that often.

As you can see, it was pretty easy. Caching static content with nginx is an effective way to improve the
performance of your app, reduce server load, and enhance the user experience.

No. 202 / 217


Martin Joo - Performance with Laravel

Caching fastcgi responses


Caching fastcgi responses with nginx is a technique used to improve the performance of dynamic web
applications that use fastcgi to communicate with backend servers.

The fastcgi_cache directive can be used to store the responses from the fastcgi server on disk and serve
them directly to clients without having to go through the backend server every time. So this is what
happens:

The browser sends a request to the API

nginx forwards it to fastcgi which runs our application

nginx saves the response to the disk

It returns the response to the client

Next time when a request comes into the same URL it won't forward the request to fastcgi. Instead, it
loads the content from the disk and returns it immediately to the client.

Caching fastcgi responses can drastically reduce the load on backend servers, improve the response time of
web applications, and enhance the user experience. It is particularly useful for websites that have high
traffic and serve dynamic content that changes infrequently.

In the company I'm working for, we had a recurring performance problem. The application we're building is
a platform for companies to handle their internal communication and other PR or HR-related workflows.
One of the most important features of the app is posts and events. Admins can create a post and publish
them. Employees get a notification and they can read the post.

Let's say a company has 10,000 employees. They publish an important post that interests people. All 1000
employees get the notification in 60 seconds or so. And they all hit the page within a few minutes. That's a
big spike compared to the usual traffic. The post details page (where employees go from the mail or push
notification) is, let's say, not that optimal. It's legacy code and has many performance problems such as N+1
queries. The page triggers ~80 SQL queries. 10 000 x 80 = 800 000 SQL queries. Eight hundred thousand
SQL queries in 5-10 minutes or so. That's bad.

There are at least two things we can do in such situations:

Optimize the code and remove N+1 queries and other performance issues. This is outside of the scope
but fortunately, there's Laracheck which can detect N+1 and other performance problems in your
code! Now, that was a seamless plug, wasn't it?

Cache the contents of the API request in nginx.

In this case, caching is a very good idea, in my opinion:

The API response doesn't change frequently. Only when admins update or delete the post.

The response is independent of the current user. Every user sees the same title, content, etc so there's
no personalization on the page. This is required because nginx doesn't know anything about users and
their settings/preferences.

Since we're trying to solve a traffic spike problem, it's a very good thing if we could handle it on the
nginx-level. This means users won't even hit the API and Laravel. Even if you cache the result of a
database query with Laravel Cache 10 000 requests will still come into your app.

No. 203 / 217


Martin Joo - Performance with Laravel

We can cache the posts for a very short time. For example, 1 minute. When the spike happens this 1
minute means thousands of users. But, using a short TTL means we cannot make big mistakes. Cache
invalidation is hard. Harder than we think so it's always a safe bet to use shorter TTLs. In this case, it
perfectly fits the use case.

I'll solve the same situation in the sample app. There's an /api/posts/{post} endpoint that we're gonna
cache.

http {
fastcgi_cache_path /tmp/nginx_cache levels=1:2 keys_zone=content_cache:100m
inactive=10m;

add_header X-Cache $upstream_cache_status;


}

First, we need to tell nginx where to store the cache on the disk. This is done by using the
fastcgi_cache_path directive. It has a few configurations:

All cached content will be stored in the /tmp/nginx directory.

levels=1:2 tells nginx to create 2 levels of subdirectories inside this folder. The folder structure will
be something like that:

4e

b45cffe084dd3d20d928bee85e7b0f4e

2c322014fccc0a5cfbaf94a4767db04e

32

e2446c34e2b8dba2b57a9bcba4854d32

So levels=1:2 means that the first level of directories contains 1 character from the end of the hashed
directory name. Such as e and b45cffe084dd3d20d928bee85e7b0f4e . And then on the second level, the
directory's name contains characters from the end of the hash. Such as 4e and
b45cffe084dd3d20d928bee85e7b0f4e .

If you don't specify the levels option nginx will create only one level of directories. Which is fine for
smaller sites. However, for bigger traffic, specifying the levels option is a good practice since it can boost
the performance of nginx.

keys_zone=content_cache:100m defines the key of this cache which we can reference later. The 100m
sets the size of the cache to 100MB.

inactive=10m tells how long to keep a cache entry after it was last accessed. In this case, it's 10
minutes.

And now we can use it:

No. 204 / 217


Martin Joo - Performance with Laravel

location ~\.php {
fastcgi_cache_key $scheme$host$request_uri$request_method;
fastcgi_cache content_cache;
fastcgi_cache_valid 200 5m;
fastcgi_cache_use_stale error timeout invalid_header http_500 http_503
http_404;
fastcgi_ignore_headers Cache-Control Expires Set-Cookie;

try_files $uri =404;


include /etc/nginx/fastcgi.conf;
fastcgi_pass unix:/run/php/php8.1-fpm.sock;
fastcgi_index index.php;
fastcgi_param PATH_INFO $fastcgi_path_info;
}

fastcgi_cache_key defines the key for the given request. It looks like this:
HTTPSmysite.composts/1GET This is the string that will be the filename after it's hashed.

fastcgi_cache here we need to specify the key we used in the keys_zone option.

The fastcgi_cache_valid directive sets the maximum time that a cached response can be
considered valid. In this case, it's set to 5 minutes for only successful (200) responses.

The fastcgi_ignore_headers directive specifies which response headers should be ignored when
caching responses. Basically, they won't be cached at all. Caching cache-related headers with
expiration dates does not make much sense.

The fastcgi_cache_use_stale directive specifies which types of stale cached responses can be used
if the backend server is unavailable or returns an error. A stale cached response is a response that has
been previously cached by the server, but has exceeded its maximum allowed time to remain in the
cache and is considered "stale". This basically means that even if the BE is currently down we can serve
clients by using older cached responses. In this project, where the content is not changing that often
it's a perfectly good strategy to ensure better availability.

All right, so we added these directives to the location ~\.php location so it will apply to every request.
Which is not the desired outcome. The way we can control which locations should use cache looks like this:

fastcgi_cache_bypass 1;
fastcgi_no_cache 1;

If fastcgi_cache_bypass is 1 then nginx will not use cache and forwards the request to the backend.

If fastcgi_no_cache is 1 then nginx will not cache the response at all.

No. 205 / 217


Martin Joo - Performance with Laravel

Obviously, we need a way to set these values dynamically. Fortunately, nginx can handle variables and if
statements:

set $no_cache 1;

if ($request_uri ~* "\/posts\/([0-9]+)") {
set $no_cache 0;
}

if ($request_method != GET) {
set $no_cache 1;
}

This code will set the $non_cache variable to 0 only if the request is something like this: GET /posts/12
Otherwise, it'll be 1 . Finally, we can use this variable:

location ~\.php {
fastcgi_cache_key $scheme$host$request_uri$request_method;
fastcgi_cache content_cache;
fastcgi_cache_valid 200 5m;
fastcgi_cache_use_stale error timeout invalid_header http_500 http_503
http_404;
fastcgi_ignore_headers Cache-Control Expires Set-Cookie;
fastcgi_cache_bypass $no_cache;
fastcgi_no_cache $no_cache;

try_files $uri =404;


include /etc/nginx/fastcgi.conf;
fastcgi_pass unix:/run/php/php8.1-fpm.sock;
fastcgi_index index.php;
fastcgi_param PATH_INFO $fastcgi_path_info;
}

With this simple config, we can cache every posts/{post} for 10 minutes. The 5 minutes is an arbitrary
number and it's different for every use case. As I said earlier, with this example, I wanted to solve a
performance problem that happens in a really short time. So caching responses for 5 minutes is a good
solution to this problem. And of course, the shorter you cache something the less risk you take (by serving
outdated responses).

No. 206 / 217


Martin Joo - Performance with Laravel

An important thing about caching on the nginx level: it can be tricky (or even impossible) to cache user-
dependent content. For example, what is users can see the Post in their preferred language? To make this
possible we need to add the language to the cache key so one post will have many cache keys. One for each
language. If you have the language key in the URL it's not a hard task, but if you don't, you have to refactor
your application. Or you need to use Laravel cache (where you have access to the user object and
preferences of course).

There are other kinds of settings that can cause problems. For example, what if every post has an audience?
So you can define who can see your post (for example, only your followers or everyone, etc). To handle this,
you probably need to add the user ID to the URL and the cache key as well.

Be aware of these scenarios. There's also a proxy_pass option that you can use in a reverse proxy.

No. 207 / 217


Martin Joo - Performance with Laravel

MySQL slow query log


You can configure MySQL to log every query that exceeds a certain threshold in execution time. This is a
very useful feature since you'll have a list of slow-running queries.

You need to add the following lines to my.cnf :

slow_query_log = 1
slow_query_log_file = /var/log/slow_query.log
long_query_time = 1

long_query_time is in seconds so 1 means MySQL logs every query that takes more than 1s. You should
adjust it to your preferences. With these settings, MySQL will log slow queries in /var/log/slow_query.log
after you restart it.

After you restart MySQL it will log entries such as these:

# Time: 2024-04-23T17:56:50.185184Z
# User@Host: root[root] @ [192.168.65.1] Id: 8
# Query_time: 2.103501 Lock_time: 0.000050 Rows_sent: 5 Rows_examined:
1557987
SET timestamp=1713895008;
select
hashed_uri, count(*) as total
from
`page_views`
where visited_at between "2024-04-07 19:00:00" and "2024-04-08 23:59:59"
and site_id = 1
group by hashed_uri
order by total desc
limit 5;

# Time: 2024-04-23T17:58:48.187662Z
# User@Host: root[root] @ [192.168.65.1] Id: 8
# Query_time: 6.309928 Lock_time: 0.000009 Rows_sent: 1557982
Rows_examined: 1557982
SET timestamp=1713895121;
select *
from page_views

No. 208 / 217


Martin Joo - Performance with Laravel

where created_at <= "2024-04-23";

You can enable this on your server and monitor your queries. After a time the log file will be huge so don't
forget to purge or rotate it.

No. 209 / 217


Martin Joo - Performance with Laravel

Monitoring database connections


There's a pretty useful Laravel command that can monitor the database:

php artisan db:monitor !)databases=mysql !)max=100

It counts the. number of connections your database has at the given moment, and dispatches an event if it's
greater than the --max argument which is 100 in this case. We need to schedule this command to run
every minute:

class Kernel extends ConsoleKernel


{
protected function schedule(Schedule $schedule): void
{
$schedule!'command('db:monitor !)databases=mysql !)max=100')
!'everyMinute();
}
}

The command will fail and dispatch an event if the number of connections is greater than 100. We can
handle the event like this:

class AppServiceProvider extends ServiceProvider


{
public function boot(): void
{
Event!$listen(function (DatabaseBusy $event) {
Notification!$route('mail', '[email protected]')
!'notify(new DatabaseApproachingMaxConnections(
$event!'connectionName,
$event!'connections
));
});
}
}

No. 210 / 217


Martin Joo - Performance with Laravel

Docker resource limits


If you're using Docker and/or docker-compose you can limit how much CPU or memory a given container
can use:

app:
image: app:1.0.0
deploy:
resources:
limits:
cpus: 1.5

queue:
image: app:1.0.0
deploy:
resources:
limits:
cpus: 1

mysql:
image: mysql:8.0.35
deploy:
resources:
limits:
cpus: 1.5

In this configuration the following happens:

The app container can use up to 1.5 CPU cores

The queue container can use only 1 CPU cores

mysql can use up to 1.5 CPU cores

This is an example with a 2-core machine. This config makes sure that no container will drive your CPU
crazy. Even if you write an infinite loop that calculates prime numbers your server will have at least 0.5 cores
available for other processes. In the queue, I used a smaller number. This is not a coincidence. In my
opinion, background processes should not use the entire CPU. This way, containers that serve user requests
always have some available CPU.

We can do the same with memory as well:

No. 211 / 217


Martin Joo - Performance with Laravel

app:
image: app:1.0.0
deploy:
resources:
limits:
cpus: 1.5
memory: 128M

The same can be done using Docker:

docker run !)cpus=2 app:1.0.0

These are great features so test it out and use it.

No. 212 / 217


Martin Joo - Performance with Laravel

Health check monitors


There are two excellent packages:

spatie/laravel-health

pragmarx/health

Both of them do the same thing. They can monitor the health of the application. Meaning:

How much free disk space does your server have?

Can the application connect to the database?

What's the current CPU load?

How many MySQL connections are there?

etc

You can define the desired threshold and the package will notify you if necessary. I'm going to use the
Spatie package but the other one is also pretty good. Spatie is more code-driven meanwhile the Pragmarx
package is more configuration-driven.

Please refer to the installation guide here.

This is what a spatie/laravel-health configuration looks like:

No. 213 / 217


Martin Joo - Performance with Laravel

<?php

namespace App\Providers;

class HealthCheckServiceProvider extends ServiceProvider


{
public function boot(): void
{
Health!$checks([
UsedDiskSpaceCheck!$new(),
DatabaseCheck!$new(),
CpuLoadCheck!$new()
!'failWhenLoadIsHigherInTheLast5Minutes(2)
!'failWhenLoadIsHigherInTheLast15Minutes(1.75),
DatabaseConnectionCountCheck!$new()
!'warnWhenMoreConnectionsThan(50)
!'failWhenMoreConnectionsThan(100),
DebugModeCheck!$new(),
EnvironmentCheck!$new(),
RedisCheck!$new(),
RedisMemoryUsageCheck!$new(),
QuerySpeedCheck!$new(),
]);
}
}

We're using the following checks:

UsedDiskSpaceCheck warns you if more than 70% of the disk is used and it sends an error message if
more than 90% is used.

DatabaseCheck checks if the database is available or not.

CpuLoadCheck measures the CPU load (the numbers you can see when you open htop ). It sends you a
failure message if the 5-minute average load is more than 2 or if the 15-minute average is more than
1.75. I'm using 2-core machines in this project a load of 2 means both the cores run at 100%. If you
have a 4-core CPU 4 means 100% load.

DatabaseConnectionCountCheck sends you a warning if there are more than 50 connections and a
failure message if there are more than 100 MySQL connections.

DebugModeCheck simply checks if the APP_DEBUG is equal true .

No. 214 / 217


Martin Joo - Performance with Laravel

EnvironmentCheck makes sure that you're application is running in production mode.

RedisCheck tries to connect to Redis and notifies you if the connection cannot be established.

RedisMemoryUsageCheck sends you a message if Redis is using more than 500MB of memory.

These are the basic checks you can use in almost every project.

To be able to use the CpuLoadCheck and the DatabaseConnectionCountCheck you have to install these
packages as well:

spatie/cpu-load-health-check

doctrine/dbal

The package can send you e-mail and Slack notifications as well. Just set up them in the health.php config
file:

'notifications' !& [
Spatie\Health\Notifications\CheckFailedNotification!$class !& ['mail'],
],

'mail' !& [
'to' !& env('HEALTH_CHECK_EMAIL', '[email protected]'),
'from' !& [
'address' !& env('MAIL_FROM_ADDRESS', '[email protected]'),
'name' !& env('MAIL_FROM_NAME', 'Health Check'),
],
],

'slack' !& [
'webhook_url' !& env('HEALTH_SLACK_WEBHOOK_URL', ''),
'channel' !& null,
'username' !& null,
'icon' !& null,
],

No. 215 / 217


Martin Joo - Performance with Laravel

Finally, you need to run the command provided by the package every minute:

$schedule!'command('health:check')!'everyMinute();

You can also write your own checks. For example, I created a QuerySpeedCheck that simply measures the
speed of an important query:

namespace App\Checks;

class QuerySpeedCheck extends Check


{
public function run(): Result
{
$result = Result!$make();

$executionTimeMs = Benchmark!$measure(function () {
Post!$with('author')!'orderBy('publish_at')!'get();
});

if ($executionTimeMs !3 50 !. $executionTimeMs !- 250) {


return $result!'warning('Database is starting to slow down');
}

if ($executionTimeMs > 250) {


return $result!'failed('Database is slow');
}

return $result!'ok();
}
}

In the sample application I don't have too many database queries, but selecting the posts with authors is
pretty important and happens a lot. So if this query cannot be executed in a certain amount of time it
means the application might be slow. The numbers used in this example are completely arbitrary. Please
measure your own queries carefully before setting a threshold.

If you run the command it gives you an output such as this:

No. 216 / 217


Martin Joo - Performance with Laravel

No. 217 / 217

You might also like