Making Video Intuitive - An Explainer
Making Video Intuitive - An Explainer
Making
2020-05-12
Video Intuitive: An Explainer
Kyle Boutette
11 min read
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 1/25
12/7/24, 10:22 AM
In this post, let's have some fun. Instead of fine-grained optimization work, we’ll
Making Video Intuitive: An Explainer
do the opposite. Today we’ll make it easy to see changes between different
versions of a video: we’ll start with a high-quality video and ruin it. Instead of
aiming for perfection, let’s see the impact of various video coding settings. We’ll
go on a deep dive on how to make some victim video look gloriously bad and
learn on the way.
Everyone agrees that video on the Internet should look good, start playing fast,
and never rebuffer regardless of the device they’re on. People can prefer one
version of a video over another and say it looks better. Most people, though,
would have difficulty elaborating on what ‘better’ means. That’s not an issue
when you’re just consuming video. However, when you’re storing, encoding, and
distributing it, how that video looks determines how happy your viewers are.
To determine what looks better, video engineers can use a variety of techniques.
The most accessible is the most obvious: compare two versions of a video by
having people look at them—a subjective comparison. We’ll apply eyeballs here.
So, who’s our sacrificial video? We’re going to use a classic video for the
demonstration here—perhaps too classic for people that work with video—Big
Buck Bunny. This is an open-source film by Sacha Goedegebure available under
the permissive Creative Commons Attribution 3.0 license. We’re only going to
work with 17 seconds of it to save some time. This is what the video looks like
when downloaded from https://peach.blender.org/download/. Take a moment to
savor the quality since we’re only getting worse from here.
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 2/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer
For brevity, we'll evaluate our results by two properties: smooth motion and
looking ‘crisp’. The video shouldn’t stutter and its important features should be
distinguishable.
It’s worth mentioning that video is a hack of your brain. Every video is just an
optimized series of pictures— a very sophisticated flipbook. Display those
pictures quickly enough and you can fool the brain into interpreting motion. If you
show enough points of light close together, they meld into a continuous image.
Then, change the color of those lights frequently enough and you end up with
smooth motion.
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 3/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer
Frame rate
Not stuttering is covered by framerate, measured in frames-per-second (fps). fps
is the number of individual pictures displayed in a single second; many videos are
encoded at somewhere between 24 and 30fps. One way to describe fps is in
terms of how long a frame is shown for—commonly called the frame time. At
24fps, each frame is shown for about 41 milliseconds. At 2fps, that jumps to
500ms. Lowering fps causes frames to trend rapidly towards persisting for the
full second. Smooth motion mostly comes down to the single knob of fps.
Mucking about with framerate isn’t a sporting way to achieve our goal. It’s
extremely easy to tank the framerate and ruin the experience. Humans have a low
tolerance for janky motion. To get the idea, here’s what our original clip reduced
to 2fps looks like; 500ms per-frame is a long time.
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 4/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer
2fps.mp4
Resolution
Making tiny features distinguishable has many more knobs. Choices you can
make include what codec, level, profile, bitrate, resolution, color space, or
keyframe frequency, to name a few. Each of these also influences factors apart
from perceived quality, such as how large the resulting file is plus what devices it
is compatible with. There’s no universal right answer for what parameters to
encode a video with. For the best experience while not wasting resources, the
same video intended for a modern 4k display should be tailored differently for a
2007 iPod Nano. We’ll spend our time here focusing on what impacts a video’s
crispness since that’s what largely determines the experience.
We’re going to use FFmpeg to make this happen. This is the sonic screwdriver of
the video world; a near-universal command-line tool for converting and
manipulating media. FFmpeg is almost two decades old, has hundreds of
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 5/25
12/7/24, 10:22 AM
contributors, and can do essentially any digital video-related task. Its flexibility
Making Video Intuitive: An Explainer
also makes it rather complex to work with. For each version of the video, we’ll
show the command used to generate it as we go.
Let’s figure out exactly what we want to change about the video to make it a bad
experience.
You may have heard about resolution and bitrate. To explain them, let’s use an
analogy. Resolution provides pixels. Pixels are buckets for information. Bitrate is
the information that fills those buckets. How full a given bucket is determines how
well a pixel can represent content. With too few bits of information for a bucket,
the pixel will get less and less accurate to the original source. In practice, their
numerical relationship is complicated. These are what we’ll be varying.
The decision of which bucket should get how many bits of information is
determined by software called a video encoder. The job of the encoder is to use
the bits budgeted for it as efficiently as possible to display the best quality video.
We’ll be changing the bitrate budget to influence the resulting bitrate. Like people
with money, budgeting is a good idea for our encoder. Uncompressed video can
use a byte, or more, per-pixel for each of the red, green, and blue(RGB) channels.
For a 1080p video, that means 1920x1080 pixels multiplied by 3 bytes to get
6.2MB per frame. We’ll talk about frames later but 6.2 MB is a lot— at this rate, a
DVD disc would only fit about 50 seconds of video.
With our variables chosen, we’re good to go. For every variation we encode, we’ll
show a comparison to this table. Our source video is encoded in H.264 at 24fps
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 6/25
12/7/24, 10:22 AM
with a variety of other settings, those features will not change. Expect these
Making Video Intuitive: An Explainer
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 7/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer
Resolution
Bitrate
File Size
Source
1280x720
7.5Mbps
16MB
Scaled to 140p
248x140
2.9Mbps
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 8/25
12/7/24, 10:22 AM
6.1MB
Making Video Intuitive: An Explainer
By the numbers, we find some curious results. We didn’t ask for a different bitrate
from the source but our encoder gave us one that is roughly a third. Given that
the number of pixels was dramatically reduced, the encoder had fewer buckets to
put the information in our bitrate. Despite its best attempt at using the entire
bitrate budget provided to it, our encoder filled all the buckets we provided. What
did it do with the leftover information? Since it isn’t in the video, it tossed it.
This would probably be an acceptable experience on a 4in phone screen. You
wouldn’t notice the sort-of grainy result on a small display. On a 40in TV, it’d be
blocky and unpleasant. At 40in, 140 rows of pixels become individually
distinguishable which doesn’t fool the brain and ruins the magic.
Bitrate
Bitrate is the density of information for a given period of time, typically a second.
This interacts with framerate to give us a per frame bitrate budget. Our source
having a bitrate of 7.5Mbps (millions of bits-per-second) and framerate of 24fps
means we have an average of 7500Kbps / 24fps = 312.5Kb of information per
frame.
Different kinds of frames
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 9/25
12/7/24, 10:22 AM
There are different ways a frame can be encoded. It doesn’t make sense to use
Making Video Intuitive: An Explainer
the same technique for a sequence of frames of a single color and most of the
sequences in Big Buck Bunny. There’s differing information density and
distribution between those sequences. Different ways of representing frames
take advantage of those differing patterns. As a result, the 312Kb average for
each frame is both lower than the size of the larger frames and greater than the
size of the smallest frames. Some frames contain just changes relative to other
frames – these are P or B frames – those could be far smaller than 312Kb.
However, some frames contain full images – these are I frames – and tend to be
far larger than 312Kb. Since we’re viewing the video holistically as multiple
seconds, we don’t need to worry about them since we’re concerned with the
overall experience. Knowing about frames is useful for their impact on bitrate for
different types of content, which we’ll discuss later.
Our starting bitrate is extremely large and has more information than we actually
need. Let’s be aggressive and cut it down to 1/75th while maintaining the source’s
resolution.
ffmpeg -v info -y -hide_banner -i source.mp4 -c:v h264 -b:v 100k -c:a
copy bitrate-100k.mp4
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 10/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer
Resolution
Bitrate
File Size
Source
1280x720
7.5Mbps
16MB
Scaled to 140p
248x140
2.9Mbps
6.1MB
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 11/25
12/7/24, 10:22 AM
Targeted to 100Kbps
Making Video Intuitive: An Explainer
1280x720
102Kbps
217KB
When you take a look at the video, fur and grass become blobs. There’s just not
enough information to accurately represent the fine details.
Source Video
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 12/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer
take bitrate down to an even further extreme of 20Kbps while keeping the
Making Video Intuitive: An Explainer
resolution constant.
ffmpeg -v info -y -hide_banner -i source.mp4 -c:v h264 -b:v 20k -c:a
copy bitrate-20k.mp4
Resolution
Bitrate
File Size
Source
1280x720
7.5Mbps
16MB
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 14/25
12/7/24, 10:22 AM
Scaled to 140p
Making Video Intuitive: An Explainer
248x140
2.9Mbps
6.1MB
Targeted to 100Kbps
1280x720
102Kbps
217KB
Targeted to 20Kbps
1280x720
35Kbps
81KB
Now, this is truly unwatchable! There’s sometimes color but the video mostly
devolves into grayscale rectangles roughly approximating the silhouettes of what
we’re expecting. At slightly less than a third the bitrate of the previous trial, this
definitely looks like it has less than a third of the information.
As before, we didn’t hit our bitrate target and for the same reason that our pixel
buckets were insufficiently filled with information. The encoder needed to start
making hard decisions at some point between 102 and 35Kbps. Most of the color
and the comprehensibility of the scene were sacrificed.
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 15/25
12/7/24, 10:22 AM
We’ll discuss why there’s moving grayscale rectangles and patches of color in a
Making Video Intuitive: An Explainer
bit. They’re giving us a hint about how the encoder works under the hood.
What if we go just one step further and combine our tiny resolution with the
absurdly low bitrate? That should be an even worse experience, right?
ffmpeg -v info -y -hide_banner -i source.mp4 -vf scale=-2:140 -c:v h264 -
Resolution
Bitrate
File Size
Source
1280x720
7.5Mbps
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 16/25
12/7/24, 10:22 AM
16MB
Making Video Intuitive: An Explainer
Scaled to 140p
248x140
2.9Mbps
6.1MB
Targeted to 100Kbps
1280x720
102Kbps
217KB
Targeted to 20Kbps
1280x720
35Kbps
81KB
Scaled to 140p and Targeted to 20Kbps
248x140
19Kbps
48KB
Wait a minute, that’s actually not too bad at all. It’s almost like a tinier version of
1280 by 720 at 100Kbps. Why doesn’t this look terrible? Having a lower bitrate
means there’s less information, which implies that the video should look worse. A
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 17/25
12/7/24, 10:22 AM
lower resolution means the image should be less detailed. The numbers got
Making Video Intuitive: An Explainer
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 18/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer
Source
Making Video Intuitive: An Explainer
1280x720
7.5Mbps
16MB
8.10
Scaled to 140p
248x140
2.9Mbps
6.1MB
83.5
Targeted to 100Kbps
1280x720
102Kbps
217KB
0.11
Targeted to 20Kbps
1280x720
35Kbps
81KB
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 20/25
12/7/24, 10:22 AM
0.03
Making Video Intuitive: An Explainer
most video codecs, the bits of picture that can be shifted are called
‘macroblocks’, which subdivide each frame with NxN squares of pixels. The less
stingy the bitrate, the less obvious the macroblock shell game.
To see this effect more clearly, we can ask FFmpeg to show us decisions it
makes. Specifically, it can show us what it decides is ‘motion’ moving the
macroblocks. The video here is 140p for the motion vector arrows to be easier to
see.
ffmpeg -v info -y -hide_banner -flags2 +export_mvs -i source.mp4 -vf
scale=-2:140,codecview=mv=pf+bf+bb -c:v h264 -b:v 6000k -c:a copy motion-
vector.mp4
Even worse is that flat color and noise might only be seen in two different scenes
in the same video. That forces you to either waste your bitrate budget in one
scene or look terrible in the other. We give the encoder a bitrate budget it can
use. How it uses it is the result of a feedback loop during encoding.
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 22/25
12/7/24, 10:22 AM
Yet another caveat is that your resulting bitrate is influenced by all those knobs
Making Video Intuitive: An Explainer
that were listed earlier, the most impactful being codec choice followed by bitrate
budget. We explored the relationship between bitrate and resolution but every
knob has an impact on the quality and a single knob frequently interacts with
other knobs.
So far we’ve taken a look at some of the knobs and settings that affect visual
quality in a video. Every day, video engineers and encoders make tough decisions
to optimize for the human eye, while keeping file sizes at a minimum. Modern
encoding schemes use techniques such as per title encoding to narrow down the
best resolution-bitrate combinations. Those schemes look somewhat similar to
what we’ve done here: test various settings and see what gives the desired result.
With every example, we’ve included an FFmpeg command you can use to
replicate the output above and experiment with your own videos. We encourage
you to try improving the video quality while reducing file sizes on your own and to
find other levers that will help you on this journey!
Cloudflare's connectivity cloud protects entire corporate networks, helps
customers build Internet-scale applications efficiently, accelerates any website or
Internet application, wards off DDoS attacks, keeps hackers at bay, and can help
you on your journey to Zero Trust.
Visit 1.1.1.1 from any device to get started with our free app that makes your
Internet faster and safer.
To learn more about our mission to help build a better Internet, start here. If
you're looking for a new career direction, check out our open positions.
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 23/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer
Discuss on Reddit
ON AIR | CLOUDFLARE TV
🎂 Closing the last privacy holes on the Internet
Tune In
Follow on X
Cloudflare | @cloudflare
RELATED POSTS
August 16, 2024 9:00 PM
Introducing high-definition portrait video support for Cloudflare Stream
Cloudflare Stream is an end-to-end solution for video encoding, storage, delivery, and
playback, focused on simplifying all aspects of video for developers. Newly uploaded or
ingested portrait videos will now automatically be processed in full HD quality...
By Alex Huang
Cloudflare Stream, Developers, Developer Platform, Internship Experience, Product News, Video
With one click, users can now generate video captions effortlessly using Stream’s newest
feature: AI-generated captions for on-demand videos and recordings of live streams...
By Mickie Betz, Ben Krebsbach, Taylor Smith
Developer Platform, Developers, Workers AI, AI, Product News, Cloudflare Stream
© 2024 Cloudflare, Inc. | Privacy Policy | Terms of Use | Report Security Issues | Cookie Preferences|
Trademark
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 25/25