0% found this document useful (0 votes)
11 views25 pages

Making Video Intuitive - An Explainer

Uploaded by

Nguyen Vu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views25 pages

Making Video Intuitive - An Explainer

Uploaded by

Nguyen Vu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

12/7/24, 10:22 AM Making Video Intuitive: An Explainer

Making
2020-05-12
Video Intuitive: An Explainer
Kyle Boutette
11 min read

On the Stream team at Cloudflare, we work to provide a great viewing experience


while keeping our service affordable. That involves a lot of small tweaks to our
video pipeline that can be difficult to discern by most people. And that makes the
results of those tweaks less intuitive.

https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 1/25
12/7/24, 10:22 AM

In this post, let's have some fun. Instead of fine-grained optimization work, we’ll
Making Video Intuitive: An Explainer

do the opposite. Today we’ll make it easy to see changes between different
versions of a video: we’ll start with a high-quality video and ruin it. Instead of
aiming for perfection, let’s see the impact of various video coding settings. We’ll
go on a deep dive on how to make some victim video look gloriously bad and
learn on the way.
Everyone agrees that video on the Internet should look good, start playing fast,
and never rebuffer regardless of the device they’re on. People can prefer one
version of a video over another and say it looks better. Most people, though,
would have difficulty elaborating on what ‘better’ means. That’s not an issue
when you’re just consuming video. However, when you’re storing, encoding, and
distributing it, how that video looks determines how happy your viewers are.
To determine what looks better, video engineers can use a variety of techniques.
The most accessible is the most obvious: compare two versions of a video by
having people look at them—a subjective comparison. We’ll apply eyeballs here.
So, who’s our sacrificial video? We’re going to use a classic video for the
demonstration here—perhaps too classic for people that work with video—Big
Buck Bunny. This is an open-source film by Sacha Goedegebure available under
the permissive Creative Commons Attribution 3.0 license. We’re only going to
work with 17 seconds of it to save some time. This is what the video looks like
when downloaded from https://peach.blender.org/download/. Take a moment to
savor the quality since we’re only getting worse from here.

https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 2/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer

For brevity, we'll evaluate our results by two properties: smooth motion and
looking ‘crisp’. The video shouldn’t stutter and its important features should be
distinguishable.
It’s worth mentioning that video is a hack of your brain. Every video is just an
optimized series of pictures— a very sophisticated flipbook. Display those
pictures quickly enough and you can fool the brain into interpreting motion. If you
show enough points of light close together, they meld into a continuous image.
Then, change the color of those lights frequently enough and you end up with
smooth motion.

https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 3/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer

Frame rate
Not stuttering is covered by framerate, measured in frames-per-second (fps). fps
is the number of individual pictures displayed in a single second; many videos are
encoded at somewhere between 24 and 30fps. One way to describe fps is in
terms of how long a frame is shown for—commonly called the frame time. At
24fps, each frame is shown for about 41 milliseconds. At 2fps, that jumps to
500ms. Lowering fps causes frames to trend rapidly towards persisting for the
full second. Smooth motion mostly comes down to the single knob of fps.
Mucking about with framerate isn’t a sporting way to achieve our goal. It’s
extremely easy to tank the framerate and ruin the experience. Humans have a low
tolerance for janky motion. To get the idea, here’s what our original clip reduced
to 2fps looks like; 500ms per-frame is a long time.
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 4/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer

ffmpeg -v info -y -hide_banner -i source.mp4 -r 2 -c:v h264 -c:a copy

2fps.mp4

Resolution
Making tiny features distinguishable has many more knobs. Choices you can
make include what codec, level, profile, bitrate, resolution, color space, or
keyframe frequency, to name a few. Each of these also influences factors apart
from perceived quality, such as how large the resulting file is plus what devices it
is compatible with. There’s no universal right answer for what parameters to
encode a video with. For the best experience while not wasting resources, the
same video intended for a modern 4k display should be tailored differently for a
2007 iPod Nano. We’ll spend our time here focusing on what impacts a video’s
crispness since that’s what largely determines the experience.
We’re going to use FFmpeg to make this happen. This is the sonic screwdriver of
the video world; a near-universal command-line tool for converting and
manipulating media. FFmpeg is almost two decades old, has hundreds of
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 5/25
12/7/24, 10:22 AM

contributors, and can do essentially any digital video-related task. Its flexibility
Making Video Intuitive: An Explainer

also makes it rather complex to work with. For each version of the video, we’ll
show the command used to generate it as we go.
Let’s figure out exactly what we want to change about the video to make it a bad
experience.

You may have heard about resolution and bitrate. To explain them, let’s use an
analogy. Resolution provides pixels. Pixels are buckets for information. Bitrate is
the information that fills those buckets. How full a given bucket is determines how
well a pixel can represent content. With too few bits of information for a bucket,
the pixel will get less and less accurate to the original source. In practice, their
numerical relationship is complicated. These are what we’ll be varying.
The decision of which bucket should get how many bits of information is
determined by software called a video encoder. The job of the encoder is to use
the bits budgeted for it as efficiently as possible to display the best quality video.
We’ll be changing the bitrate budget to influence the resulting bitrate. Like people
with money, budgeting is a good idea for our encoder. Uncompressed video can
use a byte, or more, per-pixel for each of the red, green, and blue(RGB) channels.
For a 1080p video, that means 1920x1080 pixels multiplied by 3 bytes to get
6.2MB per frame. We’ll talk about frames later but 6.2 MB is a lot— at this rate, a
DVD disc would only fit about 50 seconds of video.
With our variables chosen, we’re good to go. For every variation we encode, we’ll
show a comparison to this table. Our source video is encoded in H.264 at 24fps
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 6/25
12/7/24, 10:22 AM

with a variety of other settings, those features will not change. Expect these
Making Video Intuitive: An Explainer

numbers to get significantly smaller as we poke around to see what changes.


Resolution
Bitrate
File Size
Source
1280x720
7.5Mbps
16MB
To start, let’s change just resolution and see what impact that has. The lowest
resolution most people are exposed to is usually 140p, so let’s reencode our
source video targeting that. Since many video platforms have this as an option,
we’re not expecting an unwatchable experience quite yet.
ffmpeg -v info -y -hide_banner -i source.mp4 -vf scale=-2:140 -c:v h264 -
b:v 6000k -c:a copy scaled-140.mp4

https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 7/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer

Resolution
Bitrate
File Size
Source
1280x720
7.5Mbps
16MB
Scaled to 140p
248x140
2.9Mbps
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 8/25
12/7/24, 10:22 AM

6.1MB
Making Video Intuitive: An Explainer

By the numbers, we find some curious results. We didn’t ask for a different bitrate
from the source but our encoder gave us one that is roughly a third. Given that
the number of pixels was dramatically reduced, the encoder had fewer buckets to
put the information in our bitrate. Despite its best attempt at using the entire
bitrate budget provided to it, our encoder filled all the buckets we provided. What
did it do with the leftover information? Since it isn’t in the video, it tossed it.
This would probably be an acceptable experience on a 4in phone screen. You
wouldn’t notice the sort-of grainy result on a small display. On a 40in TV, it’d be
blocky and unpleasant. At 40in, 140 rows of pixels become individually
distinguishable which doesn’t fool the brain and ruins the magic.
Bitrate
Bitrate is the density of information for a given period of time, typically a second.
This interacts with framerate to give us a per frame bitrate budget. Our source
having a bitrate of 7.5Mbps (millions of bits-per-second) and framerate of 24fps
means we have an average of 7500Kbps / 24fps = 312.5Kb of information per
frame.
Different kinds of frames

https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 9/25
12/7/24, 10:22 AM

There are different ways a frame can be encoded. It doesn’t make sense to use
Making Video Intuitive: An Explainer

the same technique for a sequence of frames of a single color and most of the
sequences in Big Buck Bunny. There’s differing information density and
distribution between those sequences. Different ways of representing frames
take advantage of those differing patterns. As a result, the 312Kb average for
each frame is both lower than the size of the larger frames and greater than the
size of the smallest frames. Some frames contain just changes relative to other
frames – these are P or B frames – those could be far smaller than 312Kb.
However, some frames contain full images – these are I frames – and tend to be
far larger than 312Kb. Since we’re viewing the video holistically as multiple
seconds, we don’t need to worry about them since we’re concerned with the
overall experience. Knowing about frames is useful for their impact on bitrate for
different types of content, which we’ll discuss later.
Our starting bitrate is extremely large and has more information than we actually
need. Let’s be aggressive and cut it down to 1/75th while maintaining the source’s
resolution.
ffmpeg -v info -y -hide_banner -i source.mp4 -c:v h264 -b:v 100k -c:a
copy bitrate-100k.mp4

https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 10/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer

Resolution
Bitrate
File Size
Source
1280x720
7.5Mbps
16MB
Scaled to 140p
248x140
2.9Mbps
6.1MB
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 11/25
12/7/24, 10:22 AM

Targeted to 100Kbps
Making Video Intuitive: An Explainer

1280x720
102Kbps
217KB
When you take a look at the video, fur and grass become blobs. There’s just not
enough information to accurately represent the fine details.

Source Video

https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 12/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer

100 Kbps budget


We provided a bitrate budget of 100Kbps but the encoder doesn’t seem to have
quite hit it. When we changed the resolution, we had a lower bitrate than we
asked for, here we have a higher bitrate. Why would that be the case?
We have so many buckets that there’s some minimum amount the encoder wants
in each. Since it can play with the bitrate, it ends up favoring slightly more full
buckets since that’s easier. This is somewhat the reverse of why our previous
experiment had a lower bitrate than expected.
We can influence how the encoder budgets bitrate using rate control modes.
We’re going to stick with the default ‘Average-Bitrate’ mode to keep things easy.
This mode is sub-optimal since it lets the encoder spend a bunch of budget up
front to its detriment later. However, it's easy to reason about.
Resolution + Bitrate
Targeting a bitrate of 100Kbps got us an unpleasant video but not something
completely unwatchable. We haven’t quite ruined our video yet. We might as well
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 13/25
12/7/24, 10:22 AM

take bitrate down to an even further extreme of 20Kbps while keeping the
Making Video Intuitive: An Explainer

resolution constant.
ffmpeg -v info -y -hide_banner -i source.mp4 -c:v h264 -b:v 20k -c:a

copy bitrate-20k.mp4

Resolution
Bitrate
File Size
Source
1280x720
7.5Mbps
16MB
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 14/25
12/7/24, 10:22 AM

Scaled to 140p
Making Video Intuitive: An Explainer

248x140
2.9Mbps
6.1MB
Targeted to 100Kbps
1280x720
102Kbps
217KB
Targeted to 20Kbps
1280x720
35Kbps
81KB
Now, this is truly unwatchable! There’s sometimes color but the video mostly
devolves into grayscale rectangles roughly approximating the silhouettes of what
we’re expecting. At slightly less than a third the bitrate of the previous trial, this
definitely looks like it has less than a third of the information.
As before, we didn’t hit our bitrate target and for the same reason that our pixel
buckets were insufficiently filled with information. The encoder needed to start
making hard decisions at some point between 102 and 35Kbps. Most of the color
and the comprehensibility of the scene were sacrificed.

https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 15/25
12/7/24, 10:22 AM

We’ll discuss why there’s moving grayscale rectangles and patches of color in a
Making Video Intuitive: An Explainer

bit. They’re giving us a hint about how the encoder works under the hood.
What if we go just one step further and combine our tiny resolution with the
absurdly low bitrate? That should be an even worse experience, right?
ffmpeg -v info -y -hide_banner -i source.mp4 -vf scale=-2:140 -c:v h264 -

b:v 20k -c:a copy scaled-140_bitrate-20k.mp4

Resolution
Bitrate
File Size
Source
1280x720
7.5Mbps
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 16/25
12/7/24, 10:22 AM

16MB
Making Video Intuitive: An Explainer

Scaled to 140p
248x140
2.9Mbps
6.1MB
Targeted to 100Kbps
1280x720
102Kbps
217KB
Targeted to 20Kbps
1280x720
35Kbps
81KB
Scaled to 140p and Targeted to 20Kbps
248x140
19Kbps
48KB
Wait a minute, that’s actually not too bad at all. It’s almost like a tinier version of
1280 by 720 at 100Kbps. Why doesn’t this look terrible? Having a lower bitrate
means there’s less information, which implies that the video should look worse. A
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 17/25
12/7/24, 10:22 AM

lower resolution means the image should be less detailed. The numbers got
Making Video Intuitive: An Explainer

smaller, so the video shouldn’t look better!


Thinking back to buckets and information, we now have less information but
fewer discrete places for that information to live. This specific combination of low
bitrate and low resolution means the buckets are nicely filled. The encoder
exactly hit our target bitrate which is a reasonable indicator that it was at least
somewhat satisfied with the final result.
This isn’t going to be a fun experience on a 4k display but it is fine enough for an
iPod Nano from 2007. A 3rd generation iPod Nano has a 320x240 display spread
across a 2in screen. Our 140p video will be nearly indistinguishable from a much
higher quality video. Even more, 48KB for 17 seconds of video makes fantastic
use of the limited storage – 4GB on some models. In a resource-constrained
environment, this low video quality can be a large quality of experience
improvement.

https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 18/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer

CC BY 2.0 - image by nez


We should have a decent intuition for the relationship between bitrate and
resolution plus what the tradeoffs are. There’s a lingering question, though, do we
need to make tradeoffs? There has to be some ratio of bitrate to pixel-count in
order to get the best quality for a given resolution at a minimal file size.
In fact, there are such perfect ratios. In ruining the video, we ended up testing a
few candidates of this ratio for our source video.
Resolution
Bitrate
File Size
Bits/Pixel
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 19/25
12/7/24, 10:22 AM

Source
Making Video Intuitive: An Explainer

1280x720
7.5Mbps
16MB
8.10
Scaled to 140p
248x140
2.9Mbps
6.1MB
83.5
Targeted to 100Kbps
1280x720
102Kbps
217KB
0.11
Targeted to 20Kbps
1280x720
35Kbps
81KB
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 20/25
12/7/24, 10:22 AM

0.03
Making Video Intuitive: An Explainer

Scaled to 140p and Targeted to 20Kbps


248x140
19Kbps
48KB
0.55
However, there are some complications.
The biggest caveat is that the optimal ratio depends on your source video. Each
video has a different amount of information required to be displayed. There are a
couple of reasons for that.
If a frame has many details then it takes more information to represent. Frames in
chronological order that visually differ significantly (think of an action movie) take
more information than a set of visually similar frames (like a security camera
outside a quiet warehouse). The former can’t use as many B or P frames which
occupy less space. Animated content with flat colors require encoders to make
fewer trade offs that cause visual degradation than live-action.
Thinking back to the settings that resulted in grayscale rectangles and patches of
color, we can learn a bit more. We saw that the rectangles and color seem to
move, as though the encoder was playing a shell game with tiny boxes of
pictures.
What is happening is that the encoder is recognizing repeated patterns within
and between frames. Then, it can reference those patterns to move them around
without needing to actually duplicate them. The P and B frames mentioned earlier
are mainly composed of these shifted patterns. This is similar, at least in spirit, to
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 21/25
12/7/24, 10:22 AM

other compression algorithms that use dictionaries to refer to previous content. In


Making Video Intuitive: An Explainer

most video codecs, the bits of picture that can be shifted are called
‘macroblocks’, which subdivide each frame with NxN squares of pixels. The less
stingy the bitrate, the less obvious the macroblock shell game.
To see this effect more clearly, we can ask FFmpeg to show us decisions it
makes. Specifically, it can show us what it decides is ‘motion’ moving the
macroblocks. The video here is 140p for the motion vector arrows to be easier to
see.
ffmpeg -v info -y -hide_banner -flags2 +export_mvs -i source.mp4 -vf
scale=-2:140,codecview=mv=pf+bf+bb -c:v h264 -b:v 6000k -c:a copy motion-
vector.mp4

Even worse is that flat color and noise might only be seen in two different scenes
in the same video. That forces you to either waste your bitrate budget in one
scene or look terrible in the other. We give the encoder a bitrate budget it can
use. How it uses it is the result of a feedback loop during encoding.
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 22/25
12/7/24, 10:22 AM

Yet another caveat is that your resulting bitrate is influenced by all those knobs
Making Video Intuitive: An Explainer

that were listed earlier, the most impactful being codec choice followed by bitrate
budget. We explored the relationship between bitrate and resolution but every
knob has an impact on the quality and a single knob frequently interacts with
other knobs.
So far we’ve taken a look at some of the knobs and settings that affect visual
quality in a video. Every day, video engineers and encoders make tough decisions
to optimize for the human eye, while keeping file sizes at a minimum. Modern
encoding schemes use techniques such as per title encoding to narrow down the
best resolution-bitrate combinations. Those schemes look somewhat similar to
what we’ve done here: test various settings and see what gives the desired result.
With every example, we’ve included an FFmpeg command you can use to
replicate the output above and experiment with your own videos. We encourage
you to try improving the video quality while reducing file sizes on your own and to
find other levers that will help you on this journey!
Cloudflare's connectivity cloud protects entire corporate networks, helps
customers build Internet-scale applications efficiently, accelerates any website or
Internet application, wards off DDoS attacks, keeps hackers at bay, and can help
you on your journey to Zero Trust.
Visit 1.1.1.1 from any device to get started with our free app that makes your
Internet faster and safer.
To learn more about our mission to help build a better Internet, start here. If
you're looking for a new career direction, check out our open positions.

Discuss on Hacker News

https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 23/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer

Discuss on Reddit

ON AIR | CLOUDFLARE TV
🎂 Closing the last privacy holes on the Internet
Tune In

Cloudflare Stream Video

Follow on X
Cloudflare | @cloudflare

RELATED POSTS
August 16, 2024 9:00 PM
Introducing high-definition portrait video support for Cloudflare Stream
Cloudflare Stream is an end-to-end solution for video encoding, storage, delivery, and
playback, focused on simplifying all aspects of video for developers. Newly uploaded or
ingested portrait videos will now automatically be processed in full HD quality...
By Alex Huang
Cloudflare Stream, Developers, Developer Platform, Internship Experience, Product News, Video

June 20, 2024 8:00 PM


Introducing Stream Generated Captions, powered by Workers AI
https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 24/25
12/7/24, 10:22 AM Making Video Intuitive: An Explainer

With one click, users can now generate video captions effortlessly using Stream’s newest
feature: AI-generated captions for on-demand videos and recordings of live streams...
By Mickie Betz, Ben Krebsbach, Taylor Smith
Developer Platform, Developers, Workers AI, AI, Product News, Cloudflare Stream

April 04, 2024 8:00 PM


What’s new with Cloudflare Media: updates for Calls, Stream, and Images
With Cloudflare Calls in open beta, you can build real-time, serverless video and audio
applications. Cloudflare Stream lets your viewers instantly clip from ongoing streams...
By Deanna Lam, Taylor Smith, Zaid Farooqui
Developer Week, Cloudflare Stream, Live Streaming, Cloudflare Images, Image Optimization, Image Resizing,
Image Storage, Cloudflare Calls, Developers

September 25, 2023 8:00 PM


Cloudflare Stream Low-Latency HLS support now in Open Beta
Cloudflare Stream’s LL-HLS support enters open beta today. You can deliver video to your
audience faster, reducing the latency a viewer may experience on their player to as little as 3
seconds...
By Taylor Smith
Birthday Week, Cloudflare Stream, Live Streaming, Restreaming, Video, Latency, Product News

© 2024 Cloudflare, Inc. | Privacy Policy | Terms of Use | Report Security Issues | Cookie Preferences|
Trademark

https://blog.cloudflare.com/making-video-intuitive-an-explainer/ 25/25

You might also like