Chroma Subsampling Numbers Explained
Chroma Subsampling Numbers Explained
By Sareesh Sudhakaran
Disclosure: Links in this post may be to our affiliates; sales through affiliate links may
benefit this site. Please help support wolfcrow and buy from Amazon . It won’t cost you
anything extra.
Everybody talks of luma sampling and chroma subsampling, but few understand what
these numbers really signify. When someone uses the notations 4:4:4, 4:2:2 and
4:2:0, etc., they usually mean chroma sub-sampling. But what do these numbers
really mean, and why?
The major problem they faced, was the marriage between PAL and NTSC. The new
HDTV standard would need to account for both PAL and NTSC as a common factor.
E.g., if PAL is 2 and NTSC is 3, then HDTV couldn’t be 5 or 7 or 8. It had to be divisible
by both 2 and 3, in this case resulting in 6, the lowest possible value.
Standard definition luma (Y) was sampled at 13.5 Mhz, and the lowest common factor
that pleased both PAL and NTSC was 2.25 MHz (6 x 2.25 = 13.5).
This new system, which saw many variations, is now established as having 1125
vertical lines, of which 1080 is reserved for the image. The highest frame rate at the
time for 1080 was 29.97 fps (NTSC), and for 720 was 59.94 fps (NTSC).
What about the horizontal size? The uncorroborated legend goes that 16:9 (1.78:1)
wastes the least space when you want to fit in every kind of aspect ratio, from 4:3 to
2.39:1.
For 1125 vertical lines, the horizontal size for this aspect ratio (16:9) is 2002.
Multiplying 2002 x 1125 x 30 (frame rate), you get 67.57 MHz, which is not divisible
by 2.25 Mhz. The closest that you get is 74.25 MHz, while still keeping the horizontal
size a whole number. In this case it happens to be 2200.
Therefore, 74.25 MHz is the sampling rate for HDTV 1080p at 30fps, and 720p at
60fps. Out of 2200 pixels, we only need 1920 pixels for the image.
For SMPTE 292M (HD-SDI), i.e., 10-bit 4:2:2 1080p29.97 HDTV, the luma is sampled
at 74.25 Mhz, while each chrominance value is sampled at half that, 37.125 Mhz.
For SMPTE 372M (Dual HD-SDI) or 424M (3G-SDI), the sampling rate is doubled, to
148.5MHz, to account for the increased frame rate to 60fps.
Why 4?
You’ve all seen 4:4:4, 4:2:0 and 4:2:2. First of all, just for the purposes of this article,
let’s call this method of notation the ’4-sytem’. The 4-system is essentially a digital
notation, which isn’t supposed to have a direct analog counterpart (but it can be
equated or derived from it).
While defining color bit depth, we understood that in order to make digital words, the
formula for the number of combinations with n letters is 2 n. This is the hallmark of
digital numbers.
Let’s assume n = 1.
In this case, in the Y’CbCr model, Y’ can have only 21 = 2 combinations or values. It
can be a 0 or 1. Cb and Cr, too, can have only 2 values each – 0 and 1.
This gives Y’ Cb Cr 4 values each, represented as 4:4:4 when not compressed. If you
want to sample chroma at half rate, you can represent them by 4:2:2. A quarter rate
will give us 4:1:1 or 4:2:0. It gives us four options for each channel in the luma-
chroma model.
Is the number 4 good enough? It works. Like all things in broadcast, this too, is a
compromise. Tomorrow, if someone feels the sampling rate will need to be changed,
this system will become obsolete.
Here are some of the digital chroma sub-sampling values of popular formats:
HDCAM 3:1:1
NTSC 4:1:1
On the surface, it looks easy. 4:2:2 is double 3:1:1, as far as color sampling is
concerned, so if you’re like me you’d say 4:2:2 is the winner, hands down.
But it’s not that simple. The sampling numbers in the 4-system don’t take into account
the size of the image. Here’s something to ponder over:
Since color values are only sampled at one-fourth the maximum frequency,
What do you see? The 3:1:1 1080p image has more color information (518,400) than
the 4:2:2 720p image (230,400). But in quality (color density per pixel), obviously
4:2:2 is better than 3:1:1.
So, which is better? An image with more color information, or an image with better
sampled color but lesser color information? I don’t know. The point of this little
exercise was to show you exactly this, that there is more to these things than meets
the eye.
Obviously, if you sample an image at 4:4:4 that gives you the best possible quality
based on your sampling frequency.
Nobody can prove them wrong, because they are giving you 4:2:2, just that this might
be an ‘empty shell’, sort of like a Ferrari chassis which is running on a crappy third-
party engine.
The only way you can know for sure whether you’ve been conned or not is to make
tests between the two.
A full color image is 4:4:4 = 4+4+4 = 12, or 100% of maximum possible quality.
From this, you can derive the rest:
4:2:2 = 4+2+2 = 8, which is 66.7% of 4:4:4 (12)
4:2:0 = 4+2+0 = 6, which is 50% of 4:4:4 (12)
4:1:1 = 4+1+1 = 6, which is 50% of 4:4:4 (12)
3:1:1 = 3+1+1 = 5, which is 42% of 4:4:4 (12)
So, if a 4:4:4 uncompressed frame is 24 MB, then a 4:2:2 frame will reduce to 16 MB,
a 4:2:0 or 4:1:1 image will be 12 MB, and a 3:1:1 image will be 10 MB. Now you know
why chroma sub-sampling is still around. For television and internet video, it reduces
the file size by half, even before any compression has been applied.
I’ve used this calculation methodology in the costs of working with 4K and 2K. It’s not
a foolproof method and only should be used as a general guideline for quick
calculations; but it gets you damn close.
Disclosure: Links in this post may be to our affiliates; sales through affiliate links may benefit this site.
Please help support wolfcrow and buy from Amazon . It won’t cost you anything extra.
Depending on your location on Earth, you might have access to one of these two major SD standards:
PAL (720×576)
NTSC (720×480)
In order to enhance the viewer experience by providing greater clarity and visual detail, high definition
systems were introduced, with greater resolutions. There are two broad flavors of high-definition (HD):
Is there anything better than HD? Of course there is. There’s 2K (with a horizontal resolution of about
2048), 4K (horizontal resolution of about 4096), and so on. Digital photo cameras can reach up to 10328
The future might even bring 8K to homes, with a standard called UHDTV, or ultra high definition
It is notated by the form a:b. E.g. 1.78:1, 1.33:1, etc. Sometimes, people drop
the ‘:1’ for brevity. It can also be less than one, as in 0.9 for NTSC 4:3.
The pixel aspect ratio changes the size of an image, and this can be illustrated
with an example.
If an image with the resolution 720×576 has a pixel aspect ratio of 1.0 (square
However, an image with the resolution 720×576 but with a pixel aspect ratio of
1.422:1 (length of the pixel is 1.422 times the breadth) will have a size of:
This means that even though the actual pixel count horizontally is only 720, by
In the above example, it is important to note that the resolution of the image is
still 720×576, but the size of the image has changed to 1024×576 – there is no
Squashed or stretched footage has its genesis in the selection of incorrect pixel
aspect ratios.
Image Size
The size of an image is not the same as its resolution, but both are notated in the
The horizontal size of the image divided by its vertical size is the aspect ratio of
the image, notated in the form a:b. Aspect ratios are usually rounded off to two
decimal places.
What about a 720×576 image with a pixel aspect ratio of 1.422:1? What’s the
Disclosure: Links in this post may be to our affiliates; sales through affiliate links may benefit this site.
Please help support wolfcrow and buy from Amazon . It won’t cost you anything extra.
Bits
In computer terminology, the bit is the smallest possible thingy – it’s either a 0 or a 1.
If I’m shooting 8-bit 4:4:4 1080p, then the total number of bits in one frame is 1920 x 1080 x 3 x 8 =
49,766,400 bits. If I’m shooting at 24 frames per second, then the total bits per second = 1,194,393,600
One can see how talking in bits is tedious. The numbers are too large. What’s next?
Bytes
Scientists discovered pretty early that bits were too small to manage, even back in the early days of
computing. For better or for worse, they decided to group 8 bits together, which they called a Byte.
A byte is always represented by the capital letter ‘B’.
So, our 8-bit 4:4:4 1080p frame which is 49,766,400 bits is also 6,220,800 bytes. At 24 fps, the data rate
Bits vs Bytes
You’ll find many people using the letters b and B interchangeably, without realizing the ramifications of
their ignorance. Even eminent professionals on the internet do this without realizing their mistake.
MBps is not Mbps – They’re two totally different things. If you write one when you mean the other, you
are only making it worse for everyone. On wolfcrow, I strictly use the notation MB/s to mean Megabytes
per second, while I use Mbps to mean Mega bits per second. I use the ‘/’ notation because some
programs (or humans) might be inclined to make everything lower case or upper case while copying or
Just for the record, if you want to know, in traditional science, one uses the ‘/’ to indicate ‘per’, and avoids
the letter ‘p’. On a personal note I hate it when even camera manufacturers and others who should know
better continue to use notations like a fifth-grader. It makes their own engineers look bad.
A long time ago bytes were good enough. However, advancements in computer technology ensured bytes
Kilobytes
In traditional computing, everything is based on the number 2. To get multiples, the formula is 2 n. Even
the byte is 8 bits (23 = 8), and not 10.
Traditional science encourages the use of the word ‘kilo’ to mean 1,000. E.g., if 1000 grams (g) is a
kilogram (kg), in computing terms, they had to choose a number that could be the result of a direct
power of 2.
29 = 512 and 210 = 1024. 1024 was chosen as notation for kilo. Therefore, a kilobyte is not 1000 bytes,
So, our 8-bit image which is 6,220,800 bytes is also 6,075 kilobytes. At 24 fps, our data rate is 145,800
Megabytes
A megabyte is not 1000 kilobytes, but 1024 kilobytes.
One megabyte (MB) = 1024 kilobytes (KB)
So, our 8-bit image which is 6,075 kilobytes is also 5.93 Megabytes. At 24 fps, our data rate is 142.4
MB/s.
Gigabytes
One gigabyte (GB) = 1024 megabytes (MB)
So, our 8-bit image which is 5.93 Megabytes is also 0.0058 Gigabytes. Obviously, for our purposes, it’s
much easier to stop at Megabytes. At 24 fps, our data rate is 0.139 GB/s.
A minute of footage is 8.34 GB. This is how the sizes are used – based on how easy it is to remember or
talk about.
Terabytes
One terabyte (TB) = 1024 gigabyte (GB)
So, two and a half hours of our 8-bit 1080p video will need 8.24 x 150 = 1236 GB = 1.2 TB.
All this seems so easy, right? But just as marketing idiots who print brochures with disregard to correct
notations are guilty of crimes of omission, some manufacturers could be said to be guilty of crimes of
commission.
Imagine this: If I owed you $1,024 but only wanted to give you $1,000, I could call 1,000 1,024. Crazy?
Most people couldn’t be expected to know or care that computer engineers loved 1024 more than 1000,
so people began to use the term kilo like they do with kilograms. Many manufacturers added to this
Scientists shouldn’t have used the word kilo if they didn’t mean 1000 exactly. Now, half the world cannot
talk to the other half because one side means 1000 while the other 1024.
Did somebody try to correct the problem? Sure. They invented new names. Here’s the modern way, also
So, what about the 1024 faction? Computers still can only deal with powers of 2. Therefore, we have a
Does this improve matters? Nope. In popular usage, manufacturers and users continue to use the terms
Look at what they’ve done. They’ve called a mile a ‘mili’, while what we all know as a mile is now a new
value because a few corporations feel that will better their profit margins. Now, they can claim their cars
give more miles per gallon, because the length of a mile has been reduced – officially!
Thankfully, this crazy scheme hasn’t been widely accepted. There’s still some common sense left in the
world.
means 4 TB or 4 TiB.
4 TB = 32,000,000,000,000 bits
The difference?
increases in proportion. Stop taking it lightly. Write to your drive manufacturers asking them to clarify
their terms. At the very least, keep this in mind while estimating your disk drive sizes.
As for this website, I avoid the IEEE notification scheme, and stick to the traditional scheme. Whenever I
use kilobytes, megabytes, gigabytes and so on, I’m always using the 1024 system as explained earlier.
Where
H = horizontal pixels
V = vertical pixels
Image size
= 1920(H)x1080(V)x3(c)x16(b)x1(s)
= 99,532,800 bits
Image size
= 1920x1080x3x8x0.5
= 24,883,200 bits
= 3,110,400 bytes
Image size
= 4096x1716x3x12x0.667
= 168,774,009 bits
= 21,096,751 bytes
As you can see, GB is always the bigger number than GiB. Now you understand why manufacturers love
them so much.
Once you have the image size per frame, calculating the data rate is simple – just multiply by the frame
rate. To know how RAW files are calculated, check out Deconstructing RAW. To learn how image sizes
impacts data rate, take a look at Costs of working with 2K and 4K footage. Knowing this is the first step
Please support wolfcrow and buy from Amazon . It won't cost you extra.