0% found this document useful (0 votes)
317 views17 pages

A Quick Guide To Digital Video Resolution and Aspect Ratio Conversions

Uploaded by

Rupert Walker
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
317 views17 pages

A Quick Guide To Digital Video Resolution and Aspect Ratio Conversions

Uploaded by

Rupert Walker
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

A Quick Guide to Digital Video Resolution

and Aspect Ratio Conversions


Contents
1. Introduction
2. The Connection Between the Analog and the Digital
3. A Conversion Table for Digital Video Formats
4. Frequently Argued Questions
5. Related Links

Recent updates
15-Jan-2008

 Link-rot fixes. (Thanks, Jeff!)

25-Feb-2006

 Added a link to BBC – Commissioning – A Guide to Picture-Size


 Link-rot fixes.

28-Feb-2004

 Added the 544×576 resolution (as per DVB specifications) to the 625/50 table

22-Feb-2004

 Due to popular demand, I have now finished a major revamp of the conversion table as
regarding to the 525/59.94 systems. Formerly, all 525/59.94 calculations were based on
rounding up to 711 pixels (52.666... µs) in 13.5 MHz modes, then considering that figure
the exact 4:3 resolution. Now, all 525/59.94 numbers are based on the exact active line
length of 52+59/90 (52.6555...) µs, which more accurately follows the analog standards.
Note: Even though the 525/59.94 numbers in the table have now (seemingly) changed in
a big way, their practical approximations are still very close to the old values, so the
change does not have all that many practical effects. For example, the old pixel aspect
ratio value for 525/59.94 13.5 MHz pixels was 72/79 (0,91139...), now it is 4320/4739
(0,91158...) You are unlikely to see any difference even if you used the old values.
 I am sending my greetings to Andreas Dittrich and Chris Meyer who offered some
insightful comments and persuaded me to go forward with this change. They are the ones
you may now thank (or curse) for these new, more pedantically exact, more cumbersome
fractional numbers! :)
 There is still some academical controversy over the 575 vs 576, 485 vs 486 active lines
geometry issue. I may revisit the subject some time in the future.
 The table has now been divided into two sections – one listing the common 525/59.94
sampling grids and the other listing the common 625/50 sampling grids.
 Nothing has been changed on the 625/50 side. Nada. Zilch.

20-Feb-2004

 Added a link to an informative article about Determining the Capture Window of a


Capture Card and a respective FAQ section 4.11: Help! My capture card does not seem to
do it this way!

14-Feb-2004

 "Link rot" fixes. Some little changes in wording here and there.

26-Jun-2003

 Typos in the conversion table. The sampling matrix width for some resolutions was listed
as 52.33333 µs instead of 53.33333 µs.

11-Feb-2003

 Some "link rot" fixes. Thanks, Andy and Olafs!

26-Jun-2002

 The conversion table erroneously labeled the pixel aspect ratio column as if the values
were in y/x format, while they actually were in x/y format. The heading of the table has
been corrected. (Thanks, Colin!) The calculations given below the table were correct all
the time and have not been modified.

13-Apr-2002

 There was an unfortunate error in Section 4.7: the correct 525-line resolution is (of
course) 720×526 + 2/3, not 720×533.25! Sorry. It has been corrected now.

11-Apr-2002

 Added Section 4.7: What do you mean by saying it is better to avoid 720×540?

10-Apr-2002

 Enhanced the table by adding sampling matrix widths in microseconds.


 Various little touch-ups all over the document.

9-Apr-2002

 All references to 525/60 have been changed to 525/59.94 to be more pedantic.


 The table now calls 625/50 systems CCIR and 525/59.94 systems EIA (well, it is not the
perfect solution, but probably better than using mere numbers or separately listing each
and every letter designation for every broadcast standard in the world)
 Some H.261, H.263 resolutions added to the conversion table.
 Clarified the explanation of digitizing half lines.
 Added some links to the Related links section.
 Added some introductory links.
 Replaced Section 4.6 with a new one

6-Apr-2002

 Initial publication date

Abstract
Despite of ever-growing number of people working with digital video formats daily, there is still
a great deal of confusion regarding how their image geometry and aspect ratios actually work.
This document tries to shed some light on these issues.

Feel free to e-mail me any comments, corrections, suggestions, additions or opinions. Should
you come across a broken link, please let me know so I can fix it.

Acknowledgements
My warm thanks go to Colin Browell, Andy Furniss, Ole Hansen, and Paul Keinanen, and Olafs
who have provided valuable comments and feedback concerning this page.

Linking to this document


You are free to link to this document. If you do so, please use the URL
<http://www.iki.fi/znark/video/conversion/>. This ensures that the link will always work,
regardless of the actual physical location of this site.

1. Introduction
There is a fair number of mind-blowing, scary oddities and secrets in the world of digital video.
One of the very first a beginner will usually encounter is the fact that in digitized video data,
pixels are often not considered "square" in their form. In most real-world digital video
applications pixels have a width/height ratio – or aspect ratio, as it is more conveniently called –
that can be something completely different from 1/1!

The second great revelation usually comes when one runs into the concept of anamorphic 16:9
video for the very first time. If it was initially hard to grasp the idea of pixels changing their
shape when displayed in different environments, this one is even more baffling: the very same
pixel resolution you have only just learned to associate with 4:3 displays can now suddenly
represent another, totally different image geometry. In other words, the pixels have changed their
shape again!

Unfortunately, these two are often the only things most ordinary people will ever learn about
digital video and aspect ratios.

1.1 The dirty little secret revealed

Tutorials and manuals usually tend to keep very quiet and secretive about the finer technical
details of digital video, particularly when it comes to the topic of (pixel) aspect ratios and image
geometry.

Even if converting (resampling) video clips to other resolutions is discussed, the accompanying
explanation is usually troublingly simplistic and vague – often inaccurate and misleading – and
sometimes the suggested methods are just plain wrong. It is not uncommon that the examples
only deal with arbitrarily chosen ("x pixels by y pixels") frame dimensions and use ideal frame
aspect ratios such as 16:9 or 4:3 as the basis for calculations – not the actual pixel aspect ratios –
which is usually a good indicator that the writer may not actually take the real image geometry
into account at all.

It is almost as if the whole aspect ratio issue was considered some sort of dirty little secret of the
video industry; black magic you could not even begin to explain to mere mortals in reasonable
terms. This is a shame. In this case, there is really more to it than meets the eye. Confusing
people with incomplete and watered-down explanations does not do any good to the industry. 

Now that you have read this far, it is time to reward your effort with The Third Big Revelation
about aspect ratios and frame sizes - the one that is usually left unsaid:

Not a single one of the commonly used digital video resolutions exactly represents the actual
4:3 or 16:9 image frame.

Shocking, isn't it? 768×576, 720×576, 704×576, 720×480, 704×480, 640×480... none of them is
exactly 4:3 or 16:9; not even the ones you may conventionally think as "square-pixel"
resolutions.

So there. Now you finally know the truth. Let's find out what it actually means.
2. The Connection Between the Analog and the Digital
Digital video standards do not live outside the realm of analog world. On the contrary, all
commonly used modern (SDTV) digital video formats have a well-defined relationship with their
counterparts in analog video standards. You could really say they have their roots in analog soil.

And now, my friend, we are rapidly closing to The Fourth Big Revelation:

It is really the analog video standards that define the image geometry and pixel aspect ratio
in digital formats.

Even if you did all of your video work solely in digital domain, those pesky old analog video
standards still define the shape of your images and pixels.

How come?

From the video industry's point of view, the current (SDTV, as opposed to HDTV which is
another kettle of fish) digital video formats - those that actually get used in practical real-life
applications such as DVD, DV, VCD, SVCD, digital television etc. - are all about
interoperability. At the advent of digital video - late 1970's, when committee work was started
on CCIR 601 (later to become ITU-R BT.601) - there was already a vast catalog of analog video
material in formats defined solely by analog standards. What is more, enormous amounts of
money had been poured in analog studio equipment such as cameras, video switchers, proc
amps, tape decks and other tools of trade. What a waste it would have been if the "next
generation" digital video formats were designed in a such way they had absolutely nothing in
common with old analog formats, and required ditching all the analog equipment!

It was clear from the beginning that the industry wanted a smooth, well-defined transition path
between the current analog systems and the brave new digital world without running into too
many compatibility issues. It was also considered necessary to be able to freely mix and match
digital and analog equipment. The result was that the digital (SDTV) video formats we now use
are based on the concept of digitizing old, analog video signals, thus interlocking to the analog
video standards.

This connection between the digital and analog domains is permanent. Some of the fundamental
features of digital video, such as image geometry, are actually defined in the analog standards.
Even if we go all-digital, the relationship is still there, as long as we use either ITU-R BT.601
pixels or "industry standard" square pixels.

2.1 What does it mean?

There are three basic sampling rates from which almost all modern digital video formats are
derived:

13.5 MHz ITU-R BT.601 (aka CCIR 601 aka Rec. 601) non-square pixels for both 625/50
and 525/59.94 systems. This sampling rate was originally designed for
digitizing component video signals. Now used extensively in almost all modern
digital video gear.
14.75 MHz "Industry standard" square pixels for 625/50 systems. Originally designed for
digitizing composite video signals.
12 + 3/11 MH SMPTE 244M "industry standard" square pixels for 525/59.94 systems.
z Originally designed for digitizing composite video signals.

Let's see how this works out with 13.5 MHz and both 525/59.94 and 625/50 systems:

If you have the B/W (luminance) part of a component video signal in a coaxial cable, you can
plug in an A/D converter and start metering (sampling) the voltage level in the cable at regular
intervals.

 ITU-R BT.601 defines a standard sampling rate for both 625/50 and 525/59.94 video
signals: 13.5 MHz
 13.5 MHz will give you a total of 13,500,000 samples per second, but we are only
interested in sampling the parts of the signal that actually contain image information. The
parts of the signal spent in horizontal or vertical blanking are of no interest to us, and can
be omitted.
 625/50 systems have a line length of 64 µs, of which 52 µs is the "active" part that
contains actual image information. (The rest is reserved for horizontal blanking.)
o 52 µs × 13.5 MHz = 702 samples (pixels) per scanline
o In the vertical direction, there are 574 complete scanlines and 2 half lines. Even
the half lines get digitized as if their "missing" other half belonged to the active
picture, giving a total of 576 scanlines.
o Thus, the active image area at 13.5 MHz sampling is 702×576 pixels. This is the
actual area that forms the 4:3 (or anamorphic 16:9) frame.
 525/59.94 systems have a line length of 63+5/9 (63.555...) µs, of which 52+59/90
(52.6555...) µs is the "active" part that contains actual image information. (The rest is
reserved for horizontal blanking.)
o 52+59/90 µs × 13.5 MHz = 710.85 samples (pixels) per scanline. 
o In the vertical direction, there are 484 complete scanlines and 2 half lines. As
above, all of them get digitized and half lines will be treated as if their missing
other half belonged to the active picture, giving a total of 486 scanlines.
o Thus, the active image area at 13.5 MHz sampling is 710.85×486 pixels. This is
the actual area that forms the 4:3 (or anamorphic 16:9) frame.
o However, we cannot use partial pixels in any practical video work. Therefore, the
number 710.85 needs to be rounded up to 711, and we get a 711×486 pixel frame
instead.
o 711 samples equals to 52+2/3 (52.666...) µs at 13.5 MHz, so the rounded-to-the-
nearest-pixel active area is a little bit wider than it ideally ought to be.
Fortunately, the difference of 0.0111... µs is (for all practical purposes)
insignificant, and well within the tolerances of NTSC-M specifications.
It also works the same way for square-pixel sampling rates. You will just get a different number
of horizontal samples. The calculations are left as an exercise to the reader.

2.3 I am already lost!

If you did not understand a word of the above, you might want to take a look at the following
introductory links:

 A Note on CCIR / PAL-B Video Standard


 Basics of Video
 Conventional Analog Television - An Introduction
 The 625/50 PAL Video Signal and TV Compatible Graphics Modes.

Also see the Related Links section.

3. A Conversion Table for Digital Video Formats


The following is a frame size and aspect ratio conversion table, representing many commonly
used digital video formats:

The formats related to 625-line systems with a 50 Hz field rate


sampling sampling actual active
pixel
matrix sampling matrix picture size supports
aspect notes
rate (MHz) width in interlacing
width height ratio (x/y) width height
µs
"Industry
standard"
768 576 14.75 768/767 52.06780 767 576 Y 625/50
square-pixel
video
"True"
computer
768 576 14 + 10/13² 1/1 52.00000 768 576 Y
square-pixel
resolution
768 560 14.75 768/767 52.06780 767 576 Y CD-i³
D1, DV,
DVB,
720 576 13.5 128/117 53.33333 702 576 Y
DVD,
SVCD³
720 540 ambiguous 1/1 ambiguous 720 540 N Oddball
compromise
format.
Better to
avoid unless
you really
know what
you are
doing.
DVD,
H.263
704 576 13.5 128/117 52.14815 702 576 Y
(4CIF),
VCD³
Active
picture
frame for
625/50
702 576 13.5 128/117 52.00000 702 576 Y
systems in
ITU-R
BT.601-4
pixels.
DVB (3/4
of BT.601
544 576 10.125 512/351 53.72840 526+1/2 576 Y
sampling
rate)
SVCD (2/3
of BT.601
480 576 9 128/78 53.33333 468 576 Y
sampling
rate)
1/4 of
"industry
384 288 7.375 768/767 52.06780 383+1/2 288 N
standard"
768×576
384 280 7.375 768/767 52.06780 383+1/2 288 N CD-i
352 576 6.75 256/117 52.14815 351 576 Y DVD
VCD,
DVD,
352 288 6.75 128/117 52.14815 351 288 N
H.261 +
H.263 (CIF)
H.261 +
176 144 3.375 128/117 52.14815 175+1/2 144 N H.263
(QCIF)

The formats related to 525-line systems with a 59.94¹ Hz field rate


sampling sampling pixel sampling actual active supports notes
matrix rate (MHz) aspect matrix picture size interlacing
width in
width height ratio (x/y) width height
µs
Oddball
compromise
format.
Better to
720 540 ambiguous 1/1 ambiguous 720 540 N avoid unless
you really
know what
you are
doing.
720 486 13.5 4320/4739 53.33333 710.85 486 Y D1
DV, DVB,
720 480 13.5 4320/4739 53.33333 710.85 486 Y DVD,
SVCD³
Active
picture
frame for
525/59.94
711 486 13.5 4320/4739 52.66667 710.85 486 Y
systems in
ITU-R
BT.601-4
pixels.
704 486 13.5 4320/4739 52.14815 710.85 486 Y  
ATSC,
704 480 13.5 4320/4739 52.14815 710.85 486 Y DVD,
VCD³
"True"
computer
square-pixel
648 486 12 + 1452/4739² 1/1 52.65556 648 486 Y resolution
(all 486
active
scanlines)
D2:
"industry
standard"
640 480 12 + 3/11 4752/4739 52.14815 646+5/22 486 Y
525/59.94
square-pixel
video
640 480 12 + 1452/4739² 1/1 52.00549 648 486 Y "True"
computer
square-pixel
format
(cropped)
SVCD (2/3
of BT.601
480 480 9 6480/4739 53.33333 473.9 486 Y
sampling
rate)
352 480 6.75 8640/4739 52.14815 355.425 486 Y DVD
352 240 6.75 4320/4739 52.14815 355.425 243 N VCD, DVD
1/4 of
320 240 6 + 3/22 4572/4739 52.14815 324 243 N
640×480
¹ 59.94 Hz is only a conventional approximation; the mathematically exact field rate is 60 Hz *
1000/1001.
² A calculated sampling rate, represented here only for completeness. Does not exist in actual
525/625 video equipment.
³ Only used for still images.

3.1 How to use the table for conversions

Let's assume you have a video clip in one format and wish to convert it to another, so that it
remains in correct aspect ratio throughout the process.

1. Locate your source and target formats in the table.


2. Calculate the vertical conversion factor by using the following formula:
vertical_conversion_factor = target_active_picture_height /
source_active_picture_height. (Be sure to use the active picture values from
the table, not the sampling matrix size values.)
o If vertical_conversion_factor is 0.5 and your source material is
interlaced, you will probably need to deinterlace before resampling. (I recommend
using a special smart deinterlacing algorithm, such as the one found in
VirtualDub's Smart Deinterlacer filter.)
o If vertical_conversion_factor is anything other than 0.5, 1 or 2, you
are probably trying to do a standards conversion between a 625/50 system and a
525/59.94 system. Standards conversion (when done right) is a highly demanding
process and outside the scope of this document. I recommend reading The
Engineer's Guide to Standards Conversion and The Engineers Guide to Motion
Compensation from Snell & Wilcox Engineering Guides to get a grasp of the
related issues. In short, merely converting the frame size and image aspect ratio is
not enough - you would also have to take interlacing into account and correct any
aliasing problems in temporal dimension (which means synthesizing new fields
out of thin air using motion compensation algorithms.)
3. Calculate the horizontal conversion factor: horizontal_conversion_factor =
(source_aspect_ratio) / (destination_aspect_ratio) *
(vertical_conversion_factor)
4. Calculate the new horizontal size: target_sampling_matrix_width =
horizontal_conversion_factor * source_sampling_matrix_width
5. Calculate the new vertical size: target_sampling_matrix_height =
vertical_conversion_factor * source_sampling_matrix_height
6. Resample the image to the new size
7. Check if the new size matches the target resolution's sampling matrix dimensions. If not,
crop (i.e. cut at the edges) and pad (i.e., add black borders) accordingly so that it will.

3.2 Some practical examples of the above

3.2.1 640×480 "industry standard" square pixels to 720×480 ITU-R BT.601 pixels

Let's say I have captured a video clip from 525/59.94 source using an old M-JPEG card that only
allows sampling in "industry standard" (12 + 3/11 MHz) square pixel format. The resolution of
the clip is 640×480. Now I would like to incorporate this into a DV project that uses ITU-R
BT.601 pixels and a resolution of 720×480.

1. The first step is to look up the correct source and target formats from the table.
o In this case, the source format is 640×480 in a 525/59.94 system, using the
sampling rate of 12 + 3/11 MHz and a pixel aspect ratio of 4752/4739.
o The target format is 720×480 (likewise in 525/59.94 system), using the sampling
rate of 13.5 MHz and a pixel aspect ratio of 4320/4739.
2. The second step is to calculate the vertical conversion factor. In our case, it is 486/486 =
1
3. Now we need a horizontal rescaling factor, which in this case is (4752/4739) /
(4320/4739) * 1 which equals to 11/10.
4. Then we can calculate the new image width from the old one: 11/10 * 640 = 704 pixels
5. The image height will stay unchanged, since 1 * 480 is still 480.
6. Thus, we need to resample the 640×480 image to 704×480.
7. However, our original target resolution was 720×480. Now we need to pad the image
(with black vertical bars on the side) so that the frame width will become 720 pixels. A
natural conclusion is that we need to add 8 pixels black to both side edges.

3.2.2 720×576 ITU-R BT.601 pixels to 720×480 ITU-R BT.601 pixels

In other words, a "PAL" to "NTSC" conversion:

1. Again, the first step is to look up the correct source and target formats from the table.
o In this case, the source format is 720×576 in a 625/50 system, using the sampling
rate of 13.5 MHz and a pixel aspect ratio of 128/117.
o The target format is 720×480 in 525/59.94 system, using the sampling rate of 13.5
MHz and a pixel aspect ratio of 4320/4739.
2. We need to alculate the vertical conversion factor. In our case, it is 486/576 = 27/32
3. Now we need a horizontal rescaling factor, which in our case is (128/117) / (4320/4739)
* (27/32) which equals to 4739/4680.
4. Then we can calculate the new image width from the old one: 4739/4680 * 720 =
729+1/13 pixels
5. The new image height will be 27/32 * 576 = 486 pixels.
6. Thus, we need to resample the 720×576 image to (729+1/13)×486. As we normally
cannot use subpixel sampling, we must round the figure 729+1/13 to some reasonable
number - in this case probably 729.
7. However, our original target resolution was 720×480. Now we need to crop the 729×486
image sufficiently from the edges so that the frame width will become 720 pixels and
frame height 480 pixels.

4. Frequently Argued Questions


4.1 Isn't 720 the real width of a 4:3 image? If not, then why are 720 pixels
sampled instead of 711 or 702 (or whatever)?

720 pixels are sampled to allow for little deviation from the ideal timing values for blanking and
active line lenght in analog signal. In practice, analog video signal - especially if coming from a
wobbly home video tape recorder - can never be that precise in timing. It is useful to have a little
headroom for digitizing all of the signal even if it is of a bit shoddy quality or otherwise non-
standard.

720 pixels are also sampled to make it sure that the signal-to-be-digitized has had the time to
slope back to blanking level at the both ends. (This is to avoid nasty overshooting or ringing
effects, comparable to the clicks and pops you can hear at the start and end of an audio sample.)

Last but not least, 720 pixels are sampled because a common sampling rate (13.5 MHz) and
amount of samples per line (720) makes it easier for the hardware manufactures to design multi-
standard digital video equipment.

4.2 What does this mean, considering ITU-R BT.601 compliant equipment?

It means that the sampled horizontal range of the signal is a bit wider than the actual active
image frame:

 On 625/50 systems, only the centermost 702×576 pixels (of 720×576) belong to the
actual 4:3 (or anamorphic 16:9) frame.
 On 525/59.94 systems, only the centermost 710.85×486 pixels (of 720×486) belong to
the actual 4:3 (or anamorphic 16:9) frame. (For practical video applications, 710.85 will
have to be rounded up to 711 pixels.)

Yes, you understood correctly. 720x576 is not exactly 4:3, and neither is 720x480. The real 4:3
frame (as defined in the analog video standards) is a bit narrower than the horizontal range of
signal that actually gets digitized.
Yes, it is the same for all generally available digitizing equipment; tv tuner cards, digital video
cameras and such. It is true even for all-digital systems; otherwise they would not be compatible
with ITU-R BT.601.

4.3 You must be kidding! I am pretty sure there is a mistake in your calculations.
It says everywhere that 720×576 or 720×480 really is 4:3. Please stop propagating
this misinformation!

I admit that the figures presented on this web site are not very well-known facts even amongst
professional videographers, not to mention hobbyists. Aspect ratio is one of the most
misunderstood "black magic" issue in digital video. That is precisely why I constructed the web
site in the first place - to share the knowledge.

As for my calculations; feel free to prove them wrong. For starters, you might want to read the
documents in the Related Links section.

4.4 I have been doing digital video projects for the last 50 years. I know my stuff!
If you were correct, everything I have done to process my precious video has
always been wrong, aspect-ratio wise!

That may very well be the sad truth. Fortunately, even if you had used wrong methods for
scaling/resampling the image, the difference between the correct aspect ratio and a wrong aspect
ratio is often small enough to go unnoticed unless you really start looking for it.

4.5 It still does not make any sense. For starters, all the 525/59.94 equipment I
have only works in 720×480, not in 720×486 (and definitely not in 711×486)! How
do you explain that?

525/59.94 video signal has 486 active (image-carrying) scanlines, but modern digital video
equipment usually crops 6 of them off. Why? To get the height of the image down to 480 pixels,
which is neatly divisible by 16. See for yourself:

 486 / 16 = 30.375 whereas
 480 / 16 = exactly 30.

Also note that 720 / 16 equals exactly to 45 so the width of the image is divisible by 16, as well!

4.5.1 Why is it important to have the height and width of the raster image divisible by 16?

Modern digital video applications such as DV, DVD and digital television (DVB, ATSC) often
use MPEG-1 or MPEG-2 formats (or their derivatives) which are all based on 16×16 pixel
macroblocks. Having the height and width of the image readily divisible by 16 makes it easier
and more efficient for an MPEG encoder to compress video.
4.5.2 Doesn't this mean that when capturing in 720×480, I will lose six scanlines worth of
valuable information that was once present in the original video signal?

Correct, but the information might not have been that valuable in the first place. Most 525/59.94
video work is already done solely in the digital domain and in the 720×480 format, so there is
usually nothing to digitize on those scanlines anymore. Moreover, in the good old days (when all
of those 486 scanlines were still in active use) most of the time the edges only carried flickering
VCR head noise.

The video image is masked by the overscan edges of a CRT based television, so you would not
normally see the "missing" scanlines, anyway.

4.5.3 You keep saying the "real" 4:3 resolution is at about 711×486 for 525/59.94 systems.
OK, maybe there really are 9 extra pixels on the sides, but how do I cope with the fact my
equipment only records 480 active scanlines, not 486?

Think it this way:

 First, you have a frame of 720×480 pixels.


 There is another frame of 710.85×486 pixels, overlaid and centered on top of the first
one. This frame represents the "real" 4:3 resolution in 525/59.94 systems. (In any
practical real-world video application we would have to use 711 pixels, but 710.85 is the
ideal, mathematically exact number.)
 The parts of the first frame that go over the side edges of the second frame are excess
space that is outside the actual active image area. You can put picture there in digital
systems, but there is no guarantee it will survive on any analog system, or display on any
CRT monitor, even in underscan mode.
 The parts of the second frame that go over the top and bottom edges of the first frame are
the cropped 6 scanlines. As you only have 480 scanlines at your disposal, you cannot put
picture there, but aspect ratio wise this imaginary area counts as a part of the "real" 4:3
image.

There is also another way of thinking it:

 Disregard the notion that 525/59.94 systems have traditionally had 486 active scanlines.
Instead, think that the new standard is now 480 scanlines.
 Now, your ideal 4:3 frame is 480 * (4/3) / (4320/4739) = 702 + 2/27 pixels. In real world,
a minimum of 703 pixels would need to be sampled to convey all the information in the
active part of the scanline.
 703 is a nasty uneven number for computers. 704 is much better since it is divisible by 16
(again!)
 Now you have something like a frame of 704×480 pixels, inside which lives an-
approximately-702×480-frame, which in turn represents the real 4:3 image area. But
wait! 704×480 is a familiar number, isn't it? See the connection? It is used in VCD high-
res still images and in ATSC digital television! How convenient!
The latter way of thinking will also lead to cropping off the side edges of the image to get it
inside a 4:3 rectangle (albeit a bit smaller than the "real" one), but then again, if you are
restricted to using 704×480, that decision has already pretty much been made for you.

4.6 What about standards conversion? Doesn't PAL 720×576 exactly equal to
NTSC 720×480?

As can be seen from the example in section 3.2.2, the answer is no. If you simply resample from
720×576 to 720×480, the analog active areas of the source and target formats will not match.
Fortunately, there is a bit fool-proofness built-in to the relationship of these two frame sizes.
What you will actually get from the process is an image in which the original analog active area
(702×576 centermost pixels of 720×576) has become 702×480 in the target format's pixels. This,
in turn, almost represents a 4:3 area, albeit a bit smaller than what would be needed for a perfect
conversion.

The area that 702×480 covers is not the same as the actual analog active image frame (which
would be 710.85×486, or, in practical terms, 711×486). It is more like a smaller 4:3 frame inside
it.

In other words, the result is that the active 4:3 image frame in the source format has shrunk a bit
in the conversion: it has lost six (target) scanlines in vertical direction and the same relative
amount of width. However, for all practical purposes, it has still retained its original aspect ratio.
The easiest way to see this is converting 702×480 (in 13.5 MHz 525-line ITU-R BT.601 format)
to "true" square pixels: 639 + 4419/4739 square pixels by 480 scanlines is a close enough match
to 640×480, which is 4:3. Wonderful coincidence, isn't it? :)

The same peculiar relationship applies to all 525/625 "sister resolutions" derived from 13.5
MHz:

 704×576 vs 704×480
 480×576 vs 480×480
 352×288 vs 352×240
 etc.

This holds true on two conditions:

1. The source sampling matrix width (in microseconds) must be exactly the same as the
target's.
2. You can only convert between a full-height 625-line resolution and a cropped-height
525-line resolution (i.e. use only those formats that represent exactly 480 scanlines worth
of 525/60 data, instead of full 486.)

As direct resampling involves shrinkage (or when going in another direction, enlargement), I
cannot really recommend this method for any real standards conversion work. It is more like a
quick hack, suitable for use e.g. if the software does not allow proper resizing and cropping.
Note: Many people use direct resampling for all the wrong reasons: 1) They think that a
720×480 frame directly equals to a 720×576 frame. 2) They also think that both aforementioned
frame sizes represent exactly the active 4:3 (or 16:9) picture area, edge to edge. As you already
know from Section 2.1, both of these assumptions are wrong. The fact that direct resampling
works at all is mostly a quirky coincidence

4.7 What do you mean by saying it is better to avoid 720×540?

The problem with this resolution is that while you think you are editing in a format that is both 1)
4:3 square pixels and 2) easily convertable to a standard video resolution (either 720×576 or
720×480) just by vertical resampling, you are not. See the table. There is no real world video
format that would use full 720 pixel horizontal range as the width of the active 4:3 frame.

In order to get to a standard video format from this one, you need to take in account the actual
form of the sampling matrices. The 4:3 area in 625-line formats is 702×576, not 720×576. In
525-line formats it is 711×486, not 720×480. Resizing a 720 pixels wide 4:3 format directly to
720×576 or 720×480 simply won't work. You will either have to resample in both directions
(unlike you originally thought, you do not get to keep the image width neatly as 720 pixels at all
times), or to crop some top and bottom lines off.

If you need to construct an intermediary square-pixel resolution that is a) exactly 720 pixels wide
and b) covers exactly the same area as 720x576 or 720x480 (thus only having to resample in
vertical direction for conversions), you will end up with two separate resolutions, one for each
video standard:

 The 720 pixels wide square-pixel equivalent of 720×576 (ITU-R BT.601 pixels for 625-
line systems) is 720×526.5 pixels
 The 720 pixels wide square-pixel equivalent of 720×480 (ITU-R BT.601 pixels for 525-
line 4739/9systems) is 720×526 + 5/9 pixels

Fortunately, the numbers will nicely round up to 720×527 for both standards.

Note that the original interlaced field structure (if any) will go haywire as you mess around
scaling in the vertical direction.

4.8 Why does your table list two slightly different definitions for square pixels?

"Square pixels", as digitized by a TV tuner or an M-JPEG card, are not exactly square. The
"industry standard" sampling rates used in square-pixel video equipment actually give out pixels
that are almost square, but not exactly. As you can see for yourself in the table, the difference is
very small - for all practical purposes meaningless - but it is still useful to know that sampled
"video" square-pixels differ a bit from ideal "computer" square pixels.

Converting "computer" square pixels to "video" square pixels is usually a futile effort. You will
not see the difference, anyway, and probably only lose some quality in the interpolation process.
4.9 This is really scary and nasty stuff. I thought digital video was simple! Now
my head hurts!

But that's just the way video is. Fortunately, the conversions are not really that complicated once
you practice them a little.

4.10 I think you're just nit-picking. No-one will ever notice if I consider all "4:3"
video formats just 4:3, without doing any complicated aspect ratio or "active
image area" calculations.

Feel free to process your video just the way you like it. But there are still many people who
would like to get as close to the ideal aspect ratio correctness as possible, instead of only using
rough "ballpark figures" in their video work.

4.11 Help! My capture card does not seem to do it this way!

You may be correct. The professional video gear is very strict about conforming to the ITU-R
BT.601 standard, and you can also generally trust DV camcorders and DVD players/recorders
using the correct sampling rates and pixel clocks. However, the PC hardware market is different:
cheap mass-marketed tv tuner cards and "tv out" cards ofter seem to have these design flaws and
inaccuracies in their drivers: sometimes they are using the common, industry-standard frame
formats (such as 720×480) with sampling rates that are just plain wrong or sufficiently off the
mark to create problems.

It is usually not the hardware that is the culprit here – the chips on the card may be perfectly
capable of producing images (or digitizing them) using exactly the correct sampling rates and
pixel clocks, but the programmer who designed the driver that controls the hardware may have
taken some special liberties and shortcuts, leading to inaccuracies. (Possibly the drivers for these
problematic devices were designed by someone who has not studied the relevant video
standards.)

Fortunately, you can check out your devices and, if necessary, calibrate your capture workflow
by following these instructions.  (The only way you can find out these flaws for sure is
comparing test images as detailed in the above link, or using a test card generator and an
oscilloscope.)

You might also like