Capture and Streaming Guide
Capture and Streaming Guide
STREAMING
VIDEO FORMATS
AND CODECS
SV C ON L I N E.C OM | MAY 20 23 | S VC 33
CAPTURE AND
STREAMING
C ponent is now ubiquitous. Yet, this was not the case in 2018
when I began asking my colleagues about expanding my com-
pany’s services to include hybrid events—people told me I
improved workflow by providing the capability to create macros to con-
trol audio sources and settings.
We use the ATEM’s Fairlight mixer to manage the audio mix of SDI
was crazy, that hybrid events would never be relevant. and analog sources, including gain structure, EQ, dynamics, and audio
Today, Event Stream Team is a full-service video and webcast produc- sync. One of the main benefits is the ability to save the configuration file
tion business with more than 120 active clients ranging from nonprofits for the many types of gigs that we do, plus, the compact design of the
to government and state agencies, providing hybrid events for everything ATEM 2 M/E Constellation HD is perfect for portable live production.
from workshops and conferences to plays, talk shows, and more. I serve As a bonus, the switcher is extremely quiet due to its high-efficiency
as the chief technology officer of our family-run business along with my thermal system, which comes in handy when we’re working in a court-
wife, business partner and CEO Auset Reid, and our children who are room or museum.
training as technical directors, audio engineers, and camera operators. Our team recently produced a three-day religious conference that
In order to ensure high-quality video and audio for hybrid events we included IMAG, as well as live streaming. Two Blackmagic Studio
use Blackmagic Designs’ ATEM 2 M/E Constellation HD live production Camera 4K Pros, paired with ATEM Studio Converters, simplified both
switcher with built-in Fairlight audio mixer at the core of our workflow. the workflow and the cable management. The cameras’ dual native ISO
We previously used an ATEM Television Studio HD and recently ensured the quality of the IMAG, while the ATEM 2 M/E Constellation
upgraded to take advantage of the ATEM 2 M/E Constellation HD’s new- HD’s Fairlight mixer—used with the front-of-house mixer—ensured
est features. The switcher’s built-in Fairlight audio mixer can easily han- smooth and consistent audio between the HyperDeck Studio HD Plus
dle complex live sound mixing—the audio is de-embedded from all the recorders and the audio mixers from wireless mics.
SDI video inputs and passed to the audio mixer, where each input channel With Blackmagic Design gear, we’re confident that we’ll deliver top
includes 6-band parametric EQ and compressor, limiter, expander and quality in any situation. I am proud to say that Event Stream Team is now
noise gate as well as full panning. considered one of the go-to teams for live streams and hybrid events in
Prior to using the ATEM 2 M/E Constellation HD’s Fairlight mixer, the area.
our biggest obstacle was managing audio sync with video sources, par- Jimmy Reid is Chief Technology Officer at Event Stream Team,
ticularly image magnification, (or “IMAG”), common in many client Washington, DC
34 SVC | M AY 2 0 23 | SVCONLINE . CO M
CAPTURE AND
STREAMING
STREAM RACER
JP Broadcast Service Solutions Ltd, a systems integration, tion graphics for each runner. Led by a team of presenters, each bringing
SV C ON L I N E.C OM | MAY 20 23 | S VC 35
CAPTURE AND
STREAMING
T
29.97fps (the reasons would require another article). It’s still 30 com-
video formats; so many frame sizes, plete frames, but runs slightly slower than the power line frequency.
frame rates, data rates, file sizes, Most, but not all, equipment today will operate at both 29.97 (fractional)
compression algorithms. Terminology or 30 (integer) frame rates, but in the world of broadcasting everything
gets used and repurposed so much is still fractional. Again, this has much to do with all the old content
that it’s easy to lose track of what it means. There that’s still around.
are some actual standards behind video formats, So does it matter? If video is going to a traditional broadcast out-
as well a lot of fudging around the edges as things let, probably yes. Fractional may also be required for content delivered
constantly change. to other entities. But within self-contained video systems--office, con-
ference, education, entertainment--it really doesn’t matter and, in fact,
using integer frame rates means that the frame count of a recording will
This is not intended as a history lesson, but it’s important to remem- match real time. (Compensating for the frame count difference between
ber that television as a viable form of media goes back to around the 29.97 and 30 is the reason for “drop-frame” timecode.)
1940s--less than 100 years of advancing technology. For most of that Lastly, there’s interlace. Going back to standard def, it turns out that
time “television” meant programs created by a few major broadcast net- scanning all 525 lines in 1/30 of a second could cause noticeable flicker
works (plus local stations) and sent over the air to viewers at home. for the viewer. The fix was to scan all the odd lines, then the even lines,
That model shifted in various ways with the advent of “cable” and as two fields per frame (60 fields per second). The phosphor coating of
satellite delivery, but changes happened relatively slowly and the under- a CRT monitor would continue to glow long enough that the interleaved
lying technical concepts were fairly stable. What’s now known as “stan- lines appeared continuous, thus effectively doubling the refresh rate. A
dard definition” (SD) video was the only video for 50+ years. One frame little trick of the human visual system.
rate, one frame size, analog and eventually digital. But massive changes, Nobody watches interlace anymore because every display type in
driven by technology, culture and business, have remade the landscape. use today is progressive (scanned contiguously from top to bottom). But
interlace is still used by traditional broadcasters because it was part of
the standards introduced in 2009 and was easier to keep than change
HOW MUCH HISTORY DO WE NEED? (plus there were lots of CRT televisions still in use). HD formats such as
Some things, like the CRT (cathode-ray tube) television/monitor, are 1080i are available in lots of equipment, but again, there is arguably no
pretty much gone for good, but other “legacy” concepts persist. These reason to use interlace unless the video output is going somewhere that
three still affect how we deal with video: Standard definition, fractional requires it.
frame rates, and interlace.
In the US, and a few other countries, television was standardized to
scan down 525 horizontal lines every frame. The frames updated at 30 WELCOME TO TODAY
per second, which is related directly to the 60Hz frequency of the US The official standard for the DTV transition in 2009 covered roughly 36
power grid (television in Europe and most other countries has frame possible frame size and rate combinations, including fractional and inte-
rates related to 50Hz power). The terms NTSC and PAL are often used ger related to 60Hz, and those related to 50Hz (no fractional). We now
as shorthand for TV based on 60Hz or 50Hz, respectively, though those deal regularly with resolutions from below SD to 4K and higher. Some
terms actually refer to how color is encoded in each system (and are of these are outlined in Table 1.
mostly irrelevant now). Progressive has both 30 and 60 frame-per-second formats, as well as
Outside of special cases, this was video until about 2009, when the 24, 25 and 50, and all those frame rates can be used with the two offi-
“DTV Transition” brought HDTV to the public. It also changed over- cially standardized HD resolutions, 1280x720 and 1920x1080. Those
the-air broadcasting from analog to digital, but that’s another story. numbers define the width and height of a frame in pixels, with an aspect
Today we deal mostly in HD formats, but standard def is still around, ratio of 16:9. Standard-def is usually denoted as 640x480 which is an
mainly because huge amounts of older programming are still in use. aspect ratio of 4:3 (there are only 480 active picture lines out of 525).
In the US, color was added to the original black and white televi- Unfortunately, the longevity of interlace has led to confusion in termi-
sion signal as an “overlay” of sorts, and the frame rate was changed to nology. In my view we should always be talking about frames, not fields,
36 SVC | M AY 20 23 | SVCONLINE . CO M
CAPTURE AND
STREAMING
Table 1.
SOME CURRENT VIDEO (TELEVISION) FORMATS
ID Also Known As Frame Size (pixels) Frame Rates Common Use
(integer shown)
480i Standard Def (SD) 640x480 25, 30 Legacy content
UHD 2160p (Ultra HD) 3840x2160 24, 25, 30, 50, 60 General “4K” production
Notes: 24, 30 and 60 integer frame rates have fractional equivalents at 23.98, 29.97 and 59.94.
25 and 50fps are common outside the US and have no fractional variants.
480p is a legitimate format but not seen much since HD began.
PsF (progressive segmented-frame) rates are the same as P, but may require conversion.
and thus 1080i30 is 1920x1080 interlaced at 30fps (which by definition equipment. Frame rate affects the aesthetic appearance of video so, for
means 60 fields). So what is 1080i60? It’s just another name for 30fps example, 24 and 30 may be perceived as more “cinematic” because film
interlaced that unfortunately became popular. The bottom line is that if has traditionally been shot at 24. Higher frame rates tend to be better for
it’s interlaced it’s 30 frames/60 fields (or 29.97/59.94) no matter what capturing fast motion but have a look that some find “hyper-realistic.”
it’s called. Of course this is somewhat subjective.
Note that 4K has two different variants, both approximately 4000 pix- Frame rate and resolution also affect the data rate of signals and the
els across. What we often call “4K” in shorthand is usually UHD (that’s size of recorded files. Using SDI as the transport medium, 1080/30 (pro-
“Ultra HD” in marketing talk) with a frame size of 3840x2160. That is gressive or interlace) has an uncompressed data rate of 1.485Gb/sec, usu-
exactly four times the size of a 1920x1080 HD frame. The other 4K is ally short-handed to 1.5G. As 1080p60 became viable, 3Gb SDI became
4096x2160, which is one of the formats for digital cinema (as opposed common. Moving to UHD, multiply those by four to get 2160p30 at 6G
to television). There’s a corresponding 2K format of 2048x1080. In gen- and 2160p60 at 12G. Pushing 12Gb over SDI coax starts to get tricky,
eral, if it’s non-cinema video, it’s UHD. which is one reason that other approaches, such as quad-HD, or video-
Another “legacy” signal format still seen in equipment menus is pro- over-IP, may be needed.
gressive segmented frame (PsF). Sony developed PsF in the early days of The size of recorded files grows in a similar way, moving from around
progressive video, when 1080p30 was coming into use but a lot of equip- 9GB/minute for uncompressed 1080p30 to 60GB/min for 2160p/60.
ment could only process 1080i. PsF deconstructs a progressive signal Those are bytes, not bits. Calculating true data rates and file sizes is
into “pseudo-interlace” for transport between devices--say from a 30p complicated by factors such as how the luminance and color are encoded
camera to a 30p monitor. It doesn’t actually change the image structure. and the bit depth of color channels. That’s why some HDMI data rates
Getting outside of “production” video there are resolutions commonly go beyond 12Gb.
seen in computer displays, standardized by VESA. These include VGA With regard to 4K/UHD, more is not always better. 4K is incredibly
(also 640x480) and various smaller frame sizes, up to 4K and above. hyped as something everyone needs, but it really pays to think about the
Contemporary computer monitors can usually deal with a wide variety application. The detail available in a 4K image is not visible unless the
of formats, while some displays sold as consumer televisions (ie, for viewer is very close to the screen, or the screen is gigantic. Some use
watching entertainment video) may be limited to the “official” SD and cases are valuable, like shooting in 4K to enable pulling HD sub-images
HD resolutions. This little detail can cause unexpected trouble! out of the overall picture, but many are, IMHO, more hype than value-
So let’s assume that video in the AV world is progressive and most -while the overhead cost in bandwidth and file size is considerable. A
likely at integer rates. From there, the choice of which gets used has case can be made that high dynamic range (HDR) and expanded color
to do with different applications, and what signals are supported by range (gamut) provide more image improvement at a fraction of the cost.
SV C ON L I N E.C OM | MAY 20 23 | S VC 37
CAPTURE AND
STREAMING
CODECS & CONTAINERS product we see still looks pretty damn good, for which we can thank the
Although data speeds in equipment, networks and the internet keep scientists and engineers doing the heavy math!
going up, constant improvement in data compression is arguably what There are many parameters besides data rate that differentiate codecs
has made “video everywhere” possible. Finding ways to get better qual- for different purposes. A critical one is lossless vs. lossy. A lossless
ity out of fewer bits is the magic that has put HD and 4K on all those codec will return the same data after decoding that was encoded origi-
screens. As an example, consider that a 1080p60 show, which may have nally, bit for bit. These are typically used in production environments.
started at 3Gb/s uncompressed in production, might stream to a viewer’s Lossy codecs use mathematical and perceptual tools to discard some
computer or TV at 10Mb/s or less. That’s a data rate reduction of 300%. of the original data in a way that is meant to be undetectable. A good
Realistically, that show probably used some forms of compression example is the mp3 audio codec, which may discard parts of the audio
throughout the production chain as well because capturing and editing that will be masked by louder sounds, or that fall outside average hearing
uncompressed is technically challenging. ability. This is trickery that works surprisingly well, but there’s always a
The term codec refers to an algorithm for compressing and decom- tradeoff in absolute quality.
pressing a signal for transport or storage. There are dozens of codec An important lossy compression tactic for video is frame prediction.
flavors for video and audio, and choosing a codec is very much appli- For example, some variants of the H.264 (aka AVC or MPEG4 part 10)
cation-specific. Acquisition and contribution-quality codecs exhibit codec look at differences in pixels and motion between frames. A cer-
the fewest visual or audible artifacts, but have high data rates. At the tain number of complete frames (I-frames) are kept, but between them
other end are codecs used for final delivery to viewers, which need to are interpolated (B and P) frames that mathematically predict what is
be extremely efficient (low data rate) to be sent over the internet and likely to occur. The algorithm looks backward and forward in the data
decoded easily--so more likely to show artifacts. Amazingly, the end stream to create a Group of Pictures (GOP) that the decoder will reas-
Source signals
RECORDING PROCESS
(uncompressed) Storage/memory
Codec (encode) Wrapper/container
Video Essence RAM
Compression Hard drive
Audio Essence
(data reduction) USB stick
Other (eg: metadata)
Internet
PLAYBACK PROCESS
Storage/memory
Display/Speakers
Wrapper/container Codec (decode)
RAM Video Essence
Hard drive
Audio Essence De-compress Picture & Sound
USB stick
Other (eg: metadata)
Internet
38 SVC | M AY 20 23 | SVCONLINE . CO M
CAPTURE AND
STREAMING
semble into the final output. More I-frames results in better quality, by embedding the compressed essence of the content in a container or
but increases the data rate, so codecs used for final viewing tend to be wrapper file. Some file types we use regularly, such as .mp4 and .mov
Long-GOP, meaning there are lots of interpolated frames. There are (Quicktime), are not codecs in themselves, but wrappers. The wrapper
many other lossy codecs that use interpolation and other techniques to is the file that’s recognized by a playback device, and it contains the
reduce the data rate. compressed audio/video essence, plus other types of information such as
So our hypothetical 1080p60 TV show might have been shot with a metadata about the essence file.
medium-rate codec that uses only I-frames, such as a variant of Apple This distinction is why someone may try to play a .mov file and get
ProRes, maybe 300-500Mb/s. That is quite viable for high-quality edit- a message that “the codec is not supported.” The player recognizes that
ing and color correction. Scenes that are destined for extensive spe- .mov is a media file, but does not have the correct codec to decode the
cial effects (such as green-screen shots) might use a higher-rate codec essence. Conversely, compressed essence files can often be encapsulated
because having more original scene data results in better effect compos- in more than one type of wrapper. This is important because some wrap-
iting. The final edited show might be output as files in several different per formats, such as MXF, can carry extensive metadata that might be
codecs for different delivery requirements. It also might be transcoded useful in production but is not needed for simple viewing.
to different HD flavors, upconverted to 4K, or converted to 1080p50 for Tools like Quicktime Player and VLC are useful for finding out
international viewing (a process known as standards conversion). All what’s actually inside a file. Even “stats for nerds” on Youtube can be
of these conversions can exact a cost in quality, but the tools are good handy when you want to know why a video looks a certain way. But the
enough now that it’s not usually a serious problem. difference between codecs and wrappers is not as well understood as it
Some codecs, like mp3, contain all the information necessary to play should be--even among professionals. This can lead to a confusing dis-
the audio. But as codecs entered more areas of usage it became neces- cussion when someone says they want a .mov file, and I respond with,
sary to extend what could be carried in the file. This was accomplished “Okay, what codec?”
CAPTURE AND
STREAMING
MOBILE STREAMING
ed House Streaming, a subsidiary of CP Communications, our transmission trailer and as a B Unit for specific production elements,
40 SVC | M AY 20 23 | SVCONLINE . CO M
CAPTURE AND
PRODUCTS STREAMING
MAGEWELL ULTRA
ENCODE AIO
Magewell’s new Ultra Encode AIO live
media encoders build on the flexibility
of the original Ultra Encode family with
expanded features including HDMI and
SDI input connectivity in a single unit;
4K (30fps) encoding and streaming from
the HDMI input; simultaneous multi-pro-
tocol streaming; file recording; and more.
Ultra Encode AIO supports H.264 and
H.265 streaming in multiple protocols
including RTMP, SRT, HLS and more. It
can encode one live input source or mix
the HDMI and SDI inputs (picture-in-
picture or side-by-side) into a combined
output. Users can also apply up to eight
configurable overlays including text,
images, and a clock.
CAPTURE AND
STREAMING PRODUCTS
Classifieds
STAGE POCKETS FLYPACKS & ROAD CASES FIBER CABLES EQUIPMENT FOR SALE
To Advertise, contact
Zahra Majma at
[email protected]