0% found this document useful (0 votes)
98 views7 pages

Video Display Unit (VDU) : Analogue TV & Monitors Computer Monitors

The document discusses the evolution of display technologies from analog CRT displays to digital interfaces like DVI and HDMI. It provides details on how CRT displays worked using scan lines and blanking periods. It then describes how digital interfaces like DVI and HDMI transmit pixel data and additional information like audio in a digital format without the need for analog signals. The document also briefly discusses encoding schemes like 8b/10b that are used to transmit data reliably over high-speed serial links.

Uploaded by

Kevin Ndemo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views7 pages

Video Display Unit (VDU) : Analogue TV & Monitors Computer Monitors

The document discusses the evolution of display technologies from analog CRT displays to digital interfaces like DVI and HDMI. It provides details on how CRT displays worked using scan lines and blanking periods. It then describes how digital interfaces like DVI and HDMI transmit pixel data and additional information like audio in a digital format without the need for analog signals. The document also briefly discusses encoding schemes like 8b/10b that are used to transmit data reliably over high-speed serial links.

Uploaded by

Kevin Ndemo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

University of Manchester School of Computer Science

Video Display Unit (VDU)


Historically derived from Cathode Ray Tube (CRT) technology
Horizontal flyback Vertical flyback
Based on ‘scan lines’

(vertical flyback takes


several line times really)
Blank Active video Blank

A CRT scanned the display incessantly, so needed a real time stream of pixel data.
Although in principle LCDs could be ‘abused’ here, the standard interface has been retained.

COMP32211 – Implementing System-on-Chip Designs Display systems Slide 1

Analogue TV & Monitors Computer Monitors


For many decades televisions were analogue devices which worked as follows: CRT computer monitors were derived from TV technology. The more modern
devices have proportionately smaller blanking times so most of the time the
❏ A ‘spot’ was raster-scanned across the screen, ‘slowly’ in one
interface is carrying active (useful) signals, but the principle is the same.
direction (traditionally left-to-right when viewed from in front) then
rapidly in the other direction to its starting point (‘flyback’). An aging, though still useful, interface is the VGA which has analogue signals
❏ Each scan line was displaced by a ‘fixed’ distance, stepping driving the colour intensities and digital (true/false) sync. indicators.
between sweeps (traditionally from top to bottom); after sufficient
scan-lines had been drawn the spot was rapidly returned to the top
line. Red Green Blue Hsync Vsync

❏ During these sweeps the spot’s intensity was varied to ‘shade’


different intensities. Before, during and after flyback – both Gnd Gnd Gnd Gnd
horizontal and vertical – the spot was ‘blanked’ (i.e. set to 9-way VGA ‘D’ connector
minimum intensity – black) so the flyback was invisible.
❍ Blanking covered an interval from near the end of a line until
the start of the next, and for several lines’ time for vertical fly- The use of analogue signals necessitates Digital to Analogue Converters (DACs)
back. for the colour signals and makes electrical noise pick-up more of a problem.
❍ Colour could be provided by directing three, independently
controlled spots onto adjacent {red, green, blue} phosphors.
The display generated is digital in the vertical dimension (it has discrete scan
lines) but analogue horizontally as the spot sweeps continuously and its inten- VDU controller
sity is varied in real time.
The VDU controller’s job is to:
For a television it is important that the picture is displayed at the same rate as it
is broadcast! To achieve this, synchronisation signals are sent to initiate each fly- ❏ Generate the timing signals for active, blanking and sync. phases
back. The display has Phase-Locked Loops (PLLs) which regulate its scan rates ❏ Generate addresses and read the frame store memory to determine
and these are governed by these ‘sync. pulses’. Blanking is initiated some time pixel values (colours) when appropriate
before the sync. pulse (known as the ‘front porch’) and persists for some time ❏ Serialise the pixel values and send them to the display at the pixel
afterwards (the ‘back porch’). rate
Horizontal and vertical synchronisation is the same, in principle, although the It is a fairly simple state machine although the various parameters may be pro-
vertical timing is much slower. The interval between vertical syncs. sets the grammable for different display hardware and screen resolutions.
frame rate of the display.
Analogue broadcast TV is an old standard and has/had relatively long blanking
intervals when compared with the active video time.
University of Manchester School of Computer Science

Digital Visual Interface (DVI)


Analogue I/F
V

❏ Primarily digitally coded channels


❏ Differential signal pairs in each colour
❏ Two channels (available) to increase potential bandwidth
❏ Digital communications for monitor type/status/control
❏ Analogue signals for backwards compatibility

High-Definition Multimedia Interface (HDMI)

CEC

COMP32211 – Implementing System-on-Chip Designs Display systems Slide 2

Digital Visual Interface (DVI) HDMI (High-Definition Multimedia Interface)


There are various DVI standards: DVI-A is a backward-compatible Analogue HDMI is basically similar to DVI-D, only providing a single set of digital chan-
interface, DVI-D is the Digital form and DVI-I Integrates the two. In some dig- nels (no analogue). HDMI uses the blanking period between active video scans
ital interfaces, having two parallel links improves bandwidth allowing higher to encode control and other information, such as audio channels.
definition displays.
The data may be encoded in ways other than RGB (e.g. YCrCb).
DVI-D A single, serial Consumer Electronics Control channel is also included to carry
data such as that from ‘remote control’ handsets.
High-speed digital data is conveyed on multiple serial channels. Each channel
comprises a Current Mode Logic (CML) twisted-wire differential pair; this
helps to improve noise immunity. The data is encoded using a form of 8b/10b
encoding. 8b/10b encoding
Data is sent uncompressed and in real time, thus the general pattern of the out- In summary, when sending data across a synchronous serial line it is nec-
put scan is similar to that used for earlier displays such as CRTs. essary to include enough information for the receiver to recover the clock
(to discriminate between adjacent bits) as well as read the data itself.
Display Data Channel (DDC) Clearly a pure binary signal is not adequate as it may consist of many con-
With the economic availability of LCD (Liquid Crystal Display) flat-panel dis- secutive ‘0’s or consecutive ‘1’s.
plays came a wider range of displays. In particular, the typical aspect ratio of Many coding schemes have been devised. The link can only be switched at
displays has moved from 4:3 to 16:9. Rather than distorting the picture, a better a certain maximum rate; to get the best useful bandwidth, a scheme needs
solution is to output the display in an appropriate form but this requires the com- to provide ‘enough’ information to recover the clock but not so much
puter to be aware of the type of display. redundancy that the data rate is compromised.
The first forms of data sensing merely detected the monitor type. Current com- 8b/10b is one such scheme which codes 8 bits of data into 10 bit-time sym-
munications are more sophisticated with the computer communicating with an
bols (i.e. has a constant 25% overhead). It was first patented (now expired)
embedded controller on the monitor allowing the downloading of information
by IBM in the 1980s.
on a monitor’s aspect ratio, resolution, orientation et cetera. If you are suffi-
ciently interested you can look up “EDID” (Extended Display Identification An important property is that it has DC balance – meaning that, averaged
Data). With some it is also possible to write commands to the monitor, for over time, the symbols contain the same number of ‘0’s and ‘1’s. This
instance to control brightness or contrast. requires two possible codes per 8-bit byte.
The usual DDC is based on the two-wire I2C bus which allows fairly low band- When symbols with insufficient transitions for clock recovery are dis-
width communication using a bidirectional serial protocol. This only requires a carded there are 268 legal codes, which allows any 8-bit data value to be
small addition to the connector and wiring requirement. sent plus allowing some control codes (“K-codes”) for the link (which the
user need not know about).
The information available from a monitor can identify the manufacturer and
model as well as the different resolutions which are supported, the timing char- 8b/10b is in common use, including for DVI, PCI Express, Infiniband,
acteristics, colour resolution etc. There may be a ‘preferred’ mode which the SATA …
monitor is designed to use.
University of Manchester School of Computer Science

Frame store
Based on a 2D array of memory (frame store) with a ‘numeric’ representation of a pixel’s colour.

0 1 2 3 …

640 641 642 …

1280 1281 …

… …

R = FF
G = FF
B = 00

Each location has an address; this may be a byte, or several bytes, or even less than a byte.
(The first address does not have to be 0000_0000.)
Each pixel’s data represents a colour: e.g. one byte/pixel gives 256 possible colours.
Colours are often separated into Red, Green and Blue intensities.

COMP32211 – Implementing System-on-Chip Designs Display systems Slide 3

Frame store Addressing


The display is made of pixels (‘picture elements’) which are ‘dots’; typically Note: the traditional address mapping is to have the lowest address at the top-left
these are rectangular and preferably more-or-less square. The screen comprises corner and increment addresses in rows. Thus the x axis runs left to right and the
a 2D array of pixels at a particular resolution (vertical & horizontal). y axis top to bottom.
An LCD has physical pixels which determine its maximum resolution. Lower To move down one pixel (i.e. y := y + 1) requires adding the length of a row to
resolution is possible by shading adjacent groups of pixels in the same way: for the address.
example a square of four physical pixels could represent a single logical one. If
Also note that if (for example) we have (say) an ARM system with 32 bits/pixel
the mapping is non-integer then some distortion may occur.
each pixel would occupy four addresses, so moving right one place (x := x + 1)
A standard, but now ‘low’, resolution display is the 640×480 VGA (Video would require adding 4 to the address. If the frame store width was 1024 pixels,
Graphics Array). This is specified for the older 4:3 monitor aspect ratio. moving down one pixel means adding 4*1024 = 4094 to the address.
The pixel shade/colour is held in a memory called a frame store. Pixels are read
successively from the frame store and serialised onto the display. A complete
frame refresh is done frequently enough to allow successive frames to give the To calculate the address of pixel (x, y):
impression of movement and to avoid disturbing flickering. For computer moni- address = screen_start_address + (y*width_in_pixels + x) * bytes_per_pixel
tors typical frame rates are in the region 50-100 Hz.
Typical screen widths (e.g. 640, 1024, 1280) are intended to make these multi-
Colour displays are now standard. Each pixel has a colour which is specified by plications easy.
a number of bits. The usual representation for computers is to code intensities of
the colours Red, Green and Blue (RGB) separately. This works because human
eyes have a limited range of colour sensors; the only colours we actually per- A frame store can be larger than the
ceive are centred in these spectral bands and other colours (such as yellow) are displayed area, although this may
perceived from appropriate mixtures of stimuli (red & green for yellow). waste some memory. It could be made
The colour outputs are ‘analogue’ (i.e. multi-levelled) outputs where the number displayed area smaller, too, but that would make little
of bits used determines how many shades are available. Human eyes are not very sense!
frame store
sensitive to colour intensities so 8 bits per colour is more than adequate (espe-
cially for blue, where perception is worse). Eight bits is, of course, a convenient
number for digital computer implementation.
Having three colours is less convenient, so often the entire pixel is mapped into
32 bits; the extra 8 bits can find other uses which need not concern us.
University of Manchester School of Computer Science

Frame store accessing & bandwidth


Frame store can occupy significant memory.
Remember doubling the linear resolution multiplies the number of pixels by four.

address address VDUC


Arbitration/
Drawing control
data data

a d

Frame store

❏ Pixels need to be read many times per second to keep the display stable. This impacts:
❍ The output rate to the DVI (or whatever).
❍ The need for RAM access to the frame store.
Frame store bandwidth is critical.

COMP32211 – Implementing System-on-Chip Designs Display systems Slide 4

Screen update Bandwidth requirements


Although it is not germane to the drawing process, the frame store is also con- Some sums … and sensible approximations
stantly being read by hardware which is updating the display. This shares access
Let’s take a ‘High Definition’ (HD) display resolution of 1920×1080 pixels with
to (typically Time Division Multiplexing) the frame store memory. Memory
accesses are relatively slow so frame store bandwidth is always an ‘issue’, made 4 bytes per pixel. This requires 1920×1080×4 = 8294400 bytes of storage.
worse as the screen resolution increases. Rather than reach for a calculator, let’s rough it out.
As it has to be shared, the frame store may not be available exactly when you ‘Almost 2000’ × ‘just over 1000’ is going to be around two million pixels so we
want it. This influences the interface design. The highest priority for access goes need 2M×4 = 8 MB of frame store.
to the VDU read-out because if that fails to meet its real-time constraint there
will be glitches visible on the screen. More than one other device may share Let’s say1 this supports a frame rate of 50 Hz: it has to be copied to the display
access too: for example in the lab. both the microprocessor and the graphics 50 times a second, so there is a bandwidth requirement of around 400 MB.s-1.
accelerator compete for the remaining bandwidth. Note: that’s megabytes, not megabits. Minimum. It doesn’t allow for other data,
pauses for blanking, sync. etc.
Double buffering
If you want a bit rate, multiply by 8 and add a bit more for overheads: calling it
In a system which may animate a display there is a conflict between using the 4 Gb.s-1 won’t be far wrong.
frame store for what can currently be seen and the future picture under construc-
tion. This is typically resolved by double buffering: having a larger-than-needed
frame store and displaying from one area whilst drawing in another. Looked at another way, the pixel rate will be two million times 50 plus whatever
the overhead is, so something over 100 MHz – not too scary a frequency on-chip
Draw Display
(these days) but quite aggressive on a PCB!

The frame store needs to be read to supply this demand. If a single pixel (32-bit
word) were read at this rate the memory would need to cycle in <10 ns; not
really feasible for the ‘big’ (multi-megabyte even assuming a single frame store
and there could be more than one) memory devices needed. Thus there needs to
be a means of increasing the memory bandwidth. Fortunately the read-out pat-
Frame store
terns are entirely predictable; it’s easy enough to read the frame store at many
words wide and then serialise this data.

In the absence of dual-port memory the accesses either must interleave in time Also note, if implementing animation, at least, there is another bandwidth
(a typical solution) or two (smaller) separate and switchable frame store memo- requirement to allow concurrent writing of the pixels – and a real-time limit too.
ries are needed (expensive).

1. To keep the numbers easy.


University of Manchester School of Computer Science

Drawing Straight Lines


An example of mapping an algorithm to hardware.

y = m.x + c
Line is aliased onto pixel array.

Constant ‘width’ of 1 pixel looks least lumpy


Shade in the ‘nearest’ pixel to the desired point.

COMP32211 – Implementing System-on-Chip Designs Display systems Slide 5

How not to plot a line Anti-aliasing


Don’t calculate every point independently. Figures drawn in square pixels – especially at low resolution – end up ‘pixel-
lated’; lines look stepped.
Anti-aliasing is a method of blurring these steps. All pixels the theoretical line
crosses are shaded but the degree of shading is proportional to how much of the
Y1 - Y0 pixel the true line passes through. The line’s colour is blended with the back-
m= ground.
(X1,Y1) X1 - X0

(X0,Y0)

Coordinates Not anti-aliased


(X0, Y0) int(x+0.5)rounds x
(X0+1, int(Y0+m+0.5)) to the nearest integer
(X0+2, int(Y0+2m+0.5))
(X0+3, int(Y0+3m+0.5))
( … , … )

Problems:
❏ Division needed once
❏ Multiplication needed constantly
Anti-aliased
❏ Rounding errors
Anti-aliasing requires considerably more calculation and more memory opera-
tions (including reading the pre-existing background).
University of Manchester School of Computer Science

Bresenham’s line algorithm


❏ Calculate each point iteratively from its predecessor
❏ Avoid multiplication/division (by using similar triangles)
❍ Uses only integers: no rounding problems
Principle Integer code
x = X0; x = X0;
y = Y0; y = Y0;
plot (x,y); plot (x,y);
length = X1 - X0; dx = X1 - X0;
m = (Y1 - Y0) / (X1 - X0); dy = Y1 - Y0;
e = 0; e = -dx; // Starting offset
for (length) for (dx)
x = x + 1; x = x + 1;
e = e + m; e = e + 2*dy;
if (e >= 0.5) if (e >= 0) // Easy compare
y = y + 1; // y integer step y = y + 1;
e = e - 1; // Keep |e| < 0.5 e = e - 2*dx;
plot (x,y); plot (x,y);

COMP32211 – Implementing System-on-Chip Designs Display systems Slide 6

Octants Optimisation
The foregoing assumes that the line is in the There is another optimisation which reduces the length of the loop by simplify-
shaded octant, shown here. If it is not, the same ing the ‘plot’ operation. Instead of translating coordinates on each iteration,
approach can be followed with some slight varia- simply work out the address of the starting point and retain that. Using the
tions. assumptions of ‘one address per pixel’ and ‘640 pixels per line’, the following
translations take place:
In this example, x is incremented and y is incre-
mented conditionally. For the octant immediately x = x + 1 ⇒ address = address + 1
below the x axis, x is incremented and y is condi- y = y + 1 ⇒ address = address + 640
tionally decremented. As long as the coordinates
The plot no longer needs to do any translation, just the store.
are modified in the correct way it the signs of the
internal variables are irrelevant. A disadvantage of this method is that running off the edge of the frame store is
not apparent, as it may be if clipping the x any y coordinates.
Similarly, if the slope of the line is >1 (i.e. ‘steeper than 45°’) then x and y are
exchanged. A similar transformation can be applied if the line is going ‘right’ or
‘down’.

If you have more than one pixel/word in the frame store (as in the lab.) then one
Similar triangles can speed up drawing by writing several pixels at once. These pixels must be in
the same word and so will form a horizontal group. This is not very useful when
The gradient (‘m’) of a step from one pixel to the next is derived from the verti- drawing single lines because there will often not be several adjacent pixels
cal/horizontal distances between end points. Although ‘m’ is typically fractional within the same word.
(0 ≤ m ≤ 1) the distances between endpoints are integers.
y
It is very useful when filling areas (e.g. clear screen) and similar (e.g. character
drawing) where it can reduce drawing times by (e.g. 4x).
m
1 dy

dx

Thus, when considering whether the y coordinate should change, instead of


thinking of little steps (1, m) we can think of big ones (2dx, 2dy) and the deci-
sion will still be the same.
(The extra factor of 2 is convenient because we want to step when half-way to
round to the nearest pixel and this avoids the 1--2- ).
University of Manchester School of Computer Science

Parallelism
Identifying parallelism is a good plan: e.g. Bresenham’s line algorithm
2 clocks/iteration 1 clock/iteration
x <= X0; x <= X0;
y <= Y0; y <= Y0;
dx <= X1 - X0; dx <= X1 - X0;
dy <= Y1 - Y0; dy <= Y1 - Y0;
e <= -dx; e <= -dx;
for (dx) for (dx)
plot(x,y); plot(x,y);
x <= x + 1; x <= x + 1;
e <= e + 2*dy; if (e + 2*dy >= 0)
if (e >= 0) y <= y + 1;
y <= y + 1; e <= e + 2*(dy - dx);
e <= e - 2*dx; else
plot(x,y); e <= e + 2*dy;
plot(x,y);

Also note the pipelining here: plot overlaps with the next pixel calculation.

In the second example the critical path is likely to be longer (‘if’ calculation followed by multiplexer)
but not much worse (multiplexers are quick).

COMP32211 – Implementing System-on-Chip Designs Display systems Slide 7

Parallelism Problem
Probably the biggest ‘mistake’ made by people starting to develop HDL code is Fill in the timing diagram for this module.
to think serially, as it a conventional (imperative) programming language. In C,
reg [3:0] counter;
Java, assembly language etc. statements can be viewed as executing one after
the other … because they need to (at least in principle). reg carry;

In hardware the only needs are due to dependencies and resources – and always @ (posedge clk)
resources shouldn’t be too much of an issue within this lab. Thus statements if (en && carry_in) // Hint on fn. of ‘carry’
need to be mapped into time slots but as many statements as possible can go in begin
the same time. This leads to a much faster implementation than a simple one- if (counter == 9)
statement-per-clock machine. begin
counter <= 0;
The number of serial processing steps which take place in a single cycle (i.e. the carry <= 1;
critical path length) also concerns the designer; however the cycle is generous end
in the lab. so it is not likely to be a major concern when describing logic.
else
begin
counter <= counter + 1;
When developing your own code, design it before you implement. Plan what carry <= 1;
should happen (e.g. on a piece of paper) in each clock cycle. end
Pay attention to which values are latched. A common problem is that a value is end
only available after a clock edge when you want it in the current cycle. The
choice is then whether to derive the signal combinatorially so that it is available
a bit earlier or whether to start work a cycle earlier. See the problem on the right. clk

counter 7 8 9 0

carry

The circuit is unlikely to be useful. Rewrite the Verilog in at least one way to do
what the designed presumably intended.

You might also like