0% found this document useful (0 votes)
55 views

Unicode Input and Fonts

Unicode is a standard that assigns a unique number, called a code point, to each character, formatting symbol, and control code across many written languages. It was created to allow computers to consistently represent and process text in any language. Language input software allows users to type characters not on their keyboard layout by mapping keyboard keys to character code points. Fonts contain glyph images for displaying characters, but some non-Unicode fonts incorrectly assign glyphs, ignoring the Unicode standard. Character encoding specifies how code points are represented in text files and streams.

Uploaded by

Sim Tze Wei
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

Unicode Input and Fonts

Unicode is a standard that assigns a unique number, called a code point, to each character, formatting symbol, and control code across many written languages. It was created to allow computers to consistently represent and process text in any language. Language input software allows users to type characters not on their keyboard layout by mapping keyboard keys to character code points. Fonts contain glyph images for displaying characters, but some non-Unicode fonts incorrectly assign glyphs, ignoring the Unicode standard. Character encoding specifies how code points are represented in text files and streams.

Uploaded by

Sim Tze Wei
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Part 1: UNICODE, INPUT AND FONTS

by Tze Wei Sim [email protected]

Content
Unicode Language Input Software Fonts vs. Glyphs

What is Unicode?

Most early computers can only handle the characters used in the English language. Only 128 characters defined by the ASCII (American Standard Code for Information Interchange) chart can be typed with a keyboard. Each character is lined up in the chart and given a code (number). For example, A is the 65th character in the chart, a is the 97th character in the chart.

What is Unicode?

To cater to a multilingual world, a group of ambitious people called the Unicode Consortium extended the chart to include almost every single character in the world.

Almost 100 scripts and 110,000 characters in the world are lined up and given a code (number).

Language Input Software



We instruct a computer to show A by pressing the A key (65 th key) on a keyboard. The keyboard sends a signal to tell the computer to show the 65 th character in the Unicode chart. However, there are only 105 keys on a standard keyboard. How do we type (the 241st character) with a keyboard?

Language Input Software

The list of languages in the top right hand corner of the screen are different input languages

When you change it to Spanish, you are not switching keyboard.


By choosing Spanish, you are, in fact, changing the language input software which help you to transcode 59 (the semi-colon button on the keyboard) to 241. When the keyboard sends the instruction key in the 59th character, the Spanish input software translates that instruction to key in the 241st character.

Font vs. Glyph

A font is a file that maps the codes (numbers) with pictorial glyphs to be shown on computer screen.

It contains picture files of characters. The picture files are called glyphs.
How the picture files or glyphs are designed is totally up to font designers.

Some font designers exploit the fact that they are free to design how the picture or glyph would appear. They completely ignore the character map prescribed by Unicode and jumble up the order of the characters.

Non-Unicode Fonts

Some fonts i.e. Kruti Dev 010.ttf assign the glyphs of some characters to the numbers already taken by other characters in the Unicode chart.

These fonts are called non-Unicode fonts.

Character Encoding

What is the difference between Unicode and UTF-8? What is character encoding? Please read part 2 of this presentation on character encoding:

CHARACTER ENCODING: How do computers deal with multiple languages?

You might also like