0% found this document useful (0 votes)
9 views

Week 3 Unicode and Windows Architecture

Unicode is a 16-bit character encoding standard that can represent all languages worldwide. It extends ASCII by using 16 bits per character rather than 8. Windows uses Unicode as the default character encoding. A window in Windows is an object that processes messages. It is based on a window class, which identifies the window procedure that handles messages for that type of window, allowing multiple windows to use the same message handling code. Common window types include application windows, dialog boxes, and child window controls like buttons.

Uploaded by

fatah.ozil
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Week 3 Unicode and Windows Architecture

Unicode is a 16-bit character encoding standard that can represent all languages worldwide. It extends ASCII by using 16 bits per character rather than 8. Windows uses Unicode as the default character encoding. A window in Windows is an object that processes messages. It is based on a window class, which identifies the window procedure that handles messages for that type of window, allowing multiple windows to use the same message handling code. Common window types include application windows, dialog boxes, and child window controls like buttons.

Uploaded by

fatah.ozil
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Computer System

and Multimedia (Week 3):


Unicode and Windows Architecture
Textbook: Programming Windows / Charles Petzold. -- 5th ed.
Chapter 2. An Introduction to Unicode

1
Unicode
• Unicode is an aspects of C that not encountered in conventional character-
mode programming but that play a part in Microsoft Windows. The subject
of wide-character sets and Unicode almost certainly qualifies in that respect.
• Unicode is an extension of ASCII character encoding. Rather than the 7 bits
used to represent each character in strict ASCII, or the 8 bits per character
that have become common on computers, Unicode uses a full 16 bits for
character encoding.
• This allows Unicode to represent all the letters, ideographs, and other
symbols used in all the written languages of the world that are likely to be
used in computer communication.
• Unicode is intended initially to supplement ASCII.
• In this course, we will use the Compiler default character setting which is
unicode but we will still need to confront ASCII character received from
standard sources e.g from files or network, which needed to be
converted/adapted to unicode processing functions.
2
The char (ANSI) Data Type (1)
• The following statement defines and initializes a variable
containing a single character:
char c = `A' ;
The variable c requires 1 byte of storage and will be initialized with
the hexadecimal value 0x41, which is the ASCII code for the letter A.
• You can define a pointer to a character string like so:
char * p ;
Because Windows is a 32-bit operating system, the pointer variable
p requires 4 bytes of storage.
• You can also initialize a pointer to a character string:
char * p = "Hello!" ;
The variable p still requires 4 bytes of storage as before. The
character string is stored in static memory and uses 7 bytes of
storage—the 6 bytes of the string in addition to a terminating 0.
3
The char (ANSI) Data Type (2)
• You can also define an array of characters, like this:
char a[10] ;
In this case, the compiler reserves 10 bytes of storage for the
array.
The expression sizeof (a) will return 10.
• If the array is global (that is, defined outside any
function), you can initialize an array of characters by
using a statement like so:
char a[] = "Hello!" ;
• If you define this array as a local variable to a function, it
must be defined as a static variable, as follows:
static char a[] = "Hello!" ;
In either case, the string is stored in static program memory with
a 0 appended at the end, thus requiring 7 bytes of storage. 4
Unicode (WideCharacter) (1)

• Unicode or wide characters does not alter the meaning of the


char data type in C.
• The char continues to indicate 1 byte of storage, and sizeof
(char) continues to return 1.
• In theory, a byte in C can be greater than 8 bits, but for most of
us, a byte (and hence a char) is 8 bits wide.
• Wide characters in C are based on the wchar_t data type,
which is defined in several header files, including WCHAR.H,
like so:
typedef unsigned short wchar_t ;
• Thus, the wchar_t data type is the same as an unsigned short
integer: 16 bits wide.
5
Unicode (Wide Character) (2)
• To define a variable containing a single wide character, use the
following statement:
wchar_t c = `A' ;
• The variable c is the two-byte value 0x0041, which is the
Unicode representation of the letter A.
(However, because Intel microprocessors store multibyte values with the least-
significant bytes first (little endian), the bytes are actually stored in memory in the
sequence 0x41, 0x00. Keep this in mind if you examine memory storage of
Unicode text.)
• You can also define an initialized pointer to a wide-character
string:
wchar_t * p = L"Hello!" ;
Notice the capital L (for long) immediately preceding the first quotation
mark indicates to the compiler that the string is to be stored with wide
characters—that is, with every character occupying 2 bytes.
The pointer variable p requires 4 bytes of storage, as usual, but the
character string requires 14 bytes—2 bytes for each character with 2
6
bytes of zeros at the end.
Unicode (Wide Character) (3)
• Similarly, you can define an array of wide characters this
way:
static wchar_t a[] = L"Hello!" ;
The string again requires 14 bytes of storage, and sizeof (a) will
return 14.
You can index the a array to get at the individual characters.
The value a[1] is the wide character `e', or 0x0065.
With that L will the compiler know you want the string to be
stored with 2 bytes per character. Later on, when we look at
wide-character strings in places other than variable definitions,
you'll encounter the L preceding the first quotation mark again.
• You can also use the L prefix in front of single character
literals, as shown here, to indicate that they should be
interpreted as wide characters.
7
wchar_t c = L'A' ;
Library Functions
To find the length of a string
• Char (ANSI)
char * pc = "Hello!" ;
iLength = strlen (pc) ;
The variable iLength will be set equal to 6, the number
of characters in the string received for strlen (pc) .
• Unicode
wchar_t * pw = L"Hello!" ;
iLength = wcslen (pw) ;
The function returns 6, the number of characters in
the string which will be stored into iLength .
8
Maintaining a Single Source
• To declare character or string use
TCHAR
which will be renamed to char or wchar_t (Unicode )
• To get the length of a character string use
_tcslen
which will be renamed to strlen or wcslen (Unicode )
• The compile will decide whether to use Unicode or char
(a.k.a Multibyte character) from the Project Properties|
Configuration settings which by default is Unicode
characters.
• In this course we will not change the default properties.
9
Pointers data type (char)
• Defined in WINNT.H
– six data types you can use as pointers to 8-bit
character strings
typedef CHAR * PCHAR, * LPCH, * PCH, *NPSTR, *
LPSTR, *PSTR
– four data types you can use as pointers to const 8-
bit character strings
typedef CONST CHAR * LPCCH, * PCCH, * LPCSTR, *
PCSTR ;

10
Pointers data type
• Defined in WINNT.H
– six data types you can use as pointers to 16-bit
character strings
typedef WCHAR * PWCHAR, * LPWCH, * PWCH,
*NWPSTR, * LPWSTR, * PWSTR ;
– four data types you can use as pointers to const
16-bit character strings
typedef CONST WCHAR * LPCWCH, * PCWCH, *
LPCWSTR, * PCWSTR ;

11
The Windows Function Calls (1)
• In WINUSER.H
– MessageBox identifier is defined
#ifdef UNICODE
#define MessageBox MessageBoxW
#else
#define MessageBox MessageBoxA
#endif

• Thus, all the MessageBox function calls that appear in


your program will actually be MessageBoxW functions
since by default the compiler is set to Unicode characters.
• However you can explicitly use MessageBoxA functions
in Unicode configured

12
The Windows Function Calls (2)
• In WINUSER.H
– SetWindowText identifier is defined
BOOL WINAPI SetWindowTextA ( HWND hWnd, LPCSTR lpString );
BOOL WINAPI SetWindowTextW ( HWND hWnd, LPCWSTR lpString );
#ifdef UNICODE
#define SetWindowText SetWindowTextW
#else
#define SetWindowText SetWindowTextA
#endif

• If the source of data is ANSI character, you will


need to explicitly use SetWindowTextA in a
unicode configured project.
13
Computer System
and Multimedia (Week 2)

Textbook: Programming Windows / Charles Petzold. -- 5th ed.


Chapter 3. Windows and Messages

14
Window Architectural Overview (1)
• A window is a type of object-oriented program
• Types of windows:
– application windows
• contain a title bar that shows the program's name, a menu, and perhaps a toolbar and a
scroll bar.
– dialog box
• may or may not have a title bar
– push buttons, radio buttons, check boxes, list boxes, scroll bars, and text-entry
fields that adorn the surfaces of dialog boxes.
• called "child windows" or "control windows" or "child window controls.“
• A window is always created based on a "window class."
• The window class identifies the window procedure that processes
messages to the window. The use of a window class allows multiple
windows to be based on the same window class and hence use the same
window procedure.
15
Window Architectural Overview (2)
• For example, all buttons in all Windows programs are based on the
same window class.
• This window class is associated with a window procedure located in
a Windows dynamic-link library that processes messages to all the
button windows.
• The user sees these windows as objects on the screen and interacts
directly with them using the keyboard or the mouse.
• The window receives the user input in the form of "messages" to
the window.
• A window also uses messages to communicate with other windows.
• Very often these messages inform a window of user input from the
keyboard or the mouse.

16
Window Architectural Overview (3)
• Every window that a program creates has an
associated window procedure.
• This window procedure is a function that could be
either in the program itself or in a dynamic-link
library.
• Windows sends a message to a window by calling
the window procedure.
• The window procedure does some processing
based on the message and then returns control to
Windows.
17
Components of a Typical Application Window

18
Writing a Windows Program
• Generally, Windows programmers begin a new
program by copying an existing program and
making appropriate changes to it.
• HELLOWIN.CPP is an example program which
we will use to understand about creating a
basic window program.
• Click here to open program.

19
Windows Programming Example
• Write the program“HELLOWIN.CPP” in a Windows project named
“Petzold”.
• Build and RUN the program. You will hear no sound which you should.
• Search for file “Petzold.exe” and open the folder where the file is located.
• Search for file “hellowin.wav” and open the folder where the file is
located.
• Copy “hellowin.wav” to the folder where the file “Petzold.exe” is located.
• Now run the program and you will hear this sound.
• This program creates a normal application window, and displays, "Hello,
Windows!" at the centre of that window. If you have a sound board
installed, you will also hear me saying the same thing.
• For Lecturer use this solution

20

You might also like