0% found this document useful (0 votes)
13 views88 pages

ASCII Docx1

ASCII, or American Standard Code for Information Interchange, is a character encoding standard that represents 128 English characters as numbers from 0 to 127. It is widely used for text representation in computers, allowing for data transfer between different systems. While ASCII is limited to basic English characters, extended versions exist to accommodate additional symbols and characters from other languages.

Uploaded by

Legesse Samuel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views88 pages

ASCII Docx1

ASCII, or American Standard Code for Information Interchange, is a character encoding standard that represents 128 English characters as numbers from 0 to 127. It is widely used for text representation in computers, allowing for data transfer between different systems. While ASCII is limited to basic English characters, extended versions exist to accommodate additional symbols and characters from other languages.

Uploaded by

Legesse Samuel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 88

ASCII

Pronounced ask-ee, ASCII is the acronym for the American Standard Code for Information Interchange. It is a code for representing
128 English characters as numbers, with each letter assigned a number from 0 to 127. For example, the ASCII code for uppercase M is
77. Most computers use ASCII codes to represent text, which makes it possible to transfer data from one computer to another.
Recommended Reading: Webopedia's ASCII table page.
Text files stored in ASCII format are sometimes called ASCII files. Text editors and word processors are usually capable of storing
data in ASCII format, although ASCII format is not always the default storage format. Most data files, particularly if they contain
numeric data, are not stored in ASCII format. Executable programs are never stored in ASCII format.
The standard ASCII Character Set
The standard ASCII character set uses just 7 bits for each character. There are several larger character sets that use 8 bits, which gives
them 128 additional characters. The extra characters are used to represent non-English characters, graphics symbols, and mathematical
symbols.
Several companies and organizations have proposed extensions for these 128 characters. The DOS operating system uses a superset of
ASCII called extended ASCII or high ASCII. A more universal standard is the ISO Latin 1 set of characters, which is used by many
operating systems, as well as Web browsers.
Another set of codes that is used on large IBM computers is EBCDIC.

What is a ASCII Code


ASCII is an acronym for American Standard Code for Information Interchange.
It is a code that uses numbers to represent characters. Each letter is assigned a number between 0 and 127. A upper and lower case
character are assigned different numbers. For example the character A is assigned the decimal number 65, while a is assigned decimal
97 as shown below int the ASCII table.

The ASCII code predates the Internet and has been around since the days of teletypes and mechanical printers. ASCII decimal
numbers from 0 to 31 represent control codes that are not used that much these days. However if you are playing with
communications protocols you will see these control codes in use. The ASCII Control Codes table explains what each of these control
codes are.
When is ASCII code used
When a computer sends data the keys you press or the text you send and receive is sent as a bunch of numbers. These numbers
represent the characters you typed or generated. Because the range of standard ASCII is 0 to 127 it only requires 7 bits or 1 byte of
data. For example to send the string cactus.io as ascii it would translate to 99 97 99 116 117 115 46 105 111. Microprocessors only
understand bits and bytes. To it everything is a sequence of bits.

What is the difference between an ASCII code and a HTML code


The original ASCII code only had a range of 128 characters which is very limited in the range of characters. It basically only supports
the english character set. You could have used the extended ascii characters which ranged from 128 to 255. Because the ascii code
range is 0 to 255 it can fit inside 1 byte of data.

The HTML code is based on the different character sets that can range from a single byte character set such as Latin-1 (ISO-8859-1)
or UTF-8 which uses multiple bytes to represent a character. Using a charcter set such as UTF-8 gives us a much larger range of
character sets.

When using a web browser the web site we are using would normally specify the character set it is using. For example in a HTML5
web page you might see the string <meta charset="utf-8"> in the page source. This tells the browser that the data being sent utilises
the UTF-8 character table.
The HTML code is usually in the format of &#169;. The & tells the browser that it is a HTML code and not part of a string. The #
after the & tells the browser that the following is an numerical value of a symbol. The ; is to tell the browser that is the end of the
code. In the case of &#169;, this is the html code that represents the copyright symbol ©.
Go to the Resources Toolbox for a range of HTML Code tables

Where would I use ASCII codes or HTML codes


You would use ASCII codes for all normal programming and communications when using your Arduino, Rasperry Pi or whatever
platform is in use. The only time you would HTML codes is if you are communicating to a web browser.
Ascii Hex Symbol
0 0 NUL
1 1 SOH
2 2 STX
3 3 ETX
ASCII Table
4 4 EOT
5 5 ENQ
6 6 ACK
7 7 BEL
8 8 BS
9 9 TAB
10 A LF
11 B VT
12 C FF
13 D CR
14 E SO
15 F SI

Ascii Hex Symbol

16 10 DLE
17 11 DC1

18 12 DC2

19 13 DC3

20 14 DC4

21 15 NAK

22 16 SYN

23 17 ETB

24 18 CAN

25 19 EM

26 1A SUB

27 1B ESC

28 1C FS

29 1D GS

30 1E RS

31 1F US
Ascii Hex Symbol
32 20 (Space)
33 21 !
34 22 "
35 23 #
36 24 $
37 25 %
38 26 &
39 27 '
40 28 (
41 29 )
42 2A *
43 2B +
44 2C ,
45 2D -
46 2E .
47 2F /

Ascii Hex Symbol


48 30 0
49 31 1
50 32 2
51 33 3
52 34 4
53 35 5
53 36 6
55 37 7
56 38 8
57 39 9
58 3A :
59 3B ;
60 3C <
61 3D =
62 3E >
63 3F ?

Ascii Hex Symbol


64 40 @
65 41 A
66 42 B
67 43 C
68 44 D
69 45 E
70 46 F
71 47 G
72 48 H
73 49 I
74 4A J
75 4B K
76 4C L
77 4D M
78 4E N
79 4F O

Ascii Hex Symbol


80 50 P
81 51 Q
82 52 R
83 53 S
84 54 T
85 55 U
86 56 V
87 57 W
88 58 X
89 59 Y
90 5A Z
91 5B [
92 5C \
93 5D ]
94 5E ^
95 5F _

Ascii Hex Symbol


96 60 '
97 61 a
98 62 b
99 63 c
100 64 d
101 65 e
102 66 f
103 67 g
104 68 h
105 69 i
106 6A j
107 6B k
108 6C l
109 6D m
110 6E n
111 6F o

Ascii Hex Symbol


112 70 p
113 71 q
114 72 r
115 73 s
116 74 t
117 75 u
118 76 v
119 77 w
120 78 x
121 79 y
122 7A z
123 7B (
124 7C |
125 7D )
126 7E ~
127 7F

ASCII Stands for American Standard Code for Information Interchange (pronounced 'as-key'). This is a standard set of characters
understood by all computers, consisting mostly of letters and numbers plus a few basic symbols such as $ and %. Which employs the
128 possible 7-bit integers to encode the 52 uppercase and lowercase letters and 10 numeric digits of the Roman alphabet, plus
punctuation characters and some other symbols. The fact that almost everyone agrees on ASCII makes it relatively easy to
exchange information between different programs, different operating systems, and even different computers.
It also means you can easily print basic text and numbers on any printer, with the notable exception of PostScript printers. If you are
working in the MacWrite word processing application on the Mac and you need to send your file to someone who uses WordStar on
the PC, you can save the document as an ASCII file (which is the same as text-only). After you transfer the file to the PC (on a disk or
via a cable or modem),the other person will be able to open the file in WordStar.
In ASCII, each character has a number which the computer or printer uses to represent that character. For instance, a capitalAis
number 65 in the code. Although there are 256 possible characters in the code, ASCII standardizes only 128 characters, and the first
32 of these are "control characters," which are supposed to be used to control the computer and don't appear on the screen. That leaves
only enough code numbers for all the capital and lowercase letters, the digits, and the most common punctuation marks.
Another ASCII limitation is that the code doesn't include any information about the way the text should look (its format). ASCIIonly
tells you which characters the text contains. If you save a formatted document asASCII,you will lose all the font formatting, such as
the typeface changes, the italics, the bolds, and even the special characters like ©, TM, or ®. Usually carriage returns and tabs are
saved.
Unlike some earlier character encodings that used fewer than 7 bits, ASCII does have room for both the uppercase and lowercase
letters and all normal punctuation characters but, as it was designed to encode American English it does not include the accented
characters and ligatures required by many European languages (nor the UK pound sign £). These characters are provided in some 8-bit
EXTENDED ASCII character sets, including ISO LATIN 1 or ANSI 1, but not all software can display 8-bit characters, and some
serial communications channels still remove the eighth bit from each character. Despite its shortcomings, ASCII is still important as
the 'lowest common denominator' for representing textual data, which almost any computer in the world can display.
The ASCII standard was certified by ANSI in 1977,and the ISO adopted an almost identical code as ISO 646.

Related Articles (You May Also Like)


Why is ASCII important?

ASCII is important because it is our link between our computer screen and our computer hard drive, and that link is now the same
between all computers.

What is ASCII used for?

ASCII is used to translate computer text to human text.

All computers speak in binary, a series of 0 and 1. However, just like English and Spanish can use the same alphabet but have
completely different words for similar objects, computers also had their own version of languages. ASCII is used as a method to give
all computers the same language, allowing them to share documents and files.

ASCII is important because the development gave computers a common language.

What does ASCII stand for?

ASCII is an acronym that stands for American Standard Code for Information Interchange.
What are ASCII tables used for?

ASCII tables are well known in computer circles because they are the babblefish that works between computer hard drives and
humans.

Babblefish, if you don’t know, is a fish from Hitchhiker’s Guide of the Galaxy that can be put in your ear to translate alien languages.
Having the common tables in ASCII was important for computers to be able to talk to each other.
Hard drives store information on magnets (or transistors), that only have two states, on and off. ASCII tables are how we go from a set
of eight 0s and 1s (or a byte of data) to the letter “a” or “A”, or the number “4”. The tables are commonly used across all computer
systems, which allows my computer to read word documents written on your computer, even if I use a PC and you use a Mac – and
no, it was not always like that! The tables include the ASCII alphabet, ASCII binary, ASCII symbols and more!

What are activities that include ASCII and binary?

We just answered why is ASCII important, and what is ASCII used for. If you want to learn more about ASCII tables and the binary
language we have two downloadable projects for you – stamping your initials in binary and writing I Love You in binary. Both
projects have an ASCII alphabet table to convert the binary to letters.
Check out the video below to learn even more about how ASCII tables are used when computers read and write data in binary. Learn
binary and imaging in our newest Teachers Pay Teachers unit! There is a great binary imaging sheet that you can even download for
FREE!
Share this post:
Ascii control codes (control characters, C0 controls)
The following document lists the control codes (control characters) in Ascii and in newer character code
standards like Unicode, which generally try to be compatible with Ascii in the Ascii code range (positions 0
through 127).

code pos. Unicode Description in C0 of ISO 646

dec. hex. abbr. name

A control character used to accomplish media-fill or time-fill.


Null characters may be inserted into or removed from a stream
of data without affecting the information content of that
0 0 NUL NULL
stream. But then the addition or removal of these characters
may affect the information layout and/or the control of
equipment.

START OF A transmission control character used as the first character of a


ctl-A 1 1 SOH
HEADING heading of an information message.

A transmission control character which precedes a text and


ctl-B 2 2 STX START OF TEXT
which is used to terminate a heading.

ctl-C 3 3 ETX END OF TEXT A transmission control character which terminates a text.
code pos. Unicode Description in C0 of ISO 646

dec. hex. abbr. name

END OF A transmission control character used to indicate the


ctl-D 4 4 EOT
TRANSMISSION conclusion of the transmission of one or more texts..

A transmission control character used as a request for a


response from a remote station; the response may include
station identification and/or station status. When a "Who are
you" function is required on the general switched transmission
ctl-E 5 5 ENQ ENQUIRY network, the first use of ENQ after the connection is
established shall have the meaning "Who are you" (station
identification). Subsequent use of ENQ may, or may not,
include the function "Who are you", as determined by
agreement.

A transmission control character transmitted by a receiver as


ctl-F 6 6 ACK ACKNOWLEDGE
an affirmative response to the sender.

A control character that is used when there is a need to call for


ctl-G 7 7 BEL BELL
attention; it may control alarm or attention devices.
code pos. Unicode Description in C0 of ISO 646

dec. hex. abbr. name

A format effector which moves the active position one


ctl-H 8 8 BS BACKSPACE
character position backwards on the same line.

HORIZONTAL A format effector which advances the active position to the


ctl-I 9 9 HT
TABULATION next pre-determined character position on the same line.

A format effector which advances the active position to the


ctl-J 10 A LF LINE FEED
same character position of the next line.

VERTICAL A format effector which advances the active position to the


ctl-K 11 B VT
TABULATION same character position on the next pre-determined line.

A format effector which advances the active position to the


ctl-L 12 C FF FORM FEED same character position on a pre-determined line of the next
form or page.

A format effector which moves the active position to the first


ctl-M 13 D CR CARRIAGE RETURN
character position on the same line.

ctl-N 14 E SO SHIFT OUT A control character which is used in conjunction with SHIFT IN
and ESCAPE to extend the graphic character set of the code. It
code pos. Unicode Description in C0 of ISO 646

dec. hex. abbr. name

may alter the meaning of octets 33 - 126 (dec.). The effect of


this character when using code extension techniques is
described in International Standard ISO 2022.

A control character which is used in conjunction with SHIFT


OUT and ESCAPE to extend the graphic character set of the
code. It may reinstate the standard meanings of the octets
ctl-O 15 F SI SHIFT IN
which follow it. The effect of this character when using code
extension techniques is described in International Standard ISO
2022.

A transmission control character which will change the


meaning of a limited number of contiguously following
characters. Its is used exclusively to provide supplementary
ctl-P 16 10 DLE DATA LINK ESCAPE
data transmission control functions. Only graphic characters
and transmission control characters can be used in DLE
sequences.

ctl-Q 17 11 DC1 DEVICE CONTROL A device control character which is primarily intended for
ONE turning on or starting an ancillary device. If it is not required
code pos. Unicode Description in C0 of ISO 646

dec. hex. abbr. name

for this purpose, it may be used to restore a device to the basic


mode of operation (see also DC2 and DC3), or for any other
device control function not provided by other DCs.

A device control character which is primarily intended for


turning on or starting an ancillary device. If it is not required
DEVICE CONTROL for this purpose, it may be used to set a device to a special
ctl-R 18 12 DC2
TWO mode of operation (in which case DC1 is used to restore
normal operation), or for any other device control function not
provided by other DCs.

A device control character which is primarily intended for


turning off or stopping an ancillary device. This function may
DEVICE CONTROL be a secondary level stop, for example, wait, pause, stand-by
ctl-S 19 13 DC3
THREE or halt (in which case DC1 is used to restore normal operation).
If it is not required for this purpose, it may be used for any
other device control function not provided by other DCs.

ctl-T 20 14 DC4 DEVICE CONTROL A device control character which is primarily intended for
FOUR turning off, stopping or interrupting an ancillary device. If it is
code pos. Unicode Description in C0 of ISO 646

dec. hex. abbr. name

not required for this purpose, it may be used for any other
device control function not provided by other DCs.

NEGATIVE A transmission control character transmitted by a receiver as a


ctl-U 21 15 NAK
ACKNOWLEDGE negative response to the sender.

A transmission control character used by a synchronous


SYNCHRONOUS transmission system in the absence of any other character (idle
ctl-V 22 16 SYN
IDLE condition) to provide a signal from which synchronism may be
achieved or retained between data terminal equipment.

END OF A transmission control character used to indicate the end of a


ctl-W 23 17 ETB TRANSMISSION transmission block of data where data is divided into such
BLOCK blocks for transmission purposes.

A character, or the first character of a sequence, indicating that


the data preceding it is in error. As a result, this data is to be
ctl-X 24 18 CAN CANCEL ignored. The specific meaning of this character must be
defined for each application and/or between sender and
recipient.
code pos. Unicode Description in C0 of ISO 646

dec. hex. abbr. name

A control character that may be used to identify the physical


end of a medium, or the end of the used portion of a medium,
ctl-Y 25 19 EM END OF MEDIUM or the end of the wanted portion of data recorded on a
medium. The position of this character does not necessarily
correspond to the physical end of the medium.

A control character used in the place of a character that has


ctl-Z 26 1A SUB SUBSTITUTE been found to be invalid or in error. SUB is intended to be
introduced by automatic means.

A control character which is used to provide additional control


functions. It alters the meaning of a limited number of
ctl-[ 27 1B ESC ESCAPE
contiguously following bit combinations. The use of this
character is specified in International Standard ISO 2022.

A control character used to separate and qualify data logically;


its specific meaning has to be specified for each application. If
ctl-\ 28 1C FS FILE SEPARATOR
this character is used in hierarchical order, it delimits a data
item called a file.

ctl-] 29 1D GS GROUP A control character used to separate and qualify data logically;
code pos. Unicode Description in C0 of ISO 646

dec. hex. abbr. name

its specific meaning has to be specified for each application. If


SEPARATOR this character is used in hierarchical order, it delimits a data
item called a group.

A control character used to separate and qualify data logically;


RECORD its specific meaning has to be specified for each application. If
ctl-^ 30 1E RS
SEPARATOR this character is used in hierarchical order, it delimits a data
item called a record.

A control character used to separate and qualify data logically;


its specific meaning has to be specified for each application. If
ctl-_ 31 1F US UNIT SEPARATOR
this character is used in hierarchical order, it delimits a data
item called a unit.

127 7F DEL DELETE (not defined)

Notes:

 The first column shows the widely used "control-something" name used for control codes. It relates to
the fact that on a keyboard, it is often possible to generate a control code using the control (Ctrl, Ctl)
key and a normal key.
 The column C0 of ISO 646 quotes the definition in that document, with typos fixed, and with references
to characters and code positions changed to use Unicode names and modern terms.

Historical table
The following table lists the original names of Ascii control codes as defined in 1963.

code pos. Ascii 1963

dec. hex. abbr. name

0 0 NULL Null/Idle

1 1 SOM Start of message

2 2 EOA End of address

3 3 EOM End of message

4 4 EOT End of transmission

5 5 WRU "Who are you...?"

6 6 RU "Are you...?"

7 7 BELL Audible signal


code pos. Ascii 1963

dec. hex. abbr. name

8 8 FE0 Format effector

9 9 HT/SK Horizontal tabulation/ Skip (punched card)

10 A LF Line feed

11 B VTAB Vertical tabulation

12 C FF Form feed

13 D CR Carriage return

14 E SO Shift out

15 F SI Shift in

16 10 DC0 Device control reserved for data link escape

17 11 DC1 Device control

18 12 DC2
code pos. Ascii 1963

dec. hex. abbr. name

19 13 DC3

20 14 DC4 (STOP) Device control (stop)

21 15 ERR Error

22 16 SYNC Synchronous idle

23 17 LEM Logical end of media

24 18 S0 Separator (information)

25 19 S1

26 1A S2

27 1B S3

28 1C S4

29 1D S5
code pos. Ascii 1963

dec. hex. abbr. name

30 1E S6

31 1F S7

127 7F DEL Delete/idle

Ascii 1963 assigned code position 126 to the ESC code. Later ESC was moved to position 27, and position
126 was assigned to tilde (~). Similarly ACK was moved from 124 to 6, making room for vertical
line (vertical bar, |).

Converting Control Codes to ASCII, Decimal and Hexadecimal


Ctr De
ASCII Hex Meaning
l c
^@ NUL 000 00 Null character
^A SOH 001 01 Start of Header
^B STX 002 02 Start of Text
^C ETX 003 03 End of Text
^D EOT 004 04 End of Transmission
^E ENQ 005 05 Enquiry
^F ACK 006 06 Acknowledge
^G BEL 007 07 Bell
^H BS 008 08 Backspace
^I HT 009 09 Horizontal tab
^J LF 010 0A Line feed
^K VT 011 0B Vertical tab
^L FF 012 0C Form feed
^M CR 013 0D Carriage return
^N SO 014 0E Shift out
^O SI 015 0F Shift in
^P DLE 016 10 Data link escape
^Q DCL 017 11 Xon (transmit on)
^R DC2 018 12 Device control 2
^S DC3 019 13 Xoff (transmit off)
^T DC4 020 14 Device control 4
^U NAK 021 15 Negative acknowledge
^V SYN 022 16 Synchronous idle
^W ETB 023 17 End of transmission
^X CAN 024 18 Cancel
^Y EM 025 19 End of medium
^Z SUB 026 1A Substitute
^[ ESC 027 1B Escape
^\ FS 028 1C File separator
^] GS 029 1D Group separator
^^ RS 030 1E Record separator
^_ US 031 1F Unit separator
SP 032 20 Space
The ^ symbol stands for Ctrl
Originally the Control codes were used for instruments like teleprinters, so many of the codes are now obsolete.

Groups of control characters


For the purposes of this document, control characters are divided into three groups.

1. ASCII control characters. The ASCII control character area covers code positions 0–31 (hex
00–1F). This area is also called the C0 set. Two additional controls appear at 32 and 127 (hex 20
and 7F). The ASCII control characters cover a wide range of uses, such as text layout, transmission
and device control, and more. More

2. C1 control characters. C1 covers positions 128-159 (hex 80-9F). C1 is primarily for displays
and printers. This set is related to ANSI escape sequences and VT100. More
3. ISO 8859 special characters. Two special characters, NBSP and SHY, are from ISO 8859.
They are also used in Windows and Unicode. They appear at 160 and 173 (hex A0 and AD). More

Note: These control character sets are not the only control characters ever used. Other
C0 and C1 sets do exist. Alternative sets were defined for special uses. In them, a part
of the standard C0/C1 controls have been deleted or replaced by new controls. Even
totally different alternative sets exist. Alternative control characters are not discussed
in this article. One can find them in the International Register of Coded Character Sets.

Control characters in standards

ASCII control characters

C0 = positions 0–31. Origin with ASCII and ISO 646 character sets. Characters SP and DEL appear
together with C0.

The first group of control characters originates from ASCII. These characters consist of a set called
C0 and two additional characters. The C0 set is in locations 0 to 31. Two additional ASCII
characters, SP and DEL, fall outside the C0 area, but they are closely related to the C0 set. All of
these characters are defined by the same standards.

This set of control characters covers many uses. There are "Format Effectors" that control the
appearance of plain text. There are "Transmission Controls" for use with transmission protocols
and "Device Controls" to start, operate and stop auxiliary devices. There are "Information
Separators" that delimit various pieces of data. Other controls exist for producing alerts, filling a
media, indicating end of media, and for dealing with errors. There are even controls to create new
characters and controls. The C0 set was defined with perforated tape, punched cards and
typewriter-like devices in mind. Devices have changed since then, but the C0 controls have
survived.

History of ASCII control characters

The first version of ASCII was released in 1963. Like the ASCII of today, the 1963 version covered
some letters and symbols, as well as control characters. While many of those 35 control characters
were similar to those of modern ASCII, some were different. ASCII-1963 had some serious
shortcomings, such as no support for lower case letters. It quickly turned out that the standard
must be revised. Today, ASCII-1963 is practically forgotten. Since ASCII-1963 deviates a lot from
later ASCII versions in the control character area too, we will not go any deeper into it.

The next revision was ASCII-1965. This version, although formally accepted, was not published.
Another revision was going to take place. ASCII as we know it is based on the ASCII-1967 standard
(USAS X3.4-1967). This version was an important milestone. It was already very close to the
version that then became widely used.

In 1968 ASCII was slightly updated and released as USAS X3.4-1968 (later retronamed as ANSI
X3.4-1968). The actual updates were very small, only adding an option to use the character LF as
a "newline", and designating ASCII and USASCII as the names of the standard. (Later on, the name
USASCII was dropped, leaving ASCII as the official name.)

ASCII-1968 became immensely popular. Almost all of today's computer systems use ASCII or one
of its descendants. (A notable exception is EBCDIC used on IBM mainframes, very different from
ASCII.) The Internet is based on ASCII-1968 as well.

ASCII-1968 defined the 34 control characters that remained: the C0 set, SP and DEL. Included was
a short description of the intended functionality of each control character. These definitions also
made themselves to RFC 20 word for word. Most of these definitions have remained materially
unchanged for decades. Later standards have updated the text, but the basic functionality is still
the same. This is what comes to standards. Non-standard use is common and often contrary to the
standards.

When ASCII emerged, computing equipment was quite different from the equipment that ASCII
was going to be popularized with. Computers were regularly operated through punched cards,
perforated tape and teletypewriters (TTYs). TTYs were typewriter-like devices, which were used as
interactive computer terminals. Instead of a monitor they produced output on paper. The ASCII
control characters were naturally designed considering the devices of those days. Since then, new
devices such as monitors have emerged. It hasn't always been that simple to accommodate the
control characters to the newer devices. Despite the challenges, the control characters of the
1960s are still with us.

ISO 646. ASCII evolved to an ISO standard, which is known as ISO 646. The first version came out
in 1967. ISO 646 is the "international edition" of ASCII, with a few differences. Despite the
differences, these standards were closely related. ISO 646 allowed national variants to support the
national characters required for each country. The US national variant was ASCII. Several other
national variants were released to support accented letters (à, ü and the like) and other symbols.
The ISO variants including ASCII were a common way to express text in the 1970s and 1980s.

As to the control characters, the ASCII control characters set also appeared in ISO 646. The
functionality of the control characters remained quite intact, even though the definitions were
updated.

More standards. ISO 646 was also released as ECMA-6. The control characters appear in ECMA-6
very similar to those of ISO 646.

A part of the C0 codes were further refined in other standards. SI, SO and ESC appeared as
character set extension controls in ANSI X3.41, ISO 2022 and ECMA-35. These characters became
widely used to invoke additional character sets. The Transmission Control characters (T 1 to T10)
appeared as ISO 1745 in 1975, which gave detailed description of where and how they should be
used. How widely ISO 1745 was actually used in transmission is another question.

Current status of ASCII control characters

ASCII was later updated in 1977 and again in 1986 to be in conformance with ISO 646. The control
characters in ASCII-1986 and ISO 646/ECMA-6 are very similar, even though minor differences do
exist.

The current ISO and ECMA versions, namely ISO 646:1991 and ECMA-6:1991, no longer define the
C0 control characters. The control characters didn't go away, however. They now appear in
ISO/IEC 6429:1992 and ECMA-48:1991, respectively. Simply put, the C0 set was lumped together
with other control characters, the C1 set, which follows below.

As to some specific control characters, the current detailed definitions of SI, SO and ESC can be
found in ANSI X3.41, ISO 2022 and ECMA-35. The current details for the Transmission Control
characters (TC1 to TC10) appear in the old ISO 1745 from 1975.

Even though the history of the various standards related to the ASCII control codes may sound
unnecessarily complicated, the standard functionality of the characters has not changed
dramatically. It's still mostly the same as back in 1967. This is what comes to standards. The
practice is totally different. Some control characters are indeed commonly used the standard way.
On the other hand, many are used contrary to the standards, or simply ignored. It's not uncommon
to find control characters forbidden in data. Control characters can have unwanted or unknown
side-effects. The easiest way for programmers to deal with them is to shut their eyes or deny such
characters altogether.
C1 control characters

C1 = positions 128–159. Primarily for displays and printers.

The C1 set appeared in the late 1970s. It is primarily designed for controlling display and printer
devices, even though some of the controls warrant other uses as well. The C1 set is intended for
use with the C0 set.

The C1 set includes "Format effectors" that control horizontal and vertical movement when
displaying or printing. There are "Presentation controls" for defining line-break behavior. There are
"Area definition" controls for form filling. There are "Introducers" and "Shift Functions" to support
extra controls and characters. Additional controls exist for sending command strings and setting
an indicator. Some of the controls were intended to cover for shortcomings in the C0 set. Some
controls were reserved: 2 controls are for private use, while 4 controls were (and still are) reserved
for future standardization.

The C1 set occupies positions 128–159 in 8-bit environments. There are also escape codes to use
the C1 set on 7-bit systems. The respective escape codes (ESC char) are given in the C1
list further below.

History of C1

In 1979 ANSI released additional controls for use with ASCII (ANSI X3.64). This came to be known
as the C1 set. A similar set was also released as ECMA-48. According to ANSI, the C1 controls were
intended for input/output control of two-dimensional character-imaging devices, including
interactive terminals of both the cathode ray tube and printer types, as well as output to microfilm
printers.
A bit later, in 1983, the C1 set was standardized as ISO 6429. Standard-wise, the C1 set has been
volatile. Both ISO 6429 and ECMA-48 were updated several times. New control characters were
added and definitions updated. One of the C1 characters (IND) was eventually deprecated and
removed.

The standards actually cover more control codes than those that fit in the C1 area. These
additional controls are used via control sequences (escape sequences). The sequences are beyond
the subject of this article. Let it suffice that the sequences are an important part of the standards
that should be used together with the C1 controls. The sequences, together with C1, are also
known as VT100 and ANSI escape sequences.

Current status of C1

The current standards for C1 are ISO/IEC 6429:1992 and ECMA-48:1991. These standards now
define both the C0 and C1 control characters.

Unicode allows the use of C1 (and C0 too). In fact, the C1 area has been entirely reserved for
control codes in Unicode. On the contrary, the (somewhat outdated) DOS and Windows
codepages, i.e. character sets, have not reserved space for C1. Instead, they have included
additional graphic characters in the C1 area. This doesn't prevent the use of C1 controls on DOS
and Windows, though.

In practice, the C1 control characters are not very common. They are specialized codes for special
applications.

ISO 8859 special characters NBSP and SHY

Positions 160 and 173.


ISO 8859 is a group of 8-bit extended character sets. The sets cover various Latin characters and
also Cyrillic, Greek, Arabic, Hebrew and Thai characters. ISO 8859 is related to the Windows
character sets ("ANSI codepages"), but these are actually different from each other.

Two characters in ISO 8859 are of interest to us: Non-Breaking Space (NBSP) and Soft Hyphen
(SHY). They both have control character like properties, even though they are not actually called
control characters in ISO 8859.

NBSP appears in position 160 (hex A0) and SHY is 173 (hex AD). The same positions, and roughly
the same meanings too, have been adopted to many of the Windows codepages and Unicode.

Note: ISO 8859-8 Latin/Hebrew defines two additional special characters, namely LRM
(left-to-right mark) and RLM (right-to-left mark). These characters are not universal in
ISO 8859, but specific to Hebrew. Since LRM and RLM were not used in any other ISO
8859 character set, and since they do not appear in Unicode at the same positions,
they are not further presented in this article.

Current status of NBSP and SHY

Several current standards include NBSP and SHY. They appear at the same positions in all of the
following:

 ISO 8859-1 to 8859-16.


Exception: ISO 8859-11 Latin/Thai does not include SHY.
 Windows codepages 1250–1258.
 Unicode, block U+0080 C1 Controls and Latin Supplement.

Control characters in Unicode


Control characters have made their way to Unicode as well. Unicode recognizes control characters
and explicitly allows their use. While Unicode doesn't obsolete control characters, it defines special
rules for just a handful of them. Let the standard speak for itself:

The Unicode Standard provides for the intact interchange of these code points, neither adding to
nor subtracting from their semantics. The semantics of the control codes are generally determined
by the application with which they are used. However, in the absence of specific application uses,
they may be interpreted according to the control function semantics specified in ISO/IEC
6429:1992. (Unicode 9.0 p. 822)

Unicode specifies semantics for the following control characters. The semantics appear to be in
line with their original semantics, even though some differences may exist.

1. ASCII control characters:


 HT and SP are considered whitespace.
 LF, VT, FF and CR are considered whitespace, and also mandatory line breaks in the line
breaking algorithm.
 FS, GS, RS and US are considered separators in the bi-directional algorithm.
2. C1 control characters:
 NEL is considered a mandatory line break in the line breaking algorithm, even though
supporting it is optional.
3. ISO 8859:
 NBSP and SHY. These characters are not actually control characters in Unicode.
Instead, NBSP is "Separator, space" and SHY is "Other, format". Both characters have
features in the line-breaking algorithm. With SHY, Unicode is significantly more
elaborate than ISO 8859 in that Unicode suggests more hyphenation features than just
displaying a hyphen.
Note: While no new control characters appear in Unicode, it does define some of its
own special characters, such as formatting characters. These characters are beyond
the scope of this article.

From ASCII via ISO to Unicode

The following diagram summarizes the development of character standards. You can see how the
control characters were propagated from ASCII (X3.4) and other standards to Unicode.
Control characters in modern applications
With so many control characters coming from the 1960s and 1970s, are they still useful for
application programmers?

It depends on the application. Generally speaking, one needs control characters to work with old
interfaces or devices. New protocols and file formats tend to use some other mechanism than
control characters. Current formats typically use textual markup such as XML, which has little use
for control characters beyond whitespace. On the device control side, unless you are writing
device drivers, you control devices through operating system calls or library routines rather than
sending them control strings to do tricks.

The following is a subjective list of which characters are still in common use and which ones are
used less. The list is based on experience writing application software for Windows and DOS.

 ASCII control characters: some used, some not


o NUL is still common in everyday use. NUL terminates a string in many programming
languages and interfaces.
o Transmission control characters (T1 to T10) are generally of little use. Data transfer is
done through TCP/IP sockets, HTTP, FTP or some other protocol. Individual transmission
control characters appear for special uses.
o BEL probably no longer appears in its original use. Rather than sending BEL to produce
beeps, applications will rather play a tune via other means.
o Format effectors (F0 to F5) are possibly the most important control characters these
days. Some of them, such as CR and LF, are essential for a system to work at all. HT is
also very common, especially in plain text files. BS and FF are less common. VT appears
only rarely if ever.
o Device control characters (DC1 to DC4) are not required to control devices, really. To
control a device from an application you rather make a system call. On the other hand,
you might still need XOFF (^S) or XON (^Q) in a command line session from time to
time.
o SO, SI and ESC used to be common, but this has changed. One may find them from
time to time, but supposedly it's about older systems then.
o CAN and EM are not in common use.
o SUB might no longer appear as a substitute. You will more likely see something like "?"
or the Unicode REPLACEMENT CHARACTER (U+FFFD) as a substitute for a bad
character. Another use for SUB still exists, though. You could find it at the end of a text
file.
o Information separators (IS4 to IS1) are technically still valid. If anyone uses them to
separate information is another question. Other techniques are used instead, such as
XML or database systems. As a simple delimiter character a NUL, HT, CR/LF, comma or
semicolon is more common than any of the information separators originally designed
for the purpose.
o SP must be the heaviest used control character of them all.
o DEL – well, did you ever see one?
o Characters ^A to ^Z (1 to 26) frequently appear as keyboard shortcuts in various
applications and operating systems. The actual feature triggered by a keyboard
shortcut is often unrelated to the respective control character. More of that follows
below.
 C1 control characters: little use
o NEL is the only C1 character recognized by Unicode. The most probable case to run into
NEL is when EBCDIC compatibility is required.
o The other C1 characters appear outdated now. Since VT100 (that uses C1 extensively)
is still a current method with Unix shell sessions, C1 is alive, maybe even everyday
business for you. From a programmer's point of view the entire C1 set is rarely used.
 ISO 8859 special characters: in use
o NBSP is an everyday character to suppress a line break. It is supported by several
current standards, including HTML and Unicode.
o SHY seems to be less frequent.

Some frequently used characters, especially in a special field, may not have been mentioned. If
you know frequent current uses for any of the characters, let us know.

Many of the control characters only appear rarely. How did this affect the space efficiency of 7-bit
and 8-bit character sets? Instead of reserving space for control characters, it was possible to reuse
these areas for additional graphics. This was actually done by DOS, Windows and Mac, all of which
assigned graphic characters to the control character areas. Unicode chose to be different in this
respect. Since its code space is much larger than 128 or 256, it was possible to reserve the C0/C1
areas entirely for control characters. This has helped the control characters to survive, if not in
practical use, then at least in various code charts and lists.

Keyboards and control characters


Users can create many of the control characters from their keyboards. This usually happens in
combination with the Ctrl key, and, more rarely, with the Esc key. There are also some special
keys that produce control characters on their own. ←Backspace , Enter , Esc , Space and ↹ Tab are the
usual ones.
Key presses and control characters, while having some things in common, are usually unrelated.
Pressing a key combination doesn't generally trigger the functionality of the respective control
character. As an example, while it's possible to press Ctrl + O to create an SO (Shift Out), pressing
the keys seldom runs the operation associated with SO (pick an alternate character set).
Instead, Ctrl + O might start an operation beginning with an O, such as "Open".
In some cases a key press does trigger the respective control character feature. Pressing the ↹
Tab key, or Ctrl + I , can indeed produce an HT (Horizontal Tabulation) and move the cursor forward
on the line. This is an exception rather than the norm, though.
Some key combinations are more likely than others. Ctrl + A through Ctrl + Z (in other words, ^A to
^Z) are common keyboard shortcuts. Control key combinations with a symbol (^@, ^[, ^\, ^],
^^, ^_, ^?) are less common. There is a reason why such combinations should be avoided.
Considerable variation exists with symbol keys in different keyboard layouts. A Ctrl and symbol
key combination doesn't always produce the same control character, or any character at all, which
makes it less useful as a keyboard shortcut.

In this article the focus is on the programmatic features of control characters. Less focus is put on
the use of keyboard shortcuts.

About the character list


Next we are going to list every control character in detail. The column Dec refers to the decimal
value of the control code ("ASCII value"). Hex is the same in hexadecimal, preceded by a dollar
sign for clarity. An octal value is also given. The column Pos shows the row/column of the
character in code charts.

The list shows key presses that (often) produce the control character on the keyboard. In addition,
C-style escape sequences (\c) are provided where available, as are special constants supported by
Visual Basic: classic version and Visual Basic .NET.

The last column lists mnemonics and graphic symbols. The symbols (in black) have been
standardized, but they have fallen into disuse. The 2-letter mnemonics are standardized for the
ASCII section. Additional 2-letter mnemonics for the C1 and ISO 8859 sections are taken from RFC
1345, which is not a standard, but is frequently referred to in this context.

C0 NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF


00 01 02 03 04 05 06 07 08 09 0A 0B 0C

DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS
10 11 12 13 14 15 16 17 18 19 1A 1B 1C

SP
20

....

PAD HOP BPH NBH IND NEL SSA ESA HTS HTJ VTS PLD PLU
80 81 82 83 84 85 86 87 88 89 8A 8B 8C
C1
DCS PU1 PU2 STS CCH MW SPA EPA SOS SGCI SCI CSI ST
90 91 92 93 94 95 96 97 98 99 9A 9B 9C

NBSP
8859
A0

Format Info sepa- Presentation Area defi- Device Transmission


Delimiter Introducer Shift Graphic
effector rator control nition control control

Character list

ASCII control characters (C0)


The ASCII control characters work in 7-bit and 8-bit environments, as well as in Unicode. These
controls originate from a set of related standards: ASCII, ISO 646 and ECMA-6, and also ISO 6429
and ECMA-48. All of these characters are available in Unicode, too. The actual C0 set consists of
characters NUL through US (0–31). Two additional characters, SP and DEL, are a part of ASCII and
the related standards as well.

*) The 2-character mnemonics for the ASCII set are from ANSI X3.32, ISO 2047 and ECMA-17. So
are also the graphic symbols. The symbols are outdated and rarely used. A couple of the symbols
also have alternative forms.

De Hex Cha Description Octa Pos *)


c r l

0 $00 NUL Null 000 0/0 NU

\0 ^@ NUL is defined in the standards as a filler character. It can be used as media-fill


or time-fill. NUL doesn't affect the information content of a data stream. It may
affect the information layout and the control of equipment, though.

Note: NUL was originally intended as an ignorable filler character with no


meaning. Especially convenient on paper tape, where a NUL equals no holes
punched, it could be used to reserve space for new information or correcting
errors. ASCII-1986 even suggests NUL as a "time-waster" character to be added
after a newline to accommodate mechanical devices where a carriage return
works slowly. Despite this, NUL has been used contrary to the standards in null-
terminated strings as an End-Of-String marker. Several programming
languages use this convention.
Constant in Visual Basic and VB.NET: vbNullChar, NullChar

1 $01 SOH Start of Heading — TC1 Transmission control character 1 001 0/1 SH

^A Indicates the beginning of a heading in a transmission. The heading can be


terminated by STX. As per ASCII-1968, a heading constitutes a machine-
sensible address or routing information. Later standards have dropped the
explanation.

Note: SOH, along with STX and ETX, was intended for data transmission. It is
not intended for marking a heading in a document.

2 $02 STX Start of Text — TC2 Transmission control character 2 002 0/2 SX

^B STX has two functions in a transmission: it 1) indicates the beginning of a text


and 2) may terminate a heading (see SOH). As per ASCII-1968, text is what
should be transmitted to a destination. Later standards have dropped the
explanation.

3 $03 ETX End of Text — TC3 Transmission control character 3 003 0/3 EX

^C Terminates a text in a transmission. As per ASCII-1968, a text starts with STX


and ends with ETX. Later standards don't necessarily require the pairing of STX
with ETX.

Note: ETX may be used to call for reply from a slave station after a message
has been sent. ETX is also commonly used to terminate an interactive process
(keyboard: Ctrl + C ).
Ctrl + Break on PC keyboard produces this character code.

4 $04 EOT End of Transmission — TC4 Transmission control character 4 004 0/4 ET

^D Indicates the conclusion of a transmission. The transmission may have


contained one or more texts and associated heading(s).

Note: EOT can be used to end or abort a transmission. It can also be a reply to
indicate inability to receive further messages. EOT (keyboard: Ctrl + D ) is even
used as an End-Of-File control in a Unix shell session.

5 $05 ENQ Enquiry — TC5 Transmission control character 5 005 0/5 EQ

^E Requests a response from a remote station. The response may include station
identification or status. ENQ can be used as a "Who Are You" (WRU) to identify
a remote station, especially after a new connection has been established.

6 $06 ACK Acknowledge — TC6 Transmission control character 6 006 0/6 AK

^F An affirmative response. Transmitted from a receiver as a response to the


sender.

Note: ACK can indicate that a slave station has received a message correctly
and is ready to receive more.

7 $07 BEL Bell 007 0/7 BL


\a ^G Calls for human attention. BEL may control alarm or attention devices.

Note: BEL is the only control character with an audible effect. It has been used
to ring a bell (indeed) or produce a beep sound. A visual alarm is also possible.

In Unicode, this control character is abbreviated BEL but named ALERT, while
the name BELL is confusingly used for a graphic character (🔔).

8 $08 BS Backspace — FE0 Format effector 0 008 0/8 BS

\b ^H Moves one character position backwards (keeping the previous character).

Note: Contrary to the standards, BS has been used as a combined "move back
and delete" operation to remove the previous character. This is not the
standard meaning of BS, however. BS is defined as a non-destructive "move
back" or "move left" operation, similar to a backspace in mechanical
typewriters. To delete the previous character, BS should be followed by DEL. On
paper tape the result would be the previous character being completely
punched out (erased). BS followed by another character would strike two
characters in the same position. Overstriking was a way to produce combined
characters. This option was intended to internationalize ASCII. A letter followed
by BS followed by a diacritic symbol would produce an accented letter. As an
example, u BS ^ would produce û. Several ASCII characters (" ' ` ^ ~ ,) were
indeed defined to be used as diacritic symbols. Overstriking could also be
suitable with other characters, such as for underlining with the "_" character or
printing a slash "/" over "=" to produce "not equal". It could even be used to
achieve a strike-through effect (perhaps with -, / or X) to indicate removed text.
A boldface effect could be achieved by striking the same character several
times at the same position.

Overstriking was a useful option with printing devices, but displays hardly
support it. With the advent of more capable character sets and formatting
techniques overstriking can be considered outdated. ASCII-1986 does not
require overstriking capabilities and suggests that overstriking may be
proscribed in the future. ISO 8859 explicitly forbids overstriking.

←Backspace on PC keyboard produces this character code.

Constant in Visual Basic and VB.NET: vbBack, Back

See also: CCH

9 $09 HT Horizontal Tabulation — FE1 Format effector 1 (Character 011 0/9 HT


Tabulation)

\t ^I Advances to the next pre-determined character position (horizontal tab stop).


HT could also be used as a skip function on punched cards.

Note: HT is commonly also abbreviated TAB.

Even though the standards don't set a universal tab width, a typical fixed tab
width is 8 columns. Other tab widths, as well as custom tab positions, are used
as well. HT is a simple method of data compression: a single character can
represent several spaces in formatted text.
The ↹ Tab key on the keyboard is consistent with HT in that it usually produces
the code HT. How the HT is treated in each application is another story. In
windowing environments, there are three common alternative uses. Pressing ↹
Tab can either add an HT character into text, indent text (possibly by adding an
appropriate number of spaces or shifting the marginal), or something
completely different: jump to the next field or control in a graphical user
interface. This way the key has been extended to cover more uses than what
HT was originally intended for.

The original name of HT is Horizontal Tabulation. It was later renamed as HT


Character Tabulation, first in ECMA-48:1986.

↹ Tab on PC keyboard produces this character code.

Constant in Visual Basic and VB.NET: vbTab, Tab

10 $0A LF Line Feed — FE2 Format effector 2 012 0/10 LF

^J LF has two alternative functions. It advances to the same character position on


the next line (move down), or optionally to the first position on the next line
(move to start of next line, i.e. newline). Originally LF was a move-down. A
newline option (NL) was added soon. The option allowed LF to be used as a
newline, which works like a combined CR LF. Use of LF as a newline requires
agreement between sender and recipient of data. Universal agreement has not
been reached.

Note: LF, having two alternative functions, has been a major source of
confusion. While LF was initially defined as a "move down" operator, standards
began to allow LF as a newline too. As a result, operating systems differ in their
definition of a newline. A newline is LF on Unix. Operating systems using CR LF
include CP/M, DOS, OS/2 and Windows. Naturally, this caused an
incompatibility. To solve the problem, control characters IND and NEL were
added to the C1 area. This did not solve the issue, resulting in IND being
removed later. ECMA-6:1985 and ASCII-1986 attempted to clarify the situation
by declaring LF deprecated for a newline and recommending CR LF instead.
ECMA-48:1991 no longer allows LF to function as a newline.

The escape sequence for newline and LF is another source of confusion. \n is


the common sequence for a newline, whereas there is no such a sequence for a
line feed. The actual control character(s) represented by \n depend on the
system. In some cases, \n indeed represents LF, but it can also represent
another newline sequence.

Ctrl + Enter on PC keyboard produces this character code.

Constant in Visual Basic and VB.NET: vbLf, Lf

See also: CR IND NEL

11 $0B VT Vertical Tabulation — FE3 Format effector 3 (Line 013 0/11 VT


Tabulation)

\v ^K Advances to the same character position on the next pre-determined line.


ASCII-1977 and ASCII-1986 optionally allow VT to advance to the first position
on the next pre-determined line, if agreed on.
Note: The original name of VT is Vertical Tabulation. It was later renamed as VT
Line Tabulation, first in ECMA-48:1986. VT has been used to jump down to the
next pre-defined line when printing on a paper form. According to some
sources, vertical tab stops were typically spaced 6 lines apart. VT is a simple
data compression method where a single VT represents several LF characters
(and optionally a CR too).

In modern use VT must be quite a rare character. As Bob Bemer, one of the
original designers of ASCII, put it: "This is a very dangerous character to use. It
cannot be used directly on any terminal that I know of. Even if it could, the
implementation rules are not supplied unambiguously in the ASCII standard."

Constant in Visual Basic and VB.NET: vbVerticalTab, VerticalTab

12 $0C FF Form Feed — FE4 Format effector 4 014 0/12 FF

\f ^L Advances to the next form or page. Standards differ in what column the
subsequent character position will be in. Originally, ASCII-1968 did not define
the column at all. ISO and ECMA standards declare that FF does not change the
column. ASCII-1977 and ASCII-1986 optionally allow, by agreement, moving to
the first column, as if FF was actually CF FF.

Note: FF has been used as "page break" in text files, "new page" on printers
and "clear the screen" on displays. The situation was originally unclear whether
FF was just a "new page" operator or "new page, move to column 1". ASCII-
1977 and ECMA-6:1985 attempted to clarify the situation by recommending the
use of CR FF. ASCII-1986 even implied that the "new page, move to column 1"
option might be deleted in a future edition of ASCII.
Constant in Visual Basic and VB.NET: vbFormFeed, FormFeed

13 $0D CR Carriage Return — FE5 Format effector 5 015 0/13 CR

\r ^M Traditional definition: Moves to the first position on the same line (ASCII, ISO
646, ECMA-6). Newer definition: Moves to the line home position or line limit
position of the same line (ISO 6429, ECMA-48).

Note: The standard meaning of CR is "move to beginning of current line". This


allows overprinting the line with new characters, which could be used to
achieve underlining, for example. For advancing to the next line CR would be
followed by LF. On CP/M, DOS, OS/2 and Windows the newline marker is CR LF,
which is according to the definition. CR alone has been used as the newline
character on some systems, such as Commodore and Apple, which use does
not conform to the standards in question. The order CR LF (instead of LF CR)
may have been important on mechanical devices where a carriage return took
relatively long to execute. A non-printing LF was more suitable output while the
printing head was returning, rather than striking a graphic symbol in the middle
of the line.

Enter on PC keyboard produces this character code.

Constant in Visual Basic and VB.NET: vbCr, Cr

See also: LF

14 $0E SO Shift Out — LS1 Locking-Shift One 016 0/14 SO


^N Used to extend the character set. SO may alter the meaning of the following bit
combinations until an SI is reached. Between SI and SO, character positions 33-
126 (decimal) may represent additional characters that would not otherwise fit
in the regular character set.

Note: SO (Shift Out) is normal name of this control. LS1 (Locking-Shift One) is
used by ECMA-35 and ECMA-48. In those standards, SO is used in 7-bit
environments and LS1 in 8-bit environments. The mechanism to select the
alternative character set(s) was defined in ANSI X3.41, ISO 2022 and ECMA-35.
It includes the use of escape sequences starting with ESC. SO has also been
used on printers to select enlarged characters or another color.

15 $0F SI Shift In — LS0 Locking-Shift Zero 017 0/15 SI

^O Used in conjunction with SO. It may reinstate the standard meanings of the
characters following it.

Note: SI (Shift In) is normal name of this control. LS0 (Locking-Shift Zero) is
used by ECMA-35 and ECMA-48. In those standards, SI is used in 7-bit
environments and LS0 in 8-bit environments. SI has also been used on printers
to select condensed characters or to reset color.

16 $10 DLE Data Link Escape — TC7 Transmission control character 7 020 1/0 DL

^P Used to provide supplementary data transmission control functions. DLE


changes the meaning of a limited number of following characters.

Note: DLE is the "escape" character for transmission control. DLE can
potentially be put in the front of a transmission control character (TC1-TC10) to
pass it through "as is" instead of controlling the current transmission. This is
not always the case, though. It is possible to create new transmission control
sequences with DLE in a similar way ESC is used to create escape sequences
for other purposes. Contrary to the standards, ^P has been used as a keyboard
shortcut to echo console activity at the printer.

17 $11 DC1 Device Control 1 — XON 021 1/1 D1

^Q Intended to turn on or start an ancillary device, to restore it to the basic


operation mode (see DC2 and DC3), or for any other device control function.

Note: DC1 is conventionally called XON when used in communication for


software flow control. The meaning of XON is to continue data transmission
after an XOFF (DC3) has been received. The name XON ("transmit on") does not
come from a standard, but it is commonly used.

18 $12 DC2 Device Control 2 022 1/2 D2

^R Intended for turning on or starting an ancillary device, set it to a special mode


(restored via DC1), or for any other device control function.

19 $13 DC3 Device Control 3 — XOFF 023 1/3 D3

^S Intended for turning off or stopping an ancillary device. It may be a secondary


level stop such as wait, pause, stand-by or halt (restored via DC1). It can also
perform any other device control function.
Note: DC3 is conventionally called XOFF when used in communication for
software flow control. An XOFF is issued to stop transmission when a device
cannot accept more data. Transmission can be continued via XON (DC1). The
name XOFF ("transmit off") does not come from a standard, but it is commonly
used. The use of XOFF and XON is in line with the standards, even though not
directly defined in them.

XOFF (^S) is sometimes used as a pause command. Continuing requires


pressing XON (^Q). ^S also works as a pause on MS-DOS and in Windows
command prompt. Pressing any key continues.

20 $14 DC4 Device Control 4 (Stop) 024 1/4 D4

^T Intended to turn off, stop or interrupt an ancillary device, or for any other
device control function.

21 $15 NAK Negative Acknowledge — TC8 Transmission control 025 1/5 NK


character 8

^U Negative response. Transmitted from a receiver as a response to the sender.

Note: NAK can be sent as a response to indicate inability to receive a message,


or to request resending.

22 $16 SYN Synchronous Idle — TC9 Transmission control character 9 026 1/6 SY

^V Used as "time-fill" in synchronous transmission. Sent during an idle condition to


retain a signal when there are no other characters to send.
Note: SYN has been used by synchronous modems, which have to send data
constantly. — Beginning each transmission with at least two SYN characters is a
way to achieve synchronization. The receiving station will possibly ignore SYN,
since it doesn't belong to the actual data content.

23 $17 ETB End of Transmission Block — TC10 Transmission control 027 1/7 EB
character 10

^W Indicates the end of a block of data. Used when data is divided into blocks for
transmission.

Note: ETB, when used to end a block, may call for a reply from a slave station.

24 $18 CAN Cancel 030 1/8 CN

^X Indicates that data is in error or should be disregarded. Affects "the data with
which it is sent" (ASCII-1968, ASCII-1977) or "the data preceding it" (ASCII-
1986, ISO 646, ECMA-6, ECMA-48).

Note: There are 2 alternative definitions for the data to be disregarded. The
actual scope of cancellation is undefined by the standards and should be
defined case by case. ^X has been used as a keyboard shortcut to cancel
(delete) the characters on the current line, which use conforms to the
standards.

25 $19 EM End of Medium 031 1/9 E


M
^Y Identifies 1) the physical end of a medium, 2) the end of the used portion of a
medium, or 3) the end of wanted data on a medium.

Note: EM may have been suitable for paper tape or magnetic tape to say "no
more data". Disk file systems use more sophisticated ways to keep track of the
used and unused areas of the medium.

This character is commonly abbreviated EM, except for Unicode, which provides
it as an alias with abbreviation EOM.

26 $1A SUB Substitute 032 1/10 SB

^Z Used in place of an invalid or erroneous character. Introduced by automatic


means in cases like a transmission error.

Note: When SUB is used as a substitution character, the reverse question mark
symbol seems quite good as its visual representation. Compare SUB to Unicode
U+FFFD REPLACEMENT CHARACTER.

SUB has often been used contrary to the standards. On CP/M and MS-DOS, it
appears as an End-Of-File marker for text files (^Z). On Unix, ^Z is a keyboard
signal to interrupt a foreground process.

27 $1B ESC Escape 033 1/11 EC

\e ^[ The first character of an escape sequence. Provides either supplementary


characters or additional control functions. ESC changes the meaning of a
limited number of following characters.
Note: ESC is used to form escape sequences, which perform various control
functions or apply additional character sets. ESC can also be used to invoke the
C1 control characters on a 7-bit system that only support character positions 0–
127.

On the keyboard, sometimes the Esc key indeed produces the ESC control
character. In windowing environments, the key typically cancels a dialog or an
operation, rather than producing a control character or starting an escape
sequence. This kind of an "escape" is not based on the character standards,
however. The closest ASCII equivalent for canceling a dialog would be CAN, but
since there is no Can key on the common keyboards, it can't be used.

Esc on PC keyboard produces this character code.

28 $1C FS File Separator — IS4 Information separator 4 034 1/12 FS

^\ The four information separators (FS, GS, RS and US) are used to separate and
qualify data. Each separator has two alternative names: Information Separator
Four equals File Separator, Information Separator Three equals Group
Separator, Information Separator Two equals Record Separator and Information
Separator One equals Unit Separator. The separators can be used either
hierarchically or in a non-hierarchical manner. When used hierarchically, the
order is US (least inclusive), RS, GS and FS (most inclusive). The content and
length of a file, group, record or unit are not specified by the standards.

FS, when used in a hierarchical order, delimits a data item called a file. It can
also delimit anything else.
29 $1D GS Group Separator — IS3 Information separator 3 035 1/13 GS

^] GS, when used in a hierarchical order, delimits a data item called a group. It
can also delimit anything else.

30 $1E RS Record Separator — IS2 Information separator 2 036 1/14 RS

^^ RS, when used in a hierarchical order, delimits a data item called a record. It
can also delimit anything else.

31 $1F US Unit Separator — IS1 Information separator 1 037 1/15 US

^_ US, when used in a hierarchical order, delimits a data item called a unit. It can
also delimit anything else.

Note: The information separators were deliberately arranged next to SPACE,


which can also be used as an information separator (word separator).

32 $20 SP Space 040 2/0 SP

Moves one character position forwards. Space may also have a function
equivalent to that of an information separator.

Note: Space has a dual nature. It can be classified as both a control character
and a (non-printing) graphic character. SP is similar to a Format Effector. It can
also be used as a fifth Information Separator. Space is sometimes represented
by the symbol ƀ or ␢ (b with a stroke) or ␣ (open box). SP does not belong to
the C0 set.
Spacebar on PC keyboard produces this character code.

See also: NBSP

127 $7F DEL Delete 177 7/15 DT

^? Outdated. An ignorable character originally intended for erasing an erroneous


or unwanted character in punched tape. In this standard use, DEL wouldn't
affect the information content of data, even though it may have affected the
information layout and the control of equipment. Standards also allowed DEL to
be used as media-fill or time-fill (even though a NUL may be more appropriate).

Note: DEL is now outdated. It was removed from the latest standards (ECMA-48
in 1991 and ISO 6429 in 1992). The origin of DEL is with perforated paper. On
that, DEL was equal to "all holes punched", which is a way to invalidate an
erroneous character (rubout). In a sense, DEL is similar to NUL, since both
characters mean "nothing". ASCII-1977 suggests the use of DEL as a "time
waster" to accommodate mechanical devices where a carriage return takes
time to execute. ASCII-1986 recommends NUL as a time waster instead of DEL.
DEL does not belong to the C0 set, but is an individual control code.

Ctrl + ←Backspace on PC keyboard produces this character code.

See also: NUL

\xis what you write in a C program to produce the given control character. ^X means you
press Ctrl + X to produce the given control character.
C1 control characters

The C1 control characters work in 8-bit environments. These controls come from 3 related
standards: ANSI X3.64, ISO 6429 and ECMA-48. All of these characters are also available in
Unicode, too. There are three unassigned control characters: PAD, HOP and SGCI. Use was planned
for them in a failed draft DIS 10646, but they were not actually standardized or put to use. Despite
this, one can find these control characters in various C1 lists online, and also as aliases in later
Unicode standards.

†) The 2-character mnemonics for C1 are from RFC 1345. They are not standardized.

De Hex Char Description Octa Pos †)


c l

128 $80 PAD unassigned, "Padding Character" 200 8/0 PA

ESC @ A reserved control code. Intended for use as PAD Padding Character in draft
DIS 10646, rejected, never standardized (not accepted to ISO 10646).

Note: Not part of ISO/IEC 6429 or ECMA-48.

Unicode lists this character as XXX and provides PAD as an alias.

129 $81 HOP unassigned, "High Octet Preset" 201 8/1 HO

ESC A A reserved control code. Intended for use as HOP High Octet Preset in draft DIS
10646, rejected, never standardized (not accepted to ISO 10646).
Note: Not part of ISO/IEC 6429 or ECMA-48. Listed as XXX in Unicode.

Unicode lists this character as XXX and provides HOP as an alias.

130 $82 BPH Break Permitted Here 202 8/2 BH

ESC B A point where a line break may occur.

Note: Roughly equivalent to a soft hyphen except that the means for indicating
a line break is not necessarily a hyphen. Compare to Unicode U+200B ZERO
WIDTH SPACE.

131 $83 NBH No Break Here 203 8/3 NH

ESC C A point where a line break may not occur.

Note: Compare to Unicode U+2060 WORD JOINER.

132 $84 IND Index 204 8/4 IN

ESC D Moves to the next line keeping the current horizontal position.

Note: According to ECMA-48:1986, IND was provided for use in those cases
where LF was implemented as New Line. IND was deprecated in 1988 and
withdrawn in 1992 from ISO/IEC 6429 (1986 and 1991 respectively for ECMA-
48).

See also: LF RI
133 $85 NEL Next Line 205 8/5 NL

ESC E Moves to the first position of the next line. Alternatively, to line home or line
limit position.

Note: NEL maps to the control character NL (New Line) in the EBCDIC
character set used on IBM mainframes.

See also: LF

134 $86 SSA Start of Selected Area 206 8/6 SA

ESC F Starts a string of character positions whose contents can be transmitted. The
string ends at EPA (or end of display).

135 $87 ESA End of Selected Area 207 8/7 ES

ESC G Ends a string of character positions (started by SPA) whose contents can be
transmitted.

136 $88 HTS Horizontal Tabulation Set, Character Tabulation Set 210 8/8 HS

ESC H Sets a tab stop at the active position.

Note: ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 have renamed HTS as


Character Tabulation Set.

137 $89 HTJ Horizontal Tabulation with Justification, Character 211 8/9 HJ
Tabulation with Justification

ESC I Moves text to the following tab stop. The text is what comes after the previous
tab stop up to the active position.

Note: This character has several names. ANSI X3.64 originally called it
Horizontal Tabulation with Justify. ISO 6429:1992, ECMA-48:1986 and ECMA-
48:1991 have renamed HTJ as Character Tabulation with Justification.

138 $8A VTS Vertical Tabulation Set, Line Tabulation Set 212 8/10 VS

ESC J Sets a vertical tab stop at the active line.

Note: ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 have renamed VTS as


Line Tabulation Set.

139 $8B PLD Partial Line Down, Partial Line Forward 213 8/11 PD

ESC K Moves down so that following characters will appear as subscripts. Subscripts
end at the next PLU.

Note: ISO 6429:1992 and ECMA-48:1991 have renamed PLD as Partial Line
Forward. Sample: text PLD subscript PLU text.

140 $8C PLU Partial Line Up, Partial Line Backward 214 8/12 PU

ESC L Moves up so that following characters will appear as superscripts. Superscripts


end at the next PLD.
Note: ISO 6429:1992 and ECMA-48:1991 have renamed PLU as Partial Line
Backward. Sample: text PLU superscript PLD text.

141 $8D RI Reverse Index, Reverse Line Feed 215 8/13 RI

ESC M Moves to the previous line keeping the current horizontal position.

Note: ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 renamed RI as


Reverse Line Feed, apparently related to the removal of IND.

See also: IND

142 $8E SS2 Single Shift Two 216 8/14 S2

ESC N Used to extend the character set. The next character will be from the currently
chosen G2 set.

Note: For more information see ISO 2022 or ECMA-35. The next character
should be in the decimal range 33-126 or 32-127.

143 $8F SS3 Single Shift Three 217 8/15 S3

ESC O Used to extend the character set. The next character will be from the currently
chosen G3 set.

Note: For more information see ISO 2022 or ECMA-35. The next character
should be in the decimal range 33-126 or 32-127.

144 $90 DCS Device Control String 220 9/0 DC


ESC P Starts a device control string. ST ends the string. The control string may
include commands to the receiving device, or a status report from the sending
device.

145 $91 PU1 Private Use One 221 9/1 P1

ESC Q Reserved for private use, no standardized meaning.

146 $92 PU2 Private Use Two 222 9/2 P2

ESC R Reserved for private use, no standardized meaning.

147 $93 STS Set Transmit State 223 9/3 TS

ESC S Notifies that data is ready for transfer from a device (ANSI X3.64), or
establishes the transmit state in the receiving device (ISO 6429, ECMA-48).
Doesn't initiate the actual transmission.

148 $94 CCH Cancel Character 224 9/4 CC

ESC T Ignore the preceding graphic character (and CCH itself too). If the previous
character is a control character or sequence, ANSI X3.64 says it should be
ignored, while ISO 6429 and ECMA-48 leave the action undefined.

Note: Destructive backspace. Intended to eliminate ambiguity about the


meaning of BS.
See also: BS

149 $95 MW Message Waiting 225 9/5 M


W

ESC U Sets a message waiting indicator in the receiving device.

150 $96 SPA Start of Guarded Protected Area, Start of Protected Area, 226 9/6 SG
Start of Guarded Area

ESC V Starts a string of character positions that can't be altered manually or


transmitted. Optionally protects against erasure too. EPA will end the string.

Note: SPA is known as Start of Protected Area (ANSI X3.64, ECMA-48:1979),


Start of Guarded Protected Area (ISO 6429:1983, ECMA-48:1984) and Start of
Guarded Area (ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991).

151 $97 EPA End of Guarded Protected Area, End of Protected Area, End 227 9/7 EG
of Guarded Area

ESC W Ends the area started by SPA.

Note: EPA is known as End of Protected Area (ANSI X3.64, ECMA-48:1979), End
of Guarded Protected Area (ISO 6429:1983, ECMA-48:1984) and End of
Guarded Area (ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991).

152 $98 SOS Start of String 230 9/8 SS


ESC X Starts a control string. The string ends at ST. It cannot contain a SOS. The
interpretation of the string depends on the application.

153 $99 SGCI unassigned, "Single Graphic Character Introducer" 231 9/9 GC

ESC Y A reserved control code. Intended for use as SGCI Single Graphic Character
Introducer in draft DIS 10646, rejected, never standardized (not accepted to
ISO 10646).

Note: Not part of ISO/IEC 6429 or ECMA-48. Listed as XXX in Unicode.

Unicode lists this character as XXX and provides SGC as an alias.

154 $9A SCI Single Character Introducer 232 9/10 SC

ESC Z A reserved control code. The name was standardized as SCI Single Character
Introducer, but the actual functionality was not implemented in the standards.

Note: SCI was to be followed by a single byte, which would represent a control
function or a graphic character. The functions or characters were not defined
in the standards.

155 $9B CSI Control Sequence Introducer 233 9/11 CI

ESC [ Starts a control sequence.

156 $9C ST String Terminator 234 9/12 ST


ESC \ Closes a string opened by APC, DCS, OSC, PM or SOS.

157 $9D OSC Operating System Command 235 9/13 OC

ESC ] Starts an operating system control string. The string ends at ST and is
interpreted subject to the operating system.

158 $9E PM Privacy Message 236 9/14 PM

ESC ^ Starts a privacy message. ST will end the message.

159 $9F APC Application Program Command 237 9/15 AC

ESC _ Starts an application program command string. ST will end the command. The
interpretation of the command is subject to the program in question.

ESC X means you press Esc followed by X to produce this control character.

ISO 8859 special characters

The two special characters, NBSP and SHY, are not really control characters. They are graphic
characters with a special feature. The characters also appear in Unicode. They are included here
for the sake of completeness.

‡) The 2-character mnemonics for NBSP and SHY are from RFC 1345. They are not standardized.
De Hex Char Description Octa Pos ‡)
c l

160 $A0 NBSP No-Break Space 240 10/0 NS

A space for use when a line break is to be prevented.

Note: NBSP can sometimes be produced by pressing Ctrl + Shift + SPACE . No


universally supported key combination exists.

In HTML you can write &nbsp; or &#160; to add a no-break space to a web page.

See also: SP

173 $AD SHY Soft Hyphen 255 10/13 --

Indicates an intraword break point for use when a word must be broken across
lines. The visual rendering either is a hyphen (ISO 8859) or varies (Unicode).

Note: SHY can sometimes be produced by pressing Ctrl + - . No universally


supported key combination exists.

In HTML you can write &shy; or &#173; to add a soft hyphen to a web page.

C0 NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF


00 01 02 03 04 05 06 07 08 09 0A 0B 0C
DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS
10 11 12 13 14 15 16 17 18 19 1A 1B 1C

SP
20

....

PAD HOP BPH NBH IND NEL SSA ESA HTS HTJ VTS PLD PLU
80 81 82 83 84 85 86 87 88 89 8A 8B 8C
C1
DCS PU1 PU2 STS CCH MW SPA EPA SOS SGCI SCI CSI ST
90 91 92 93 94 95 96 97 98 99 9A 9B 9C

NBSP
8859
A0

Format Info sepa- Presentation Area defi- Device Transmission


Delimiter Introducer Shift Graphic
effector rator control nition control control

Categories
A summary of character categories. Mostly based on ANSI X3.64, ISO 6429, ECMA-6 and ECMA-48.

Delimiters (Control string delimiters)


Delimiters start and end a control string. A control string consists of an opening delimiter, a
command string or a character string, and a terminating delimiter (ST).
APC DCS OSC PM SOS ST
Introducers
An introducer is a control character or escape sequence that begins a sequence. The
sequence is interpreted as a single graphic character or control.
CSI ESC SCI
Shift function characters (was: Code extension control)
Shift function characters are used to extend the character set of the code. They may alter
the meaning of one or more characters that follow them.
SI SO SS2 SS3
Format effectors (also: Layout characters)
Format effectors are mainly intended for the control of the layout and positioning of
information. Format effectors (most of them) are data which happen to have a format
representation rather than a graphic representation.
BS CR FF HT HTJ HTS IND LF NEL PLD PLU RI VT VTS
Information separators
Information separators separate and qualify data logically. They may be used either in
hierarchical order or non-hierarchically. Their specific meanings depend on the application.
FS GS RS US
Presentation control characters
Presentation control characters indicate where a line break may or may not occur.
BPH NBH
Graphic characters
Graphic characters appearing here are those that have control character like properties.
NBSP SHY SP
Area definition characters (was: Form filling)
Area definition characters are used for entering information into a preformatted visual
display.
EPA ESA SPA SSA
Device control characters
Device control characters are intended for the control of local or remote or ancillary devices.
They are not intended to control data communication systems; this should be done with
transmission control characters.
DC1 DC2 DC3 DC4
Transmission control characters (was: Communication control)
Transmission control characters are intended to control or facilitate transmission of
information over telecommunication networks.
ACK DLE ENQ EOT ETB ETX NAK SOH STX SYN
Miscellaneous
Miscellaneous control characters fall outside other categories.
BEL CAN CCH DEL EM MW NUL PU1 PU2 STS SUB
Not assigned
Unassigned control characters are ones that were not standardized. Their location was
reserved for future standardization. These characters are known by names that appeared in
a draft (DIS 10646), even though they didn't make it to the final standard.
PAD HOP SGCI

Translations
The translated terms are taken from the given standards. Several alternative translations may
exist.

English French Russian Spanish German Finnish


Standard Unicode 5.0 GOST 34.301-91, GOST T.53, T.51, T.50 DIN 66003 SFS 4017 *
34.302.2-91

NUL Null NUL nul ПУС пусто nulo Füllzeichen tyhjämerkki

SOH Start of Heading DET début d'en-tête НЗ начало заголовка comienzo de Anfang des otsikon alku
encabezamiento Kopfes

STX Start of Text DTX début de texte НТ начало текста comienzo de Anfang des tekstin alku
texto Textes

ETX End of Text FTX fin de texte КТ конец текста fin de texto Ende des Textes tekstin loppu

EOT End of FTR fin de transmission КП конец передачи fin de Ende der Über- tekstin loppu
Transmission transmisión tragung
ENQ Enquiry DEM demande КТМ кто там? pregunta Stationsauf- kysely
forderung

ACK Acknowledge ACC accusé de réception ДА подтверждение acuse de recibo Positive Rück- kuittaus
[positif] meldung

BEL Bell SON sonnerie ЗВ звонок timbre Klingel äänimerkki

BS Backspace EFF espace arrière ВШ возврат на шаг retroceso Rückwärtsschritt peruutus

HT Horizontal TAB tabulation horizontale ГТ горизонтальная tabulación de Horizontal- sarake-


Tabulation табуляция caracteres Tabulator ohjaus

LF Line Feed PAL changement de ligne ПС перевод строки cambio de Zeilenvorschub riviaskel
renglón

VT Vertical TAV tabulation verticale ВТ вертикальная tabulación Vertikal- rivitys


Tabulation табуляция vertical Tabulator

FF Form Feed SDP saut de page, page ПФ перевод формата página siguiente Formular- sivun vaihto
suivante vorschub

CR Carriage Return RC retour de chariot ВК возврат каретки retorno del carro Wagenrücklauf vaunun
palautus

SO Shift Out HC hors code ВЫХ выход cambio-salida Dauerum- koodinvaihto


schaltung

SI Shift In EC en code ВХ вход cambio-entrada Rückschaltung koodin-


palautus

DLE Data Link Escape ÉCT échappement AP1 авторегистр один escape de Datenüber- ohjaus-
transmission enlace de datos tragungsum- koodin poik-
schaltung keus

DC1 Device Control 1 CD1 commande d'appareil СУ1 символ устройства control de Gerätesteuerung laitteen
un один dispositivo uno 1 ohjaus 1

DC2 Device Control 2 CD2 commande d'appareil СУ2 символ устройства control de Gerätesteuerung laitteen
deux два dispositivo dos 2 ohjaus 2

DC3 Device Control 3 CD3 commande d'appareil СУ3 символ устройства control de Gerätesteuerung laitteen
trois три dispositivo tres 3 ohjaus 3

DC4 Device Control 4 CD4 commande d'appareil СУ4 символ устройства control de Gerätesteuerung laitteen
(Stop) quatre четыре dispositivo 4 ohjaus 4
cuatro

NAK Negative ACN accusé de réception НЕТ отрицание acuse de recibo Negative Rück- kielteinen
Acknowledge négatif negativo meldung kuittaus

SYN Synchronous Idle SYN synchronisation СИН синхронизация reposo síncrono Synchroni- tahditus
sierung

ETB End of FBT fin de bloc de КБ конец блока fin de bloque de Ende des Über- jaksonsiirron
Transmission transmission transmisión tragungsblocks loppu
Block

CAN Cancel ANN annulation ОТ отмена cancelar Ungültig sanoman


М peruutus

EM End of Medium FS fin de support КН конец носителя fin del medio Ende der Auf- tietovälineen
físico zeichnung loppu

SUB Substitute SUB substitution ЗС замена символа substituto Substitution korvike

ESC Escape ÉCH échappement АР2 авторегистр два escape Umschaltung koodin
poikkeus

FS File Separator SF séparateur de fichiers РФ разделитель файлов separador de Hauptgruppen- tiedoston


fichero Trennung erotusmerkki

GS Group Separator SG séparateur de groupes РГ разделитель групп separador de Gruppen- ryhmän


grupo Trennung erotusmerkki

RS Record Separator SA séparateur d'enregis- РЗ разделитель записей separador de Untergruppen- tietueiden


trements, séparateur registro Trennung erotusmerkki
d'articles

US Unit Separator SSA séparateur de sous- РЭ разделитель separador de Teilgruppen- yksikön


articles элементов unidad Trennung erotusmerkki

SP Space ESP espace ПР пробел espacio Zwischenraum tyhjä

DEL Delete SUP suppression ЗБ забой suprimir Löschen merkin


poisto

PAD "Padding caractère de bourre


Character"

HOP "High Octet octet supérieur


Preset" prédéfini

BPH Break Permitted API arrêt permis ici РПС разрешение переноса corte permitido
Here строки aquí

NBH No Break Here PAI aucun arrêt ici ЗПС запрет переноса corte no
строки permitido aquí

IND Index IND index ИНД индекс

NEL Next Line NL à la ligne НС новая строка

SSA Start of Selected DZS début de zone НВО начало выбранной


Area sélectionnée области

ESA End of Selected FZS fin de zone КВО конец выбранной


Area sélectionnée области
HTS Horizontal TTH taquet de tabulateur УГТ установка горизон-
Tabulation Set horizontal тальной табуляции

HTJ Horizontal THJ tabulateur horizontal ГТВ горизонтальная


Tabulation with avec justification табуляция с
Justification выключкой

VTS Vertical TTV taquet de tabulateur УВТ установка


Tabulation Set vertical вертикальной
табуляции

PLD Partial Line Down IPav interligne partiel avant CCB смещение строки avance de línea
вперед parcial

PLU Partial Line Up IPar interligne partiel CCH смещение строки retroceso de
arrière назад línea parcial

RI Reverse Index IR index renversé, ОПС обратный перевод cambio de


interligne inversé строки renglón inverso

SS2 Single Shift Two RU2 remplacement unique ПЕ2 переключатель cambio
deux единичный два individual dos

SS3 Single Shift Three RU3 remplacement unique ПЕ3 переключатель cambio
trois единичный три individual tres

DCS Device Control CCA chaîne de commande УЦУ управляющая цепочка


String d'appareils устройства

PU1 Private Use One UP1 usage privé un ЧИ1 частное использова-
ние один

PU2 Private Use Two UP2 usage privé deux ЧИ2 частное использова-
ние два
STS Set Transmit MMT mise en mode УСП установка состояния
State transmission передачи

CCH Cancel Character ANC annulation du OTC отмена символа


caractère précédent

MW Message Waiting MES message en attente ОС ожидание сообщения


ATT

SPA Start of Guarded DZP début de zone НС начало сохраняемой


Protected Area protégée О области

EPA End of Guarded FZP fin de zone protégée КСО конец сохраняемой
Protected Area области

SOS Start of String DC début de chaîne НЦ начало цепочки comienzo de


cadena

SGCI "Single Graphic introducteur de


Character caractère graphique
Introducer" unique

SCI Single Character ICU introducteur de ГЕС головной символ


Introducer caractère unique единичного символа

CSI Control Sequence ISC introducteur de ГУП головной символ introductor de


Introducer séquence de управляющей после- secuencia de
commandes довательности control

ST String Terminator FC fin de chaîne ТРЦ терминатор цепочки terminador de


cadena

OSC Operating System CSE commande de КОС команда


Command système d'exploitation операционной
системы
PM Privacy Message MP message privé ЧС частное сообщение

APC Application CO commande de КПП команда прикладной


Program PRO progiciel программы
Command

NBSP No-Break Space ESP espace insécable непрерывающий espacio anticorte yhdistävä
INS пробел välilyönti *

SHY Soft Hyphen CDN trait d'union гибкий дефис guión de corte pehmeä
conditionnel programable tavuviiva *

* Finnish terms marked with an asterisk are not from any standard, but from
recommendation Eurooppalaisen merkistön merkkien suomenkieliset nimet.

Character index
ACK Acknowledge NAK Negative Acknowledge

APC Application Program Command NBH No Break Here

BEL Bell NBSP No-Break Space

BPH Break Permitted Here NEL Next Line

BS Backspace NUL Null


CAN Cancel OSC Operating System Command

CCH Cancel Character PAD "Padding Character" (unassigned)

CR Carriage Return PLD Partial Line Down

CSI Control Sequence Introducer PLU Partial Line Up

DC1 Device Control 1 PM Privacy Message

DC2 Device Control 2 PU1 Private Use One

DC3 Device Control 3 PU2 Private Use Two

DC4 Device Control 4 (Stop) RI Reverse Index

DCS Device Control String RS Record Separator

DEL Delete SCI Single Character Introducer

DLE Data Link Escape SGCI "Single Graphic Character Introducer" (unassigned)

EM End of Medium SHY Soft Hyphen

EN Enquiry SI Shift In
Q
SO Shift Out
EOT End of Transmission
SOH Start of Heading
EPA End of Guarded Protected Area
SOS Start of String
ESA End of Selected Area
SP Space
ESC Escape

ETB End of Transmission Block SPA Start of Guarded Protected Area

ETX End of Text SS2 Single Shift Two

FE0 Format effector 0 (Backspace) SS3 Single Shift Three

FE1 Format effector 1 (Character Tabulation) SSA Start of Selected Area

FE2 Format effector 2 (Line Feed) ST String Terminator

FE3 Format effector 3 (Line Tabulation) STS Set Transmit State

FE4 Format effector 4 (Form Feed) STX Start of Text

FE5 Format effector 5 (Carriage Return) SUB Substitute

FF Form Feed SYN Synchronous Idle

FS File Separator TC1 Transmission control character 1 (Start of Heading)

GS Group Separator TC2 Transmission control character 2 (Start of Text)

HOP "High Octet Preset" (unassigned) TC3 Transmission control character 3 (End of Text)

HT Horizontal Tabulation TC4 Transmission control character 4 (End of Transmission)

HTJ Horizontal Tabulation with Justification TC5 Transmission control character 5 (Enquiry)

HTS Horizontal Tabulation Set TC6 Transmission control character 6 (Acknowledge)

IND Index TC7 Transmission control character 7 (Data Link Escape)

IS1 Information separator 1 (Unit Separator) TC8 Transmission control character 8 (Negative Acknowledge)
IS2 Information separator 2 (Record Separator) TC9 Transmission control character 9 (Synchronous Idle)

IS3 Information separator 3 (Group Separator) TC10 Transmission control character 10 (End of Transmission
Block)
IS4 Information separator 4 (File Separator)
US Unit Separator
LF Line Feed
VT Vertical Tabulation
LS0 Locking-Shift Zero (Shift In)
VTS Vertical Tabulation Set
LS1 Locking-Shift One (Shift Out)
XOFF Device Control 3
MW Message Waiting
XON Device Control 1

Sources
 ASA standard X3.4-1963: American Standard Code for Information Interchange. Note: ASCII-
1963.
 USAS X3.4-1967: USA Standard Code for Information Interchange. United States of America
Standards Institute, New York, USA, 1967. Note: ASCII-1967.
 USAS X3.4-1968: USA Standard Code for Information Interchange. Reprinted as NIC 11246 in
Feinler & Postel (ed.): Arpanet Protocol Handbook. NIC 7104 Rev. Jan 1978. ADA-052 594.
Network Information Center, Menlo Park, California, USA. Note: ASCII-1968.
 ANSI X3.4-1977: American National Standard Code for Information Interchange. American
National Standards Institute, Inc, New York, USA, 1977. Also reprinted in McGraw Hill's
Compilation of Data Communication Standards, edition II, McGraw-Hill, 1982. Note: ASCII-
1977.
 ANSI X3.4-1986: Coded Character Sets – 7-bit American National Standard Code for
Information Interchange. American National Standards Institute, Inc, New York, USA,
1986. Note: ASCII-1986.
 ANSI X3.32-1973: Graphic Representation of the Control Characters of American National
Standard Code for Information Interchange. Reprinted in McGraw Hill's Compilation of Data
Communication Standards, edition II, McGraw-Hill, 1982.
 ANSI X3.64-1979: Additional Controls for Use with American National Standard Code for
Information Interchange. American National Standards Institute, Inc, New York, USA, 1979.
 Bemer, R.W.: Inside ASCII. Best of Interface Age, Volume 2: General Purpose Software.
Oregon, USA (1980).
 Bies, Lammert: ASCII character map.
 Digital Research: An Introduction to CP/M Features and Facilities, version 1.3, 1976.
 ECMA-6: 7-bit Coded Character Set, 4th edition 1973, 5th edition 1985.
 ECMA-17: Graphic Representation of the Control Characters of the ECMA 7-Bit Coded
Character Set for Information Interchange, 1st edition (withdrawn).
 ECMA-35: Character Code Structure and Extension Techniques, 6th edition.
 ECMA-48: Control Functions for Coded Character Sets, 2nd, 3rd, 4th and 5th edition.
 Gerstung, Olaf: Tabellen — Verschiedenes. Bedeutung der Steuerzeichen im ASCII und nach
DIN 66003.
 GOST 34.301-91: Information technology. 7-bit and 8-bit coded character sets. Control
functions – ГОСТ 34.301-91 (ИСО 6429-88) Информационная технология. 7-битные и 8-
битные кодированные наборы символов. Управляющие функции.
 GOST 34.302.2-91: Information technology. 8-bit single-byte coded graphic character sets.
Latin alphabet No. 2 – ГОСТ 34.302-91 (ИСО 8859/2-87) Информационная технология.
Наборы 8-битных однобайтовых кодированных графических символов. Латинский
алфавит № 2.
 Helsingin yliopiston yleisen kielitieteen laitos: Eurooppalaisen merkistön merkkien
suomenkieliset nimet, 2. laitos, toukokuu 2004.
 ISO / R 646-1967 (E): 6 and 7-bit coded character sets for information processing
interchange, 1st edition December 1967. International Organization for Standardization,
Switzerland.
 ISO 646-1973 (E): 7-bit coded character set for information processing interchange. ISO
Standards Handbook 1: Information transfer, 1st edition, 1977. Also reprinted in McGraw
Hill's Compilation of Data Communication Standards, edition II, McGraw-Hill, 1982.
 ISO 646:1991: Information technology – 7-bit coded character set for information processing
interchange.
 ISO 2022-1973 (E): Code extension techniques for use with the ISO 7-bit coded character set.
ISO Standards Handbook 1: Information transfer, 1st edition, 1977.
 ISO 2047-1975 (E): Information processing – Graphical representations for the control
characters of the 7-bit coded character set. ISO Standards Handbook 1: Information transfer,
1st edition, 1977.
 ISO/IEC 6429:1992 (E): Information technology – Control functions for coded character sets.
 ISO 1745-1975 (E): Information processing – Basic mode control procedures for data
communication systems. Reprinted in McGraw Hill's Compilation of Data Communication
Standards, edition II, McGraw-Hill, 1982.
 ISO/IEC 8859: Information technology – 8-bit single-byte coded graphic character sets. Note:
Mostly ISO/IEC 8859-1:1998: 8-bit single-byte coded graphic character sets -- Part 1: Latin
alphabet No. 1.
 ISO-IR 001: The set of control characters of the ISO 646. Note: ISO-IR 001 deviates slightly
from ISO 646-1973 in wording. DEL missing.
 ISO-IR 077: C1 Control Character Set of ISO 6429-1983.
 Jennings, Tom: An annotated history of some character codes, revised 29 October, 2004.
 RFC 20: ASCII format for Network Interchange. Note: Identical to USAS X3.4-1968 (ASCII-
1968). Missing Appendix A–D.
 RFC 1345: Character Mnemonics & Character Sets.
 SFS 4017: Tietojen vaihdossa käytettävä 7-bittinen koodi – 7-bit coded character set for
information processing interchange. Suomen standardisoimisliitto, Helsinki, Finland, 1977.
 UIT-T T.50 (04/92): Alfabeto internacional de referencia, (anteriormente alfabeto
internacional N.° 5 o IA5) – Tecnología de la información - Juego de caracteres codificado de
siete bits para intercambio de información.
 UIT-T T.51 (09/92): Juegos de caracteres codificados basados en el alfabeto latino para los
servicios de telemática.
 UIT-T T.53 (04/94): Funciones de control codificadas mediante caracteres para los servicios
telemáticos.
 Unicode, Inc.: Unicode 5.0, section française.
 Unicode, Inc.: The Unicode Standard, version 9.0.0, 2016.
 Unicode, Inc.: Unicode Character Database, NameAliases-9.0.0.txt.
 Whistler, Ken: Why Nothing Ever Goes Away (was: Re: Acquiring DIS 10646). Unicode Mail
List, 5 Oct 2015.
 Wikipedia: ASCII.
 Wikipedia: C0 and C1 control codes.
 Wikipedia: Control character.
 Wikipedia: Newline.
 Wikipedia: Software flow control.

Most of the sources have been consulted as of September/October 2011.

Special thanks for help to Douglas A. Kerr, the principal author and editor of the published
standards document of the first complete version of ASCII.

Last updated in August 2016: Unicode 9.0, CP/M, additional details on PAD, HOP and SGCI.

You might also like