cy
gE tL Eee OMELET ad
ASSEMBLY
LANGUAGE |
PRIMER forthe
IBM PC & XT |
Bereta ta Powe co
DOS Function Calls"
r) yaa poy, -to-Learn Se ‘
e ores nd rat) aa .
@ Includes DEBUG, Disk Access, and —
eure
Cy coo Nr to Pevee| levels
VA Xo) ot Ta etaASSEMBLY LANGUAGE PRIMER for the
IBM® PC and XT
This book is one of the first to completely demystify assembly lan-
guage on the IBM PC or XT. Written in a lucid, easy-to-follow style, it
makes learning assembly language on this powerful 16-bit computer
not only a richly rewarding experience, but fun as well.
Assembly language is the fastest and most efficient language on
any computer. Once you've learned BASIC, assembly language is the
next logical step. It puts you in touch with the “soul” of the machine:
the way your computer really works. I's the only language to use if
you write programs that require great speed or precise control of
peripheral devices. Animated graphics, special sound effects, win-
dows, fast sorting, string handling, and enhancements to the operat-
ing system are just a few of the applications requiring assembly
language.
This book uses graphics and sound to provide interesting exam-
ples, and DEBUG and DOS functions to simplify your first programs
in assembly language. This book and your IBM PC are all you need
to move up to a new level of professional programming.
Robert Lafore
Robert Lafore is Managing Editor of The Waite Group, a company which
produces computer books in San Rofoel, California. Mr. Lafore has worked
with computers since 1965, when he first learned assembly language on the
DEC PDP-5. He has programmed on many different machines, and is fluent
in a voriely of computer languages. He holds degrees in mathematics and
electrical engineering, and is the co-author of Soul of CP/M, an assembly
longuoge book for CP/M systems. Mr. Lafore founded Interactive Fiction, a «
computer game company, and has also been a petroleum engineer in
Southeast Asia, a novelist, c newspaper columnist, o systems engineer for the
University of Californio’s Lawrence Berkeley Laboratory, and has sailed his
‘own boat to the South Pacific.INSSENIBLY,
WANGQUAGE
PRIMER.
for the
JBM IPC 2 ZF
by
Robert Lafore
A Plume/Waite Book
New American Library
New York and Scarborough, OntarioNAL BOOKS ARE AVAILABLE AT QUANTITY DISCOUNTS WHEN USED TO PROMOTE
PRODUCTS OR SERVICES. FOR INFORMATION PLEASE WRITE TO PREMIUM MARKET-
ING DIVISION, NEW AMERICAN LIBRARY, 1633 BROADWAY, NEW YORK, NEW YORK
10019.
Copyright © 1984 by The Waite Group, Inc. All rights reserved. For information address New
American Library.
Several trademarks and/or service marks appear in this book. The companies listed below are the
‘owners of the trademarks and/or service marks following their names.
International Business Machines Corporation: IBM, IBM PC, IBM Personal Computer, IBM PC XT,
PC-DOS
Microsoft: MS-DOS, MBASIC
Digital Research: CP/M, CP/M-86
MicroPro International Corporation: WordStar
Apple Computer Inc.: Apple
Intel Corporation: Intel
Softech Microsystems: UCSD p-System
Epson Corporation: Epson
Atari Inc: ATARI
Lotus: Lotus 1-2-3
Information Unlimited Software: Easy Writer
ATT Corporation: Bell Laboratories, Unix
‘ComputerLand
KayPro
Osborne
‘Xerox Corporation
LIBRARY OF CONGRESS CATALOGING IN PUBLICATION DATA
Lafore, Robert (Robert W.)
‘Assembly language primer for the IBM PG & XT.
“A Plume/Waite book.”
Includes index
1. IBM Personal Computer—Programming. 2. 1BM Personal
Computer XT—Programming. 3. Assembler language (Computer
program language) 1. Title
QA768.12594134 1984 001612 84.3902
ISBN: 0-452-25711-5
PLUME TRADEMARK REG. US. OFF AND FOREIGN COUNTRIES
REGISTERED TRADEMARK—MARCA REGISTRADA
HECHO EN HARRISONBURG, VA., USA
SIGNET, SIGNETCLASSIC, MENTOR, PLUME, MERIDIAN and NAL BOOKSare publishedin
the United States by New American Library, 1633 Broadway, New York, N.Y. 10019, in Canada by
The New American Library of Canada Limited, 81 Mack Avenue, Scarborough, Ontario MIL IMB,
Book and cover design by Dan Cooper
Typography by Walker Graphics
First Printing, May, 1984
56789
PRINTED IN THE UNITED STATES OF AMERICAContents
Acknowledgments vii
Introduction
Is Assembly Language Really so Hard to Learn? 2
Why Is This Book Unusual? 2
Why Learn Assembly Language on the IBM PC? 3
Who This Book Is For 3
The Equipment You Need to Use This Book 4
The Approach Used in This Book 11
Assembly Language and Debug
‘Assembly Language and Higher-Level Languages 13
Microprocessors 18
DEBUG Versus the Assembler 19
The Window of the 8088's Soul 21
Getting DEBUG Rolling 21
Summary 28
Instant Program
Waiting Your First Program 29
Running the Program 34
What an Assembler Really Does 35
Assembly-Language Instructions 37
Summary 50
What Is Assembly Language?
Filling in Details 52
Registers 55
ASCII Display Program 60
Some Sound Advice 68
Soe RaE
Inside DOS—The Disk Operating System
The Ports of DOS 92
DOS Functions 96
Writing to the Printer 107
Summary 118
Introduction to the IBM MACRO Assembler
MASM and ASM 120
‘What Does an Assembler Do? 121
Assembling Your First Program 125
Assembling SMASCII2. 133
Deciphering Machine-Language Op-Codes 139
Using a Batch File to Speed Assembly 142
Summary 145
13
29
Si
82
1196 Using the IBM MACRO Assembler 147
The BINIHEX Program 148
New Instructions - 154
Using DEBUG's Trace Command 165
The DECIBIN Program 170
The DECIHEX Program 183
Cross-Reference: Using the CREF Program 187
Summary 191
7 How Does It Sound? 192
Why Use Sound? 193
The White Noise Program 193
The Machine Gun Program 197
Generating Sound with the Timer 208
Controlling Sound with the Keyboard 215
Summary 230
8 Memory Segmentation and EXE Files 231
Memory Segmentation 232
The PSTRING Program 237
The PIANO Program as on EXE File 249
The EXEFORM Progrom—A Nonprogram 252
Segmentation and the String-Handling Instructions 258
The Compare Strings Program 262
Summary 272
9 Inside the ROM 273
Scan Codes and the Keyboard 278
Video ROM Routines 285
Summary 293
10 Monochrome and Color Graphics 294
Grophies Modes in the IBM PC_ 295
Memory Mapped Graphics 297
Color Grophies 3
Drawing Lines ay.
Summary 344
11 Reading and Writing Disk Files 345
The Historical Perspective 346
Floppies and the Fixed Disk 347
Sequential Access 349
Random Access 373
Random Block Access 378
Summary 384
vi Contents12 File Handle Disk Access
13
Features of File Handle Access 385
The ZOPEN Program 388
The ZREAD Program 396
Whiting to 0 File 401
Getting to the Middle of a File 408
Summary 411
Interfacing to BASIC and Pascal
General Interfacing Considerations 413
Interfacing to BASIC with USR 416
Interfacing to BASIC with CALL 437
Interfacing to Pascal 444
Summary 452
Appendix A—Hexadecimal Numbering
What Is @ Numbering System? 453
What Numbering System Do Computers Like? 454
Appendix B—Supplementary Programs
MEMSCAN 463
HEXIDEC 469
PRIME 472
The Birthday Programs 477
SAVEIMAG 494
Index
385
412
453
463
499
Contents viiThis book is dedicated to the Munchkins,
without whose patient support the Emerald City
would still be only a myth.
ACKNOWLEDGMENTS
‘The author would like to thank Mitchell Waite, who edited the
manuscript of this book, suggested many important improvements,
and provided moral support; John Angermeyer, whose technical
expertise eliminated a variety of errors; and Janet Hunter, whose
ig attention to detail was essential to the finished work.Introduction
The purpose of this book is to teach you how to write programs in
assembly language. Why would you want to study a computer language
which has acquired the reputation of being somehow mysterious and
difficult to learn?
Assembly language is always the fastest and most powerful language
available for a given computer. It is essential in programs where pure
speed of operation is important, such as graphics, sorting, and sustained
number-crunching, It is also the only language that can make use of all
of a particular machine's hardware features. With higher-level languages,
such as BASIC or Pascal, the programmer is always insulated from the
computer by the language itself — you can only do what the writers of
the language decided you should be able to do. Inevitably, then, you can
not tap the full power of the computer.
For these reasons, many types of programs, such as operating
systems, compilers, word processors, and graphics programs, are almost
always written in assembly language. So, if you want to do this sort of
programming, you need to know assembly language.
But assembly language is not just practical, it is also a fascinating and
rewarding field of study. It is so closely tied to the physical reality of the
computer that it does not suffer from the somewhat arbitrary quality of
higher-level languages. Everything you do in assembly language is the
result of the way the computer operates, not the way the designers of a
higher-level language decided to do things for the sake of ease and
convenience,
We can think of higher-level languages as being like stodgy luxury
sedans: they're comfortable and easy to use, but the steering is imprecise,
the suspension insulates you from the feel of the road, and if you try to
push them too fast they slide into the ditch.
Assembly language, on the other hand, is the sports car of computer
languages. In a sports car you're close to the road. The steering, brakes
and gears are light and precise, and the car is built for speed and
efficiency. It may not be quite as comfortable as a sedan, but it’s fast, and
more importantly, it’s fun to drive.
Assembly language is fun in the same way: it’s fast, it’s efficient, and it
gives you the satisfaction of having complete control over a powerful and
finely-tuned machine.Is Assembly Language Really So Hard to Learn?
Unfortunately, assembly language has developed the reputation of
being difficult to learn. Many people — even those who had no trouble
Iearning a higher-level language such as BASIC — think that assembly
language is somehow beyond them. This belief is fostered by many books
on assembly language, which, strange as it may seem, appear to be
written with the assumption that the reader already knows all about the
subject the book is attempting to teach! For instance, many assembly-
language books start off by listing and describing ail of the scores of
machine instructions. This is a good bit like giving a student in a first-
year French class a dictionary, and telling him that as soon as he's
memorized it he can go on to the next lesson! There must be an easier
way.
We believe that assembly language, in spite of its reputation, is
actually not too much harder to learn than any other computer language,
provided that it is presented gradually and easily, so that the reader does
not feel overwhelmed at the beginning. It’s this sort of easy, step-by-step
presentation that we have attempted to achieve in this book. For this
reason we've avoided “clever” programming; that is, shortcuts which
increase the speed or compactness of the program at the expense of
clarity. Once a program has been written in an obvious way, it can always
be modified to make it faster or smaller. Like poetry, very compact
programs can be beautiful once you understand them, but require far
more time to understand than a more obvious, less compact routine.
Why Is This Book Unusual?
Assembly Language Primer for the IBM PC and XT is unusual — and we
believe superior to other books on the market — in several respects. First,
it not only teaches assembly language, it teaches it in the context of a
particular computer: the IBM PC. As we'll see, this provides significant
advantages over books that try to cover all the computers which use a
particular microprocessor chip.
Second, this book makes use of the built-in DOS function calls, which
vastly simplify programming and can make even short programs
powerful.
Third, to make things easy for the complete novice, this book makes
extensive use of IBM’s DEBUG utility, which provides a far simpler and
less threatening introduction to assembly-language programming than
more conventional approaches that plunge you immediately into the
2 Assembly Language Primer for the IBM PC & XTcomplexities of a full MACRO Assembler program
Finally, this book uses the graphics and sound capabilities of the IBM
PC. This makes the learning experience more interesting, Plus, by
making use of these features, you can ensure that the programs you
write will be fun to use. If you decide to market your programs, graphics
and sound will make them more popular and profitable.
Why Learn Assembly Language on the IBM PC?
If you are interested in writing programs for commercial use, the
answer to the question posed above must be obvious: the IBM enjoys
unprecedented sales growth. If you write a popular program for the IBM
PC, you are guaranteed one of the largest markets in the personal
computer field. There are other reasons, however, why the IBM PC is an
especially appropriate computer on which to learn assembly language.
First, both the hardware and the software on the IBM PC are top
quality. They are solid and reliable. You don’t have to worry — as you do
on some machines — that a hardware failure will suddenly cause a
system crash and destroy a file you've spent hours creating, or that a
mysterious bug in the assembler will prevent your program from
assembling correctly, even though it is correctly written.
Second, if you want to be in the forefront of what’s happening in
computers, it’s important to learn about the new 16-bit technology. The
internal characteristics of the 8088 are 16-bit. This makes the IBM PC an
ideal “stepping stone” to using an 8086 16-bit system.
Finally, the IBM PC Disk Operating System (PC-DOS) is far more
powerful and versatile than earlier microcomputer operating systems. By
writing programs under this operating system you ensure not only that
your programs can make use of an extensive number of powerful DOS
functions, but that you are learning how a sophisticated, state-of-the-art
system operates. You also benefit from the fact that PC-DOS is very
similar to (and is in fact derived from) another operating system, MS-
DOS. By writing programs that run under PC-DOS you can (if you
follow a few simple rules) ensure that the same programs will run under
MS-DOS. MS-DOS is used on many non-IBM computers, so if you are
interested in marketing your product, you will have a program you can
sell to IBM PC owners and to owners of a host of other computers as well.
Who This Book Is For
This book is primarily intended for the person who has no previous
Introduction 3experience in assembly language programming: the rank beginner.
However, it will benefit the programmer who knows assembly language
for a different microprocessor, such as the 8085, Z-80, or 6502, and who
wants to learn how the 8088 family of chips work
The Rank Beginner
If you have never written in assembly language, and have only a
vague idea what it’s all about, then this book is for you. We start at the
very beginning, without inflated expectations about your knowledge of
the subject.
Although the reputation that assembly language has for being
difficult to learn is largely undeserved, many people still find it a bit less
obvious than the simpler higher-level languages such as BASIC and
Pascal. For this reason, we recommend that you have some experience
with a higher-level language before you read this book. Although it is
possible to learn assembly language as a first computer language, it’s
probably easier to cut your teeth on BASIC.
Once you know a little about a higher-level language, you'll not only
understand in general what computer languages are supposed to do, but
you will also have picked up the jargon and some of the ideas that are
necessary for a real understanding of computers.
The Experienced Assembly Language Programmer
Although this book is oriented toward the beginner, you will still find
it valuable if you are an experienced assembly-language programmer
who is not yet familiar with the 8088 microprocessor and its
implementation in the IBM PC. You may whiz through the book faster
than the beginner, but even the initial chapters will be of interest, since
it’s here that you will learn how to use DEBUG — an essential tool —
and various other useful skills.
In fact, if you are used to 8-bit microprocessors, you will find the 16-
bit 8088 to be, in many ways, a whole new ball game. The use of memory
segmentation, the extensive instruction set, the implementation of
graphics and sound, the string-handling instructions, and the multiple
addressing modes all require thorough examination, which this book
provides.
The Equipment You Need to Use This Book
4
In this section we're going to discuss the equipment, both hardware
and software, you need to best profit from this book.
Assembly Language Primer for the IBM PC & XTHardware
This is very much a “hands-on” book. Although you can gain a
general understanding of assembly language by reading it without a
computer at your disposal, you will be far better off if you have the
computer on your desk before you start to read. As with other computer
languages (and non-computer languages) it’s only through practice that
real mastery is achieved.
So we'll assume that you have access to an IBM Personal Computer,
either a model with one or two floppy diskette drives, or the newer model
with the fixed disk: the PC XT. You definitely can not use the cassette-
based version of the PC, since the assembler program, various other
software, and the entire DOS function approach used in this book, all
require the disk operating system. Very few IBM PCs are sold in the
cassette configuration, but if yours is one of them, rush out today and
buy a set of floppy disk drives. If you're serious about computers, you
won't regret it.
Memory Size and the Assembler
How big a memory do you need to create assembly-language
programs? That depends which assembler you want to use. When you
buy the standard IBM MACRO-Assembler, you actually get two
assemblers in the same package: MASM and ASM. MASM stands for
“Macro-ASseMbler,” and is the full-scale assembler with all the bells and
whistles. If you use this program you'll need a minimum of 96K, and
you'll find that 128K is more useful.
ASM, which is sometimes called the “Small Assembler,” is a more
modest version of MASM. It leaves out some of MASM's more advanced
features, such as MACROs and conditional assembly, and in consequence
requires considerably less memory space. ASM will run in a 64K system
if you are using PC-DOS version 1.00 or 1.10, but again you will
probably be happier with more memory — 96K or 128K — especially if
you plan to write large programs. However, if you are using DOS version
2.00, then you will need a minimum of 96K, with 128K being preferable.
Since this book does not describe MACROs and conditional assembly,
there’s no problem using ASM. In fact, ASM even has some advantages:
since it’s smaller, it loads faster and takes up less space on your disk.
Thus we use ASM throughout the book (although you can use MASM if
you want, and if you have enough memory).
So the answer to how much memory you need is: an absolute
minimum of 64K, provided you are using DOS version 1.00 or 1.10, and
ASM. However, we recommend that you upgrade to 128K if you can.
Introduction 5Display Monitors
You can use this book with any of the display options available on the
PC: either the monochrome monitor used with a monochrome adapter
board, an RGB (red, blue, green) color monitor, a non-IBM black and
white monitor, or a TV set hooked up to the color graphics adapter
board via an RF (radio frequency) modulator. Any of these options will
permit you to operate the examples in this book with one exception: if all
you have is the monochrome display, you won't be able to make use of the
section on color graphics, in chapter 10. However, if you have any sort of
monitor connected to the color graphics adapter board, you will be able
to explore both color graphics and character graphics.
The examples used in this book are all based on an 80-column
display. With TV sets, and some low-quality color monitors, an 80-
column display isn’t practical because the screen resolution is so low that
the characters get fuzzy; if that’s the case then 40 columns must be used.
If you're using 40 columns, you will need to do a little mental
reformatting to compare the printouts in this book with those on the
screen, which will be “wrapped around”; but this should not be a major
problem.
Printers
Its very nice but not absolutely necessary to have a printer when
writing assembly language programs. Especially as your programs grow
longer, looking at a printed listing rather than at the same listing on the
screen will be much more convenient and will give you a better idea of
the overall operation of your program. Also, when debugging a program,
it’s nice to be able to look at the listing at the same time you're executing
the program and watching the results on the screen.
However, most of the programs in this book are short enough that a
printer isn’t really necessary. A printer is like a house in the country: if
you have one you'll love it, but if you don't you'll get along just fine
anyway.
As you become deeply involved in assembly language programming,
to the point where you're writing really long programs, then the ideal
printer would have more than 80 columns; say 132. This gives you room
on your listings for line numbers and extensive comments. Line numbers
are a useful addition to long programs because they can be used to
create a cross-reference file of symbolic names, as we'll see in chapter 6
when we discuss the CREF program.
One way to get a “wider” printer, if you have an IBM dot-matrix
printer or an Epson MX-80 or FX-80, is to set it to “compressed” mode.
6 Assembly Languoge Primer for the IBM PC & XT(In chapter 4, we show you the techniques you'll need to write a program
to do this.) Compressed mode gives you a 136-character width. However,
the characters are somewhat harder to read.
Normally the standard 80-column printer is fine. The listings used in
this book were originally generated with a standard Epson MX-80 in
normal mode.
Documentation
Along with your IBM PC you'll want to have the IBM Personal
Computer Technical Reference manual, available from IBM. It is packed full
of details on the operation of the PC. Many of these details will become
important to us as we explore the things that assembly language can do.
Also, appendix A of the manual contains a complete listing of the ROM
routines built into the computer. After you learn about these routines in
chapter 9, you'll find that appendix A will make fascinating reading,
(You'll need various other manuals from IBM as well: we'll discuss them
in the section on software.)
IBM-Compatible Computers
Many computers claim to be “IB M-compatible,” meaning that they
will run the same software (and in some cases use the same hardware) as
the IBM PC. So do you absolutely have to have an IBM PC to use this
book? Maybe not — it depends on which computer you have, since there
are various degrees of compatibility.
As a minimum you need a computer that runs the MS-DOS
operating system (from which PC-DOS, the system used on the PC, is
derived). This way the DOS function calls — which form such an
important part of this book — will still apply. However, that’s only part of
the story. If you want to benefit from the chapter on color graphics, then
your computer will have to use the same approach to graphics as the
IBM PC does. If you want to understand ROM functions, then the ROM.
in your computer should operate the same way the IBM PC’s does. If you
want to use the loudspeaker to generate sound, then your computer will
have to do it in a similar way to that of the IBM PC. And so on.
Some computers are compatible in most of these respects, and others
in only a few of them. If you have an IBM-compatible computer, you can
give this book a try and see how far you get. Most things will probably
work, But it takes a detailed understanding of the features of the IBM PC
and another particular computer to know in advance how compatible
they really are.
Figure I-] summarizes the hardware needed to use this book.
IntroductionSoftware
Let’s assume that you have an IBM PC or XT with the necessary
peripherals as described previously. What software do you need to use for
this book?
The Operating System
For starters you'll need the PC-DOS. At this writing, version 1.10 of
this system is included with the IBM PC, and version 2.00 is included
with the fixed-disk version of the IBM PC: the XT. For PC owners
version 2.00 is available as an option for a very reasonable price.
This book will not be very useful if you are running an operating
system other than IBM PC-DOS, such as CP/M-86 or UCSD p-System.
Why? Because (among other reasons) our programming examples make
extensive use of the specific DOS functions built into PC-DOS. If you use
a different operating system, these functions will in most cases be
different, and the programs won't work.
Monochrome Display, or
RGB color monitor, or
black and white
monitor, or
TV set
IBM PC or XT
64K of memory — or more
Single diskette drive, ot
dual diskette drives, a
fixed disk plus
diskette drive
Printer (very nice,
but not essential)
Figure I-1. Hardware needed for this book
8 Assembly Longuoge Primer for the IBM PC & XTWhich Version of PC-DOS?
This book will work with any of the current releases of PC-DOS:
1.00, 1.10 and 2.00. We expect that it will also work with any future
releases. However, there are some advantages to using version 2.00 (or
later), instead of 1.00 or 1.10. First, IBM PC-DOS version 2.00 contains
a very useful enhancement to the DEBUG program which is part of the
DOS. This is a “mini-assembler,” built right into DEBUG. As we
mentioned earlier, we will write a number of programs using this
DEBUG mini-assembler rather than the more cumbersome ASM (or
MASM). It is possible to do this using the older versions of DEBUG that
do not have this mini-assembler capability (and we show you how to do
it), but it’s easier to create the program examples if you have it.
‘The second reason why IBM PC-DOS version 2.00 is preferable has
to do with the way disk files are accessed. Version 2.00 introduces a whole
new system of file access, called “file handle access,” which we cover in
chapter 12. File handle access is a very powerful and flexible system. It
uses pathnames rather than simple filenames, and is therefore the only
system that will work if you have a hard disk drive. Thus if you are
interested in learning about this latest file access method, you will need
version 2.00.
There are, however, some disadvantages to using PC-DOS 2.00. The
first is its size. If you have a small amount of memory, like 64K, you will
find that version 2.00 takes up so much space that you don’t have room
for the assembler and assembly-language programs. So if you have a 64K
system, stick to DOS version 1.10. ‘This book will work fine with 1.10,
except for the slight inconvenience in writing programs in DEBUG, and
the inability to perform file handle disk access.
‘The second disadvantage of 2.00 is compatibility. If you write a
program in version 1.10, it will work on version 2.00. However, most
version 2.00 programs will not work on version 1.10 or 1.00. If you're
writing programs to be used on the the widest possible number of
different PCs, then 1.10 should be your choice.
In sum, we recommend that you use PC-DOS version 2.00 if you can.
Its increased capabilities, especially the “mini-assembler” in DEBUG,
make it well worth the modest price.
DOS Utility Programs
Along with PC-DOS you get a number of utility programs, which are
referred to in IBM’s documentation as “external routines” (to distinguish
them from the functions built right into the PC-DOS program, which are
Introduction
9called “internal routines”). Three of these programs are essential to the
use of this book. They are:
1. DEBUG. This program is used to monitor, debug and edit
assembly-language programs. Learning to use it, which we teach you
in the first few chapters, is vital to an understanding of assembly
language.
2. LINK. This program is used to change an intermediate form of
assembly-language programs, called OBJ (object) files, into an
executable program called an EXE (executable) file. (These terms will
all be explained in the following chapters.)
3. EXE2BIN. This program converts EXE files to COM (command)
files. COM files are another, somewhat simpler, form of executable
program.
Operating System Documentation
Along with the PC-DOS operating system described above, you'll also
need the JBM Personal Computer Disk Operating System manual which
accompanies it. The manual is the definitive word on the operating
system, and also on the various utility programs such as DEBUG, LINK,
and EXE2BIN. Although we explain how to use these utilities, you will
still find the manual important for reference. Also — and this is very
important — appendix D of the manual is a list of all the DOS functions
available in the operating system. We will explain how to use many of
these functions, but for those not covered, and as a reference to all of
them, the IBM manual is invaluable.
Assembler Programs
You will need the IBM PC MACRO-Assembler, a software package
offered as an option by IBM. This package contains two different
assemblers: ASM and MASM. If you have a 64K system you will have to
use ASM. If you have more memory you can use MASM, although (as we
noted earlier) we recommend using ASM for the examples in this book
because of its smaller size and faster loading.
Another program in the MACRO Assembler package which you may
find useful is CREF; it produces a cross-reference table of the variable
names used in your program.
Of equal importance to the assembler program itself is the /BM
Personal Computer MACRO Assembler manual which accompanies it. The
manual contains a complete list of all the 8088 instructions, a list of all
the pseudo-operations used with the assembler, and descriptions of the
10 Assembly Longuage Primer for the IBM PC & XTvarious other conventions you'll need to know to use the assembler.
Text Editor or Word Processor Program
In order to create the source files for assembly-language programs
you'll need some sort of text edit or word-processing program. If you're a
Pascal programmer you're used to this process and you know what
“source files” are, but if you’ve only programmed in BASIC, the idea of
preparing a source file may be new to you. Source files (also called ASM
files) for assembly-language programs are (ext files, just like letters or
other documents. They constitute the first step in the assembly process
(unless you're using DEBUG). To create a source file you'll need a word-
processing program such as IBM’s Personal Editor, WordStar, Easywriter,
or any one of the dozens of other excellent programs on the market.
There is a text-editing program which is one of the utility programs
that comes with PC-DOS, It’s called EDLIN (for EDit LINes). It is
possible to use EDLIN to create assembly source files. In fact, it works
fairly well for short programs. However, as your programs become longer,
EDLIN’s limitations become increasingly apparent.
For one thing, EDLIN is a “line-oriented” (as opposed to a “screen-
oriented”) text editor. This means that you have to specify what line you
want to edit, rather than simply move the cursor to that line; this makes
it difficult to “move around” in the file. This and other factors make
EDLIN suitable only for very short source files.
If you don't already have a good word-processing program, or a full-
screen text editor, our advice is to go out and buy one and become
familiar with it before you become deeply involved in assembly language.
However, it is beyond the scope of this book to recommend a word.
processor, or to describe how to use it.
The Approach Used in This Book
As we mentioned above, this book is unusual in several respects. The
most important of these is that it teaches assembly language in the
context of a specific computer: the IBM PC, rather than for all computers
using a particular microprocessor. What's so unusual about a book that
teaches assembly-language programming for a particular computer? And
why is this a superior way to learn programming?
Assembly language consists of instructions to a particular
microprocessor. The microprocessor chip that powers the PC is the Intel
8088. This microprocessor is the “brains” of the computer. Physically it’s
very small, consisting of a slice or “chip” of silicon no bigger than your
Introductionthumbnail; but mentally it’s a giant. The microprocessor interprets
instructions you send it in machine language (created by assembly
language), and — based on these instructions — causes the computer to
do all the things computers do so well: getting data from the outside
world, processing it, and outputting it again.
So what would be wrong with a book that taught assembly language
for the 8088 microprocessor, without regard to a specific computer?
‘There are a number of books that attempt this approach; they are
supposed to work with any computer that contains an 8088 or 8086,
microprocessor chip. But the fact is, it’s very difficult to learn assembly
language without reference to a particular computer. There are several
reasons for this.
First, while the actual instructions to the 8088 chip may be the same
on different computers, assemblers (the programs that translate these 8088
instructions into a form the computer can understand) may be different
on different machines. So an assembler format that works on one
computer may not work on another, If you’re reading a book that
describes the assembler on machine A, and you're using machine B, then
the programs you write may well not run.
Second, there are always a great many differences between computers
in such seemingly minor areas as the way the keyboard is used, the
format of the screen display, and the operating system commands
necessary to accomplish a given task. Since we already know in this book
what machine you're using, we can tell you exactly what keystrokes and
commands to use to accomplish a given task, such as assembling your
program, linking it, and trying it out. No general 8088 book can do that.
‘There's a third reason it’s more effective to teach assembly language
for a particular computer rather than for a particular chip. Many
computers — including the IBM PC — contain, buried deep within the
Disk Operating System (DOS), a collection of routines which can be used
by assembly-language programmers to vastly simplify the programs they
write. In fact, these routines are so powerful, and such an integral part of
today’s sophisticated computers, that their use is almost essential for all
but the most trivial programs. However, since these DOS routines differ
from one machine to another, no book which attempts to teach 8088
assembly language in general can make use of them. This book makes
extensive use of DOS functions. In fact one of the goals of the book is to
teach you everything you need to know to make full use of these
powerful software tools.
What it boils down to is this: given the advantages of our approach as
outlined above, we think you'll find it easy and enjoyable to learn
assembly language from this book.
12. Assembly Language Primer for the IBM PC & XTy
Assembly Language
and Debug
Concepts
Assembly language versus higher-level languages
Using DEBUG
Memory
Memory eddrestiog
ASCII codes
Debug Commands
D = Dump
F = Fill
qh this chapter we're first going to talk about assembly language in
general. We'll explain how it differs from higher-level languages such as
BASIC or Pascal, and talk in a general way about the operation of an
assembler and how it differs from the interpreter or compiler used in
higher-level languages.
In the second part of this chapter we'll introduce you to DEBUG, the
utility program which will be your gateway into assembly-language
programming.
Assembly Language and Higher-Level Languages
As is true with most computer languages, it's hard to describe
assembly language meaningfully without reference to examples of specific
programs. In the next chapter you'll encounter your first assembly-
language program, and then you'll begin to see what assembly languageLET A
is all about. In the meantime, we'll provide an overview concerning what
assembly language is, and how it differs from other computer languages.
Higher-Level Languages — More Abstract
If you are familiar with a higher-level language such as BASIC or
Pascal, you know that there is a certain level of abstraction involved in
program statements in these languages. A BASIC statement such as
=3
or, in Pascal,
is operating on an abstract level in that we don’t usually know, or need to
know, where in the computer the “A” is, or what changes are taking place
in the computer when A is assigned the value 3, This is because higher-
level languages are oriented toward the handling of numbers with
algebra-like formulas. Thus FORTRAN, one of the earliest of the higher-
level languages, stands for FORmula TRANslator — a language in which
it is easy to express formulas. BASIC is a descendant of FORTRAN, and
it too —as is Pascal — is oriented primarily toward processing numerical
data in this abstract, algebraic context. Programmers in these languages
want to be insulated from what's really going on inside the computer so
they can concentrate on the formulas.
Analogy — a Newspaper Office
As an analogy to a higher-level language, we can think of a
newspaper office. Reporters write stories about the affairs of the day: an
election in Pennsylvania, a flood on the Mekong River, a riot in Bombay.
‘This information is all transmitted to the newspaper office. There it
edited, typeset, pasted up, printed, and finally distributed to newsstands
and tossed by small children into people’s driveways.
The data processed by the newspaper is abstract. Although you can
touch the medium (paper) that contains it, you can’t touch the actual
news: you can’t build houses out of headlines or drive gossip to work.
The value of news lies in the information itself.
Ina similar way a computer program written in a higher-level
language is concerned with something abstract: variables representing
numbers and characters.
14 Assembly Language Primer for the IBM PC & XTAssembly Language — More Concrete
In contrast, assembly language operates on a very concrete level. It
deals with bits, with bytes, with words (two bytes side-by-side), with
registers — which, as we'll see, are physical places in the microprocessor
where bytes and words are stored — and with memory locations, which
have specific numerical addresses and specific physical locations in the
memory chips inside the PC.
An analogy to assembly language might be a brick factory. In this
factory, clay, water, and energy to run the kilns are the raw materials. The
factory performs certain operations on these raw materials, and the
output from the factory is the bricks themselves, packaged in bundles
which can be lifted onto trucks by forklifts and delivered to building sites.
You can touch a brick, but you can’t touch a news story. Similarly, you
can (or could, if you were very small) touch the registers and memory
locations that assembly language deals with, while you can’t touch the
variable “A” in a BASIC program. (See Figure 1-1.)
The General and the Specific
If you travel to another city, you will have to buy another newspaper,
but the news will be much the same. We could say that the “program” —
the series of operations used to generate the news — is similar in most
newspapers. Higher-level languages are similar in that they can run on a
variety of different computers: the BASIC program on my IBM PC will
Assembly language —
as substantial as @ brick
ae -level languages —
‘as abstract as yesterday's news
Figure 1-1. Assembly language and higher-level languages
‘Assembly Language and Debug 15probably — with some minor modifications — run on someone else’s
Apple.
In the brick factory, on the other hand, the operations are much more
specific. The clay must be dumped into tanks, the water must be mixed
in, and the kiln must be heated to a certain temperature. These
procedures are applicable only in one particular factory. If the foreman
says to turn up the temperature of kiln number five to 2000 degrees, this
instruction is tailored specifically to the physical equipment of one
particular factory. In a similar way, programs written in assembly
language are specific to a particular microprocessor chip, and in many
cases to the specific computer which contains the chip as well.
What Does an Assembler Do?
If you've written programs in BASIC, you're familiar with the two-
step process involved: first you write a group of BASIC program
statements which make up a program; then later, when you execute the
program, these statements are “interpreted,” or changed into machine-
language instructions which are executed by the 8088 microprocessor.
(We'll have a lot more to say about machine language in the following
chapters. Don’t worry if you don’t completely understand what we mean
by it at this point.)
You may not be very aware of this interpretation process in BASIC,
since it is made to appear “invisible” to the user; but it takes place
nevertheless. The individual program lines are interpreted one at a time,
and the resulting machine-language instructions for each line are
executed by the 8088, before the next line is interpreted. (Refer to Figure
1-2 for a simplified view of this process.)
In compiled languages such as Pascal, things are handled a little
differently. The user first creates a source file, which is a text file of the
entire program. This is then changed into machine-language instructions
by a compiler program. (Actually a linker is used too, but we'll ignore it
for the moment.) In a compiled language such as Pascal, the entire
program is transformed into machine language at once.
Assembly language resembles a compiled language more than it does
an interpreted language such as BASIC. An assembler source file
consisting of the text of the program is first created. This is then
assembled into machine-language instructions by an assembler program.
The assembler performs a process very similar to a compiler, except that —
as we'll see in the next chapter — there is a far closer correspondence
between an assembly-language instruction and a machine-language
16 Assembly Language Primer for the IBM PC & XTinstruction than there is between a Pascal statement and the resulting
group of machine-language instructions.
What we've described is the traditional way of transforming an
assembly-language program into machine-language instructions.
However, in the first few chapters of this book we'll use a different
approach: a feature of the DEBUG program called a “mini-assembler”
Using DEBUG it’s almost as easy to create and run short assembly-
language programs as it is to create and run interpreted programs such
as with BASIC. We'll introduce you to DEBUG later in this chapter, and
show you how it can be used to assemble a program in chapter 2.
Small group of
machine-language
instructions
|| Interpreter
10 PRINT “BASIC”
Single line of
BASIC code
Program X%%
Const Plus i,
Yor Nam e257 Compiler aan
Pir.
RepOe inf
Num
or ives BO
Entire Pascal Lorge group of
program machine-language
instructions
Progrem YC
ieee ree
on
Entire ossembly-language Large group of
program je-longuage
Figure 1-2. Interpreters, compilers, and assemblers
Assembly Language and Debug 17Microprocessors
We've mentioned microprocessors several times. A microprocessor is a
single chip of silicon which performs all the basic functions of a
computer. Because assembly language is inextricably entwined with a
particular microprocessor chip — the Intel 8088 in the case of the IBM
PC — we'll talk a bit here about the 8088 and its history. Figure 1-3 gives
a representation of the 8088's development.
‘The very first microprocessor was the 4004, manufactured by the
Intel Corporation. It appeared in 1970. Before the 4004, computers were
made differently. The earliest solid-state computers had thousands of
individual transistors mounted on hundreds of printed circuit boards,
8086 /
8085
Figure 1-3. 8088 family tree
18 Assembly Language Primer for the IBM PC & XTwhich occupied enormous cabinets in air-conditioned rooms and cost
hundreds of thousands of dollars. Later, integrated circuits — which put
a dozen or more transistors in a little package — reduced the size of a
computer to somewhat smaller cabinets in rooms that weren't necessarily
air-conditioned, but the computer still cost in the six or even seven-figure
range.
The 4004, in what is surely one of the most astonishing
accomplishments of our age, squeezed all these cabinets into an object so
small it would blow away if you sneezed, and cost (in quantity) less than a
good dinner.
The 4004 was not really a very powerful microprocessor. It operated
on data which was only 4 bits wide, and had a rather rudimentary
instruction set. But it was followed soon after by the first 8-bit
microprocessor, the 8008. The 8008 evolved into the 8080 (a much easier
to use 8008), and then into the 8085, a refined 8080, which is still in use
in millions of computers.
The next major advance was to go from eight bits to sixteen bits,
since 16-bit microprocessors provide more power and the capability to
use a larger memory than do their 8-bit cousins. The microprocessor that
achieved this breakthrough was the Intel 8086, The 8086 operates on 16-
bit data: it requires a 16-bit memory, 16-bit data buses (which connect
the components of the computer system together), and other 16-bit
peripheral devices.
However, because 8-bit computers have been around for so long,
many of these peripheral devices exist at a reasonable price only in 8-bit
form. So Intel created another version of the 8086, which it called the
8088. The 8088 has an internal architecture like the 8086: the same 16-
bit registers. But when it talks to the outside world, it does so with 8-bit
data: one byte at a time. Thus the memory and peripherals used with an
8088 can be the tried and true (and cheaper) 8-bit models. This reduces
the cost of the computer system, and is the approach used in the IBM
PC.
DEBUG Versus the Assembler
As we noted above, there are two major ways to write short assembly-
language programs on the IBM PC. The first way is to use the assembler
program ASM, or its more sophisticated cousin MASM. (We explained
the difference between these two programs in the Introduction.) People
Assembly Language and Debug 19usually write assembly-language programs in one or the other of these
assembler programs. (Yes, we know you may not be entirely clear at this
point what an assembler program is supposed to do. ‘That's all right —
we'll get to it soon.)
The other way to write assembly-language programs is to use a
different kind of program called DEBUG. DEBUG is not really an
assembler program. Its primary use is for “debugging” (that is, fixing the
errors in) assembly-language programs. However, you can also write short
assembly-language programs with DEBUG.
We've chosen to write the programs in the first few chapters of this
book using DEBUG. There are several reasons for this. First, DEBUG is
a much easier program to operate than the ASM (or MASM) assembler
program. To type in and execute a program using DEBUG requires
calling up only DEBUG itself: a simple process. Using an assembler, on
the other hand, involves using a text editor, the assembler itself, a
program called LINK, and often another program called EXE2BIN.
Each of these programs requires a rather complex series of commands to
make it work. We figured you'd have enough on your mind being
introduced to a new computer language, without having to learn how to
operate all these other programs at the same time.
DEBUG’ second advantage is that programs written with it require
less “overhead” than those written with the assembler. ‘This overhead
comes in the form of program statements which must appear in the ASM
“source file,” but which are not necessary in DEBUG. (Don’t worry if you
don’t understand what we mean by “source file”; we'll explain everything
eventually.) By using DEBUG you avoid having to start your day with a
lot of incomprehensible program lines that would be necessary in the
assembler.
Third, using DEBUG puts you in closer contact with what is really
going on in your computer than using the assembler would. As we'll soon
see, DEBUG has features that make it possible to get down to the most
fundamental level of your computer's operation (short of opening up the
cover and probing about with meters and oscilloscopes). Sooner or later,
if you write programs in assembly language, you're going to have to
understand this fundamental level and learn to use DEBUG; so now
seems like a good time to start.
Of course, as we'll find later, the assembler has all sorts of powerful
features that make it indispensable for assembling long programs, but for
the moment, DEBUG will do just fine. The table below summarizes the
advantages and disadvantages of DEBUG and the assembler.
20 Assembly Language Primer for the IBM PC & XTDEBUG versus Assembler
DEBUG Assembler
Easy to run Hard to run
Low overhead programs More program overhead
Close to the machine Isolated from the
machine
Not so versatile Very versatile
Good on short programs Good on long programs
The Window of the 8088's Soul
An old saying has it that “the eyes are the windows of the soul.” We
might say that DEBUG is the window of the 8088's soul. Besides being
useful for assembling programs, DEBUG is also used to examine and
modify memory locations; to load, store and start programs; and to
examine and modify registers (we'll learn what "reg are later). In
other words, DEBUG is designed to put us in touch with various physical
features of the IBM PC.
Before we write our first 8088 assembly-language program in the
next chapter, we're going to get to know our way around DEBUG: rev it
up, so to speak, find out where the controls are, and taxi it out of the
hangar and around the runway. Then we'll be ready for takeoff in
chapter 2.
Getting DEBUG Rolling
Alll right, let’s leap into the cockpit, get a firm grip on the keyboard,
and get DEBUG rolling! We'll assume that you have a disk with DEBUG.
on it inserted in drive A, and that the A> prompt is waiting for your next
move. (If you have a fixed disk you'll have to make sure DEBUG has
been copied to the fixed disk, and you'll also have to imagine a “C>”
whenever you see an “A>” in the text of this book.) As we noted in the
introduction, DEBUG is one of the programs provided on the “system
disk” that contains the PC-DOS.
Following the DOS prompt, enter the program name “DEBUG”.
Assembly Languoge ond Debug 21(When we tell you to “enter” something in this book we mean to type the
“something” and then press the 2)’ key — the one just to the left of
the numeric keypad.)
sodebug <—Enter this
- <— DEBUG's prompt character
‘The single dash that appears on the screen is DEBUG's “prompt,” the
symbol it uses to tell you that it’s ready to listen to what you have to tell
it.
The “D” Command
You tell DEBUG what to do by typing in single-letter commands,
usually followed by one or more numbers. When we refer to these single-
letter commands in the text we usually use uppercase letters to make
them stand out (like “D”). However, when you type them in, you can use
lowercase. It works just as well as uppercase, and is easier to type. For
example, enter the letter “d”, followed by the digits “1”, “0”, and “O”.
-d1 9p «— You enter this
O8F1: 0100 20 OO 00 DO 0 00 OD 00-00 00 OO OD OO OO 00 OO
DRF1:0110 10 10 1) 00 00 10 D0 00-00 00 00 00 OO 00 OO Od
DRF1: 9120 10 1 10 10 0 10 10 00-00 1 06 0 OD 0 90 Od
ORF: 0130 00 1 10 OO 1D OD OO 00-00 1D AO OO OO OO OD Oo
OSF1: 0146 00 OD 10 OO 90 OO 0 00-00 00 OO OO OD OD OO OO
O8F1: 0150 60 10 0 OO OO 0 00 00-00 OO OO OO OO OO 00 OO
O8F1: 0160 00 10 1 OD OO OD 00 0-00 OD 00 OO OD OO OD OO
O8F1:0176 00 OO OO Of OO OO OD 0-00 1D OO OD OD OO OD OD
Wow — look at all those numbers! What does it all mean? Well, first
of all, you may not see all zeros on your display as we show here. What
the “D” command has done is to “dump,” or display, a portion of your
computer's memory on the screen. Each pair of numbers represents one
byte, or eight bits, of data stored in a particular memory location. If your
computer's memory happened to have other data in it before you loaded
DEBUG, it will appear here when you type “D”, so you may see all sorts
of junky-looking numbers, like this:
dio
O8F1:9190 93 EB 42 99 75 03 EB 41-99 2C 30 72 38 3C MA 73 .kB.u.kA., Or8<.s
O8F1:9110 34 52 8B D3 9F 03 DB 93-DB 63 DA 03 DB D1 DE SE = 4R.S. . [. [.Z. [@°
O8F1:9126 D1 D6 8A DO BG OO 9F O3-DA D1 DE 9E D1 D6 SA EB QV. PG... 20°. QVZh
22 Assembly Longuage Primer for the IBM PC & XT98F1:0130 21 0 72 GA 74 OA 2C 30-72 O4 3C MA 72 D3 41 4A !.7.t.,Or.<.rSAJ
Q8F1:9149 8A C7 GA CO C3 OF 41 4A-9E FO C3 E8 05 OM 75 O1 G.@C. AJ. yCh. . u.
Q8F1:9159 C3 EB F8 8A C5 A Cl 75-61 C3 49 42 8B F2 AC 24 Ckx.E. Au.CIB.r,$
OSF1: 0169 7F OA CO F9 75 1 C3 3C-PC F9 75 $1 C3 3C PA FO
[email protected]<. yu.C<.y
O8F1:0170 75 01 C3 3C 1A F9 75 O1-C3 72 01 C3 F5 C3 AQ 46 u.C<.yu.Cr.Cuc)F
All the numbers in this display are in hexadecimal. In fact,
hexadecimal is the only numbering system that DEBUG knows about, so
if you aren’t already acquainted with this way of representing numbers,
now is the time to read appendix A in the back of this book.
(Welcome back, those of you who have been reading appendix A.)
Let’s adopt this convention: hexadecimal numbers — except those in
program listings or where the context makes clear what they are — will
be followed by a small letter “h” to distinguish them from decimal
numbers. Decimal numbers — again, unless the context makes it dear —
will be followed by a small “d”. Numbers from 0 to 9 are the same in
both systems, so they don’t really need to be followed by a distinguishing
letter, although they sometimes are for consistency. Of course, since
DEBUG only speaks hexadecimal, it doesn’t use an “h” in its printouts,
and you don’t need to put one after hexadecimal numbers you type in as
DEBUG commands.
requires two hexadecimal digits to represent an 8-bit
two-digit hexadecimal number can range in value from
00h to FFh (which is from 0 to 255d). Thus all the two-digit numbers in
the printout above fall into this range. There are 16d of these numbers
on each line of the display. The dashes in the middle of the printout are
placed there for clarity, to separate the left-hand eight bytes on the line
from the right-hand eight bytes.
Addresses
The numbers in the column to the left (like O8F1:0120) are the
memory addresses of the bytes of data. Thus each byte shown in the dump
occupies a specific address, as shown in Figure 1-4.
The vertical column to the left in Figure 1-4 represents an actual
section of your computer’s memory. Notice how each memory location, or
byte, corresponds to a particular number in the DEBUG dump.
Each address consists of two numbers separated by a colon. What do
these two numbers mean?
Offset Address
The 0100 part of the number, to the right of the colon, is called the
offset address. For the next few chapters this will be the only part of the
Assembly Language ond Debug 2324
address we'll be concerned with, so if you want to skip the next few
paragraphs it won't really do you any harm.
Segment Addresses
The O8F1 part of the number, to the left of the colon, is the segment
address. (Your system might have a number other than 08F1. That's fine
too.) The segment part of the address is such a complex and far-out
thing that we're going to postpone a thorough discussion of it until
chapter 8. However, we'll tell you here in very general terms what it
means, so you won't have to wonder about it for six more chapters.
‘To find a real address, take the segment address, shift it left
one place, and add the offset address,
Briefly, the idea of the two-part address is thi
mostly on numbers which are four hexadecimal di;
1234h. (We'll abbreviate the word “hexadecimal” to “hex” from now on.)
O8F1:9100 G3 EB 42 90 75 03 EB 41-99 2C 3 72 38 3C OA 73
Portion
of
memory
O8F1: 0100
101
9102
9103
9194
105
0166
Figure 1-4. Each byte in the dump is a byte in memory
Assembly Language Primer for the IBM PC & XTHowever, there are so many possible memory addresses in the 8088 that
it takes numbers with five hex digits to specify them, such as FFFFFh or
12345h. The engineers at Intel invented the following solution to this
dilemma. They used two four-digit hex numbers to represent each
memory address: the first number is the offset address and the second is
the segment address. These numbers are combined in an unusual way to
form the real or absolute address. The segment address (the number on the
left) is shifted left one digit — which is the same as multiplying it by 10h
Ieis then added to the offset address (the number on the right).
For example, suppose an address shown in our DEBUG dump is
08F1:0120. What absolute address do these numbers represent? We take
the O8FI and shift it left to get 08F10. Then we add the 120. The
resulting five-digit sum is the hex number representing the absolute
address of this particular memory location, as shown below:
Shift left —> 08F10
Add ——> 9120
99030—<— Absolute address
O8F1: 9120
Offset address
Segment address
From now until chapter 8 we're not going to be concerned with the
segment part of the address. How is this possible? The reason we can get
away with paying attention only to the offset part of the address is that
we're going to operate only in a certain part of memory: a part called a
segment. This part of memory is 64K bytes long, which is 65536d bytes,
or FFFFh bytes. It can be specified with a single four-digit hex number,
so all we need to specify an address in the segment is the four-digit offset
address. If this isn’t completely clear, trust us. It will all be explained in
chapter 8, on memory segmentation.
Offset Addresses and DEBUG
Notice how each offset address (we'll just call them “addresses” now,
at least until chapter 8) in the left-hand column of a DEBUG “dump”
ends with a zero. If you're familiar with hex numbers you should
understand why this is so. There are 16d, or 10h, bytes in each line, so
when you've counted from 0h to Fh, you're ready to increase the ten’s
column by 1, since 10h is the number that comes after Fh in hex. So we
‘Assembly Language and Debug 25display 16d (10h) bytes, and then move down one line, increment the
address by 10h, and display 10h more bytes.
The display would be easier to read and understand if it had the one’s
column values of the addresses printed across the top, like this:
O12 .3)4 ‘S06 718 9. 8B CD ELF
O8F1: 0108 10 00 0 OO OD 10 00 00-10 BO OH OO OD 00 OO OD
O8F1:0110 00 OD 15 00 00 00 10 00-00 OO OO OO Od Od OO OO
GBF1:9120 96 00 00 06 00 10 00 00-10 06 00 10 Od OO OO Oo
O8F1: 0130 06 D0 00 00 00 60 BO 00-00 00 OO 10 OO 06 OD OD
D8F1: 0140 10 Bd 10 00 0 OO BO 00-00 1 OO 00 OD 00 OO OO
D8F1:0150 08 10 00 10 OO 0 10 B0-00 BO OO OO OD OD OO OO
O8F1: 0160 00 OO 00 00 00 00 OO 00-00 CO OO OO OO 00 OO OO
O8F1: 6178 90 00 OO 00 OO OO BO 10-00 1 00 1 OD 0 HO HO
But it doesn’t. Anyway it should be clear that the first byte on the top
row is at memory location 0100, the next is at 0101, the next at 0102,
and so on. Similarly, the first byte on the second row is at 0110, the
second at 0111, and so on.
The “F” Command
Want to see this display change? An easy way to do that is to use
DEBUGS “F” or “fill” command. This command fills a part of memory
with a particular hex number, To use “fill” you enter "f” followed by
three numbers, each number separated by a space. The first of these
numbers is the address where you want to start filling, the second is the
address where you want to stop filling, and the third is the constant (from
00h to FFh) that you want to use to fill in between the first address and
the second. Notice that while the data to be filled in consist of two-digit
hex numbers (bytes), the addresses are four-digit hex numbers. OF
course, you don’t need to type leading zeros, so you can type fewer than
four digits for the addresses when appropriate, as it is here.
Enter this:
26 14f ff
|_~ constant to be filled in
Ending address
Starting eddress
Nothing will appear to happen. To see what’s changed, you have to
dump the same part of memory again:
26 Assembly Language Primer for the IBM PC & XT-d100
98F1: 6100
98F1: 9119
O8F1: 0126
98F1: 6130
98F1: 6149
O8F1: 9156
98F1: 0160
98F1: 6176
10 00 1 OO 10 OO OO 06-00 00 OO BO OO OO OO OD
OP 1) 1 OO WH OD 10 00-00 00 00 OO OO OO OO D0
FF FF FF FF FF FF FF EF-FF FF FF FF FF FF FF FF
FE FF FF FF FF FF FF FF-FF EF FF FF FF FF FF FF
FF FF FF FF FF FF FF FF-FF FF FF FF FF FF FF FF
10 10 10 OO DO 00 00 00-00 06 00 HO 00 06 00 00
1 10 20 B0 10 9D 05 00-00 6 90 OO 00 00 00 00
10 00 10 OO 0 00 90 O0-00 10 00 OO O OO OO HO
Well, look at that! All the memory locations between 120 and 14F are
now filled with FE, just as you specified with the “F” command. (Of
course if you started off with other numbers instead of zeros, they'll still
be there instead of the zeros shown in this dump.)
ASCII Codes
You may have been wondering about all the little dots and odd
characters on the right-hand side of the dump display. These are the
characters (like “A”, “B”, and so on) that the numbers to the left
represent. The number which represents a particular character is called
its “ASCH Code.” (ASCII stands for “American Standard Code for
Information Interchange.”) As you probably know, the ASCII code is the
normal way to represent characters in a computer's memory. (There is a
very nice table of these codes in the JBM Personal Computer Technical
Reference manual.)
Since neither 00 nor FF represents a printable ASCII character, the
positions in the ASCII display corresponding to these numbers are filled
with dots, which indicate “no printable character.” (If your computer's
memory was filled with junk rather than all zeros to begin with, some of
the numbers may have been printable characters.) To see the character
display change, along with the numbers, let’s try filling in parts of
memory with numbers that we know represent printable ASCII
characters.
Enter the following DEBUG commands:
£100 117 61
£178 17f 24
-d1go
O8F1:6190 61 61 61 61 61 61 61 61-61 61 61 61 61 61 61 61
OSF1: 9110 61 61 61 61 61 61 61 61-00 00 90 00 00 OO OO OO
O8F1:0120 FF FF FF FF EF FF FF FF-FF FF FF FF FF FF FF FF
O8F1:0130 FF FF FF FF FF FF FF FF-FF FF FF FF FF FF FF FF
O8F1:0149 FF FF FF FF FF FF FF FF-FF FF FF FF FF FF FF FF
aaaaaaaaaaaaaaaa
aaaaaaaa
Assembly Lenguage and Debug 27O8F1: 0150 00 0D 00 0D 00 OD 10 11-10 0 00 0 OD 0H 90 0d
G8F1: 0160 90 10 00 00 90 OO 00 20-00 0 00 OO OO 00 OO Od
O8F1:0170 00 00 00 0 00 00 0M 09-24 24 24 24 24 24 24 246
$8599595
61 hex is the ASCII code for a lowercase “a,” and 24 hex is the code for
the dollar sign ($). We can see them both as numbers, and, to the right,
as characters.
Summary
In this chapter we've talked about how assembly language differs
from higher-level languages, and also explained something about the
operation of the DEBUG ut
point to experiment a bit with DEBUG. Ti
y program. You may find it useful at this
ing in different constants
to see how they look when you “dump” them, Examine different parts of
memory, You'll be using DEBUG a great deal in the chapters to follow,
and you should feel comfortable about using it.
28 Assembly Longuage Primer for the IBM PC & XT2|
Instant Program
Concepts
Writing a simple program in machine or assembly language
Assembly language instructions
Debug Commands
E = Enter
A = Assemble
U = Unossemble
G=Go
8088 Instructions
MOV = Move
INT = Interrupt
IMP = Jump
DOS Functions
Display Output
Program Terminate
Mire epee ercine err: by cstenbine theneriting oF
complete, though very short, assembly-language program. Then we'll go
back and talk in more detail about the steps used in the process and what
they mean. We'll finish up with some variations on our program
Writing Your First Program
You're going to write your first program in assembly language, but
you don’t know assembly language yet. Obviously, there will be many
aspects of the process that won't seem completely clear to you. Don't
worry! Our approach is to show you first what something looks like, and
afterwards explain why it looks that way and how it works.By moving in this direction, from the concrete to the abstract (rather
than the other way around), we hope to avoid the sort of academic
theory-oriented descriptions that leave most readers confused, bored, and
frustrated. Instead, you'll first get the feeling of the process (the roar of
the motor and the rush of the wind in your hair, to return to our flying
analogy). Later we'll explain what happened.
The Two Versions of DOS
‘There’s a small problem we better deal with right away. This has to
do with which version of DOS you're using: As we noted in the
introduction, the DEBUG in DOS version 2 (that is, versions 2.00 and
later) contains a built-in mini-assembler which will help in the creation of
assembly-language programs. The DEBUG in DOS version I (versions
1.00 and 1.10) does not have this capability, so for those readers using
this version we need to take a slightly different tack.
We'll handle this situation in the following way. We'll first explain how
to type in a program if you're using DOS version 1. Even if you have
version 2, you should read this part, try it out, and understand it. There
are two reasons why this is a good idea. The first is that you will be
introduced to a new DEBUG command: the “E” (for “Enter”) command.
‘The second is that after you've typed in the program using “E”, you'll be
better able to appreciate how lucky you are to have DOS version 2, with
its advanced version of DEBUG and its mini-assembler capability.
Writing the Program with the “E” Command
In this section we'll create an assembly language program using
DEBUG’s “E” command. (The term ‘mbly language” is actually not
quite right in this particular instance, as we'll see later in the chapter, but
that needn't concern us now.) If you have DOS version 1 this is the only
way to use DEBUG to create a program. If you have version 2, you
should, as we suggested above, follow along anyway, typing in the
commands we show.
The purpose of the “E” command is to enter a byte (or bytes) of data
into memory. It’s a little like the “F” command described in the last
chapter, except that you can enter a scries of different bytes; they don’t all
have to have the same value, as “F” required.
The series of bytes we're going to enter with “E” will constitute our
program. To insert this program into memory, you enter the “E”
command, followed by the address where you want the program to go. In
our case, we're going to put it at location 100h, so we enter “e” followed
30 Assembly Languoge Primer for the IBM PC & XTby “100”. The program will respond by printing out the address, followed
by its Current contents:
e100 — You enter this
9485: 0100 61.—
Assembly —
— Disassembly >
Take heart, however. The mnemonic instructions in the column on
the right in the listing, as shown above, are not comprehensible to the
microprocessor, clever though it may be. They form what is properly
called “assembly language,” and while you may not understand these
instructions now, you soon will even though you are merely a human
being.
It is the job of an assembler to translate assembly language, which is
comprehensible to humans, into machine language, which is
comprehensible to microprocessors.
Assembler programs translate assembly language into
machine language.
Assembly-Language Instructions
You've typed in the program, and run it, and disassembled it again,
but of course you still don’t really understand how it does what it does.
To understand the program we must understand the individual
instructions in it, and what they do. In this section we'll look at the
Instant Program 37instructions one by one. But first we need to understand another
fundamental concept: registers. So let's digress for a moment, and return
to our program later.
Registers
A register is a place in the microprocessor where our program can
put a byte, or sometimes two bytes, of data. A register is something like a
memory location, except that it has various special properties which a
memory location doesn't. One of these properties is that the
microprocessor can do a simple kind of arithmetic on the contents of
registers; whereas it can only put bytes into, and take them out of,
memory locations. However we won't be concerned with this arithmetic
capability in this chapter. For the moment, think of registers as places,
like memory locations, where we can put eight-bit bytes of data.
The registers in the 8088 are given two-letter names. There are a
dozen or so of these registers, but we're going to put off looking at all of
them at once until the next chapter. Our particular program concerns
itself with only two of the registers: the DL register and the AH register.
There are four instructions in the program, one on each of the four
lines in the listing above. The first two deal with registers.
The MOV Instruction
The first instruction, “MOV DL,01”, occupies memory locations 100
and 101, and consists of the bytes B2 and O1.
O8F1:0190 B201 NOV DL, @1
Take this number
}—ond MOVe it into
C the Di register
This instruction tells the 8088, “take the number O1h, and move it
(MOV for “move”) into the DL register.” This way of writing the
instruction may seem somewhat backwards to you, moving things from
right to left. I’s. a convention that probably had its origin in the kind of
statements used in higher level languages, like BASIC’
LET A=2
where the quantity on the right gets “assigned” or put into the variable
38 Assembly Language Primer for the IBM PC & XTon the left. At any rate, after this instruction is executed, there will be a
byte with the value | in the DL register. Where does the 8088 get the 01?
It’s actually part of the instruction: the second byte. The 8088
microprocessor looks at the first part of the instruction, the B2, in
memory, figures out that this means “move the following 8-bit constant
into the DL register,” and then gets the 8-bit constant from the very next
memory location (0101h) and places it in the DL register. The operation
of the MOV DL,01 instruction is shown in Figure 2-1.
When we introduce each assembly-language instruction in this book
we're generally going to start with a box which summarizes the ways the
Memory
EG
This hex number +——+4
tells the 8088, “Take FD
the constant from the FE
following memory locotion
‘ond put it in FF
the DL register” p 100
103\ the
104 ( program
105
Here’s th 106
constant Ol moving co
earalscadsoniGl 20 | J 107
to the DL register
(A few microseconds later)
DL register
Here's the
DL register with
the constant in it
Figure 2-1. Operation of the MOV DL, 01 instruction
Instant Program 39instruction can be used; and the MOV instruction — your first assembly
language instruction — is no exception. However, you should understand
that at this point you don’t need to understand everything that’s in the
box. For our program we're only interested in one use of the MOV
instruction: the “immediate to register” byte MOV. As you've seen, this
means taking a constant two-digit hex value which is part of the
instruction (it “immediately” follows the instruction in memory, hence the
name), and putting it in a register. The other uses of this instruction,
involving MOVes to and from memory and between registers, will be
covered later. Likewise, at a later time we'll also explain the meaning of
the word “flags” used in the bottom line.
Thus for the moment you can ignore most of the following box:
MOV Instruction
Moves byte or word from/to register/memory/immediate.
Move immediate value to Register
MOV DL, @1 byte
MOV AX, 1234 jword
Move immediate value to Memory
MOV MBYTE, 12 byte
MOV MWORD,1234 ; word
Move Register to Register
MOV DL, BL byte
MOV AX, BK jword
Move Register to Memory
MOV MBYTE, BL sbyte
MOV MNORD, DX ;word
Move Memory to Register
MOV CH, MBYTE2 byte
MOV AX, MWORD4 word
Flags affected: none
40 Assembly Language Primer for the IBM PC & XT‘The second instruction in our program is also a MOV instruction:
O8F1:0102 B4g2 MOV AH, 92
This means, as you no doubt have figured out, “take the number 02,
and MOVe it into the AH register.” Because we're moving the constant
into a different register, AH instead of DL, the hex code for the
instruction is different: B4 instead of B2. And since it’s the constant 02h
that’s being moved into a register, the second byte of the instruction is 02.
Otherwise the operation of this instruction is just the same as the first
one.
But why are we putting these constants in these registers? What does
that have to do with printing a happy face on the screen? Fear not,
things will become clearer as we describe the next two instructions.
The INT Instruction
INT is a sort of “jump to subroutine” instruction. It stands for
“[NTerrupt,” and there are various reasons why it isn't a real “jump to
subroutine,” but for the time being we can think of it that way. It’s a lite
like a GOSUB in BASIC or a CALL in various other languages. It
transfers control from our program to another routine somewhere else in
memory. Then, when that routine is done, control is returned to the line
following the INT in our program.
So when the instruction
O8F1:9104 CD21 INT 21
is executed, the program jumps to a special part of DOS, a routine whose
number is 21h, and when this routine is finished, control returns to the
next line of the program, the INT 20 at address 106. This is shown in
Figure 2-2.
(Remember, this is a simplified view of INT. Actually the INT
instruction involves a transfer to a special address called an “interrupt
vector,” which in turn transfers control to the routine. However, the effect
is much the same as we've shown.
As with the MOV instruction, you can ignore, at least for the
moment, a lot of the material in the following box.
Instant Program 41INT Instruction
Calls a routine pointed to by an interrupt vector.
Control is transferred with an indirect call through any of
the 256 interrupt vectors located from absolute address
00000h to 00400h. The address of the routine, in
segment:offset form, must be in the vector.
Control is returned to the calling program from the routine
with the IRET instruction.
Flags affected: IF, TF
The Display Output Function
What does this special DOS routine number 21h do? That depends,
as we'll see on the number in the AH register at the time we execute the
INT 21. Routine 21h is a sort of switchyard, which will route us to a
number of different DOS functions, depending on the number in AH. In
cour case we want to display a character on the screen, so we put the
number 2 in the AH register. DOS routine number 21h will then transfer
control to the “Display Output” function, whose purpose is to write a
single character to the screen.
This Display Output routine is one of the famous “DOS Functions”
we've mentioned before. We'll be talking about these functions at length
later. For the time being, the important things to know about them are
that they are assembly-language routines built into the PC-DOS, so that
they are always available in memory when you need them, and that they
are all reached by executing an INT 21h instruction, with different
values in the AH register.
Your program must do three things to cause the "Display Output”
function to actually display a character. First it must put the numeric
value of the character to be displayed into the DL register. The numeric
value for the happy face is 1. (It’s like an ASCLI value, except that for
special characters like the happy face it’s not really ASC, it’s a code
IBM invented.) So the first instruction in our program puts 01h into the
DL register.
The second thing the Display Output function needs is to have the
number 2 put into the AH register, as we explained above. This is the
number that tells the operating system that we want the Display Output
42 Assembly Language Primer for the IBM PC. & XTfunction, and not some other function (like Keyboard Input or Print
String, which we'll talk about in the next chapter).
The third thing our program has to do is execute the INT 21
instruction itself, so that control will be transferred to the Operating
System, which will then look in the AH register to figure out what we
want to do, namely, display a character.
‘As we do with assembly-language instructions, we're going to
summarize each DOS function in a box. In the case of DOS functions,
Routine #21
locoted somewhere
in memory
The CD21 instruction
tells the 8088, “Get
poe aa
1m routine #21"
When routine #21 is
finished, it tells the
8088 to get its next
instruction from the
location following the CD21
Figure 2-2. Operation of the INT 21 instruction
Instant Program 43however, most of the contents of the box should be familiar to you (unlike
instructions boxes, which, at this point, leave many unexplained details).
DISPLAY OUTPUT Function — Number 02h
Enter with:
Reg AH = 2
Reg DL = Numeric value of character
Execute:
INT 21
Return with: character displayed on screen
Comment causes exit from function
‘The instructions that make up the Display Output routine are not in
the same place in memory as our program. In fact, we actually don’t
know where they are, and we don’t need to know. The hardware in the
8088 will take care of transferring control from the CD21 instruction in
our program, to the beginning of the Operating System, and then
getting back to our program when the Operating System has told the
Display Output Routine to print the character from the DL register.
Whew — what a lot of complexity in one little instruction. Perhaps
Figure 2-3 will help make it clearer.
The Program Terminate Interrupt
‘The last instruction in the program is another INT instruction, this
time to DOS routine number 20h.
O8F1:0196 CD29 INT 2o
This routine is much simpler than DOS routine number 21h, in that
there's only one thing it can do. Therefore, you don’t have to put
anything in any of the registers before you call it. ‘The routine is called
the “Program ‘Terminate Interrupt.” Its job is to ensure that, when a
ser program” (such as the one we've just written) has finished
executing, it correctly transfers control back to DOS or DEBUG,
whichever is being used to run the program (in this case DEBUG).
44 Assembly Language Primer for the IBM PC & XTPROGRAM TERMINATE Interrupt
Execute:
INT 20
Return with: control returned to supervisor program —
DOS or DEBUG
Thus the INT 20 instruction is very similar to a STOP or END
instruction in higher-level languages. When the INT 20 instruction has
done its work, control goes back to DEBUG and you get the “Program
terminated normally” message, and the DEBUG prompt.
Variations on a Theme
Now that we have our program up and running, let’ try changing it
Some other
function
The valve in
the AH register
determines which function
will be executed
The valve in
the DL register
determines which
character will
be disployed
Control returns to
‘our program when
Display output
function the function is finished
Figure 2-3. The operating system and functions
Instant Program 45a little here and there to see what happens. T
understanding of how the program works.
should give you more
Printing Different Characters
What modifications to this program do you think we would need to
make it display some other character, say the letter “X”, instead of the
happy face? That's right: all we need to do is change the first line, so that
instead of MOVing a I (the happy face code) into the DL register, we
move a 58h, which is the ASCII code for “X”.
Assuming that your program is still in memory (if it's not you can
type it in again with “A” or “E”), use “U” to look at it again:
09 197
F1:0100 B201 MoV DL, @1
O8F1: 0102 B492 MOV AH, @2
O8F1:0104 ( INT a1
O8F1:9106 CD20 INT 29
All we need to do is change one byte in this program to make it print
an “X”, That's the “01” at location 101, We could do this using “A”, and
simply assemble a new instruction,
right over the top of the old one at location 101. But since we only need
to change one byte, let’s use the “E” command instead.
We want to change the byte at location 101 from 1 to 58h, so we
enter “E” followed by that address:
Ol — Enter “e" and the address:
SF1:0101 91.58, —Type “58°
Press
Old value
Since we only want to put one byte into memory, we press
following the byte.
So we've changed the 01 to a 58h. If we list the program again with
“U" we'll see the change incorporated in it:
199,197
O8F1: 010 B258 MOV OL, 58
O8F1:0102 B492 Mov AH, 62
O8F 1: 0104 INT 21
O81: 0196 CD20 INT 20
46 Assembly Longuage Primer for the IBM PC & XTNow if we run it again with “G” we should see an “X” displayed on
the screen:
-8
x
Program terminated normally
Not bad! It worked again. By looking up the ASCII values of various
characters, you can change the program to print whichever one you want
‘The Endless Loop
Before we go on to a more thorough discussion of assembly language,
let’s do one more variation on this program. Suppose instead of printing
one character, we wanted to print a whole series of the same character.
How would we modify the program to do that? It’s not hard: we simply
put a “jump” instruction at the end of the program, which takes us back
to the beginning so that our “Display Output” function will be repeated
over and over.
Let’s also go back to the happy face — it’s more upbeat than the “X”.
Put the number 01 back into location 101
-elfl
@8F1:0101 58.1
to restore the happy face.
Now we'll install a new instruction in our program. This instruction
goes at the end of the program, overlaying the INT 20 instruction at
location 106. It’s a “jump” to the beginning of the program, at location
100, so the program becomes an endless loop.
If you’re using DOS version 2, enter “al06”, and when the address is
printed, enter “jmp 100”. Then on the next line hit @3) again to go back to
DEBUG.
-alg6
O8F1: 0106 jmp 100 <— Enter jmp 100
O8F1: 6168 —Press
If you're running DOS version 1, you'll have to type in the hex code
for this instruction using “E”. The hex code is EBF8h.
Instant Program 47Old volves
-e196
94B5:0106 CD.EB 20.F8
\
New values
Now “unassemble” the program to make sure it looks right:
-ul99, 196
98F1: 0100 B201 MOV DL, @1
O8F1: 9192 B492 MOV AH, 62
O8F1: 6104 CD21 INT 21
O8F1: 9196 EBF8 SMP O10
The JMP Instruction
‘The box containing the summary of the JMP instruction is largely for
later reference. You can ignore most of it at this point.
JMP Instruction
Jumps to new memory location.
Within-segment short jump: to address within -128 to +127
bytes
SMP NEAR_LABEL
Within-segment long jump: to address in same segment
JMP NEAR_LABEL
Intersegment jump: to address in a different segment
SMP FAR_LABEL
The last two types can also be “indirect jumps,” that is,
jumps to the memory address contained in a memory
address, a register, or a memory address modified by a
register.
IMP WOR_VAR
IMP AX
IMP ADDRPTR [BX]
48 Assembly Language Primer for the IBM PC & XTLet’s look at the JMP instruction a little more closely, to see how “JMP
100” gets assembled into “EBF8.” You don’t need to remember the details
of this process, but it will give you some idea of what DEBUG's “A”
command (or the assembler) has to figure out to arrive at the correct
machine language equivalent of a particular assembly-language
instruction.
The first two hex digits that make up the instruction are EB, which is
the code for a “short” jump. (We'll talk about the difference between long
and short jumps later.) What does the F8 mean? You might expect to see
the address 100 that we're jumping to, but you don’t. This is because the
jump is a relative jump. Instead of using the address of the place we're
going to jump to, JMP uses the distance to the place we're going to jump.
Even after you know this, the F8 still doesn’t make much sense. There
are two reasons for this. The first is that since the jump is backwards, the
number of bytes that need to be jumped is negative. If we were going to
jump forward eight bytes, the instruction would simply be EBO8. But
since we're going to jump backward eight bytes, we form a negative
number by subtracting 8 from 00. If you had FF and added 1 to it, you'd
get 00. So if you have 00 and you subtract I from it, you get FE. Subtract
T again and you get FE. Count down by 1, eight times, and you get FD,
FC, FB, FA, F9, F8.
But what have we jumped 8 bytes from? This brings us to the second
reason the F8 is confusing: it doesn’t tell you to jump 8 bytes from the
jump instruction itself, it tells you to jump 8 bytes from the byte following
the jump instruction. That would be from location 108 to location 100,
which is in fact 8 bytes. Expressed arithmetically, this looks like
19h - 198h = F8h
Whew. What a lot of complexity in just one little instruction.
Fortunately the “A” command (and, as we'll see later, the assembler
programs ASM or MASM) do all these tedious calculations for us, so that
we don’t even have to think about the hex representations of instructions
unless something goes wrong.
DOS version | users are at a disadvantage here, since they have no
easy way to start with the assembly language mnemonics and end up with
the machine language numbers, at least not using DEBUG. Thus if
version I users were trying to figure out the hex equivalent of the JMP
100 instruction so they could type them in, they'd either need to go
through the calculation just described, or else try to figure it out by wial
and error, guessing a number and then using “U” to see if it was right.
Actually version I users don’t need to do either of these things for the
Instont Program
49s book, since we supply all the hex equivalents, so that “E”
mediately. You can find these hex codes by looking at the
“U" listing included with each program.
In and Out of the Endless Loop
Have you waited all this time to try the new program? Good — that
shows admirable restraint. Try it now. Enter the “G” command:
faces! All right,
(the
Wow — look how fast the screen fills up with haj
now stop the program. How do you do that? Just hi
nd keys together) as you would for a1 program,
and presto, we're back in DEBUG. This is possible because the Display
Output function is programmed to look for at the same
time it’s printing characters on the screen. We can thus tell it, “Stop!
Don't print that character — I want to get back to DEBUG!”
When you escape from your program back to DEBUG you'll get a
display like this:
AX=$201 BX=9000 CX=0000 001 SP=FFEE BP=$000 SI=9900 DI=9000
DS-98F1 ES-G8F1 SS=$8F1 CS=$8F1 IP=0106 NV UP DI PL NZ NA PO NC
O8F1:9196 EBF8 IMP G10
Don’t worry about what all this means at this point — we'll look into it
later. Do note, however, that the last line of the display contains one of
the instructions from your program, in this case the JMP 0100. Thi
shows what instruction was being executed when you pressed the
keys.
Perhaps this is a good time to end the chapter, with the screen
(mostly) full of happy faces.
Summary
In this chapter you've learned a good bit about DEBUG, written your
first assembly-language program and some variations on it, and explored
some of the ways to see what your computer is doing on a very
fundamental level.
We hope that this chapter has given you an idea (although at this
point it will be a somewhat impressionistic idea) of what assembly
language is all about, and whetted your appetite for a more detailed
understanding.
50 Assembly Language Primer for the IBM PC & XT3
What Is Assembly
Language?
Concepts
Machine language versus assembly language
Registers
Saving programs to disk from DEBUG, and loading them back
Input/Output ports
Logic instructions
Toggling a bit to beep the speaker
Debug Commands
R = Registers
N = Name
W = Write
Q = Quit
L = Load
8088 Instructions
INC = Increment
XOR = Logical “Exclusive OR”
Applications
‘SMASCII program — Displays enfire character set
SOUND program — Beeps the speaker
I, n the last chapter we led you straight into the heart of the 8088
microprocessor. We showed you how to examine memory, write
51programs, assemble them, disassemble them, and execute them. We did
this to show you that programming in assembly language doesn’t have to
be all that difficult. However, there were some details that we left out
along the way:
In the first part of this chapter we're going to fill in some of these
details. Then we'll go on to write some more programs, to consolidate
what we've learned.
ing in Details
An assembly-language programmer is primarily concerned with three
things: instructions, memory, and registers. In the process of writing our
first assembly-language programs in the last chapter we talked a litle
about each of these topics. In the following sections we'll take a somewhat
more leisurely look at each of the three, and try to deepen your
understanding of what assembly language is all about.
Machine Language, Assembly Language, and Physical
Reality
Although we've talked so far about assembly-language instructions,
and the data they operate on, in terms of hexadecimal numbers, the fact
is that if we look at things in a more fundamental way we should be
talking about binary numbers, not hexadecimal numbers. Let’s look at an
example of what we mean.
Remember the happy face program you wrote in the last chapter? It
looked like this when you disassembled it with “U”:
-u190, 107
O8F1: 9100 B2d1 MOV DL, @1
O8F1: 0102 B4g2 MOV AH, 92
98F1: 9104 CD21 INT 21
98F1: 9106 CD26 INT 26
As you learned, the symbolic statements on the right of this listing
constitute a form of assembly language. The mini-assembler in DEBUG
translates these statements into the hex numbers in the columns on the
left. These hex numbers are called machine language.
Although we glossed over this point in the last chapter, it is not
actually the hex digits themselves which are read and understood by the
8088 microprocessor, but the numbers or bit-patterns they
represent. Hex digits are merely a way to make binary digits easier for us
humans to read.
52 Assembly Languoge Primer for the IBM PC & XTFor instance, the instruction MOV DL,01 in the program above is
translated into the hex number B201. The B2 part of this instruction
goes in location 100, However, B2 is not stored in the computer's memory
as a hex number, but as a pattern of bits:
B 2
ooo eee T atl
In fact, the entire program is stored in memory as a pattern of bits:
ooo {7 0 1 110 0 1 O|} B2
oo [ooo 000071 orf NOVAK
0102/1 0 1 1;0 1 0 Ol) B4
os [0 00 0'0010 on} MOvOL2
0104 [7 7 0 O11 1 0 1[] cD
0105 [0 01 010007 mf ita
0106 [7 7 0 011 1 0 I[JeD
ow7 [001 010000 ate
Physical Machine Assembly
reality language language
(bit patterns) (hex digits) (symbolic instructions)
Notice how each of the instructions in our program occupies a
specific place in the computer’s memory. In this program all the
instructions are two bytes long, but other instructions can vary in length
from one byte to five bytes, and sometimes even more.
The earliest microcomputers (such as the venerable {MSA1-8080) had
lights on the front panel which could be set to show the bit patterns
inside the computer. It was possible to “step through” a program and
look at the binary representations of all the program instructions, much
as we've shown you in the diagram above. This was occasionally helpful
in debugging some complex internal process (especially for hardware-
oriented users).
If we had such an old-fashioned computer with front panel switches,
we could actually look at, say, memory location 0100 and see the lights lit
where the bits were turned on, as shown in Figure 3-1.
Since the PC is so modern, it doesn’t have these lights, but DEBUG
What Is Assembly Language? 53shows us the contents of memory just as well — in fact better, because
DEBUG is far more convenient to use than a bunch of front panel
switches, and also because it’s far easier to read four hex digits than a
line of sixteen lights. However, using DEBUG does deprive us of a certain
insight into what happens decp inside the computer: it’s easy to forget
that the hex digits which DEBUG shows us are not really what’ in the
computer's memory and registers: these hex digits merely stand for a
binary pattern of bits. (Later in this chapter we'll see how important actual
bit patterns can be: we'll write a program which must manipulate bits in
order to make sounds on the speaker.)
Bit Numbering
What is memory? From an assembly-language programmer's
viewpoint, memory is a place in your computer where you can store
something called “bytes.” A byte is simply eight bits arranged in a row. A
bit is the smallest possible unit of information: either yes or no, on or off,
1 or 0. Thus each memory location consists of eight places where bits can
be stored, like this:
bit7_bit 6
it bit 4 bit3 bit 2
ONE BYTE
‘The way the bits are numbered — whether from right to left or left to
right — is largely arbitrary. The figure above shows how IBM likes to do
it, as do most computer manufacturers, but some manufacturers start
with 0 (or sometimes 1) on the left instead of the right. Each bit can have
a value of either 0 or I, so if we placed the byte represented by, say, the
Old fashioned microcomputer
0000000000000 0 0 Address
% 0 XX 0 O % O Contents
WOVWOVDOWA®Y OHO
Figure 3-1. Old-foshioned microcomputer with front-panel lights
54 Assembly Language Primer for the IBM PC & XThex number C3 (which is 11000011 in bi y, Since C= 1100 and
3=0011) into a memory location, the location would look like this:
bit7 bité bit 5 bit4 bit3 bit2 bit! bit
Lititoftofvojto{ifi E
Ina computer's memory there are many thousands of locations like
this, into which 8-bit numbers can be placed. Here’s how a small section
of memory looks, filled in with binary numbers:
00000000 i 0100
00000000 o101
Toco 0102
00000000 | | o1os
00000000 | 0104
eae
‘There’s our friend, C3, in location 102. The other locations happen to
contain zeros.
As we discussed in the last chapter, we are, for the next few chapters,
confining ourselves to a single segment of memory, or 65536 bytes.
Although this is only a fraction of the memory the IBM PC can hold, it is
nevertheless a large number of bytes. When we show figures, such as the
one ahove, which depict a half-dozen bytes of memory, it’s important to
remember that in reality the memory locations in our segment start at
0000 and go all the way up to FFFFh or 65536d, as shown — somewhat
fancifully — in Figure 3-2.
Registers
In the last chapter you were introduced to two registers: AH and DL.
We used the AH register to hold a number that told the operating system
what DOS function we wanted to perform when we executed an INT 21
call to DOS, and we used the DL register to hold the numerical ASCII
value of a particular character to be displayed. We showed these 8-bit
registers as containing pairs of hex numbers, but of course what each
really contains is eight bits:
What Is Assembly Longuage? 55{ooyo[o]o]ojo]1{J = on
DL register
(oToToToTolol1 oy) = ozh
AH register
As we mentioned, a register is a physical device built into the
computer. It’s something like an address in memory, but since it’s part of
Figure 3-2. One segment of memory
56 Assembly Language Primer for the IBM PC & XTthe microcomputer chip itself, rather than part of a memory chip located
somewhere else on the computer, data can be moved from register to
register very quickly. The 8088 instructions can also do a far wider
variety of things to registers than they can to memory locations. For
instance, arithmetic and logical operations can be performed on data in
registers, addresses stored in registers can be used to point to locations
in memory, and registers can be used to read and write data to the
peripherals connected to the computer.
Although so far we've shown you only two registers, there are actually
eight 8-bit general-purpose registers. (There are some other more
specialized registers as well, but we'll ignore them for the time being.)
These eight registers are called:
AH and AL
BH and BL
CH and CL.
DH and DL
As you can see, we've arranged the eight registers in pairs. That’s
because they're arranged in pairs in the 8088 microprocessor. Why is
this?
Some data which we want to manipulate is 8 bits wide, as we saw in
the programs in the last chapter. However, we often want to be able to
operate on 16-bit-wide data. This data might be just numbers, or it
might be addresses: as we've seen, a 16-bit number can specify any
address in our current 64K data segment.
Instead of having some 8-bit registers and other 16-bit registers, the
designers of the 8088 decided to group pairs of 8-bit registers together to
form 16-bit registers.
In 8088 assembly-language format, the 16-bit register is
differentiated from the two 8-bit registers by giving it the same first letter
as the pair from which it was made, but ending it in the letter “X”:
8-bits 16-bits
(aa) ond = [ax
(anf) ona (orf) = (ox
ond (al) - Cox)
Gal eu fapl- (Cox)
What Is Assembly Language? 57A more detailed picture, showing the positions occupied by individual
bits in the AX register, looks like this:
16-bit 151413121109 876543210
abt 7654321076543210
(TPToyofofolttololijifiiiy
Contents = C3 Contents = 3F
AH register: 8 bits AL register: 8 bits
ee
Contents = C33F
AX register: 16 bits
The upper row of bit numbers shows how the bit positions are
numbered in the 16-bit register, while the lower row shows the
for the two halves of the register used as two separate 8-bit
registers. The “H” in the register name stands for “high,” and the “L”
stands for “low,” since (for instance) the AH register forms the high part
of the AX register, containing the most significant bits, and the AL
register forms the low part, with the least significant bits. For example,
the most significant digits of the 4-digit hex number C33F are C3, shown
in the AH register in the figure above, and the least significant digits are
3K in the AL register.
‘These four 16-bit registers, AX, BX, CX, and DX, are in many ways
identical to one another. However, depending on the circumstance, they
may have areas of specialization. The AX register has special circuitry
that makes it more suitable for doing arithmetic and logical operations
than the other registers. The BX register can be used to point to
memory addresses in a way that other registers can’t, and the CX register
is often used for counting, As we learn more about assembly-language
instructions we'll begin to see how these specialized features of the
different registers are used.
Manipulating Registers with DEBUG
DEBUG has a command which enables us not only to look at the
registers in the 8088 (really at the hex representations of their contents),
but to change the contents as well.
Load DEBUG as described in the last chapter, and when the prompt
appears, type “R” (for “Registers”). You'll be rewarded with a very
complicated looking display:
58 Assembly Language Primer for the IBM PC & XTspdebug
+t
AX=9000 BX=0000 CX=0000 DX=0000 SP=FFEE BP=9000 SI=(000 DI=9009
DS=$8F1 ES=98F1 SS=$8F1 CS=98F1 IP=$109 NV UP DI PL NZ NA PO NC
98F1: 0100 0000 ADD (BX+SI] , AL DS: 9900=CD
This is the same kind of display you saw i
terminated the endless loop program with
For the time being, you can ignore most of the display
however, the first four entries on the top row:
the last chapter when you
Notice,
AX=9000 BX=0000 CX=0000 DX=0000
This tells you that the contents of all four major registers in your 8088
are set to zero. The contents of the registers will change when a program
executes instructions that put data into the registers, as your programs
did in the last chapter with MOV DL,O1 and MOV AH,02. For this
reason, if you use the “R” command after running a program, you may
find that some of the registers contain values other than zero.
There is another way to change the contents of the registers: You can
do it directly from DEBUG, using a variation of the “R” command. For
instance, enter the command “R” followed by the letters “AX” (for the
AX register). DEBUG will respond by printing out the letters “AX”,
followed by the current contents of AX, which in this case is 0000. It then
prints a colon and sits there waiting for you to enter a number
representing the new contents of AX.
-rax
AX 9000
«— DEBUG waits for you to enter a new value
Suppose you now enter “1234”.
-rax
AX 000
1234 — You enter 1234
- <—Batk to the DEBUG prompt
This places the hex number 1234 into the AX register. This is the same
as putting 12 in the AH register, and 34 in the AL register. With
DEBUG, you can’t access the two halves of the register separately — you
must deal with them both together. To verify that what you put in AX is
really there, type just plain “R” again:
What Is Assembly Longuoge? 59-r
AX=1234 BX=0000 CX=9000 DX=9000 SP=FFEE BP=1000 SI=0000 DI-9000
DS=98F1 ES=08F1 SS-98F1 CS=98F1 IP=$109 NV UP DI PL NZ NA PO NC
O8F1: 0100 1000 ADD BX+SI] , AL Ds: 6900=CD
And there it is, 1234 in the AX register.
Similarly, let’s put FFFFh into the CX register:
rex — Enter this for “Register CX”
CX G005 = Current contents is DOD
ffff <— Change it to FFFFh
Checking again with “R”, we get:
©
AX=1234 BX=0900 CX=FFFF DX=9000 SP=FFEE BP=0000 SI=0000 DI=0000
ES=(8F1 SS=08F1 CS=(8F1 IP=919@ NV UP DI PL NZ NA PO NC
O8F1: 9109 0900 ADD BX+ST] , AL Ds: 690¢=CD
You can do this with any of the four major registers: AX, BX, CX,
and DX; and with a variety of other registers as well, although we're
going to ignore these other registers for the time being.
This ability to examine and modify the contents of registers directly
from DEBUG will become important when we learn how to follow the
operation of your program while it’s running, a topic we'll cover later
when we talk about the “T” (for “Trace”) command.
ASCII
play Program
Let's write another program. This one will display all the ASCII
characters (and all the special non-ASCH IBM characters as well) on the
screen, It will also introduce you to a new instruction. Once you've
written the program, we'll show you how to save it on your disk, so that
you can execute it directly from DOS without getting into DEBUG.
From DEBUG, use the “A” command to type in the following litle
program. As you can sec, we've added comments to each line. You can’t
type these in, since DEBUG doesn't accept comments, but when we
explain the program, they'll help clarify its operation.
s>debug
-a
O8F1: 919 mov dl,¢ <— Put first character in DL
98F1:9192 mov ah,2 + Specify Display Output function
O8F1: 0104 int 21 = Call DOS to print choracter
60 Assembly Longuage Primer for the IBM PC & XT98F1: 0196 ine dl <— Change to next character
08F1:0108 jmp 192 k to disploy next cheracter
8F1: 019A fo end assembly
Note that the first time you use “A” after calling DEBUG, you don’t
need to specify “al00”. DEBUG assumes you want to start at 100h unless
you tell it otherwise. Use “U” to see that everything looks all right:
-U100, 198
O8F1: 0100 B29 MoV DL, OO
M8F1: 0162 B4p2 MOV AH, 02
O8F1: 0164 CD21 INT 21
O8F1: 0106 FEC2 INC DL
O8F1: 0108 EBF8 IMP 192
(If you're using DOS version 1, you'll have to use “E” to type in the
hex numbers B2, 00, B4, 02, CD, 21, FE, C2, EB, F8, as we expl:
the last chapter.)
The INC Instruction
‘As you can see, this program uses a new instruction which you
haven't seen before: INC. The purpose of this instruction is to increment
— that is, add one to — the contents of a register.
INC Instruction
Increments the contents of a register or memory address.
To increment a register:
INC BX
INC AL
‘To increment a memory address:
INC WORD_VAR
INC BYTE_VAR
INC TABLE [BX)
Flags affected: AE OF, PE, $
Figure 3-3 shows how the INC DL instruction works.
What Is Assembly Longuage? 61Operation of the ASCII Program
What does this INC instruction do in our program? As you can see,
the first three instructions of the program are very similar to the
program in the last chapter which printed a happy face on the screen.
‘The only difference i: the first instruction: the constant loaded into the
DL register to be printed is 0 instead of 1 (1 is the code for a happy
face). So we can surmise that the first thing the program is going to do is
print something: whatever the character is that corresponds to 0.
The JMP instruction at the end of the program should also be a
familiar sight. From the last program in the last chapter, we know that
this JMP takes us back to the start of the program, turning the program
into an endless loop.
That leaves only the INC instruction unexplained. Its purpose is to
increment — add one to — the DL register, every time we cycle through
the program. Since the value in the DL register determines what
character will be displayed when we call the Display Output function, we
can see that a different character will be displayed each time. The first
time through the loop this number will be 0, then 1, and so on up to
FFh, which is 255d. This is as high a number as can be expressed in a
single byte, so the next time through the loop, DL will be back at 0. Each
of the numbers from 0 to FFh represents a different character, so the
program will show them all to us on the screen, over and over again.
Figure 3-3. Operation of the INC DL instruction
62 Assembly Language Primer for the IBM PC & XTSaving the Program to Disk
Before you run the program, save it on your disk. This is a three-step
process: first we tell DEBUG the name of the program we want to save,
then we tell it how large the program is, and finally we tell it to actually
write the program to the disk.
‘To specify the name, we type the command “N” (for “Name”),
followed immediately (no space) by the name of the file we want the
program saved under. The filename itself can be anything we want, but
the extension must be COM if we want to execute the program later
directly from DOS. This is because a COM file has certain attributes that
make it compatible with DEBUG. (We'll have more to say about this later,
in chapter 8, on memory segmentation.) We'll call our program ASCII,
so we enter:
i.com
Next, to tell DEBUG how long a program is (that is, how many bytes
we want to save), we use both the BX and the CX registers. The CX
register holds the low-order, least significant part of this number, and the
BX register holds the high-order part. Notice that we're now talking
about 16-bit wide registers, which can hold numbers up to FFFFh, or
65536d. A program this long is a big program. Since by using only the
CX register we can save programs up to this length, the chances are we'll
never have to put anything in the BX register. However, we must be sure
it’s set to zero. Then we put the actual number of bytes occupied by our
program into the CX register.
You need to be a little careful figuring out this number. If the
program starts at 100 and the last byte used is 109 (as in the program
above), then how many bytes must be saved? Not 9, but 10d, which is Ah.
This is because we start counting at 100, not 101. So we enter:
-rbx — Enter this fo see the BX register
BX 1900
prompt, and enter the program name “ascii”. If they want a copy of the
program, you can copy it to another disk just as you would any other file
or program. In other words, you've written a perfectly good assembly-
language application program! It may not be quite as useful as a word
processor or a spreadsheet program, but the concept is the same.
Reloading the Program in DEBUG
If you want to modify or examine the program again, there are two
ways to get it back into DEBUG. The first way is to enter the program
name at the same time you load DEBUG. The thing to remember here is
that you must use the full filename, including the extension, as shown here:
64 Assembly Longuage Primer for the IBM PC & XT[Don’t forget the extension
A>debug ascii.con
I
COM file to be loaded with DEBUG
You can also load DEBUG first, and then load the program. This is
done by first filling in the program name with the “N” command, and
then loading it with the “L” (for “load”) command, “L” is the opposite of
“W” —it causes the COM file to be loaded back from the disk into
DEBUG. Here’s how that looks:
a>debug
=nascii.com
-1
You'll bear the disk drive whirr, and if you use “
your program is back in memory again.
~Ul99, 108
Q8F1: 9100 B20 MOV DL, 66
O8F1: 0192 B4g2 MOV AH, 62
98F1: 0194 CD21 INT 21
O8F1: 0106 FEC2 INC DL
O8F1: 0108 EBF8 IMP (9192
Now if you want you can also run it directly from DEBUG by using
the “G” command, just as you have done with the other programs we've
written. The moral here is that as far as DEBUG is concerned, a program
can either be typed in with “A” or “E”, or it can be loaded in from the
disk with “L”: the result is the same. Try running the program:
+— Enter "G" to run the ASCII program from DEBUG
Again, the screen will fill up with the character set, over and over
again.
SMASCII — Making the ASCII Program Smarter
It would be somewhat more elegant if our ASCH program only
printed the character set once, and then returned to DOS (or DEBUG, if
we ran it from there) without our having to interrupt it with the
keys. Let's modify it to do that, and at the same time introduce
another 8088 instruction: LOOP.
What Is Assembly Language? 65From DEBUG, type in the following program (not the comments, of
course):
-alop
9995: 0100 mov cx, 190 <—Set up the count for the LOOP
9995: 0103 mov dl, 0 — Put the first character in DL
9905: 0105 mov ah, 2 — Specify Display Output
9905: 0107 int 21 <— Call DOS to display character
9995: 9109 ine dl <— Change to next charocter
9995: 0108 loop 195 —Loop until CX is zero
9905: 019D int 20 <— Bit to DEBUG or DOS
9905: 010F
Here it is disassembled with “U”:
-u10, 19d
O8F1: 0190 B9DPO1 Mov CX, 6100
O8F1: 9103 B200 MOV DL,
O8F1: 0195 B4g2 MOV AH, 02
O8F1: 0167 CD21 INT 21
O8F1:0199 FEC2 INC DL
G8F1: $108 E2F8 Loop 915
O8F1:019D CD20 INT 26
‘This program is very much the same as the ASCII program shown
earlier, except that it starts off with a MOV CX,100 instruction, and
finishes with LOOP 105 and INT 20. We've already learned that the
INT 20 is a sort of “exit” instruction, What about MOV CX,100 and
LOOP 105?
The LOOP Instruction
LOOP is a powerful instruction that functions somewhat like a
FOR...NEXT loop in BASIC. The idea is this: You put a number, which
is the number of times you want to do something, into the CX register.
Then, every time you execute the LOOP instruction, it decrements (that is,
subtracts | from) the contents of the CX register. LOOP then jumps back
to the address written as part of the LOOP instruction, in this case, 105.
More accurately, it jumps to this address unless the count in CX is zero. If
CX contains zero, no jump takes place, and the program goes on to the
instruction following the LOOP. In other words, when CX goes from | to
0, LOOP stops looping, Figure 3-4 shows the operation of the LOOP 105
instruction.
66 Assembly Longuage Primer for the IBM PC & XTLOOP Instruction
Jumps to start of loop until CX register is zero.
The number of times the loop is to be executed must be
placed in the CX register before LOOP is invoked:
MoV CX, COUNT
START.
(instructions within loop)
LOOP START
Flags affected: none
} MOV CX,100
MOV DL,OO
MOV AH,2
The LOOP 105 instruction causes INT 21
the program to jump up to
location 105, and subtract
‘one from the CX register, INC DL
‘as long as the CX register
is not zero
LOOP 105
INT 20
When the CX register becomes
Zero, the program goes on
to the instruction followi
“LOOP 105”
Figure 3-4. Operation of the LOOP 105 instruction
What Is Assembly Language? 67‘The effect is that all the instructions between the LOOP and the
address pointed to by LOOP will be executed the number of times
corresponding to the value originally placed in the CX register. In our
case we want to print out all 256d possible characters, from 0 to 255.
The hex equivalent of 256d is 100h, so we put 100h in the CX register
before we begin our loop. The program will then keep incrementing the
number in the DL register from 0 to 255d, just as it did in the last
program. However, when the count in CX reaches 0, the instruction
following the LOOP will be executed, and control will pass to the INT 20
instruction, which will terminate the program.
Before you try out the program, save it to your dis!
ahead and execute it first with “G” if you want, but saving a newly written
program to the disk before you run it is a good habit. That way if there's
a bug in the program — or a design defect — that causes DEBUG and/or
the entire operating system to crash, you won't lose the program.
You can go
-nsmascii. com, —Set the program nome
-rbx
BX 9000
orex —Set the length to Fh
CX 9000
if
WwW <— Write it to the disk
Writing QO0F bytes
Now you can execute the program from DEBUG with “G”, or you can
get out of DEBUG and execute the program directly from DOS:
“4
a>smascii
If you wrote it right, itl put the entire IBM character set on the
screen, once; then return you to the DOS prompt. This is a nice
finement over ASCII, which went on and on (until you hit
Some Sound Advice
Before we wrap up this chapter, we're going to introduce you to one
more feature of your PC: its ability to produce sound. You have no doubt
seen the little speaker grill in the front of your machine, and heard it go
68 Assembly Longuage Primer for the IBM PC & XT“beep” on start-up, and when, for instance, you type in too many
characters for the keyboard buffer to hold. You may also have used the
bi in BEEP, SOUND, and PLAY statements in BASIC. In this section
we're going to show you how to control this sound capability with
assembly language.
‘Assembly language gives you far better control over the sound
function than do higher level languages. We'll explore some of the really
clever things you can do with sound in chapter 7. For now, we'll simply
introduce you to the fundamentals of the sound mechanism.
To produce sound on the speaker you need to be familiar with some
powerful assembly-language concepts. The first of these new concepts is
communicating with the outside world using IN and OUT instructions.
These instructions are the only way an assembly language program can
communicate with those peripheral devices for which there are no DOS
functions. In fact, the DOS functions themselves use IN and OUT to.
communicate with all the peripherals, including the screen and keyboard.
The other new concept we'll be exploring in this section is the use of
the 8088 logic instructions. These instructions are common ones in
assembly-language programs, and permit you to do logical manipulations
on the bits in certain registers.
We'll show you our sound-producing program, and then explain how
it makes use of these ideas.
‘The SOUND Program
Use the “A” command in DEBUG to type in the following program
(or use “E” to type in the hex values shown below).
APdebug
-al i
8F1:0100 in al. 61
O8F1: 0102 and al,fe
O8F1: 0164 xor al,2
O8F1: 0106 out 61, al
G8F1:0108 mov cx, 140
98F1:019B loop 19b
O8F1:010D jmp 104
O8F1: O1OF
Make sure it’s correct with “
-uldB, 1Pe
9995: 0190 E461 IN AL, 61
9905: 6102 24FC AND AL,FC
905-0104 3402 XOR AL, 62
What Is Assembly Language?9995: 0106 E661 our «61, AL
0905: 0198 B94901 MOV CX, 0140
5: 0198 E2FE Loop = 1B
5: 010D EBFS IMP 194
Before you run the program, save it to disk (you'll be sorry if you
omit this step — don’t say we didn't warn you):
-nsound. com
-rbx
BX 9000
“rex
et)
f
-w
Writing 900F bytes
Now run it:
Well, what do you know? A nice tone sounds. The only trouble is, you
can’t turn it off! Now that’s a real inconvenience. Even the
key combination — resetting the system — doesn't have any effect.
You have to actually turn the whole computer off and then on again
to get rid of the sound and recapture control of the computer. At least,
this is true if you execute the program from DEBUG. If you execute it
directly from DOS, you can interrupt it with (Bel); but, when
you execute it from DOS, the tone has an unpleasant burbling sound to
it. In chapter 7 we'll learn why this is true, and in chapter 4 we'll learn
how to guarantee that we can break into the program. But for now, let’s
content ourselves with trying to understand what our program does.
Fiddling with the Outside World
When we access most of the peripherals on the PC we can use DOS
functions to help us out. You've already seen how this works with the
Display Output function, which puts a character on the screen. When we
use a DOS function we don’t actually control the peripheral with
instructions from our program: we let the DOS routine do that for us.
This saves us a great deal of trouble and is generally just what we want.
When we access the speaker, on the other hand, we actually use
instructions in our program to cause something to happen to a physical
device. We can’t use a DOS function, because there is no DOS function
70 Assembly Language Primer for the IBM PC & XTfor the speaker (the creators of PC-DOS must not have thought it was
important). This gives us the opportunity to explore just how powerful
assembly language can be, when it communicates directly with devices in
the outside world.
‘To make sounds, our program actually turns on and off an electronic
“gate” (which is a kind of switch). Each time we turn the gate on and
then off again, we create a pulse: a brief period when current flows in a
circuit. (See the illustration below.) These pulses are amplified and sent
to the speaker, where they make a sound. This gate is turned on and off
with the OUT instruction in the program above, as we'll explain soon.
One
pulse
bt
Il Worm
to ke
on
turned off
Gate
turned on
‘The faster we send the pulses, the higher the pitch of the sound. We
can control how fast we send the pulses by putting a delay into our
program. We turn the gate on, delay, turn the gate off, delay, and so on.
‘The LOOP instruction in the program above is used to cause the delay,
as we'll see later.
The OUT Instruction
‘The instruction which turns the gate on and off is the OUT 61,AL in
location 106 of our program. ‘To understand what this instruction is
doing you need to know about “Input/Output ports.”
Ports are somewhat like registers, in that you send 8-bit or 16-bit data
to them (from the AL or AX register). You can also read their contents
back into the AL or AX register. However, the big difference between
ports and registers is that ports are connected to physical devices in the outside
world. (See Figure 3-5.)
So when you change something in a port, you are sending a message
to some peripheral device, such as the video screen, disk drive, or in our
case, the speaker. There can be, theoretically, up to 64K ports in the IBM
PC. In reality only a small fraction of these are used, since there are
usually less than a dozen peripherals connected to the PC.
What Is Assembly Language? 71OUT Instruction
Sends byte or word to input/output port.
‘To output a byte to port number PORTNO,
OUT PORTNO, AL
To output a word to port number PORTNO
QUT PORTNO, AX
The port number can also be placed in the DX register, prior
to executing the OUT
MOV DX, PORTNO
OUT DX, AL
Flags affected: none
The OUT instruction is something like MOV, except that it doesn’t
MOVe a byte into a register, it MOVes it (copies it actually) from a register
to a port. The register the byte (or word) is to be moved from (it must be
either AL or AX), and the number of the port to be moved to, are
specified in the instruction. Thus,
OUT 61,AL
causes the contents of the AL register to be placed in port number 61.
Figure 3-6 shows the operation of the OUT 61,AL instruction.
Alor AX 8-bit or 16-bit
register VO port Peripheral device
(gates, oscillators,
keyboard, display, etc.)
Figure 3-5. Input/output ports
72 Assembly Longuage Primer for the IBM PC & XTGetting Down to Bits
In the case of beeping the speaker we're really only concerned with
two of the bits in the byte we send to port number 61h. These are
number I and number 0 (which as you recall from the bit numbering
diagrams earlier in this chapter are the two bits on the right):
bit7 bits bitS bit bit3 bit? bit! bitO
xDD xT x] x] xfvol o
PORT NUMBER 61h | :
‘We want to turn
this bit off
We want to toggle
this bit on and off
Bit 0 is connected to an oscillator which we want to turn off and leave
off, so we want to set this bit to zero. (The oscillator is used in another
method of sound generation which we'll learn about in chapter 7. For the
time being we just want to deactivate it.) Bit 1 is connected to the gate
which generates the pulses to be sent to the speaker, as we discussed. ‘The
other bits in this port (those marked “X” in the illustration above) do
various other things, not related to the speaker (such as turning the
cassette motor on and off), and for this reason must not be changed. Our
goal is then to set bit 0 to 0, and turn bit I on and off just fast enough to
make a nice tone in the speaker, but not change the other bits. How do
we do that? We're going to need some more 8088 instructions.
The IN Instruction
In order to change bits 0 and | in port 61h without changing the
others, we need to find out how all the bits are initially set: whether to 0 or
to 1. Once we know how they're set, we can write the unchanged values
of the ones we don’t want to change back into the port, and at the same
time write the new values of the ones we do want to change.
Port #61h
Figure 3-6. Operation of the OUT 61,AL instruction
What Is Assembly Language?
73IN Instruction
Receives byte or word from input/output port.
To input a byte from port number PORTNO.
IN AL, PORTNO
To input a word from port number PORTNO
IN AX, PORTNO
The port number can also be placed in the DX register, prior
to executing the IN
MOV DX, PORTNO
IN AL,DX
Flags affected: none
So how do we find out the initial values of the bits in the port? We
use the IN instruction, which is the opposite of OUT. IN reads the byte
from a port into a register. The register and the port number are specified
in the instruction. Thus,
IN AL, 61
takes the byte in port 61 and reads it into the AL register. Figure 3-7
shows the operation of the IN AL,61 instruction.
AL register Port #61h
_| 5]4[3][2[1] 0
—z
(=
Speaker toggle
Figure 3-7. Operation of the IN AL,61 instruction
74 Assembly Longuage Primer for the IBM PC & XTThe AND Instruction
Once we've read the contents of port 61 into AL, we want to change
bits 0 and 1 so we can send them back out to the port. This is a two-step
process. First we get rid of the old value of these bits using an AND
instruction, then we use a special instruction called XOR to “toggle” the
bit connected to the gate which drives the speaker.
First we'll set the two unwanted bits to zero with an AND instruction.
AND Instruction
Performs logical “AND” on two operands. Result
(conjunction) is stored in leftmost operand.
Register with register
AND AL, BL
AND BX, CX
Immediate with register or memory
AND DL, BYTE
AND MEM_WORD, WORD
Register with memory and vice versa
AND MEM_BYTE, DBYTE
AND DBYTE, MEM_BYTE
Flags affected: CF, OF, PF, SE, ZF
Flags undefined: AF
As you may recall from BASIC or some other higher-level language,
ANDing two bytes together has the effect of turning off the bits in the
result unless both of the corresponding bits in the two bytes are set to 1.
If only one, or neither one, of the corresponding bits is set to 1, the
resulting bit is set to 0.
We can summarize this in the following table:
0 AND0 = 0
OANDI=0
1ANDO
1 AND
What Is Assembly Language? 75As an example we'll AND two bytes together:
O1L001111 this
T1L1L1000 1) —ANDed with this
01000001 <—Gives this
If you're not familiar with the use of this operation you might want
to work out a few more examples before going on. (Notice that nothing is
carried to the adjacent column in logical operations: each column is a
separate stand-alone calculation.)
AND can be used to “get rid of” (that is, set to zero) the bits we don't
want in a byte, while at the same time keeping the bits we do want. This
is often called “masking off” the unwanted bits. In our program we want
to mask off the two least significant bits, 0 and 1, while keeping alll the
others. So we AND the byte in the AL register with the hex number FC,
which is 11111100 in binary.
Suppose the number we read from the port is 4Dh, which is
01001101 binary. We mask off the two lower bits by ANDing on the FCh,
which gives 70h, as shown in the figure below:
01001 101 —Numberread from port
TTL LLL OO —ANDed with this
01001100 —Givesthis
AL register
(of
(Before AND AL,FC
is executed)
Al register 11111100
ANDed with
01001101
gives
01001100
Figure 3-8. Operation of the AND ALFC instruction
76 Assembly Language Primer for the IBM PC & XT‘The operation of the AND AL,FC instruction is shown in Figure 3-8.
‘The XOR Instruction
Now that we've gotten rid of bits 1 and 0, we need a way to turn bit 1
on and off, over and over again, to generate our pulse train. What works
nicely for this is a semi-magical instruction called XOR.
XOR Instruction
Performs logical Exclusive OR on two operands. Result
(disjunction) is stored in leftmost operand.
Register with register
XOR AL, BL
XOR BX, CX
Immediate with register or memory
XOR DL. BYTE
XOR MEMWORD, WORD
Register with memory and vice versa
XOR MEM BYTE, DBYTE
XOR DBYTE, MEM_BYTE
Flags affected: CE, OF, PE, SE, ZF
Flags undefined: AF
We call the XOR instruction “semi-magical” because if somethin;
on, XOR can turn it off, and if something is off, XOR can turn it on.
How? XOR stands for “Exclusive OR,” which means “either one or the
other, but not both.” In terms of how it operates on bits, XOR looks like
this:
0 XORO
0 XOR 1
1 XOR0 =
1XOR1 =
0
1
1
0
What Is Assembly Language?
nFor example,
—This
11000011
111 1 <—xXORed with this
1100
oo0011
110011
— Gives this
Notice how a | XORed with a | is 0, while a 0 XORed with a 1 is a 1.
If we repeatedly XOR a | with another bit, that bit will turn on, then off,
then on, and so forth; as shown in the diagram below:
0 XOR1=1
LXOR 1 = 0
OXORI =1
1 XOR 1 = 0 <— This toggles back and forth
——
| This is the switch
This toggles bock and forth
No such toggling action takes place when we XOR a 0 instead of a 1.
In fact, XORing a 0 to another bit leaves the other bit unchanged:
This is the some
osthis
0XOR0 = 0
1XOR0=1
——
This leaves bit unchanged
In our particular case what we want to do is turn on and off bit 1 in
the AL register, so we XOR AL with the hex number 2, which is
00000010 binary. This leaves all the bits except bit 1 unchanged, while if
bit I was a 0 it now becomes 1, and if it was a 1 it becomes 0. The
operation of the XOR AL,2 instruction is shown in Figure 3-9.
So in our program all we need to do is put this XOR instruction in a
loop along with the OUT instruction, and then every time we go through
the loop we'll change bit one in port 61 from I to 0 or from 0 to 1, thus
“toggling” (switching) it repeatedly on and off.
78 Assembly Language Primer for the IBM PC & XT9108
9198
The Time Delay
‘The only thing left to explain about this program is the time delay in
lines 108 and 10B. A delay is necessary in the loop that toggles bit 1 on
and off, because the computer executes its instructions so very rapidly
compared with the frequency of sound. The 8088 can whiz through our
little program so fast, in fact, that the tone generated in the speaker
would be far too high to be heard by human ears (or even dog ears).
So we need to slow things down. We do this by setting up a LOOP
instruction to cause a delay, This is done by putting the appropriate-sized
count in the CX register, and then simply executing the LOOP
instruction that many times. The LOOP is written to jump to itself until
the count in CX goes to zero.
MoV. CX,140 — Sets count of 14Qh into loop counter
LOOP $198 — Jumps to itself 14Qh times
By trial and error, we can determine that the length of time taken by
the LOOP instruction, times 140h (320d), produces a delay just long
enough to make a tone with a pitch in the audible range.
When the LOOP instruction has finished jumping to itself, JMP 104
is executed and control goes back up to toggle the bit again. The effect is
bit-on, delay, bit-off, delay, and so on.
(After XOR AL,2
is executed)
Figure 3-9. Operation of the XOR AL,2 instruction
What Is Assembly Longuage?
9Here's an annotated version of the program to summarize how the
various instructions work together:
1109, 19e
905: 0100 E461 IN AL, 61 — Get old value from I/O port
0995: 0192 24FC AND AL, FC <— Mask off lower two bits
0905: 9104 3402 XOR AL, 2 <— Toggle bit 1 [on oF off)
0905: 0106 E661 ouT 61, AL Send result to port
0905: 9108 B94001 MOV CX, 9149 Set up delay of 14@h cycles
9995: 9108 E2FE LOOP = 108 <— Repeat this instruction 14h times
0995: 610D EBFS IMP 4 +«— Go back to toggle again
That’ all there is to the program. It’s a lot like someone standing next
to a wall switch, flicking it on and off as fast as they can — except, of
course, that the program is faster than the fastest human fingers, and the
switch is connected to a speaker instead of a light.
Changing the Pitch
Want to change the pitch of the sound generated by our program?
All you have to do is change the number you load into the CX register.
This changes the delay, which changes how rapidly the gate is toggled,
which changes the frequency of the sound. Smaller numbers will cause
less delay, which will increase the frequency and generate a higher tone.
Larger numbers will lower the tone.
Let's raise the tone a bit by changing the 140h to 100h.
9995: 0198 B94901 MOV CX,@140
tT
Change this from 4@ to QO to increase pitch
Load the program from DEBUG (unless the program is still in
memory, of course) and use “E” to change the 40 in location 109 to 0
(this will change the 140h to 100h).
A>debug sound. com
$199 40.0
The resulting program is identical to the first one, except for the one
changed byte:
80 Assembly Language Primer for the IBM PC & XT-ul 09, 1pe
9905: 0100 E461 IN AL, 61
9905: 0102 24FC AND AL, FC
9995: 9104 3402 XOR AL, 92
9905: 6196 E661 ouT 61, AL
9905:0108 890001 MOV CX,@199 — <—Shortened delay raises pitch
9995: 9108 E2FE LOOP = 108
9995: 010D EBFS IMP 9194
Now run the program again. You should hear the difference in pitch.
fe
Of course, you have to restart your whole system after this
experiment, so it’s not a convenient program to experiment on very
much, Later we'll show you how to transform it into a more useful
program.
Summary
In this chapter we've talked some more about how assembly language
uses the computer's memory and registers. You've learned how to
examine and modify the 8088's main registers — AX, BX, CX, and DX
— using DEBUG’s “R” command; how to save a program on disk using
the “N” and “W” commands, and how to get it back again using “N” and
“L.” You know how to make the speaker produce a tone. And finally,
you've learned some more 8088 instructions: INC, LOOP, IN, OUT,
AND, and XOR.
What Is Assembly Longuage? 8182
4
Inside DOS—The Disk
Operating System
Concepts
The purpose of DOS
The different parts of DOS
The IP register
Memory buffers
Indirect addressing
Using the BX register as a pointer
Sending messages to the printer
Sending control codes to the printer
Debug Commands
RIP To change to IP register
8088 Instructions
DB = Define byte to assemble strings (pseudo-op)
DOS Function Calls
Keyboard Input
Print String
Buffered Keyboard Input
Printer Output
Applications
EMPHAP program — Turns on printer's “emphasized” print
NORMALP program — Restores printer's normal print
I rites stiles inumoetttnodemicommuicret hers serieetere
connection between the assembly-language programs which run in the
computer, and DOS — the Disk Operating System. In this chapter we'regoing to talk about DOS, what it does, and how it relates to assembly
language. We'll also write some programs that will extend your
understanding of this relationship and teach you more about assembly
language.
What Is a Disk Operating System?
You're probably already aware of many of the user-level functions of
the DOS on your PC. Whenever you see the “A>” prompt it is DOS that
has printed it, and when you type in a command like DIR or COPY, it’s
DOS that carries out the command. Also, when you type a program
name like ASM or BASICA, it's DOS that finds the program and loads it
into memory, and is waiting there to resume control when your program
is finished.
So one of the primary purposes of DOS is to manage other
programs, by keeping them on the disk in such a way that they
called by name, loaded into memory, and executed; and by provi
functions to permit you to list, copy, and erase these programs or data
files.
These “file management” operations are an essential part of DOS,
but they are not the whole story. Beneath the file management part of
DOS is another, more sophisticated level, which can only be reached
through assembly language. What is this deeper level of DOS, and what
does it do?
‘The Historical View
The earliest operating systems performed only the file management
functions, and provided no further interaction or assistance to other
programs using the system, once they were loaded. Thus, if you wanted
to write an assembly language program to, say, put a character on the
video screen, you had to figure out exactly how the video circuitry
worked, and then go through a complex series of instructions to tell this
circuitry where to put the character. Similarly, if you wanted to write a
file to the disk, you had to understand the most minute details of the disk
operation, such as where every byte was located on the disk, how fast the
disk was spinning, and how long it took the stepping motor to reach
different tracks. As you can imagine, this made programs very long and
complex. Figure 4-1 shows schematically what this looked like.
These early operating systems had another disadvantage, too. If you
physically interchanged your video terminal, your disk drive, or some
other device, for one of a different kind, then you had to rewrite all the
programs that used these devices, since the instructions in your program.
Inside DOS—The Disk Operating System 83that worked for one kind of device would not work for another that was
even slightly different. Worst of all, your programs would only run on
other computers which were exactly the same as yours: same video
terminal, same disk drives, same everything. It was impossible to
transport a program from one brand of computer to another. All this was
very inconvenient.
DOS to the Rescue
Then someone had a very clever idea. This idea depended on the fact
that the routines to access the peripheral devices — the video terminal,
the disk drives, and so on — were already in the disk operating system. They
had to be there, because DOS needed to interact with these peripherals.
The clever idea was this: Why not make these routines accessible to other
programs? That way, if you wanted to, say, write a character to the video
screen, you wouldn’t have to know anything about the video circuitry, All
you would need to know was the entry point of the video routine and how
‘Routines to || to
control
peripherals
Figure 4-1. _Old-foshioned operating systems
84 Assembly Language Primer for the IBM PC & XTto tell it what character to print. Then you could let the DOS routine
worry about all the tedious hardware-dependent details. Figure 4-2 shows
a modern operating system, which lets the user program make use of its
inpuvVoutput routines.
Does this remind you of anything? Have you realized that you've
already written programs that use routines in DOS? The happy face
programs use the Display Output function call to print characters on the
screen. This function call required only three instructions:
MOV DL, 1 <— Put ASCII character in DL register
MOV AH, 2 «<— Put DOS function number in AH register
INT 21 <= Interrupt #21 call to DOS
Putting a character on the screen would have required dozens of
assembly-language instructions if we had written the routine to do it
ourselves, as we would have had to do with the old-fashioned kind of
operating system. We would have needed to worry about such topics as
Figure 4-2. Modern operating systems
Inside DOS—The Disk Operating System 85what mode the display was in, what the horizontal retrace was doing,
whether the character was a linefeed (if so, we'd need to move the cursor
down a line), whether we were on the bottom line (if so, we might need
to scroll the screen up), and so on. But by simply calling a routine in
DOS, we have changed our task from an extremely complex one,
requiring detailed understanding of the computer's hardware, to a
comparatively simple one needing only a few facts about the operating
system, and only three instructions.
Using the Speaker — No Help from DOS
Remember the routine we wrote in the last chapter to make a “beep”
sound on the speaker? This program provides a small example of the
difficulties involved in writing our own routines to access a peripheral. In
this routine we had to figure out all sorts of details, such as how long a
delay loop to make to produce a given pitch. The resulting program was
seven instructions long. If there were a DOS function call to perform this
function (which there isn’t), it would require no detailed understanding
of how the speaker works, and could get by with only two instructions:
MOV AH, 99 +— Hypothetical number of BEEP function
INT 21 — Call DOS
Of course, beeping the speaker is one of the simplest 1/O jobs we can
perform, The advantages to be gained by using DOS routines are much
greater for other peripherals, such as the keyboard and disk drive, as we'll
see.
Program Transportability
Besides the convenience of being able to write shorter programs and
not needing detailed knowledge of how to program peripherals, there is
another big advantage to letting DOS do our input/output. Our program
will work even if we replace some of these peripherals — like the video
terminal or the disk drives — with completely different models from
different manufacturers.
In fact, our program will even work on an entirely different computer,
provided it uses the MS-DOS operating system. Since MS-DOS is very
similar to PC-DOS (as we noted in the Introduction), you can take your
happy face program and run it on any of the so-called “IBM compatible”
computers that use MS-DOS. The DOS function calls will have the same
numbers, and be accessed in the same way, so your program will operate
just as before. On a small program like “happy face” this is hardly an
86 Assembly Language Primer for the IBM PC & XTearth-shaking issue, but if you have invested thousands of hours in a
sophisticated accounting or word-processing program, it’s nice to know
that it can be used on a variety of different computers, with little
additional programming investment. Its also nice to know that what you
learn in this book is applicable to other computers besides the IBM.
Something Has to Change
Of course, something has to change when you try to run the same
program on a computer with different peripherals, or on a different
make of computer. What changes are the input/output routines, buried
somewhere in DOS, which actually communicate directly with the
physical device. Thus, if you got a different kind of disk drive, or video
terminal, or wanted to use the operating system on a different computer
altogether, then you would need to change your operating system to work
with this new device. Actually, only part of DOS needs to be changed
when these routines are changed, the part called IBMBIOS.
We're going to learn more about IBMBIOS and the other parts of
DOS in a moment. First, however, let's explore another example of a DOS
function call, so you can begin to see the variety of different things these
calls can do for your programs.
‘The KEYBOARD INPUT Function
KEYBOARD INPUT Function — Number 01h
Enter with:
Reg AH=1
Execute:
INT 21
Return with: keyboard character in Reg AL
GBreck) causes exit from function
You might think it would be a comparatively simple task to read a
character from the keyboard into your program. Actually, itis if you use
the DOS function call we’re about to describe. If you wanted to write the
code to do it yourself, it would take ten pages of code! How do we know?
Inside DOS—The Disk Operating System 87Because that's how much code IBM used in the ROM routine built into
the PC, as you can see by looking at appendix A in the /BM Personal
Computer Technical Reference manual.
What does all this ROM code do? Well, for example, it has to figure
‘out if the (AI) or (Shift) or
when combined with other keys. It has to know what to do if
or (AID) e pressed. It has to store normal key
entries in a buffer (an area of memory), so that if your program is busy
doing something else while you are typing, no keystrokes will be lost. If
this buffer gets full, the routine has to sound the beeper to let you know.
And so on, and so on. Aren't you glad you don’t have to figure all this
‘out every time you want to read a character from the keyboard?
We'll be talking more about these ROM routines in chapter 9. Until
then, all you need to know about them is that there are routines built into
ROM to help with input/output, and that DOS makes use of these
routines to simplify assembly-language programming.
Here's a short program that makes use of the Keyboard Input
function. Get into DEBUG, and type the following:
a>debug
-alio
O8F1: 9199 mov ah, 1
O8F1: 9182 int 21 «Enter these instructions
@8F1:0104 int 26
98F1: 0106 — Press 2 to leave “A command
This is an even shorter program than the happy face one! To make sure
it’s accurate, unassemble it with “U”:
-u199, 105
O8F1:0100 B4g1 MOV AH, O1
98F1:6192 CD21 INT (21
O8F1: 9104 CD26 INT 29
Now, to execute this program, you type “G”. Uh, oh — nothing seems
to be happening. The computer is just sitting there. Is it stuck? No
problem. Just press any key, “2” for example.
<6
z
Program terminated normally
‘The computer comes back to life, and you're in DEBUG again. What
88 Assembly Longuage Primer for the IBM PC & XTwas all that about? Nothing mysterious. When you started the program,
the first instruction put a 1 in the AH register to tell DOS that we wanted
to execute the Keyboard Input function. Then INT 21 called DOS (as
you know), which took us straight to the routine to read a character from
the keyboard. The function waits until something is typed before it lets
the program go on, so until we hit a key the program sits there, looping
endlessly in the DOS routine.
Once we strike a key, the function terminates, and the next
instruction in our program is executed, which is the INT 20, which
terminates the program and returns us to DEBUG. The ASCII value of
the character is also returned in the AL register, although this short
program does not make use of that fact.
Potential Trouble
‘There's one character which will
if you type it in this program: the
se things to act a little differently
key. Let’s see what
happens:
i, Aap
‘E — Type o normal character
Program terminated normally
-£ <— Run the program again
¢ Wee (Git) Break)
AX=9100 BX=0000 CX=9000 DX=0000 SP=FFEE BP=0000 SI=1000 DI=0000
DS=(8F1 ES=98F1 SS=$8F1 CS=98F1 IP=9164 NV UP DI PL NZ NA PO NC
98F1: 0104 CD26 INT 26
Wow! You get the printout of all the registers that you got before by
typing DEBUG’s “R” command. And now, try typing “g” to run the
program again:
-g —Runit
<— It doesn't wait for you to type something!
Progran terminated normal ly
-£ —Ty it again
«<— Same result
Program terminated normal ly
Something’s gone wrong with the program. It no longer waits for our
input from the keyboard when we type “G” to run it; it says “Program
terminated normally” immediately. Why is that?
Inside DOS—The Disk Operating System 89The Instruction Pointer Register
Look closely at the register display we just saw. There’s a new part of
this display you should learn about, in order to understand where our
program went awry. In the middle of the middle row it says “IP=0104."
Why is this important? To understand its significance, you need to know
that the 8088 keeps track of where it is in a program by keeping the
address of the instruction currently being executed in the IP register. The IP
register is a 16-bit register something like AX, BX, and so on, except that
it is used only to hold the address of the current instruction. Each time an
instruction is executed, the 8088 updates the IP register to point to the
next instruction.
The 8088 microprocessor keeps track of where it is with the
Instruction Pointer (IP) register.
Thus, at the beginning of our program, IP contains 100, since that’s
where all programs are supposed to start in DEBUG. In fact, DEBUG
puts this value into IP when it’s first loaded, as you can see by loading
DEBUG and typing “R” immediately. After we execute the first
instruction in our program, the IP contains 102, since that’s the address
of the next instruction. And finally, for the last instruction, it contains
106. When the program terminates with an INT 20 instruction, DEBUG
automatically sets the IP register back to 100, so that it’s ready to start
the program again.
Now, the reason our program doesn’t work the way it should is this:
when you hit (Cit) ), DEBUG terminated the program right in
the middle, just before the program had a chance to execute the INT 20
instruction. The instruction shown in IP in the register display is the one
about to be executed. So, when we return to DEBUG fi
the IP contains 104, not the 100 that it should. Since the program has
not terminated with an INT 20 instruction, the IP will not be reset to
100. So when you type “G”, DEBUG will start the program at whatever
address is in the IP register. If 104 is in IP, then that’s where the program.
will start. But the only thing at 104 is an INT 20, which will terminate
the program and bring us straight back to DEBUG with "Program
terminated normally.” The call to the Keyboard Input function will never
be executed.
How can you start over at the beginning of the program? It turns out
you can modify the contents of the IP register with DEBUG, just as you
90 Assembly Language Primer for the IBM PC & XTcan the AX and other general purpose registers. Enter “R”, followed by
“Tp”,
-rip — You type this, to see the IP register
IP $104 <— Contents of IP
109 — Type this to change it to 13D
-£ — Now try the program again
z << It waits for you to type a character!
Program terminated normally
So we've fixed it! The moral is that it’s only when your program starts
at 100 and terminates itself with an INT 20 that DEBUG will
automatically reset the IP to 100. If you start the program somewhere
else, or terminate it in the middle, then you can’t be sure what may be
left in the IP. To avoid problems, get in the habit of checking the IP by
typing “RIP”, and setting it back to 100 if necessary, before you type “G”
to run a program.
Typing in a Sentence
Suppose we wanted to use the Keyboard Input function to type in
something longer than a single character, As you might guess, we can
simply change the INT 20 to a jump back to the beginning of the
program: JMP 100. Here’s what you type in:
-ali
08F1:9199 mov ah, 1
O8F1: 9192 int 21
O8F1:9104 jmp 100
O8F1: 9106
And here it is disassembled with “U”:
-u109, 105
O8F1: 9100 B491 Mov AH, 02
O8F1: 9102 CD21 INT 21
O8F1: 0104 EBFA IMP 100
Now when we run the program we can type in a whole sentence.
While you're typing you can experiment with some of the editing
features built into this system call. For instance, if you make a mistake,
ae back-space. If you type J (the “J” key pressed wl
key is held down) you'll get a linefeed. And if you hi
Cursor will return to the start of the line, although you will still be in the
Inside DOS—The Disk Operating System 91(2) does this, how do we escape from our program?
— without further ado you'll be back in DEBUG:
Now is the time for all good men to come to the aid of their country
c
AX=10D BX=9000 CX=0000 DX-0900 SP=FFEE BP=0000 SI=1000 DI=0000
DS=$8F1 ES=(8F1 SS=$8F1 CS-98F1 IP=9104 NV UP DI PL NZ NA PO NC
O8F1: 9104 EBFA IMP 100
Actually it’s not quite as clean as this, because the *C prints over the
first part of the phrase you typed in:
Cw is the time for all good men to come to the aid of their country.
Notice how the registers have all been printed out again, as they were
when we typed in the single-key program above. Again, the
IP contains 104. But this time it doesn’t matter if we set it back to 100 or
not: since the program consists of an endless loop, we can get into it
anywhere without changing its operation.
You may be concerned that the programs we've used to demonstrate
these functions so far don’t scem to do anything very useful. Don’t worry.
At the end of this chapter, and in the next chapter, we'll combine the
function calls we've learned into larger programs that will actually
perform useful services and amaze your friends. Now, however, let’s go
back and talk about the various parts of DOS, and where these function
calls fit into the overall DOS organization.
The Parts of DOS
Earlier we mentioned ROM and IBMBIOS, and said that they were
parts of the Disk Operating System. Let’s stop a minute now and describe
the major parts of DOS. This will give you a rough idea of what the
various parts of DOS do, and where they fit in the computer's memory.
Be aware, however, that you really don’t need to know a great deal about
the internal workings of DOS to write programs in assembly language. So
don't worry if some of the details of its operation seem a little vague at
this point; you'll learn more about the operating system as we go along.
DOS is divided into four major parts: ROM, IBMBIOS, IBMDOS,
and COMMAND. They are loaded into memory like this:
92 Assembly Longuoge Primer for the IBM PC & XTLH 0000 boom of memory
IBMBIOS
IBMDOS
Resident
part of
COMMAND.
Space available for
user programs
—————
Transient
part o}
| COMMAND | 1__ Highest available RAM address
(FFFF for 64K memory,
a IFFFF for 128K memory, etc.)
2. 3 1
ROM
routines
FFFFF highest part of address
space (always the same)
Notice that the lowest addresses are shown at the top of the diagram.
This may seem backwards, but it’s the way program listings are written,
and it’s the way IBM does it, so for consistency we're going to follow this
format too.
‘To understand the roles played by the various parts of DOS, it’s
helpful to think of the entire operating system as some sort of large
industrial corporation — we could call it “DOS Incorporated.” The
different parts of the system then correspond (very roughly) to the
different management levels in the corporation.
The Workers: ROM (Read Only Memory)
ROM stands for “Read Only Memory.” It corresponds to the blue-
collar workers down on the floor of the factory, getting the actual work
done. In DOS this work might be sending characters to the display
screen, reading information from the disk drive and the keyboard, and so
forth. By getting the work done we mean that the routines in ROM send
instructions to peripheral devices such as the keyboard and disk drive
Inside DOS—The Disk Operating System 93.that actually do things in the outside world. This is the point where
software “interfaces” (connects with) hardware.
The “products” that the ROM routines are producing are generally
concerned with moving information from hardware to software and vice
versa: reading a character from the screen into memory, sending a group
of data from memory to the disk, and so on. In other words, ROM
contains most of the actual inpuVoutput routines that communicate with
the peripheral devices connected to the PC.
ROM is an actual physical part of the computer, a kind of memory
like the RAM (Random Access Memory) you store your program in,
except that the programs in ROM are installed by IBM at the factory and
can’t be changed. (They also don’t vanish when you turn off the
computer, the way programs stored in RAM do.) Since ROM is part of
the physical computer it is documented in the IBM Personal Computer
Technical Reference manual, which describes the physical characteristics of
the machine, rather than in the [BM Personal Computer Disk Operating
System manual.
You might not think of ROM as being part of DOS, since it exists
even in cassette-based IBM PCs that don’t have any disk drives. However,
ROM contains routines to access the disks as well as the other
peripherals, and when DOS is loaded from the diskette, the routines in
ROM become an integral part of the operating system.
The remaining parts of DOS come on the DOS diskette, and are
loaded in from the diskette when you initialize your system, either by
turning it on, if it’s off, or by hitting .
The Foreman: IBMBIOS
IBMBIOS supervises the activities of the ROM routines. If IBMDOS
or another program wants to use a routine in ROM, the request is
“passed through” IBMBIOS. That is, the request goes to IBMBIOS,
which decides what to do with it before passing it on to the appropriate
ROM routine. This has several advantages. If [BM discovers a mistake in
the ROM, or if they want to modify it for some reason, they can’t actually
change the ROM (at least in those computers that have already been
sold), since the ROM is a permanent part of the computer. But they can
change the DOS diskette, which contains IBMBIOS, so that it
incorporates the changes. This is like a human foreman who has learned
so well what mistakes his employees are likely to make that he can
compensate for them in the finished product.
‘Thus by issuing a new operating system with the revised IBMBIOS,
IBM can in effect change the input/output routines in ROM, even though
94 Assembly Longuoge Primer for the IBM PC & XTROM itself is unchanged. (It’s modification of this sort that led to new
revisions of the operating system being issued, such as when DOS 1.00
became 1.10, and so on.) Also, various error situations which can occur
when an I/O routine is in use can be dealt with more flexibly if they are
not a permanent part of ROM.
Management: IBMDOS
IBMDOS concerns itself with more general, less detailed problems
than do ROM and IBMBIOS. You can think of it as the management
part of DOS, having a larger perspective than the workers or the
foreman. For instance, ROM and IBMBIOS know how to write a
particular sector (a small amount of disk information) to the disk, but
IBMDOS knows what entire file is to be written to the disk, and keeps
track of what sectors have been written so far and where they are on the
disk. (Don’t worry, we'll be talking more about sectors and files, among
other things, in the chapters on the disk system.)
IBMDOS also contains the “entry points” for the DOS function calls
discussed previously, like the Display Output and Keyboard Input
functions we've already used. (Entry points are simply addresses where
these routines begin in memory.) It’s this part of DOS that our assembly-
language programs will be communicating with when they need to
perform any input or output operations. The actual input/output routines
may be in ROM, but your program must go through IBMDOS to use
them, just as in a corporation we wouldn't place an order for 1000
widgets with the workers in the assembly line; we'd talk with some
management-level people on a higher floor.
Chief Executive Officer: COMMAND.COM
COMMAND.COM is responsible for controlling the overall activities
of the operating system. Iv’s the part of DOS that prints the A> prompt
and then figures out what to do with what you type in. You might say it is
the intelligent part of the operating system. The other parts merely do
what they're told, either by COMMAND.COM, or by another assembly-
language program.
COMMAND.COM actually comes in two parts: a resident part, which
lies just above IBMDOS in low memory, and a transient part that sits all
the way at the top of memory, up to FFFF if you have 64K, up to 1FFFF
if you have 128K, and so on. (Notice the difference between the memory
you actually have, which might be say, 128K, and the entire addressable
memory space in the computer, which is one megabyte, or 1,000K, with a
high address of FFFFE)
Inside DOS—The Disk Operoting System 95“Resident” means that this part of COMMAND.COM remains in
memory at all times. The resident portion of COMMAND.COM contains
basic control functions and error-handling routines. The transient
portion communicates with users via the A> prompt, and cont the
internal DOS commands like DIR, TYPE, and COPY. The transient part
of COMMAND.COM can actually be written over by user programs if
they need a lot of memory space. It is then loaded back into memory
from the diskette by the resident portion when the user program is
finished.
Acting Chief Executive Officer
When we write an applications program in assembly language (or in
a higher-level language like Pascal, which is then compiled into machine
language), and then execute it, this program takes over temporarily from
the COMMAND.COM program, and assumes command of the computer
itself. It then has access to all the facilities provided by IBMDOS,
IBMBIOS, and ROM, just as COMMAND does when it’s in charge. It
can use these resources for its own purposes, and COMMAND can only
regain control when the program is over, as when it executes the INT 20
interrupt.
Chairman of the Board
And who, you might ask, tells COMMAND.COM what to do? Why,
you do — whenever you type a command following the A> prompt. Was
it not this opportunity to exercise corporate power that convinced you to
buy a computer in the first place?
Figure 4-3 gives some idea how the various parts of DOS fit together.
DOS Functions
We learned above that the DOS functions are input/output routines
located in the ROM and IBMBIOS portions of DOS. They are accessed
by making interrupt calls in the form of INT 21 to the IBMDOS part of
the operating system, which then passes our request on to the
appropriate routine in IBMBIOS or ROM. The particular function to be
used is selected, as we've seen, by placing a particular number in the AH
register before making the INT 21 call to DOS.
In chapter 2 we used the Display Output DOS function to write a
happy face and other characters on the screen, and in this chapter we
used the Keyboard Input DOS function to get a character from the
96 Assembly Longuage Primer for the IBM PC & XTkeyboard. What other DOS functions are there?
The most complete description of these functions is given in
appendix D of the JBM Personal Computer Disk Operating System manual,
which comes with your copy of DOS. In DOS version 1.10 the functions
start with 0 and go up to number 2Eh. We wind up with a total of 41
functions (not all the available numbers were assigned). DOS version 2.00
uses 74 functions, and there is no reason why new versions of DOS will
not contain even more. It might be educational for you to look through
this appendix, just to get a rough idea of the kinds of things these calls
do. Many of the descriptions will be mysterious to you at this point, but
by the time you finish this book you will be reading appendix D for
relaxation, like the Sunday comics.
Since there are so many functions we are not going to provide
detailed descriptions of them all in this book. Instead, we will concentrate
on the most commonly Gsed ones, and those that most easily demonstrate
how particular parts of the operating system work. Once you know these,
you should be able to figure out how the others work, since there are
many similarities.
DOS functions can be divided roughly into two categories: those that
The user] Je Boor
Comma
(Acting Chief
Erocuve Offcerl| prcsgecm
(Chief Executive
nd | J Officer)
IBMDOS
(Manogement)
IBMBIOS
(The foremon)
Information — out [ROM in
to
let LSC
a oe
devic
es
Figure 4-3. Organization of DOS
Inside DOS—The Disk Operoting System 97deal with the disk, and those that deal with other peripherals, such as the
video screen, keyboard, and printer. The non-disk functions are generally
simpler, so we will cover several more of them in this chapter. We'll
discuss the disk functions in chapters 11 and 12.
The Print String Function
Let's start off by learning a new DOS function: one that prints a
string of characters.
PRINT STRING function — Number 09h
Enter with:
Reg AH = 9
DS:DX address of start of string
Execute:
INT 21
Comments: string must terminate with “$” (dollar sign)
You may have noticed something new in the box above: the
expression
Reg DS:DX = address of start of string
‘This means that the function needs both the segment address and the
offset address of the string, and that the segment address is to be placed
in the DS register and the offset address is to be placed in the DX
register. You don’t need to know about the DS register yet. DEBUG (or
DOS) takes care of making sure the correct value is in this register, so for
the time being you can ignore it. Later, in chapter 8 on memory
segmentation, we'll find out about the “Segment Registers,” of which DS.
is one.
We already know how to print a single character on the screen, using
the Display Output function. That's good as far as it goes, but many
times in a program we'd like to display a whole string of characters at
once. Print String lets us do just that.
Here’s how it works. Before you can use this function you need to put
the string — consisting of the actual characters you're going to print —
98 Assembly Language Primer for the IBM PC & XTsomewhere in memory. (Makes sense, doesn’t it? Can't print them if
they're not there.) The string consists of ASCII characters, and it must
end with a dollar sign ($). The dollar sign is the only way the function
knows when it has come to the end of the string, so it’s important that
you don’t forget it.
Strings to be printed by the Print String function must end
with a dollar sign.
To use the Print String function you first put the starting address of
the string in the DX register. Next, you put the function number 9 in the
AH register, and finally you call DOS with an INT 21. Let's write a
program that makes use of this function to printa string.
s>debug
-al00
O8F1: 0100 mov dx, 109
98F1: 9103 mov ah,9
98F1:9105 int 21
98F1: 9107 int 26
98F1:9199 db 'Good Morning, Robert! $'
O8F1:911F
The “DB” Pseudo-Op
All the instructions in this program look pretty familiar except for
this one:
@8F1:6109 db ‘Good Morning, Robert!$"
This doesn’t look like an ordinary assembly-language instruction, and it’s
not. In fact, it’s a very strange sort of animal. Instead of being an
instruction that tells the 8088 microprocessor to do something, it’s an
instruction that tells DEBUG (or the assembler program — when we get
to that in the next chapter) what to do. In this case, it tells DEBUG to
put all the bytes represented by the characters between the single quote
marks into memory. Thus “G” is translated into its ASCII code 47h, “o”
into 6Fh, and so on. These values are then placed in memory. Note that
the “DB” itself is not placed in memory, since it is not really an
instruction and is not going to be executed by the 8088. Once it has told
Inside OOS—The Disk Operating System 99)DEBUG to put the characters in memory, its job is done. Its called a
“pseudo-op” because it’s not really an “operation code” or instruction. It
goes in the same place in the program as regular instructions, but it has
a different purpose.
“DB” stands for “Define Byte,” and as you can see it’s very useful for
putting ASCII codes into memory, since we don’t have to look up the
code for each value and then type it in with the “E” command. (If you
don’t have DOS version 2, you'll have to use the “E” command anyway,
but you won’t need to look up the values, since we'll show them when we
disassemble the program with “U.”)
You can also use “DB” to put numeric values into memory, either by
themselves, or with ASCII characters. We'll show an example of this in
the next section.
Don't forget the philosophical difference between regular assembler
instructions like MOV and JMP (which are sometimes called “operation
codes,” or “op-codes”) and pseudo-ops like DB. Instructions tell the 8088
microprocessor what to do at the time the program is executed. Pseudo-
ops, on the other hand, tell the assembler program (in this case DEBUG),
what to do when the program is being assembled.
“Instructions” are instructions to the microprocessor.
“Pseudo-ops” are instructions to the assembler.
Here's the program unassembled (or disassembled) with “U”:
~u100, 198
0905: 0100 BAGSO1 MOV DX, 0199
9995: 0193 B499 MOV AH, 09
9905: 0195 CD21 INE 21
9905: 9107 CD26 INTO
‘To see what bytes the db pseudo-op has placed in memory, “U” is not
much help, since these bytes are not program instructions. Instead, we'll
use ‘“d,” which provides not only the hex values of these bytes, but the
ASCH characters they represent:
-d109, 11f
O8F1: 0100 BA 09 01 B4 09 CD 21 47 GF 6F 64 20 4D GF :.. 4.M!M Good Mo
O8F1:9119 72 GE 69 GE 67 2C 20 52-GF 62 65 72 74 21 243A rming, Robert'$
Dollar sign
100 Assembly Longuage Primer for the IBM PC & XT