CS609 Quiz 2 Merged
CS609 Quiz 2 Merged
CS609
37. What is the maximum length for file names used as API
function arguments?
255 characters…..confirm
38. IsReg() function processes registry keys rather than
_________ and _________
Directories, files…..confirm
50. Which file system is commonly used for floppy disks and
memory cards?
FAT…..confirm
4. Windows operating system provides a naming scheme for the resources which allows
maximum character only
❖ 255
❖ 16
❖ 55
❖ 155
5. DOS was a _______ operating system.
❖ GUI based
❖ Command line
125. During searching files/folders, a data structure ______ is used to store the
information about a found file or directory
❖ Directory -64
❖ Attribute
❖ Directory -32
❖ WIN32_FIND DATA
126. What will be next code statement, if the following if statement is true? If
(!writeFile(hFile& header, size of(Header),&nXfer,&ovzero))
❖ ReportError (_T(“RecordAccessError:set End of header.”),6,TRUE);
❖ ReportError (_T(“RecordAccessError:set pointer.”),4,TRUE);
❖ ReportError (_T(“RecordAccessError:write File header.”),5,TRUE);
❖ ReportError (_T(“RecordAccessError:readFile header.”),4,TRUE);
127. The number of arguments required for Findclose()API is ____
❖ 3
❖ 1
❖ 2
❖ 0
128. The field flastAccessTime in a WIN32-FIND-DATA structure is used to represent
a time when a file was ______ time accessed
❖ Closing
❖ Last
❖ First
❖ Second ;last
129. Using GetFileTime() API argument(s) is/are provided.
❖ Both creation and last access time
❖ Only last access time
❖ Creation, last access and last write time
VirtualUniversityofPakistan
LeadersinEducationTechnology
CS609-SystemProgramming VU
TableofContents
01 -Introduction,MeansofI/O ...............................................................................................3
02 -InterruptMechanism. ................................................................................................... 11
03 -UseofISRsforCLibraryFunctions ................................................................................. 19
04 -TSRprogramsandInterrupts.......................................................................................... 26
05 -TSRprogramsandInterrupts(Keyboardinterrupt) ........................................................... 33
06 -TSRprogramsandInterrupts(Diskinterrupt,Keyboardhook) ........................................... 40
07 -HardwareInterrupts...................................................................................................... 46
08 -HardwareInterruptsandTSRprograms........................................................................... 54
09 -TheintervalTimer......................................................................................................... 68
10 -PeripheralProgrammableInterface(PPI)........................................................................ 76
11 -PeripheralProgrammableInterface(PPI)II ..................................................................... 83
12 -ParallelPortProgramming ............................................................................................ 95
13 -SerialCommunication ................................................................................................ 103
14 -SerialCommunication(UniversalAsynchronousReceiverTransmitter) ......................... 110
15 -COMPorts................................................................................................................. 117
16 -COMPortsII .............................................................................................................. 125
17 -RealTimeClock(RTC) ............................................................................................... 133
18 -RealTimeClock(RTC)II............................................................................................. 146
19 -RealTimeClock(RTC)III ........................................................................................... 155
20 -Determiningsysteminformation.................................................................................. 163
21 -KeyboardInterface ..................................................................................................... 172
22 -KeyboardInterface,DMAController ........................................................................... 180
23 -DirectMemoryAccess(DMA) .................................................................................... 186
24 -DirectMemoryAccess(DMA)II .................................................................................. 192
25 -FileSystems ............................................................................................................... 199
26 -HardDisk................................................................................................................... 207
27 -HardDisk,PartitionTable ............................................................................................ 216
28 -PartitionTableII ......................................................................................................... 223
29 -ReadingExtendedPartition ......................................................................................... 229
30 -FileSystemDataStructures(LSN,BPB)........................................................................ 236
31 -FileSystemDataStructuresII(Bootblock) .................................................................... 244
32 -FileSystemDataStructuresIII(DPB)............................................................................ 249
33 -RootDirectory,FAT12FileSystem. ............................................................................. 256
34 -FAT12FileSystemII,FAT16FileSystem ..................................................................... 262
35 -FAT12FileSystem(Selectinga12-bitentrywithinFAT12System) .................................. 267
36 -FileOrganization........................................................................................................ 274
37 -FAT32FileSystem ..................................................................................................... 283
38 -FAT32FileSystemII................................................................................................... 291
39 -NewTechnologyFileSystem(NTFS) ........................................................................... 301
40 -DisassemblingtheNTFSbasedfile ............................................................................... 306
41 -DiskUtilities .............................................................................................................. 312
42 -MemoryManagement ................................................................................................ 317
43 -Non-Contiguousmemoryallocation ............................................................................ 324
44 -AddresstranslationinProtectedmode ........................................................................... 329
45 -Viruses...................................................................................................................... 332
©CopyrightVirtualUniversityofPakistan 2
01-Introduction,MeansofI/O
01 -Introduction,MeansofI/O
WhatisSystemsProgramming?
Computerprogrammingcanbecategorizedintotwocategories.i.e.
While designing software the programmer may determine the required inputs for
that program, the wanted outputs and the processing the software would
performin order to give those wanted outputs. The implementation of the
processing partis associated with application programming. Application
programming facilitatesthe implementation of the required processing that
software is supposed to perform; everything that is left now is facilitated by
system programming.
ThreeLayeredApproach
DOS
BIOS
H/W
In this case the BIOS programs the hardware for required I/O operation which is
hidden to the user. In the third case the programmer may invoke operating
systems (DOS or whatever) routines in order to perform I/O operations. The
operating system in turn will use BIOS routines or may program the hardware
©CopyrightVirtualUniversityofPakistan 3
01-Introduction,MeansofI/O
directly in order to perform the operation.
©CopyrightVirtualUniversityofPakistan 4
01-Introduction,MeansofI/O
MethodsofI/O
In the three layered approach if we are following the first approach we need to
program the hardware. The hardware can be programmed to perform I/O in three
ways i.e.
ProgrammedI/O
InterruptdrivenI/O
DirectMemoryAccess
In case of programmed I/O the CPU continuously checks the I/O device if the I/O
operation can be performed or not. If the I/O operations can be performed
theCPU performs the computations required to complete the I/O operation and
then again starts waiting for the I/O device to be able to perform next I/O
operation. In this way the CPU remains tied up and is not doing anything else
besides waiting for the I/O device to be idle and performing computations only for
the slower I/O device.
In case of interrupt driven the flaws of programmed driven I/O are rectified. The
processor does not check the I/O device for the capability of performing I/O
operation rather the I/O device informs the CPU that it’s idle and it can
performI/O operation, as a result the execution of CPU is interrupted and an
Interrupt Service Routine (ISR) is invoked which performs the computations
required forI/O operation. After the execution of ISR the CPU continues with
whatever it was doing before the interruption for I/O operation. In this way the
CPU does not remain tied up and can perform computations for other processes
while the I/O devices are busy performing I/O and hence is more optimal.
Usually it takes two bus cycles to transfer data from some I/O port to memory or
vice versa if this is done via some processor register. This transfer time can be
reduced bypassing the CPU as ports and memory device are also interconnected
by system bus. This is done with the support of DMA controller. The DMA (direct
memory access) controller can controller the buses and hence the CPU can be
bypassed data item can be transferred from memory to ports or vice versa in a
single bus cycle.
©CopyrightVirtualUniversityofPakistan 5
01-Introduction,MeansofI/O
I/Ocontrollers
I/O device
I/Ocontroller
CPU
No I/O device is directly connected to the CPU. To provide control signals to the
I/O device a I/O controller is required. I/O controller is located between the CPU
and the I/O device. For example the monitor is not directly collected to the CPU
rather the monitor is connected to a VGA card and this VGA card is in turn
connected to the CPU through busses. The keyboard is not directly connected to
CPU rather its connected to a keyboard controller and the keyboard controller is
connected to the CPU. The function of this I/O controller is to provide
I/Ocontrolsignals
Buffering
ErrorCorrectionandDetection
We shall discuss various such I/O controllers interfaced with CPU and also the
techniques and rules by which they can be programmed to perform the required
I/O operation.
Someofsuchcontrollersare
DMAcontroller
Interruptcontroller
ProgrammablePeripheralInterface(PPI)
IntervalTimer
UniversalAsynchronousReceiverTransmitter
We shall discuss all of them in detail and how they can be used to perform I/O
operations.
Operatingsystems
Systems programming is not just the study of programmable hardware devices.To
develop effective system software one needs to the internals of the operating
systemaswell.Operatingsystemsmakeuseofsomedatastructuresortables for
management of computer resources. We will take up different functions of the
operating systems and discuss how they are performed and how can the data
structures used for these operations be accessed.
©CopyrightVirtualUniversityofPakistan 6
01-Introduction,MeansofI/O
FileManagement
File management is an important function of the operating systems.DOS/Windows
uses various data structures for this purpose. We will see how it performs I/O
management and how the data structures used for this purpose can be directly
accessed. The various data structures are popularly known as FAT which can be
of 12, 16 and 32 bit wide, Other data structures include BPB(BIOS parameter
block), DPB( drive parameter block) and the FCBs(file control block)
whichcollectivelyformsthedirectorystructure.Tounderstandthefilestructure the basic
requirement is the understanding of the disk architecture, the disk
formattingprocessandhowthisprocessdividesthediskintosectorsand clusters.
Memorymanagement
Memory management is another important aspect of operating systems.
Standard PC operate in two mode in terms of memory which are
RealMode
ProtectedMode
In real mode the processor can access only first one MB of memory to control the
memory within this range the DOS operating system makes use of some data
structures called
FCB(Filecontrolblock)
PSP(Programsegmentprefix)
We shall discuss how these data structures can be directly accessed, what is the
significance of data in these data structures. This information can be used to
traverse through the memory occupied by the processes and also calculate the
total amount of free memory available.
Certainoperatingsystemsoperateinprotectedmode.Inprotectedmodeallof the
memory interfaced with the processor can be accessed. Operating systems in
thismodemakeuseofvariousdatastructuresformemorymanagementwhich are
LocalDescriptorTable
GlobalDescriptorTable
InterruptDescriptorTable
We will discuss the significance these data structures and the information
storedin them. Also we will see how the logical addresses can be translated
intophysical addresses using the information these tables
VirusesandVaccines
©CopyrightVirtualUniversityofPakistan 7
01-Introduction,MeansofI/O
We will see where do they embed themselves and how can they be detected.
Moreover we will discuss techniques of how they can be removed and mostly
importantly prevented to perform any infections.
There are various types of viruses but we will discuss those which embed
themselves within the program or executable code which are
Executablefileviruses
PartitionTableorbootsectorviruses
DeviceDrivers
Just connecting a device to the PC will not make it work unless its device drivers
are not installed. This is so important because a device driver contains
theroutines which perform I/O operations on the device. Unless these routines
are provided no I/O operation on the I/O device can be performed by any
application. We will discuss the integrated environment for the development of
device drivers for DOS and Windows.
We shall begin our discussion from means of I/O. On a well designed device it is
possible to perform I/O operations from three different methods
ProgrammedI/O
InterruptdrivenI/O
DMAdrivenI/O
Output Input
D0
D0
D7
D7
Busy
Strobe DR
In case of programmed I/O the CPU is in a constant loop checking for an I/O
opportunity and when its available it performs the computations operationsrequired
for the I/O operations. As the I/O devices are generally slower than the CPU, CPU
has to wait for I/O operation to complete so that next data item can be sent to the
device. The CPU sends data on the data lines. The device need to be signaled
that the data has been sent this is done with the help of STROBE signal. An
electrical pulse is sent to the device by turning this signal to 0 and then 1. The
device on gettingthe strobesignal receives the data and starts itsoutput. While
thedeviceisperformingtheoutputit’sbusyandcannotacceptanyfurtherdata on the
other and CPU is a lot faster device and can process lot more bytes during
theoutputofpreviouslysentdatasoitshouldbesynchronizedwiththeslower
I/Odevice.ThisisusuallydonebyanotherfeedbacksignalofBUSYwhichis kept active
as long as the device is busy. So the CPU is only waiting for the
©CopyrightVirtualUniversityofPakistan 8
01-Introduction,MeansofI/O
device to get idle by checking the BUSY signal as long as the device is busy and
when the device gets idle the CPU will compute the next data item and send it to
the device for I/O operation.
Similar is the case of input, the CPU has to check the DR (data Ready) signal to
see if data is available for input and when its not CPU is busy waiting for it.
InterruptDrivenI/O
InterruptDriveninput/output
The main disadvantage of
Output
programmed I/O as can be
Input
noticed is that the CPU is
D0
busy waiting for an I/O
D0 opportunity and as a result
remain tied up for that I/O
D7 D7 operation. This disadvantage
Strobe can be overcome by means
Busy INT IBF of interrupt driven I/O. In
INT ACK
Programmed I/O CPU itself
CPU
I/O CPU I/O checks for an I/O opportunity
Controller butincaseofinterrupt driven
Controller
I/O the I/O controller
interrupts the execution of CPU when ever and I/O operation is required for the
computation of the required I/O operation. This way the CPU can perform other
computation and interrupted to perform and interrupt service routine only when
an I/O operation is required, which is quite an optimal technique.
©CopyrightVirtualUniversityofPakistan 9
01-Introduction,MeansofI/O
DMAdrivenI/O
We shall start our discussion with the study of interrupt and the techniques
usedto program them. We will discuss other methods of I/O as required.
Whatareinterrupts?
1
2
ISRPerformingAnI/O
Literallytointerruptmeanstobreakthecontinuityofsomeongoingtask.When
wetalkofcomputerinterruptwemeanexactlythesameintermsofthe processor. When
an interrupt occurs the continuity of the processor is broken and
theexecutionbranchestoaninterrupt serviceroutine.Thisinterruptservice routine is a
set of instruction carried out by the CPU to perform or initiate an I/O operation
generally. When the routine is over the execution of the CPU returns to the point of
interruption and continues with the on going process.
©CopyrightVirtualUniversityofPakistan 10
01-Introduction,MeansofI/O
Interruptscanbeoftwotypes
Hardwareinterrupts
Softwareinterrupts
Only difference between them is the method by which they are invoked. Software
interrupts are invoked by means of some software instruction or statement and
hardware interrupt is invoked by means of some hardware controller generally.
InterruptMechanism
Interrupts are quite similar to procedures or function because it is also anotherform
temporary execution transfer, but there some differences as well. Note that
whenproceduresareinvokedbytherenameswhichrepresentstheiraddresses is
specified whereas in case of interrupts their number is specified. This number can
be any 8 bit value which certainly is not its address. So the first question is what is
the significance of this number? Another thing should also be noticed that
procedures are part of the program but the interrupts invoked in the program areno
where declared in the program. So the next question is where do these interrupts
reside in memory and if they reside in memory then what would be the address of
the interrupt?
Firstlyletsseewheredointerruptsreside.Interruptscertainlyresidesomewhere in
memory, the interrupts supported by the operating system resides in kernelwhich
you already know is the core part of the operating system. In case of DOS
thekernelisio.syswhichloadsinmemoryatboottimeandincaseofwindows the kernel is
kernel32.dll or kernel.dll. these files contain most of the I/O routines
andareloadedasrequired.TheinterruptssupportedbytheROMBIOSare loaded in
ROM part of the main memory which usually starts at the address F000:0000H.
Moreover it is possible that some device drivers have been installed these device
drivers may provide some I/O routines so when the system boots these I/O
routines get memory resident at interrupt service routines. So these are the three
possibilities.
Secondly a program at compile time does not know the exact address where the
interrupt service routine will be residing in memory so the loader cannot assign
addresses for interrupt invocations. When a device driver loads in memory it
places the address of the services provided by itself in the interrupt vector table.
Interrupt Vector Table (IVT) in short is a 1024 bytes sized table which can
hold256 far addresses as each far address occupies 4 bytes. So its possible to
store the addresses of 256 interrupts hence there are a maximum of 256 interrupt
in a standard PC. The interrupt number is used as an index into the table to get
the address of the interrupt service routine.
©CopyrightVirtualUniversityofPakistan 11
02-InterruptMechanism
02 - InterruptMechanism
InterruptMechanism
Interrupt follow a follow a certain mechanism for their invocation just like near or far
procedures. To understand this mechanism we need to understand its differences with
procedure calls.
Differencebetweeninterruptandprocedurecalls
Procedures or functions of sub-routines in various different languages are called by
different methods as can be seen in the examples.
• CallMyProc
• A=Addition(4,5);
• Printf(“helloworld”);
The general concept for procedure call in most of the programming languages is that on
invocation of the procedure the parameter list and the return address (which is the value if
IP register in case of near or the value ofCS and IP registers in case of far procedure) is
pushedMoreover in various programming languages whenever a procedure is called its
addressneedtobespecifiedbysomenotationi.e.inClanguagethenameofthe procedure is
specified to call a procedure which effectively can be used as its address.
However in case of interrupts the a number is used to specify the interrupt number in the
call
• Int21h
• Int10h
• Int3
Fig1(Calltointerruptserviceroutineandprocedures/functions)
Main
Callproc1()
Callproc1()
Int21h
Proc1()
Int10h
Proc2()
Moreover when an interrupt is invoked three registers are pushed as the return address i.e.
the values of IP, CS and Flags in the described order which are restored on return. Also
©CopyrightVirtualUniversityofPakistan 12
02-InterruptMechanism
no parameters are pushed onto the stack on invocation parameters can only be passed
through registers.
Theinterruptvectortable
The interrupt number specified in the interrupt call is used as an index into the interrupt
vectortable.Interruptvectortableisaglobaltablesituatedattheaddress0000:0000H. The size of
interrupt vector table is 1024 bytes or 1 KB. Each entry in the IVT is sized 4 bytes hence
256 interrupt vectors are possible numbered (0-FFH). Each entry in the table contains a far
address of an interrupt handlers hence there is a maximum of 256 handlers however each
handlers can have a number of services within itself. So the number operations that can be
performed by calling an interrupt service routine (ISR) is indefinite
dependinguponthenatureoftheoperatingsystem.Eachvectorcontainsafaraddressof an interrupt
handler. The address of the vector and not the address of interrupt handler can be easily
calculated if the interrupt number is known. The segment address of the wholeIVT is
0000H the offset address for a particular interrupt handler can be determined by multiplying
its number with 4 eg. The offset address of the vector of
INT 21H will be 21H * 4= 84H and the segment for all vectors is 0 hence its far addressis
0000:0084H,( this is the far address of the interrupt vector and not the interrupt service
routine or interrupt handler). The vector in turn contains the address of the
interruptserviceroutinewhichisanarbitraryvaluedependinguponthelocationoftheISR
residing in memory.
Fig2(InterruptVectorTable)
InterruptVectorTable
INT0 0000:0000
INT1 0000:0004
INTFF 0000:03FFH
Moreover it is important to understand the meaning of the four bytes within the interrupt
vector. Each entry within the IVT contains a far address the first two bytes (lower word)of
which is the offset and the next two bytes (higher word) is the segment address.
©CopyrightVirtualUniversityofPakistan 13
02-InterruptMechanism
LO(1)
HI(0)
HI(1) 0000:0003
INT1 0000:0004
0000:0007
Fig3(FaraddresswithinInterruptvector)
LocationofISRs(Interruptserviceroutines)
Generally there are three kind of ISR within a system depending upon the entity which
implements it
BIOS(BasicI/Oservices)ISRs
DOSISRs
ISRsprovidedbythirdpartydevicedrivers
When the system has booted up and the applications can be run all these kind of ISRs
maybe provided by the system. Those provided by the ROM-BIOS would be typically
resident at any location after the address F000:0000H because this the address within
memory from where the ROM-BIOS starts, the ISRs provided by DOS would be
residentin the DOS kernel (mainly IO.SYS and MSDOS.SYS loaded in memory) and the
ISR provided by third party device drivers will be resident in the memory occupied by the
device drivers.
IO.SYS
DeviceDriver
Command.COM
Command. COM
USERPROGRAM
RomBios F000:0000
Fig4(ISRsinmemory)
This fact can be practically analyzed by the DOS command mem/d which gives the
statusof the memory and also points out which memory area occupied by which process
asshown in the text below. The information given by this command indicates the address
©CopyrightVirtualUniversityofPakistan 14
02-InterruptMechanism
where IO.SYS and other device drivers have been loaded but the location of ROM
BIOSis not shown by this command.
C:\>mem/d
Address Name Size Type
655360bytestotalconventionalmemory
655360 bytes available to MS-DOS
597952largestexecutableprogramsize
1048576bytestotalcontiguousextendedmemory
0bytesavailablecontiguousextendedmemory 941056
bytes available XMS memory
MS-DOSresidentinHigh MemoryArea
InterruptInvocation
Although hardware and software interrupts are invoked differently i.e hardware interrupts
areinvokedbymeansofsomehardwarewhereassoftwareinterruptsareinvokedby
meansofsoftwareinstructionorstatementbutnomatterhowaninterrupthasbeen invoked
processor follows a certain set steps after invocation of interrupts in exactly same way in
both the cases. These steps are listed as below
• PushFlags,CS,IPRegisters,ClearInterruptFlag
• Use(INT#)*4asOffsetandZeroasSegment
©CopyrightVirtualUniversityofPakistan 15
02-InterruptMechanism
• ThisistheaddressofinterruptVectorandnottheISR
• UselowertwobytesofinterruptVectorasoffsetandmoveintoIP
• Use the higher two bytes of Vector as Segment Address and move it into
CS=0:[offset+2]
• BranchtoISRandPerformI/OOperation
• ReturntoPointofInterruptionbyPoppingthe6bytesi.e.FlagsCS,IP.
This can be analyzed practically by the use of debug program, used to debug assembly
language code, by assembling and debugging INT instructions
C:\>debug
-d 0:84
0000:0080 7C 10 A700-4F 03 55 058A03 5505 |...O.U...U.
0000:0090 17 03 55 058610 A700-9010A7 00 9A 10A700 ..U.............
0000:00A0 B8 10 A7 005402 7000-F20474 CC B8 10A700 ....T.p...t.....
0000:00B0 B8 10 A7 00B810 A700-400121 04 50 09ABD4 ........@.!.P...
0000:00C0 EA AE 10 A700E8 00F0-B810A7 00 C4 2302C9 . ...........#..
0000:00D0 B8 10 A7 00B810 A700-B810A7 00 B8 10A700 . .............. 7
0000:00E0 B8 10 A7 00B810 A700-B810A7 00 B8 10A700 ................
0000:00F0 B8 10 A7 00B810 A700-B810A7 00 B8 10A700 ................
0000:0100 8A 04 10 02 ....
-a
0AF1:0100 int21
0AF1:0102
-r
AX=0000BX=0000CX=0000DX=0000 SP=FFEEBP=0000SI=0000DI=0000
DS=0AF1ES=0AF1SS=0AF1CS=0AF1 IP=0100 NVUPEI PLNZ NAPONC
0AF1:0100CD21 INT 21
-t
AX=0000BX=0000CX=0000DX=0000 SP=FFE8BP=0000SI=0000DI=0000
DS=0AF1ES=0AF1SS=0AF1CS=00A7 IP=107C NVUPDI PLNZ NAPONC
00A7:107C90 NOP
-d ss:ffe8
0AF1:FFE0 0201F1 0A02F20000
0AF1:FFF0 00000000000000 00-00000000 00000000
Thedump at the address 0000:0084 H shows the value of the vector of the interrupt #
21H i.e. 21H * 4 = 84H. This address holds the value 107CH in lower word and
00A7Hin the higher word which indicates that the segment address of interrupt # 21 is
00A7H and the offset address of this ISR is 107CH.
Moreover the instruction INT 21H can be assembled and executed in the debug program,
on doing exactly so the instruction is traced through and the result is monitored. It can be
seen that on execution of this instruction the value of IP is changed to 107CH and
thevalue of CS is changed to 00A7H which cause the execution to branch to the Interrupt
# 21H inmemoryand the previous values of flags, CSand IP registers are
temporarilysavedontothestackasthevalueofSPisreducedby6andthedumpatthelocation
SS:SP will show these saved values as well.
ParameterpassingintoSoftwareinterrupts
In case of procedures or function in various programming languages parameters
arepassed through stack. Interrupts are also kind of function provided by the
operatingsystem but they do not accept parameters by stack rather they need to passed
parameters through registers.
Softwareinterruptsinvocation
Now let’s see how various interrupts can be invoked by means of software
statements.First there should be way to pass parameters into a software interrupt before
©CopyrightVirtualUniversityofPakistan 16
02-InterruptMechanism
invoking the
©CopyrightVirtualUniversityofPakistan 17
02-InterruptMechanism
interrupt;thereareseveralmethodsfordoingthis.Oneofthemethodsistheuseof
pseudovariables.Avariablecanbedefinedaspacewithinthememorywhosevaluecan be changed
during the execution of a program but a pseudo variable acts very much like a
variableasitsvaluecanbechangedanywhereintheprogrambutisnotatruevariableas it is not
stored in memory. C programming language provides the use of pseudo variablesto access
various registers within the processor.
The are various registers like AX, BX, CX and DX within the processor they can be
directlyaccessedina program byusingtheirrespectivepseudovariable byjustattaching a “_”
(underscore) before the register’s name eg. _AX = 5;A = _BX.
After passing the appropriate parameters the interrupt can be directly invokedby calling
thegeninterrupt() function. The interrupt number needs to be passed as parameter
into the geninterrupt() function.
Interrupt#21H,Service#09description
Now lets learn by means of an example how this can be accomplished. Before invoking
the interrupt the programmer needs to know how the interrupt behaves and what
parameters it requires. Lets take the example of interrupt # 21H and service # 09
writtenas 21H/09H in short. It is used to print a string ending by a ‘$’ character and other
parameters describing the string are as below
Inputs
AH=0x09
DS = Segment Address of string
DX = Offset Address of string
Output
The‘$’terminatedstringattheaddressDS:DXisdisplayed
One thing is note worthy that the service # is placed in AH which is common with almost
all the interrupts and its service. Also this service is not returning any siginificant data, if
some service needs to return some data it too is received in registers depending upon the
particular interrupt.
Example:
#include<stdio.h>
#include<BIOS.H>
#include<DOS.H>
#include<conio.h>
void main()
{
clrscr();//toclearthescreencontents
_DX=(unsignedint)st;
_AH=0x09;
geninterrupt(0x21);
getch();//waitsfortheusertopressanykey
}
this is a simple example in which the parameters of int 21H/09H are loaded and then int
21H is invoked. DX and AH registers are accessed through pseudo variables and then
©CopyrightVirtualUniversityofPakistan 18
02-InterruptMechanism
geninterrupt()is called to invoke the ISR. Also note that _DS is not loaded. This is
the case as the string to be loaded is of global scope and the C language compiler
automatically loads the segment address of the global data into the DS register.
AnotherMethodforinvokingsoftwareinterrupts
This method makes use of a Union. This union is formed by two structure whichcorrespond
to general purpose registers AX, BX, CX and DX. And also the half register AH, AL, BH,
BL, CH, CL, DH, DL. These structures are combined such that through this
structurethefieldaxcanbeaccessedtoloadavalueandalsoitshalfcomponentsaland ah can be
accessed individually. The declaration of this structure goes as below. If this union is to be
used a programmer need not declare the following declaration rather declaration already
available through its header file “dos.h”
structfull
{
unsigned int ax;
unsigned int bx;
unsigned int cx;
unsignedintdx;
};
structhalf
{
unsigned char al;
unsignedcharah;
unsigned char bl;
unsigned char bh;
unsigned char cl;
unsigned char ch;
unsigned char dl;
unsignedchardh;
};
typedefuniontagREGS
{
struct full x;
structhalfh;
}REGS;
This union can be used to signify any of the full or half general purpose register shows if
the field ax in x structis to be accessed then accessing the fields al and ah in h will also
have the same effect as show in the example below.
Example:
#include<DOS.H>unio
n REGS regs; void
main (void )
{
regs.h.al = 0x55;
regs.h.ah=0x99; printf
(“%x”,regs.x.ax);
}
©CopyrightVirtualUniversityofPakistan 19
02-InterruptMechanism
output:
9955
Theint86()function
The significance of thisREGSunion can onlybe understoodafter understandingthe int86()
function. The int86() has three parameters. The first parameter is the interrupt number to be
invoked, the second parameter is the reference to a REGS type union which contains the
value of parameters that should be passed as inputs, and third parameter is a reference to a
REGS union which will contain the value of registers returned by this function. All the
required parameters for an ISR are placed in REGS type of union and its
referenceispassedtoanint86()function.Thisfunctionwillputthevalueinthisunion into the
respective register and then invoke the interrupt. As the ISR returns it might leave
somemeaningful valuein the register (ISR will return values), these values can beretrieved
from the REGS union whose reference was passed into the function as the third parameter.
Exampleusinginterrupt#21Hservice#42H
To make it more meaningful we can again elaborate it by means of an example. Here we
make use of ISR 21H/42H which is used to move the file pointer. Its detail is as follows
Int # 21 Service#42H
Inputs
AL = Move Technique
BX = File Handle
CX-DX = No of Bytes File to be moved
AH = Service # = 42H
Output
DX-AX=NoofBytesFilepointeractuallymoved.
BOF cp EOF
This service is used to move the file pointer to a certain position relative to a certain
point. The value in AL specify the point relative to which the pointer is moved. If the
value of AL = 0 then file pointer is moved relative to the BOF (begin of File) if AL=1
then its moved relative to current position and if AL = 2 then its moved relative to the
EOF (end of file).
CX-DX specify the number of bytes to move a double word is needed to specify this
value as the size of file in DOS can be up to 2 GB.
On return of the service DX-AX will contain the number of bytes the file pointer
isactually moved eg. If the file pointer is moved relative to the EOF zero bytes the DX-
AX on return will contain the size of file if the file pointer was at BOF before calling the
service.
©CopyrightVirtualUniversityofPakistan 20
03-UseofISRsforCLibraryFunctions
Example21H/42H:
#include<stdio.h>
#include<fcntl.h>
#include<io.h>
#include<BIOS.H>
#include<DOS.H>
This program opens a file and saves its handle in the handlevariable. This handle is
passed to the ISR 21H/42H along with the move technique whose value is 2 signifing
movement relative to the EOF and the number of bytes to move are specified to be zero
indicating that the pointer should move to the EOF. As the file was just opened the
previous location of the file pointer will be BOF. On return of this service DX-AX will
contain the size of the file. The low word of this size in ax is placed in the low word of
sizevariable and the high word in dx is placed in the high word of sizevariable.
AnotherExample:
Lets now illustrate how ISR can be invoked by means of another example of
BIOSservice. Here we are choosing the ISR 10h/01h. This interrupt is used to perform I/O
on the monitor. Moreover this service is used to change the size of cursor in text mode.
The description of this service is given as under.
©CopyrightVirtualUniversityofPakistan 21
03-UseofISRsforCLibraryFunctions
Int#10H Service#01H
Entry
AH=01
CH = Beginning Scan Line
CL = Ending Scan Line
On
ExitUnchang
ed
The size of the cursor depends upon the number of net scan lines used to display the
cursor if the beginning scan line is greater than the ending scan line the cursor will
disappear. The following tries to accomplish just that
voidmain()
{
char st[80];
union REGS regs;
regs.h.ah = 0x01;
regs.h.ch = 0x01;
regs.h.cl=0x00;
int86(0x10,®s,®s); //corrected
gets(st);
}
The program is quite self explanatory as it puts the starting scan line to be 1 and the
ending scan line to be 0. Henceforth when the service execute the cursor will disappear.
UseofISRsforCLibraryfunctions
There are various library function that a programmer would typically use in a program to
perform input output operations. These libraryfunctions perform trivial I/O
operationslike character input (putch()) and character output (getch(),
getc()etc). All these function call various ISRs to perform this I/O. In BIOS and DOS
documentation number of services can be found that lie in correspondence with some C
library functionin terms of its functionality.
WritingS/WISRs
Lets now see how can a programmer write an ISR routine and what needs to be done in
order make the service work properly. To exhibit this we will make use of an interrupt
whichisnotusedbyDOSorBIOSsothatourexperimentdoesnotputanyinterference to the
normal functions of DOS and BIOS. One such interrupt is interrupt # 65H. The vector of
int 65H is typically filled with zeros another indication that it is not being used.
Gettinginterruptvector
As we have discussed earlier IVT is a table containing 4 byte entries each of which is afar
address of an interrupt service routine. All the vectors are arranged serially such that the
interrupt number can be used as an index into the IVT.
Gettinginterruptvectorreferstotheoperationwhichusedtoreadingthefaraddress stored within
the vector. The vector is double word, the lower word of it being the offset address and the
higher word being the segment address. Moreover the address read from a vector can be
©CopyrightVirtualUniversityofPakistan 22
03-UseofISRsforCLibraryFunctions
used as a function pointer. The C library function used to do the exactly
©CopyrightVirtualUniversityofPakistan 23
03-UseofISRsforCLibraryFunctions
Fig1(VectorbeingreadfromIVT)
INT# Offset
Offset
Segment
far
Segment
Intproc
Functionpointers
Another thing required to be understood are the function pointers. C language is a very
flexible language just like there are pointers for integers, characters and other data types
there are pointers for functions as well as illustrated by the following example
voidmyfunc()
{
void(*funcptr)()
funcptr = myfunc;
(*funcptr) ( );
myfunc();
There are three fragments of code in this example. The first fragment shows the
declaration of a function myfunc()
The secondfragment show declaration of a pointerto function namedfuncptrwhich isa
pointer to a function that returns void.
In the third fragment funcptris assigned the address of myfunc as the name of the
function can be used as its address just like in the cases of arrays in C. Then the function
pointed by funcptrby the statement(*funcptr)();is called and then the original
myfunc()is called. The user will observe in both the cases same function myproc()
will be invoked.
©CopyrightVirtualUniversityofPakistan 24
03-UseofISRsforCLibraryFunctions
Interruptpointersandfunctions
Interrupt functions are special function that as compared to simple functions for reasons
discussed earlier. It can be declared using the keyword interruptas shown in the
following examples.
voidinterruptnewint()
{
...
...
}
Similarlyapointertosuchinterrupttypefunctioncanalsobedeclaredasfollowing
voidinterrupt(*intptr)();
whereintptristheinterruptpointeranditcanbeassignedanaddressusingthe
getvect()function
intptr=getvect(0x08);
Nowinterruptnumber8canbeinvokedusingtheinterruptvectorasfollowing
(*intptr)();
SettingInterruptVector
Setting interrupt vector is just the reverse process of getting interrupt vector. To set the
interrupt vector means is to change the double word sized interrupt vector within the IVT.
This task can be accomplished using the function setvect(int #, newint)which
requires the number of interrupt whose vector is to be changed and the new value of the
vector.
INT # Offset
Offset
Segment
Segment far
©CopyrightVirtualUniversityofPakistan 25
03-UseofISRsforCLibraryFunctions
In the following example a certain interrupt type function has been declared. The
addressof this function can be placed onto the vector of anyinterrupt usingsetvect()
function as following. The following code places the address of newintfunction at the
vector of int 8
voidinterruptnewint()
{
…
…
}
setvect(0x08,newint);
CprogrammakinguseofInt65H
Here is a listing of a program that makes use of int 65H to exhibit how software interrupts
needs to be programmed.
TheKeepfunction
One deficiency in the above listing is that it is not good enough for other application i.e.
after the termination of this program the newint65function is de-allocated from the
memory and the interrupt vector needs to be restored otherwise it will act as a dangling
©CopyrightVirtualUniversityofPakistan 26
03-UseofISRsforCLibraryFunctions
keep(returncode,no.ofparas);
the keep() function requires the return code which is usually zero for normal termination
andthenumberofparagraphsrequiredtobeallocated.Eachparagraphis16bytesinsize.
TSRPrograms
Following is a listing of a TSR (Terminate and Stay Resident) program which programs
the interrupt number 65H but in this case the new interrupt 65H function remains in
memory even after the termination of the program and hence the vector of int 65h does
not become a dangling pointer.
#include<BIOS.H>
#include<DOS.H>
Themain()function gets and sets the vector of int 65H such that the address of
newint65isplacedatitsvector.Inthiscasetheprogramismadememoryresident using the keep
function and 1000 paragraphs of memory is reserved for the program (the amount of
paragraphs is just a calculated guess work based upon the size of application). Now if any
application as in the following case invokes int 65H the string st which is also now memory
resident will be displayed.
©CopyrightVirtualUniversityofPakistan 27
03-UseofISRsforCLibraryFunctions
#include<BIOS.H>
#include<DOS.H>
voidmain()
{
geninterrupt (0x65);
geninterrupt(0x65);
}
Thisprograminvokestheinterrupt65Htwicewhichhasbeenmaderesident.
©CopyrightVirtualUniversityofPakistan 28
04-TSRprogramsandInterrupts
04 -TSRprogramsand Interrupts
AnotherExample:
#include<BIOS.H>
#include<DOS.H>
charst[80]={"HelloWorld$"};
char st1[80] ={"Hello Students!$"};
void interrupt (*oldint65)( );
void interrupt newint65( );
void main()
{
oldint65 = getvect(0x65);
setvect(0x65, newint65);
keep(0, 1000);
}
voidinterruptnewint65()
{
if((_AH)==0)//corrected
{
_AH=0x09;
_DX = (unsigned int) st;
geninterrupt (0x21);
}
else
{
if((_AH)==1)//corrected
{
_AH=0x09;
_DX = (unsigned int) st1;
geninterrupt (0x21);
}
}
}
Various interrupts provide a number of services. The service number is usually placed in
the AH register before invoking the interrupt. The ISR should in turn check the value in
AH register and then perform the function accordingly. The above example exemplifies
justthat. In thisexampleint 65is assigned two services 0and1. Service 0prints the string
stand service 1 prints the string st1. These services can be invoked in the following
manner.
#include<BIOS.H>
#include<DOS.H>voi
d main()
{
_AH=1;
geninterrupt(0x65);
©CopyrightVirtualUniversityofPakistan 29
04-TSRprogramsandInterrupts
_AH=0;
geninterrupt(0x65);
}
Interruptstealingorinterrupthooks
Previously we have discussed how a new interrupt can be written and implemented.
Interrupt stealing is a technique by which already implemented services can be altered by
the programmer.
This technique makes use of the fact that the vector is stored in the IVT and it can be read
and written. The interrupt which is to be hooked its (original routine ) vector is first read
from the IVT and then stored in a interrupt pointer type variable, after this the vector is
changed to point to one of the interrupt function (new routine) within the program. If the
interrupt is invoked now it will force the new routine to be executed provided that its
memoryresident.Nowtwothingscanbe done,theoriginal routinemightbeperforming an
important task soitalso needsto invoked, itcaneither beinvokedin the start of the new routine
or at the end of the new routine using its pointer as shown in the following execution charts
below
Fig1(NormalExecutionofanISR)
ExecutionInterrupted
ISR PerformI/O
Normal ExecutionofInterrupt
Fig2(TheoriginalISRbeingcalledatheendofnewroutine)
NewRoutine
OriginalRoutine
InterruptInterception
©CopyrightVirtualUniversityofPakistan 30
04-TSRprogramsandInterrupts
Fig3(TheoriginalISRinvokedatthestartofnewISR)
OriginalRoutine
NewRoutine
OtherformofInterruptInterception
Care must be taken while invoking the original interrupt. Generally in case hardware
interrupts are intercepted invoking the original interrupt at the start of new routine might
cause some problems whereas in case of software interrupts the original interrupt can be
invoked anywhere.
SampleProgramforinterruptInterception
void interruptnewint();
void interrupt (*old)();
void main()
{
old=getvect(0x08);
setvect(0x08,newint);
keep(0,1000);
}
voidinterruptnewint()
{
…
…
(*old)();
}
The above program gets the address stored at the vector of interrupt 8 and stores it in the
pointer oldint. The address of the interrupt function newint is then placed at the vector
ofint 8 and the program is made memory resident. From this point onwards whenever
interrupt 8 occurs the interrupt function newint is invoked. This function after performing
its operation calls the original interrupt 8 whose address has been stored in oldint pointer.
TimerInterrupt
In the coming few examples we will intercept interrupt 8. This is the timer interrupt. The
timer interrupt has following properties.
ItsanHardwareInterrupts
ItisInvokedbyMeansofHardware
Itapproximatelyoccurs18.2timeseverysecondbymeansofhardware.
©CopyrightVirtualUniversityofPakistan 31
04-TSRprogramsandInterrupts
BIOSDataArea
BIOS contains trivial I/O routines which have been programmed into a ROM type device
andisinterfacedwiththeprocessorasapartofmainmemory.HowevertheBIOS routines would
require a few variables, these variables are stored in the BIOS data arera at the location
0040:0000H in the main memory.
One such byte stored in the BIOS data area is the keyboard status byte at the location
40:17H. This contains the status of various keys like alt, shift, caps lock etc. This byte can
be described by the diagram below
Fig4(Keyboardstatusbyte)
40:17H 7 6 5 4 3 2 1 0
Insertkey
RightShiftkey
CapsLockKey Left Shift Key
Num Lock key Ctrl Key
AnotherExample
#include<dos.h>
void interrupt (*old)();
void interrupt new();
char far *scr=(char far* ) 0x00400017;
void main()
{
old=getvect(0x08);
setvect(0x08,new);
keep(0,1000);
}
voidinterruptnew(){
*scr=64;
(*old)();
}
This fairly simple example intercepts the timer interrupt such that whenever the timer
interrupt occurs the function new() is invoked. Remember this is .C program and not a
.CPP program. Save the code file with .C extension after writing this code. On
occurrenceof interrupt 8 the function new sets the caps lock bit in key board status by
placing 64 atthis position through its far pointer. So even if the user turns of the caps lock
on the next occurrence of int 8 ( almost immediately) the caps lock will be turned on again
(turing on the caps lock on like this will not effect its LED in the keyboard only letters will
be typedin caps).
©CopyrightVirtualUniversityofPakistan 32
04-TSRprogramsandInterrupts
MemoryMappedI/OandIsolatedI/O
Adevicemaybeinterfacedwiththeprocessortoperformmemorymappedorisolated I/O. Main
memory and I/O ports both are physically a kind of memory device. In case of Isolated
I/O, I/O ports are used to hold data temporary while sending/receiving the data to/from
the I/O device. If the similar function is performed using a dedicated part of main memory
then the I/O operation is memory mapped.
Fig5(IsolatedI/O)
IsolatedI/O
IN
M I/O
P
OUT
Fig6(MemorymappedI/O)
MemoryMappedI/O
MOV
M I/O
P
MOV
MemoryMappedI/OonMonitor
OneofthedevicesinstandardPCsthatperformmemorymappedI/Oisthedisplay device
(Monitor). The output on the monitor is controller by a controller called video controller
within the PC. One of the reason for adopting memory mapped I/O for the monitor is that
a large amount of data is needed to be conveyed to the video controller in
ordertodescribethetextorthatgraphicsthatistobe displayed.Suchlargeamount of data being
output through isolated I/O does not form into a feasible idea as the number of port in PCs
is limited to 65536.
The memory area starting from the address b800:0000H. Two bytes (a word) are reserved
forasinglecharactertobedisplayedinthisarea.ThelowbytecontainstheASCIIcode of the
character to be displayed and the high byte contains the attribute of the character tobe
displayed. The address b800:0000h corresponds to the character displayed at the top
©CopyrightVirtualUniversityofPakistan 33
04-TSRprogramsandInterrupts
left corner of the screen, the next word b800:0002 corresponds to the next character on
the same row of the text screen and so on as described in the diagram below.
Fig7(MemorymappedI/Oonmonitor)
MemoryMappedI/OONMonitor
B8OO:0002
B8OO:0003
B8OO:0000
B8OO:0001
LowByte=ASCIICODE
HighByte=AttributeByte
The attribute byte (higher byte) describes the forecolor and the backcolor in which the
character will be displayed. The DOS screen carries black as the backcolor and white
asthe fore color by default. The lower 4 bits (lower nibble) represents the forecolor and
the higher4bits(highernibble)representsthebackcolorasdescribedbythediagrambelow
Fig8(AttributeByte)
MemoryMappedI/OONMonitor
forecolor
Blink X X X X X X X X
BackColor Color
Bold
Low Byte = Ascii Code 000 Black
HighByte=AttributeByte 100 Red
010 Green
001 Blue
111 White
Tounderstandalldescribeaboveletstakealookatthisexample.
void main()
{
(*scr)=0x0756;
(*(scr+1))=0x7055;
}
©CopyrightVirtualUniversityofPakistan 34
04-TSRprogramsandInterrupts
ThisexamplewillgeneratetheoutputVU
The far pointer scr is assigned the value 0xb800H in the high word which is the segment
address and value 0x0000H in the low word which is the offset address. The word at this
address is loaded with the value 0x0756H and the next word is loaded by the value
0x7055H,0x07isthe attributebytemeaningblackbackcolorandwhiteforecolorand the byte
0x70h means white back color and black fore color. ).0x56 and 0x55 are the ASCII value
of “V” and “U” respectively.
©CopyrightVirtualUniversityofPakistan 35
05-TSRprogramsandInterrupts(Keyboardinterrupt)
05 -TSRprogramsandInterrupts(Keyboardinterrupt)
This same task can be performed by the following program as well. In this case the video
text memory is accessed byte by byte.
The next example fills whole of the screen with spaces. This will clear the contents of the
screen.
unsignedcharfar*scr=(unsignedcharfar*)0xb8000000;
//corrected
voidmain()
{
inti;//instructionadded
for(i=0;i<2000;i++)//corrected
{
*scr=0x20;//corrected
*(scr+1)=0x07; //corrected
scr=scr+2;
}
}
Usually the in text mode there are 80 columns and 25 rows making a total of 2000
characters that can be shown simultaneously on the screen. This program runs a loop
2000 times placing 0x20 ASCII code of space character in whole of the text memory in
this text mode. Also the attribute is set to white forecolor and black backcolor.
AnotherExample
In the following example memory mapped I/O is used in combination with interrupt
interception to perform an interesting task.
#include<dos.h>
void interrupt (*old)();
void interrupt newfunc();
char far *scr=(char far* ) 0xb8000000;
void main()
{
old=getvect(0x08);
setvect(0x08,newfunc);
keep(0,1000);
}
©CopyrightVirtualUniversityofPakistan 36
05-TSRprogramsandInterrupts(Keyboardinterrupt)
voidinterruptnewfunc()
{
*scr=0x41;//corrected
*(scr+1)=0x07; //corrected
(*old)();
In the above example the timer interrupt is intercepted such that whenever the timer
interrupt is invoked (by means of hardware) the memory resident newfunc() is invoked.
This function simply displays the ASCII character 0x41 or ‘A’ in the top left corner of the
text screen.
Hereisanotherexample.
#include<stdio.h>
void interrupt (*old)();
void interrupt newfunc();
char far *scr=(char far* ) 0xb8000000;
int j;
voidmain()
{
old=getvect(0x08);
setvect(0x08,newfunc); //corrected
keep(0,1000);//corrected
}
voidinterruptnewfunc()
{
for ( j=0;j<4000;j+=2){ //corrected
if(*(scr+j)==‘1’){
*(scr+j)=‘9’;}
}
(*old)();
}
This program scans through all the bytes of text display memory when int 8 occurs. It
once resident will replace all the ‘1’ on the screen by ‘9’. If even somehow a ‘1’ is
displayed on the screen it will be converted to ‘9’ on occurrence of interrupt 8 which
occurs 18.2 times every second.
ThekeyboardInterrupt
Keyboard is a hardware device and it makes use of interrupt number 9 for its input
operations. Whenevera keyis pressed interrupt # 9 occurs. The operating
systemprocessesthisinterruptinordertoprocessthekeypressed.Thisinterruptusuallyreads the
scan code from the keyboard port and converts it into the appropriate ASCII code and
places the ASCII code in the keyboard buffer in BIOS data area as described I nthediagram
below
©CopyrightVirtualUniversityofPakistan 37
05-TSRprogramsandInterrupts(Keyboardinterrupt)
Keyboard AnyProcess
Controller
INT9
Interrupt
Reads Scan
60H
INT Codeconvertsto
ASCII&placeit
in Keyboard
Buffer&returns
Kbd .
Example
#include<dos.h>
void interrupt (*old)( );
void interrupt newfunc( );
voidmain()
{
old = getvect(0x09);
setvect(0x09,newfunc);
keep(0,1000);
}
voidinterruptnewfunc()
{
(*old)();
(*old)();
(*old)();
}
This program simply intercepts the keyboard interrupt and places the address of newint
intheIVT.Thenewintsimplyinvokestheoriginalinterrupt 9thrice.Thereforethe same
character input will be placed in the keyboard buffer thrice i.e three characters willbe
received for each character input.
Example
#include<dos.h>
void interrupt (*old)( );
void interrupt newfunc( );
charfar*scr=(charfar*)0x00400017;
©CopyrightVirtualUniversityofPakistan 38
05-TSRprogramsandInterrupts(Keyboardinterrupt)
voidmain()
{
old = getvect(0x09);
setvect(0x09,newfunc);
keep(0,1000);
}
voidinterruptnewfunc()
{
*scr=64;
(*old)();
}
The above program is quite familiar it will just set the caps lock status whenever a key is
pressed. In this case the keyboard interrupt is intercepted.
Example
This too is a familiar example. Whenever a key is pressed from the keyboard the newfunc
functions runs through whole of the test display memory and replaces the ASCII ‘1’
displayed by ASCII ‘9’.
©CopyrightVirtualUniversityofPakistan 39
05-TSRprogramsandInterrupts(Keyboardinterrupt)
Timer&KeyboardInterruptProgram
#include<dos.h>
voidinterrupt(*oldTimer)();//corrected void
interrupt (*oldKey)( ); //corrected void
interrupt newTimer ( );
voidinterruptnewKey();
charfar*scr=(charfar*)0xB8000000; int i,
t = 0, m = 0;
charcharscr[4000];
void main( )
{
oldTimer=getvect(8);
oldKey = getvect (9);
setvect (8,newTimer);
setvect (9,newKey);
getch();
getch();
getch();
getch();
}
voidinterruptnewTimer()
{
t++;
if((t>=182)&&(m==0))
{
for (i =0; i < 4000; i ++)
charscr [i] = *(scr + i);
for(i=0;i<=4000;i+=2)
{
*(scr+i)=0x20;
*(scr+i+1)=0x07;
}
t=0;m= 1;
}
(*oldTimer)();
}
voidinterruptnewKey()
{
intw;
if(m==1)
{
for(w=0;w<4000;w++)
*(scr+w)=charscr[w]; m =
0;
}
(*oldKey)();
}
This program works like a screen saver. The newTimerfunction increments twheneverit
is invoked so the value of treaches 182 after ten second. At this moment the function
savesthevalueindisplaytextmemoryinacharacterarrayandfillsthescreenwith spaces and sets a
flag m. The newKeyfunction is invoked when a key press occurs.
©CopyrightVirtualUniversityofPakistan 40
05-TSRprogramsandInterrupts(Keyboardinterrupt)
The flag is checked if the it’s set then the screen is restored from the values saved in that
character array.
ReentrantProcedures&Interrupt
If on return of a function the values within the registers are unchanged as compared
tothe values which were stored in registers on entry into the procedures then the
procedureis called reentrant procedure. Usually interrupt procedures are reentrant
procedures especially those interrupt procedure compiled using C language compiler are
reentrant. This can be understood by the following example
AX=1234H
AX=FF55H
Proc1()
AX=?
In the above example the function Proc1() is invoked. On invocation the register AX
contained the value 1234H, the code within the function Proc1() changes the value in
AXto FF55H. On return AX will contain the value 1234H if the function have been
implemented as a reentrant procedure i.e a reentrant procedure would restore the values in
registers their previous value (saved in the stacked) before returning.
C language reentrant procedures save the registers in stack following the orderAX, BX,
CX, DX, ES, DS, SI, DI, BP on invocation and restores in reverse order before return.
Thisfactaboutreentrantprocedurescanbeanalysedthroughfollowingexample.
#include<stdio.h>void
interrupt *old(); void
interrupt newint() void
main ()
{
inta;
old = getvect(0x65);
setvect(0x65,newint);
_AX=0xf00f;
geninterrupt(0x65);
a = _AX
printf(“%x”,a);
}
©CopyrightVirtualUniversityofPakistan 41
05-TSRprogramsandInterrupts(Keyboardinterrupt)
voidinterruptnewint()
{
_AX=0x1234;
}
Firstly its important to compile this above and all the rest of the examples as .C files
and not as .CPP file. It these codes are compiled using .CPP extension then there is no
surety that this program could be compiled.
Again int 65H is used for this experiment. The int 65H vector is made to point at the
function newint(). Before calling the interrupt 65H value 0xF00F is placed in the AX
register. After invocation of int 65H the value of AX register is changed to 0x1234. But
after return if the value of AX is checked it will not be 0x1234 rather it will be 0xF00F
indicating that the values in registers are saved on invocation and restore before return
and also that the interrupt type procedures are reentrant.
©CopyrightVirtualUniversityofPakistan 42
06-TSRprogramsandInterrupts(Diskinterrupt,Keyboardhook)
06 -TSRprogramsandInterrupts(Diskinterrupt,
Keyboard hook)
The typical sequence in which registers will be pushed and poped into the stack on
invocation and on return can be best described by the following diagrams
Push
AX,BX,CX,DX,ES,DS,SI,DI,BP
Pushflags,CS,IP
POP
BP,DI,SI,DS,ES,DX,CX,BX,AX
PopIP,CS,flags
The registers Flags, CS and IP are pushed on execution of INT instruction and executions
branches to the interrupt procedure. The interrupt procedure pushes register AX, BX, CX,
DX, ES, DS, SI, DI, BP in this order. The interrupt procedure then executes, before
returning it pops all the registers in the reverse order as BP, DI, SI, DS, ES, DX, CX, BX
and AX. IP, CS and flags are poped on execution of the IRET instruction.
Nextdiagramshowsthestatusofthestackafterinvocationoftheinterruptprocedure.
©CopyrightVirtualUniversityofPakistan 43
06-TSRprogramsandInterrupts(Diskinterrupt,Keyboardhook)
BP
DI
SI
DS
ES
DX
CX
BX
BX
AX
AX
IP
CS
Flags
The arguments in simple procedure or functions are saved in the stack for the scope of the
function/procedure. When an argument is accessed in fact stack memory is accessed.
Now we will take a look how stack memory can be accessed for instance in case of
interrupt procedures to modify the value of register in stack.
AccessingStackExample
void interrupt newint ( unsigned int BP,unsigned int DI,
unsigned int SI,unsigned int DS, unsigned int
ES,unsigned int DX, unsigned int CX,unsigned
int BX, unsigned int AX,unsigned int IP,
unsigned int CS,unsigned int flags)
//corrected
{
unsigned int a = AX;
unsigned int b = BX;
unsignedintd=ES;
}
©CopyrightVirtualUniversityofPakistan 44
06-TSRprogramsandInterrupts(Diskinterrupt,Keyboardhook)
Example:
In this example the value on invocation in AX is 0x1234, the interrupt procedure does not
change the current value of the register through pseudo variables rather it changes the
corresponding of AX in stack which will be restored in AX before return.
DiskInterrupt
The following example makes use of disk interrupt 13H and its service 3H. The details of
this service are as under.
OnEntry
AH=Service#=03
AL = No of Blocks to write
BX = Offset Address of Data
CH = Track # ,CL=Sector#
DH = Head #
DL = Drive #(Starts from 0x80 for fixed disk &0 for removable disks)
ES = Segment Address of data buffer.
OnExit
AH=returnCode
Carryflag=0(NoErrorAH=0) Carry flag =
1 ( Error AH = Error Code)
Boot block is a special block on disk which contains information about the
operatingsystem to be loaded. If the data on boot block is somehow destroyed the disk
would be rendered inaccessible. The address of partition block on hard disk is head # =1,
track# = 0 and sector # = 1. Now let’s write an application that will protect the boot block
to bewritten by any other application.
©CopyrightVirtualUniversityofPakistan 45
06-TSRprogramsandInterrupts(Diskinterrupt,Keyboardhook)
#pragma inline
#include <dos.h>
#include <bios.h>
voidinterrupt(*oldtsr)();
voidinterruptnewtsr(unsignedintBP,…,flags);
//must provide all the arguments
void main ( )
{
oldtsr = getvect (0x13);
setvect(0x13, newtsr); //corrected
keep (0, 1000);
}
void interrupt newtsr(unsigned int BP, unsigned int DI,
unsigned int SI, unsigned int DS, unsigned int ES, unsigned
int DX, unsigned int CX, unsigned int BX, unsigned int AX,
unsigned int IP, unsigned int CS,
unsignedintflags)//corrected
{
if(_AH==0x03)
if((_DH==1&&_CH==0&&_CL==1)&&_DL>=0x80)
{
asm clc;
asmpushf;
asm pop flags;
return;
}
_ES=ES;_DX=DX;
_CX=CX;_BX=BX;
_AX=AX;
*oldtsr;
asmpushf;
asmpopflags;
AX = _AX; BX = _BX;
CX = _CX; DX = _DX;
ES = _ES;
}
The above program intercepts interrupt 13H. The new interrupt procedure first check AH
for service number and other parameters for the address of boot block. If the boot block isto
be written it simply returns and clears the carry flag before returning to fool the calling
programthattheoperationwassuccessful.Andifthebootblockisnottobewrittenthen it places the
original parameters back into the registers and calls the original interrupt.
The values returned by the original routine are then restored to the corresponding register
values in the stack so that they maybe updated into the registers on return.
©CopyrightVirtualUniversityofPakistan 46
06-TSRprogramsandInterrupts(Diskinterrupt,Keyboardhook)
ThekeyboardHook
The service 15H/4FH is called the keyboard hook service. This service does not perform
any useful output, it is there to be intercepted by applications which need to alter the
keyboard layout. It called by interrupt 9H after it has acquired the scan code of input
character from the keyboard port while the scan code is in AL register. When this service
returns interrupt 9H translates the scan code into ASCII code and places it in buffer. This
service normally does nothing and returns as it is but a programmer can intercept it
inorder to change the scan code in AL and hence altering the input or keyboard layout.
Move Scan Code
from60Hportto
AL Int15H
Service4FH
KeyPressed
ConverttoASCII
&placeitinkeyboardbuffer
Thefollowingapplicationshowhowthiscanbedone.
#include <dos.h>
#include <bios.h>
#include <stdio.h>
voidinterrupt(*oldint15)();
void interrupt newint15(unsigned int BP, …, flags);
void main ( )
{
oldint15 = getvect (0x15);
setvect (0x15, newint15);
keep (0, 1000);
}
void interrupt newint15(unsigned int BP, unsigned int DI,
unsigned int SI, unsigned int DS, unsigned int ES, unsigned
int DX, unsigned int CX, unsigned int BX, unsigned int AX,
unsigned int IP, unsigned int CS,
unsignedintflags)
{
if(*(((char*)&AX)+1)==0x4F)
{
if(*((char*)&AX)==0x2C)
*(((char*)&AX)) = 0x1E;
else if (*((char*)&AX) == 0x1E)
*((char*)&AX)=0x2C; //corrected
}
else
(*oldint15)();
}
©CopyrightVirtualUniversityofPakistan 47
06-TSRprogramsandInterrupts(Diskinterrupt,Keyboardhook)
Theapplicationinterceptsinterrupt15H.Thenewint15functionchecksfortheservice#
4FinthehighbyteofAX,ifthisvalueis4FthedefinitelythevalueinALwithbethe scan code. Here
a simple substitution have been performed 0x1E is the scan code of ‘A’and 0x2C is the
scan code of ‘Z’. If the scan code is AL is that of ‘A’ it is substituted with the scan code of
‘Z’ and vice versa. If some other service of 15H is invoked the original interrupt function is
invoked.
©CopyrightVirtualUniversityofPakistan 48
07-HardwareInterrupts
07 -HardwareInterrupts
The microprocessor package has many signals for data, control and addresses. Some of
these signals may be input signals and some might be output. Hardware interrupts makeuse
of two of such input signals namely NMI (Non maskable Interrupt) & INTR(Interrupt
Request).
Reset
Hold
NMI Microprocessor
INTR
NMIisahigherprioritysignal thanINTR,HOLDhasevenhigherpriorityandRESET has the
highest priority. If anyof the NMI or INTR pins areactivatedthemicroprocessoris
interrupted on the basis of priority, if no higher priority signals are present. This is how
microprocessor can be interrupted without the use of any software instruction hence the
name hardware interrupts.
HardwareInterruptandArbitration
Most of the devices use the INTR line. NMI signal is used by devices which perform
operations of the utmost need like the division by zero interrupt which is generated by
ALU circuitry which performs division. Definitely this operation is not possible and the
circuitry generates an interrupt if it receives a 0 as divisor from the control unit.
INTR is used by other devices like COM ports LPT ports, keyboard, timer etc. Since only
one signal is available for microprocessor interruption, this signal is arbitrated among
various devices. This arbitration can be performed by a special hardware called the
Programmable Interrupt Controller (PIC).
©CopyrightVirtualUniversityofPakistan 49
07-HardwareInterrupts
InterruptController
Asingleinterruptcontrollercanarbitrateamong8differentdevices.
D0 IRQ0
PIC
D7
INT IRQ7
As it can be seen from the diagram above the PIC device has 8 inputs IRQ0-IRQ7. IRQ0
has the highest priority and IRQ7 has the lowest. Each IRQ input is attached to an I/O
device wheneverthedevicerequiresanI/OoperationitsendsasignaltothePIC.The PIC on the
basis of its priority and presence of other requests decides which request to serve.
Whenever a request is to be served by PIC it interrupt the processor with the INT output
connected to the INTR input of the processor and send the interrupt # to be generated the
data lines connected to the lower 8 datelines of the data bus to inform the processor about
the interrupt number. In case no higher priority signal is available to the processor and the
processor is successfully interrupted the microprocessor sends back an INTA (interrupt
Acknowledge) signal to inform the PIC that the processor has been interrupted.
The following diagram also shows the typical connectivity of the IRQ lines with various
devices
IntervalTimer
0IRQ1
KBD Controller
1 DO
2
MICRO
COM2 3 D7
PIC PROCESSOR
COM1 4
INT INTR
5
Other
6 INTA
Controllers
PrinterCont roller 7IRQ7
©CopyrightVirtualUniversityofPakistan 50
07-HardwareInterrupts
In standard PCs there maybe more than 8 devices so generally two PIC are used for INTR
line arbitration. These 2 PICs are cascaded such that they collectively are able to arbitrate
among 16 devices in all as shown in the following diagram.
MASTER IRQO
DO
PIC IRQ7
D7 cas1
cas2
INTA cas3
DO IRQ8
D7 PIC IRQ15
cas1
INTA cas2
cas3
SLAVE
The PICs are cascaded such that a total of 16 IRQ levels can be provided number IRQ0-
IRQ15. The IRQ level 2 is used to cascade both of the PIC devices. The Data lines are
multiplexed such that the interrupt number is issued by the concerned PIC. The IRQ
2input of theMaster PIC isconnectedto the INT output of theSlave PIC. If the slave PICis
interrupted by a device its request ins propagated to the master PIC and the master PIC
ultimately interrupts the processor on INTR line according to the priorities.
InastandardPC the PICsareprogrammedsuchthatthemasterPICgeneratedthe interrupt
number 8-15 for IRQ0 –IRQ7 respectively and the slave PIC generates interrupt number
70-77H for IRQ8-IRQ15
HardwareInterruptsareNon-Preemptive
As described earlier IRQ 0 has the highest priority and IRQ 15 has the lowest priority. If a
number of requests are available instantaneously the request with higher priority will
besentforservicefirstbythePIC.Nowwhatwillhappenifalowerpriorityinterruptis
beingserviceandahigherpriorityinterruptrequestoccurs,willthelowerpriority interrupt be
preempted? The answer is that the interrupt being serviced will not be preempted no matter
what. The reason for this non-preemptive can be understood by the example illustrated as
below. Let’s first consider that the hardware interrupts arepreemptive for argument sake. If
a character ‘A’ is input a H/W interrupt will occur to process it, while this interrupt is
being processed another character is input say ‘B’ in case the interrupts have been
preemptive the previous instance will be preempted and another instance for the H/W
interrupt call will be generated, and similarly consider another character is input ‘C’ and
the same happened for this input as well. In this case thecharacter first to be fully
processed and received will be ‘C’ and then ‘B’ will be
©CopyrightVirtualUniversityofPakistan 51
07-HardwareInterrupts
processed and then ‘A’. So the sequence of input will change to CBA while the correct
sequence would be ABC.
CPRESSED
APRESSED
BPRESSED
InputreceivedCBA LogicallyIncorrect
LogicallyCorrect ABC
The input will be received in correct sequence only if the H/W interrupts are non-
preemptive as illustrated in the diagram below.
APRESSED
BPRESSED
C PRESSED
InputreceivedABC LogicallyCorrect
WhoNotifiesEOI(Endofinterrupt)
The PIC has to be notified about the return of the previous interrupt by the ISR routine.
From programmer point of view this is the major difference between H/W and software
interrupt. A software interrupt will not require any such notification. As the
diagrambelow illustrates that every interrupt returns with an IRET instruction. This
instruction is executed by the microprocessor and has no linkage with the PIC. So there
has to be a different method to notify the PIC about the End of interrupt.
©CopyrightVirtualUniversityofPakistan 52
07-HardwareInterrupts
PendingHardwareinterrupts.
While a hardware interrupt is being processed a number of various other interrupt maybe
pending. For the subtle working of the system it is necessary for the In-service hardware
interrupt to return early and notify the PIC on return. If this operation takes long and the
pending interrupt requests occur repeated there is a chance of loosing data.
ProgrammingthePIC
To understand how the PIC is notified about the end of interrupt lets take a look into the
internal registers of PIC and their significance. A PIC has a number of initialization
control words (ICW) and operation control words (OCW), following is characteristic of
ICW and OCWs
• ICWprogrammedattimeofbootup
• ICW are used to program the operation mode like cascade mode or not also it is
used to program the function of PIC i.e if it is to invoke interrupts 08~0FHor
70-77H on IRQ request.
• OCWareusedatrun-time.
• OCWisusedsignalPICEOI
• OCW are also used to read/write the value of ISR(In-service register),
IMR(interrupt mask register), IRR(interrupt request register).
To understand the ISR, IMR and IRR lets take a look at the following diagram illustrating
an example.
7 6 5 4 3 2 1 0
ISR 0 0 0 1 0 0 0 0
7 6 5 4 3 2 1 0
IMR 0 0 0 0 0 0 1 0
7 6 5 4 3 2 1 0
IRR 1 1 0 0 0 1 0 1
Thevaluesshowninthevariousregistersillustratethatthecurrentlyin-serviceinterrupt is that
generatedthrough IRQ4 of the PIC (int 0CH in case ofmater PIC), alsotheinterrupt through
IRQ1 has been masked (int 9h (keyboard interrupt) in case of masterPIC) which means that
even though a request for this interrupt is received by the PIC but this request is ignored by
the PIC until this bit is cleared. And the requests through IRQ7, IRQ6, IRQ2 and IRQ0 are
pending and waiting for the previously issued interrupt toreturn.
PortAddresses
Few of the operation control words can be altered after boot time. The addresses for these
OCW are listed as below
• MasterPIChastwoports
20H=OCWforEOI
21H=OCWforIMR
©CopyrightVirtualUniversityofPakistan 53
07-HardwareInterrupts
• SlavePIChastwoportsaswell
A0H=OCW for EOI code
A1H=OCW for IMR
Let’snowdiscussanexamplethataccessestheseportstocontrolthePIC
#include <stdio.h>
#include <bios.h>
void main()
{
outport(0x21,0x02);
This example simply accesses the bit # 1 of IMR in the master PIC. It sets the bit #1
inIMR which masks the keyboard interrupt. As a result no input could be received from
the keyboard after running this program.
Let’snowlookatanotherexample
#include <dos.h>
#include <stdio.h>
#include <bios.h>
void interrupt(*oldints)();
void interrupt newint8();
int t=0; //corrected
voidmain()
{
oldints=getvect(0x08);
setvect(0x08,newint8);
keep(0,1000);
}
voidinterruptnewint8()
{
t++:
if(t==182)
{
outport(0x21,2);
}
else{
if(t==364)
{
outport(0x21,0);
t=0;
}
}
(*oldints)();
}
©CopyrightVirtualUniversityofPakistan 54
07-HardwareInterrupts
The example above is also an interesting example. This program intercepts the timer
interrupt. The timer interrupt makes use of a variable to keep track of how much time has
passed; tis incremented each time int 8 occurs. It the reaches the 182 after 10 second, at
this point the keyboard interrupt is masked and remains masked for subsequent 10
secondat which point the value of twill be 364, also tis clearedto 0 for another such
round.
#include<dos.h>
void interrupt(*old)();
void interrupt newint9();
charfar*scr=(charfar*)0x00400017;
voidmain()
{
old=getvect(0x09);
setvect(0x09,newint9);
keep(0,1000);
}
voidinterruptnewint9()
{
if (inportb(0x60)==83
&&(((*scr)&12)==12)) //corrected
{
outportb(0X20,0x20);
return;
}
(*old)();
}
©CopyrightVirtualUniversityofPakistan 55
07-HardwareInterrupts
#include<dos.h>
void interrupt(*old)();
voidmain()
{
old=getvect(0x09);
setvect(0x09,newint9);
keep(0,1000);
}
voidinterruptnewint9()
{
if(inportb(0x60)==0x1F)//corrected
outportb(0X20,0x20);
return;
}
(*old)();
The above C language program suppresses the ‘s’ input from the keyboard. The keyboard
interrupt has been intercepted. When a key is pressed newint9 is invoked. This service
checks the value through the import statement of the keyboard port numbered 0x60. If he
scan code ( and not the ASCII code) is 0x1F then it indicates that the ‘s’ key was pressed.
This program in this case simply returns the newint9 hence suppressing this input by not
calling the real int 9. Before return it also notifies the PIC about the end of interrupt.
©CopyrightVirtualUniversityofPakistan 56
08-HardwareInterruptsandTSRprograms
08 -HardwareInterruptsandTSRprograms
Thekeyboardbuffer
KeyboardBuffer
• KeyboardBufferislocatedinBIOSDataArea.
• Startsat40:IEH
• Endsat40:3DH
• Has32byesofmemory2bytesforeach
character.
• Headpointerislocatedataddress40:1Ato
40:IBH
• Tailpointerlocatedataddress40:ICto40:IDH
The keyboard buffer is a memory area reserved in the BIOS data area. This area stores the
ASCII or special keycodes pressed from the keyboard. It works as a circular buffer andtwo
bytes are reserved for each character, moreover 2 bytes are used to store a single
character.ThefirstcharacterstorestheASCIIcodeandthesecondbytestores0incase an ASCII
key is pressed. In case a extended key like F1- F12 or arrow key is pressed the first byte
stores a 0 indicating a extended key and the second byte stores its extended key code.
©CopyrightVirtualUniversityofPakistan 57
08-HardwareInterruptsandTSRprograms
Circularbuffer
40:1AH
40:1CH
Head Tail
40:1EH
40:3DH
The circular keyboard buffer starts at the address 40:1EH and contains 32 bytes.
Theaddress 40:1AH stores the head of this circular buffer while the address 40:1CH stores
the tail of this buffer. If the buffer is empty the head and tail points at the same location as
shown in the diagram above.
Storingcharactersinthekeyboardbuffer
Tail
0x1E
‘A’
0’
‘B’ 0x20
0 0x21
0 0x22
83 0x23
Head=0x24
The above slide shows how characters are stored in the buffer. If ‘A; is to be stored then
the first byte in the buffer will store its ASCII code and the second will store 0, and if
©CopyrightVirtualUniversityofPakistan 58
08-HardwareInterruptsandTSRprograms
extended key like DEL is to be stored the first byte will store 0 and the second byte will
store its scan code i.e. 83. The diagram also shows that head points to the next byte where
the next input character can be stored. Also notice that head contain the offset from the
address 40:00H and not from address 40:1EH. i.e. it contain 0x24 which is the address of
the next byte to be stored relative to the start of BIOSdata area and not the
keyboardbuffer.
Positionoftail
0xIE
Tail=0x20
‘B’
0
0
83 Head=24
As discussed earlier the keyboard buffer is a circular buffer therefore the tail need to be
placed appropriately. In the given example the input ‘A’ stored in the buffer is consumed.
On consumption of this character the tail index is updated so that it points to the next
character in the buffer. In brief the tail would point to the next byte to be consumed in the
buffer while head points to the place where next character can be stored.
©CopyrightVirtualUniversityofPakistan 59
08-HardwareInterruptsandTSRprograms
• SoKBDbufferactsasacircularbuffer.
• The tail value should be examined to get
to the start of the buffer.
Example
#include
<dos.h>voidinterrupt(
*old)();
voidinterruptnew1()!
unsignedcharfar*scr=(unsignedcharfar
*)0x0040001C
voidmain()
{
old=getvect(0x09);
setvect(0x09,new1);
keep(0,100);
}
©CopyrightVirtualUniversityofPakistan 60
08-HardwareInterruptsandTSRprograms
voidinterruptnew1()
{
if(inportb(0x60)==83)
{
*((unsignedcharfar*)0x00400000+*scr)=25;
if((*scr)==60)
*scr=30;
else
*scr+=2;
outportb(0x20,0x20);
return;
}}
The program listed in the slides above intercepts interrupt 9. Whenever the interrupt 9
occurs it reads the keyboard port 0x60. If the port contains 83 then it means DEL was
pressed, if so it places the code 25 in the buffer and then updates the head in circular
manner. The code 25 placed instead of 83 represents the combinations CTRL+Y. The
program when resident will cause the program to receive CTRL+Y combination
whenever DEL is pressed by the user. i.e in Borland C environment CTRL+Y
combination is used to delete a line, if this program is residentthen in Borland C
environment a line will be deleted whenever DEL is pressed by the user.But the thing
worth noticing is that the interrupt function returns and does not call the real interrupt 9
after placing 25 in the buffer, rather it returns directly. But before returning as it has
intercepted a hardware interrupt it needs to notify the PIC, this is done by
outport(0x20,0x20); statement. 0x20 is the address of the OCW that receives the EOI
code which incidentally is also 0x20.
©CopyrightVirtualUniversityofPakistan 61
08-HardwareInterruptsandTSRprograms
EOICodeforSlaveIRQ
ForMaster
outportb(0x20,0x20);
ForSlave
outportb(0x20,0x20);
outportb(0xA0,0x20);
As discussed earlier the slave PIC is cascaded with the master PIC. If the hardware
interrupt to be processed is issued by the master PIC then the ISR needs to send the EOI
code to the master PIC but if the interrupt request is issued by the slave PIC then the ISR
needs to inform both master and slave PICs as both of them are cascaded as shown in the
slide.
©CopyrightVirtualUniversityofPakistan 62
08-HardwareInterruptsandTSRprograms
ReadingOCW
OCW2&OCW3
7 6 5 4 3 2 1 0
X X X
00IfEOIis tobesent
001FORNonSpecific EOI
01IfotherRegistersareto beaccessed
The same port i.e 0x20 is used to access the OCWs. 00 is placed in bits number 4 and 3 to
indicateanEOIisbeingreceivedand01isplacedtoindicatethatainternalregisteristo be accessed.
0
7 6 5 4 3 2 1 0
O1ToreadIRRorISR
10=IRR
11=In-ServiceRegister
The value in bits number 1 and 0 indicate which Register is to accessed. 10 is for IRR and
11 is for ISR.
©CopyrightVirtualUniversityofPakistan 63
08-HardwareInterruptsandTSRprograms
AccessingtheISRandIRR.
7 6 5 4 3 2 1 0
0 0 0 0 1 0 0 0
NoEOIrelevant
Don’tCare
OtherRegistertobe Accessed
A value is placed in the port0x20 as shown in the above slide to indicate that a register isto
be accessed.
7 6 5 4 3 2 1 0
0 0 0 0 1 0 1 0
IRRAccessed
01PICNotifiedaboutreadingoperation
Then again a value in that same port is placed to indicate which register is to be
accessed,as in the above slide IRR is to be accessed.
©CopyrightVirtualUniversityofPakistan 64
08-HardwareInterruptsandTSRprograms
7 6 5 4 3 2 1 0
0 0 0 0 1 0 1 1
ISRAccessed
01PICNotifiedaboutreadingoperation
And in this slide a value is formed which can be programmed in the port 0x20 to access
the ISR.
Asampleprogram
#include<stdio.h>
#include<dos.h>
#include<bios.h>
void main(void)
{ chara;
outport(0x20,8);
outport(0x20,0x0A);
a=inport(0x20);
printf(“valueofIRRis%x”;,a);
outport(0x20,0x08);
outport(0x20,0x0B);
a=inport(0x20);
printf(“valueofISRis%x”;,a);
}
The above program makes use of the technique described to access the ISR and IRR.
Firstly 0x08 is placed to specify that a register is to be accessed then 0x0A is placed to
indicate that IRR is to accessed. Now this port 0x20 can be read which will contain the
value in IRR. Similarly it is done again by placing the 0x0B in port 0x20 to access the
ISR.
©CopyrightVirtualUniversityofPakistan 65
08-HardwareInterruptsandTSRprograms
MoreaboutTSRPrograms
• ATSRneedtobeloadedonceinmemory
• Multiple loading will leave redundant
copies in memory
Oneofthesolutiontotheproblemcanbe
©CopyrightVirtualUniversityofPakistan 66
08-HardwareInterruptsandTSRprograms
asshownintheslidebelow
intflag;
flag=1;
keep(0,1000);
if(flag==1)
MakeTSR
else
exit Program
This will not work as this global variable is only global for this instance of the program.
Other instances in memory will have their own memory space. So the
©CopyrightVirtualUniversityofPakistan 67
08-HardwareInterruptsandTSRprograms
int65Hisempty,wecanuse its
vector as a flag.
Addressofvector
seg = 0
offset=65H*4
Example:
#include<stdio.h> setvect(0x08, newint);
#include<BIOS.H> (*int65vec)=0xF00F;
#include<DOS.H> keep (0,1000);
unsignedint far* int65vec= }else
(unsigned far *) {
MK_FP(0,0x65*4) puts(“ProgramAlready
voidinterrupt(*oldint)( ); Resident”);
void interrupt newfunc ( ); }}
void main() voidinterruptnewfunc()
{ { :::::::
if((*int65vec)!=0xF00F) :::::::
//corrected (*oldint)();
{ }
oldint =getvect(0x08);
©CopyrightVirtualUniversityofPakistan 68
08-HardwareInterruptsandTSRprograms
The above template shows how the vector of int 0x65 can be used as a flag. This template
showsthatafarpointerismaintainedwhichisassignedtheaddressoftheint0x65 vector. Before
callingthe keep() functioni.emaking the program resident a value
of0xf00fisplacedatthisvector(thisvectorcanbetemperedasitisnotbeingusedbythe OS or
device drivers). Now if another instance of the program attempts to run the if
statementatthestartoftheprogramwillcheckthepresenceof0x0f00fattheintvector of 0x65, if
found the program will simply exit otherwise it will make itself resident. Or in other word
we can say that 0xf00f at the int 0x65 vector in this case indicate that the program is
already resident.
AnotherMethod
Service # 0xFF usuallydoesnot
exist for ISR’s.
Key is to create another service #
0xFF for the ISR interrupt besides
other processing.
Example:
#include<stdio.h>
#include<BIOS.H>
#include<DOS.H>
voidinterrupt(*oldint)();
voidinterruptnewfunc(unsignedintBP,..…,flags);
void main()
{
_DI=0;
_AH=0xFF;
geninterrupt
(0x13);if(_DI==0xF0
0F){
puts (“ProgramAlreadyResident”);
exit (0);
}
©CopyrightVirtualUniversityofPakistan 69
08-HardwareInterruptsandTSRprograms
The implements the service 0xff of interrupt 0x13 such that whenever this service is
called it returns 0xf00f in DI and if this value does not return then it means that this
program is not resident.
Example:
Else
{
oldint=getvect(0x13);
setvect(0x13,newint);
keep (0, 1000);
}} else
voidinterruptnewint() { :::::::
{ :::::::
if(_AH==0xFF) :::::::
{ }
DI=0xF00F; (*oldint)();
return; }
}
©CopyrightVirtualUniversityofPakistan 70
09-TheintervalTimer
09 -Theinterval Timer
Theintervaltimer
IntervalTimer
- Synchronous Devices require a
timing signal.
Clk
Clock
Clk Microprocessor
generated
PCLK=1.19318MHz
Ch0toIRQ0
Clk Interval CH1TO DRAM controller
PCLK(forperipheral Timer Ch2toPCSpeake r
SynchronousDevices)
The interval timer is used to divide an input frequency. The input frequency used by the
interval timer is the PCLK signal generated by the clock generator. The interval timer has
three different each with an individual output and memory for storing the divisor value.
©CopyrightVirtualUniversityofPakistan 71
09-TheintervalTimer
DividingClocksignal
CounterRegisters:
• Counterregisterscanbeusedtodividefrequency.
7 6543210
count
/16/8/4 /2
A counter register can be used to divide the clock signal. As shown in the slide above,0 of
the clock register is used to divide the clock frequency by 2 subsequently bit 1 is usedto
divide it by 4 and so on.
TheDivisionmechanism
00000000 00001000
00000001 00001001
00000010 00001010
00000011 00001011
00000100 00001100
00000101 00001101
00000110 00001110
00000111 00001111
The above slide shows a sequence of output that a 8bit clock register will generate in
sequence whenever it receivesthe clock signal.Observe bit#1, its value changes between0
and 1 between two clock cycles so it can be used to divide the basic frequency by 2.
©CopyrightVirtualUniversityofPakistan 72
09-TheintervalTimer
Similarly observe bit #2 its value transits between 0 and 1 within 4 clock cycles hence it
divides the frequency by 4 and so on.
Timingdiagram
TimingDiagram
Bit0(/2)
Bit1(/4)
Bit2 (/8)
Bit3(/16)
:::::
:::::
:::::
Here is the timing diagram for above example. Bit #1 performs one cycle in between 2
clockcycles.Similarlybit#2performsonecycleinbetween4clockcyclesandsoon.
©CopyrightVirtualUniversityofPakistan 73
09-TheintervalTimer
Commandregisterswithintheprogrammableintervaltimer
IntervalTimerProgramming:Co
mmand Registers
• 8-bitCommandport
• Need to beprogrammed before
loading the divisor value for a
channel.
• 3 channels, eachrequires a16-
bit divisor value to generate the
output frequency.
Command register and the channels need to be programmed for the interval timer to
generate a wanted frequency.
CommandRegister
7 6 5 4 32 10
1 0 1 1 0 1 0 0
Binary= 0
Ch:00=0
BCD =1
01=1
10=2 Mode0~5
=000~101
01=LowByte
10=HighByte
11=LowBytefollowed
by High Byte
©CopyrightVirtualUniversityofPakistan 74
09-TheintervalTimer
ModeDescription
Divisor=4
Mode=0 -----4----- -----4-----
Divisor=4
Mode=1 -----3----- -----3-----
Divisor=4
Mode=2 ---2------2---
Divisor=4
Mode=3
-----4----- -----4-----
The interval timer can operate in six modes. Each mode has a different square wave
pattern according to need of the application. Some modes might be suitable to control a
motor and some might be suitable to control the speaker.
Binarycounter
BinaryCount:
count
1 0 0 0 1 0 0 1
1 0 0 0 1 0 1 0
1 0 0 0 1 0 1 1
BCDCOUNT=89 1 0 0 0 1 0 0 1
1 0 0 1 0 0 0 0
1 0 0 1 0 0 0 1
99=1 0 0 1 1 0 0 1
0 0 0 0 0 0 0 0
The interval timer channels can be used as a binary as well as a BCD counter. In case its
used in binary mode its counter registers will count in binary sequence and if its used as a
BCD counter its registers will count in BCD sequence as described above.
©CopyrightVirtualUniversityofPakistan 75
09-TheintervalTimer
PortsandChannels
Ports&Channels:
• 3-Channels16-bitwidedivisorvalue
i.e0~65535
8-bit port for each channel therefore the
divisorwordisloaded seriallybytebybyte.
PortAddresses
43H=CommandPort
40H =8-bit port for Channel0
41H =8-bit port for Channel1
42H=8-bitportforChannel2
The interval timer has 3 channels each channel is 16 bit wide. The port 43H is an 8 bit
port used as the command register. Ports 40h, 41H and 42H are associated with the
channels o, 1 and 2 respectively. Channels are 16 bit wide whereas the ports are 8 bit
wide. A 16 bit value can be loaded serially through the ports into the register.
Stepsforprogrammingtheintervaltimer
Programming Concepts
forIntervalTimer:
• LoadtheCommandbyteinto
commandregisterrequiredto
programthespecificchannel.
• The divisor word is then
Seriallyloadedbytebybyte.
©CopyrightVirtualUniversityofPakistan 76
09-TheintervalTimer
Theport61H
61HPort
Connectto
interval
timer = 1
Rest of the bits are usedby other
devicesandshouldnotbechanged.
TurnONSpeaker=1
TurnOFFSpeaker=0
the port 61h is used to control the speaker only the least significant 2 bits are important.
Bit 0 is used to connect the interval timer to the speaker and the bit #1 is used to turn the
speaker on off. Rest of the bits are used by other devices.
Example
Example:
//Programloadsdivisorvalueof0x21FF
//TurnsONthespeakerandconnectsittoInterval
Timer
#include<BIOS.H>
#include<DOS.H>
void main()
{
outportb(0x43,0xB4);
outportb(0x42,0xFF);
outportb(0x42,0x21);
outportb (0x61,inportb(0x61)|3);
getch();
outportb(0x61,inportb(0x61)&0xFC);
}
The above programs the interval timer and then turns it on. A value of 0xb4 is loaded into
the command register 0x43. This value signifies that the channel 2 is to programmed,
©CopyrightVirtualUniversityofPakistan 77
09-TheintervalTimer
both the bytes of divisor value are to loaded, the interval timer is to be programmed in
mode 2 and is to be used as a binary counter.
Then the divisor value say 0x21ffH, is loaded serially. First 0xFF low byte and then the
high byte 0x21 is loaded. Both the least significant bits of 0x61 port are set to turn on the
speaker and connect it to the interval timer.
Onakeypressthespeakerisagaindisconnectedandturnedoff.
ProducingaDelayinaProgram
TimerCount:
40:6CH
Incrementedevery1/18.2seconds.Whenever INT8
unsignedlongintfar*time=(unsignedlongintfar*)0x0040006C
voidmain()
{
unsignedlonginttx;
tx = (*time);
tx = tx +18;
puts(“Before”);
while((*time)<=tx);
puts(“After”);
}
Delay can be produced using double word variable in the BIOS Data area placed at the
location0040:006C.Thisvaluecontainsatimercountandisincrementedevery1/18th of a
second. In this program the this double word is read, placed in a program variable and
incremented by 18. The value of 40:6cH is compared with this variable in a loop. Thisloop
iterates until the value of 40:6cH is not greater. In this way this loop will keep on iterating
for a second approximately.
©CopyrightVirtualUniversityofPakistan 78
10-PeripheralProgrammableInterface(PPI)
10 -PeripheralProgrammableInterface(PPI)
SampleProgram
unsignedlongint*time=(unsignedlongint*)0x0040006C
voidmain()
{ unsignedlonginttx;
unsignedintdivisor=0x21FF;
while (divisor >= 0x50) {
outportb(0x43,0xB4);
outportb(0x42,*((char*)(&divisor)));
outportb(0x42,*(((char*)(&divisor))+1));
outportb(0x61,inportb(0x61) | 3);
tx=*time;
tx=tx+4;
while(*time<=tx);
divisor=divisor-30;
}
}
The inner while loop in the program is used to induce delay. The outer loop
simplyreloads the divisor value each time it iterates after reducing this value by 30. In
this way the output frequency of the interval timer changes after every quarter of a
second approximately. The speaker will turn on with a low frequency pitch and this
frequency will increase gradually producing a spectrum of various sound pitches.
©CopyrightVirtualUniversityofPakistan 79
10-PeripheralProgrammableInterface(PPI)
SampleProgram
#include <dos.h>
#include <bios.h>
voidinterrupt(*oldint15)();
void interrupt newint15 (unsigned int BP, unsigned int DI,
unsigned int SI, unsigned int DS, unsigned int ES,
unsigned int DX, unsigned int CX, unsigned int BX,
unsigned int AX, unsigned int IP, unsigned int CS,
unsigned int flags);
voidmain()
{
oldint15 = getvect (0x15);
setvect (0x15, newint15);
keep (0, 1000);
}
The above program is a TSR program that can be used to turn the speaker on/off.
Theabove program intercepts the int 15h. Whenever this interrupt occurs it looks for
service # 0x4f (keyboard hook). If ‘S’(0x1f scan code) has been pressed it toggles the
speaker.
©CopyrightVirtualUniversityofPakistan 80
10-PeripheralProgrammableInterface(PPI)
SampleProgram
#include<dos.h>#includ
e<bios.h>
unsignedintdivisors[4]={0x21ff,0x1d45,0x1b8a,,0x1e4c};
unsigned long int far*time =(unsigned long int
far*)0x0040006C;void main ()
{
unsigne
dlonginttx;inti=0;
while(!kbhit())
{
while(i<4)
{ outport(0x43,0xB4);
outport(0x42,*((char*)(&divisor[i])));
outport(0x42,*(((char*)(&divisor[i]))+1));outp
ort(0x61, inport(0x61)|3);
tx=*time;t
x=tx+4;
while(tx>=(*time));i++;
}i=0
;
}
outport(0x61,inport(0x61)&0xFC);
This program generatesa tune with 4 different pitches. This program is quite similar tothe
one discussed earlier. The only major difference is that in that program the pitch was
gradually altered from low to high in this the pitches change periodically until a key is
pressed to terminate the outer loop. Four various pitches are maintained and their divisor
values are placed in the divisors[] array. All these divisor values are loaded one by one
after a delayof approximatelyquarter of a second and this continues until a keyispressed.
©CopyrightVirtualUniversityofPakistan 81
10-PeripheralProgrammableInterface(PPI)
SampleProgram
#include<stdio.h>#inclu
de <dos.h> #include
<bios.h>
structtagTones
{ unsignedintdivisor;
unsigned int delay;
};
structtagTonesTones[4]={
{0x21ff,3},{0x1d45,2},{0x1b8a,3},{0x1e4c,4}};
inti,ticks,flag=0;
voidinterrupt(*oldint15)();
void interrupt (*oldint8)();
void interrupt newint15();
void interrupt newint8();
void main ()
{
oldint15=getvect(0x15);
setvect(0x15,newint15);
oldint8=getvect(0x08);
setvect(0x08,newint8);
keep(0,1000);
}
This is an interrupt driven version of the previous program. This program makes use of
the timer interrupt rather than a loop to vary the divisor value. Moreover interrupt 15 is
used to turn the speaker on /off.
©CopyrightVirtualUniversityofPakistan 82
10-PeripheralProgrammableInterface(PPI)
voidinterruptnewint15()
{
if(_AH==0x4f)
{
if((_AL==0x1f)&&(((*scr)&12)==12))
{
ticks=0;i
=0;
outport(0x43,0xb4);
outport(0x42,*((char *)(&Tones[i].divisor)));outport
(0x42,*(((char
*)(&Tones[i].divisor))+1));outport(0x61,inport(0x61)|
3);
flag=1;
}
elseif((_AL==0x1E)&&(((*scr)&12)==12))
{
outport(0x61,inport(0x61)&0xfc);flag
=0;
}
return;
}
(*oldint15)();
}
The speaker turns on whenever ‘S’ (scan code 0x1f) is pressed and turns off whenever
‘A’ (scan code 0x1E) is pressed.
voidinterruptnewint8()
{
if(flag==1)
{
ticks++;
if(ticks==Tones[i].delay)
{
if (i==3)
i=0;
else
i++;
outport(0x43,0xB4);
outport(0x42,*((char*)(&Tones[i].divisor)));outpo
rt(0x42,*(((char
*)(&Tones[i].divisor))+1));outport(0x61,inport(0x6
1)|3);
ticks= 0;
}
}
(*oldint8)();
}
The timer interrupt shift the divisor value stored in the tones structure whenever the
required numbered of ticks( timer counts) have passed as required by the value stored in
the delay field of the tone structure.
More such divisor values and their delays can be initialized in the tones structure to
generate an alluring tune.
©CopyrightVirtualUniversityofPakistan 83
10-PeripheralProgrammableInterface(PPI)
PeripheralProgrammableinterface(PPI)
ParallelPorts(PPI)
ParallelCommunication
Output
D0
D1
D2
CPU ParallelOutPutDevice
D7
Busy
Strobe
PPIisusedtoperform parallelcommunication.Deviceslikeprinteraregenerallybased on
parallel communication. The principle of parallel communication is explained in the slide
above. It’s called parallel because a number of bits are transferred from one point ot
another parallel on various lines simultaneously.
©CopyrightVirtualUniversityofPakistan 84
10-PeripheralProgrammableInterface(PPI)
ParallelCommunication
Input
D0
D1
D2
CPU ParallelInputDevice
D7
DR
CPU I/O
Controller
AdvantagesofParallelcommunication
ParallelCommunication
Faster
Only Economically Feasible For
Small Distances
©CopyrightVirtualUniversityofPakistan 85
11-PeripheralProgrammableInterface(PPI)II
11 -PeripheralProgrammableInterface(PPI)II
Programmable
PeripheralInterface(PP
I)
Programmable Peripheral
Interface(PPI)
ParallelI/O
CPU PPI Device
Printer
The PPI acts as an interface between the CPU and a parallel I/O device. A I/O
devicecannotbedirectlyconnectedtothebusessotheygenerallyrequireacontrollertobe placed
between the CPU and I/O device. One such controller is the PPI. Here we will see how we
can program the PPI to control the device connected to the PPI which generally is the
printer.
©CopyrightVirtualUniversityofPakistan 86
11-PeripheralProgrammableInterface(PPI)II
Int17H
Int 17H is used to control the printer via the BIOS. The BIOS functions that perform the
printer I/O are listed in the slide above with its other parameter i.e DX which contains the
LPT number. A standard PC can have 4 PPI named LPT1, LPT2, LPT3 and LPT4.
©CopyrightVirtualUniversityofPakistan 87
11-PeripheralProgrammableInterface(PPI)II
StatusByte
AccessingtheParallelPortThroughBIOSFu
nctions
Allthe functionReturninAHtheCurrentPrinterStatus
7 6 5 4 3 2 1 0
Time out
Printer Busy
ReceiveModeSelected
OutofPaper
PrinterOffLine TransferError
The above listed function returns a status byte in the AH register whose meaning is
described in the slide above. Various bits of the byte describe the status of the printer.
TimeoutByte
AccessingtheParallelPortThroughBIOSFunction
s
TimeOutByte
0040:0078 LPT1
0040:0079 LPT2
0040:007A LPT3
The BIOS service once invoked will try to perform the requested operation on the printer
repeated for a certain time period. In case if the operation is rendered unsuccessful due to
any reason BIOS will not quit trying and will try again and again until the number of tries
specified in the timeout bytes shown above runs out.
©CopyrightVirtualUniversityofPakistan 88
11-PeripheralProgrammableInterface(PPI)II
AccessingtheParallelPortThroughBIOSFunctions
• SpecifythenumberofAttemptsBIOSperform
before giving a time out Error
• ThisbyteVariesDependinguponthespeed of
the PC
• Busy=0PrinterisBusy
• Busy=1PrinterisnotBusy
ImportanceofStatusByte
ImportanceoftheStatusByte
If((pstate&0x29)!=0)or
((pstate&0x80)==0) or
((pstate&0x10)==0)
{printerok=FALSE;}
else
{printerok=TRUE;}
The status of the printer can be used in the above described manner to check if the printer
can perform printing or not. In case there is a transfer error, the printer is out of paper or
thereisatimeouttheprintercouldnotbeaccessed.Oriftheprinterisbusyorifthe printer is offline
the printer cannot be accessed. The pseudo is just performing thesechecks.
©CopyrightVirtualUniversityofPakistan 89
11-PeripheralProgrammableInterface(PPI)II
ImportanceoftheStatusByte
17H/00HWrite 17H/01HInitializePrinter
acharacteronentry onentry
AH=00 AH=01
AL=ASCIIcode DX=Interface#
DX=Interface# Onexit
Onexit AH=StatusByte
AH=StatusByte
17H/02HGetPrinterStatus
onentry
AH=02,DX=Interface#OnexitAH=StatusByte
PrintingPrograms
©CopyrightVirtualUniversityofPakistan 90
11-PeripheralProgrammableInterface(PPI)II
SampleProgram
Printing Program
unionREGSregs;FILE*fptr; void
main(void)
{
fptr=fopen(“c:\\temp\\abc.txt”,”rb”);regs.h.ah=1
;
regs.x.dx=0;int86(0x17,®s,®
s);while(!feof(fptr))
{regs.h.ah=2;regs.x.dx=0
;
int86(0x17,®s,®s);
if((regs.h.ah&0x80)==0x80)
{regs.h.ah=0;
regs.h.al=getc(fptr);int86(0x17,
®s,®s);
}}}
The above program performs programmed I/O on the printer using BIOS services. The
program firstly initializes the printer int 17H/01. The while loop will end when the end of
fileisreached,inthe loopitchecksthe printerstatus(int17h/02)andwrite thenextbyte
inthefileiftheprinterisfoundidlebycheckingthemostsignificantbitofthestatus byte.
SampleProgram
#include<dos.h> PrintingProgram1
void interrupt (*old)( );
voidinterruptnewint();
main( )
{
old = getvect(0x17);
setvect(0x17,newint);k
eep(0,1000);
}
voidinterruptnew()
{ if(_AH==0)
{
if((_AL=='A')||(_AL=='Z'))//corrected
return;
(*old)();
}
}
©CopyrightVirtualUniversityofPakistan 91
11-PeripheralProgrammableInterface(PPI)II
Z is to be printed rest of the characters will be printed normally. Only the As and the Zs in
the printing document will be omitted.
SampleProgram
#include<dos.h> PrintingProgram2
void interrupt (*old)( );
voidinterruptnewfunc();
main()
{
old=getvect(0x17);
setvect(0x17,newfunc);
keep(0,1000);
}
voidinterruptnewfunc()
{
if(_AH==0)
{
if(_AL !=‘‘)
(*old)();
}
}
In this sample program again int 17H is intercepted. The new interrupt function will
ignore all the spaces in the print document.
©CopyrightVirtualUniversityofPakistan 92
11-PeripheralProgrammableInterface(PPI)II
SampleProgram
#include<dos.h> PrintingProgram3
void interrupt (*old)(
);voidinterruptnewfunc();m
ain()
{
old=getvect(0x17);setv
ect(0x17,newfunc);kee
p(0,1000);
}
voidinterruptnewfunc()
{ if(_AH==0){
(*old)();
_AH=0;
(*old)();
_AH=0;
(*old)();
}
(*old)();
}
In this program interrupt 17h is again intercepted. Whenever a character is to printed the
new function call the old function thrice. As a result a single character in the print
document will be repeated 4 times.
Now we will see how the register within the PPI can be accessed directly to control the
printer.
©CopyrightVirtualUniversityofPakistan 93
11-PeripheralProgrammableInterface(PPI)II
Direct Parallel
PortProgramming
• BIOSsupportupto
three parallelports
• AddressoftheseLPT
ports is Stored in
BIOS Data Area
40:08 word LPT1
40:0A word LPT2
40:0C word LPT3
40:0E word LPT4
Above slide list the addresses within the BIOS data area where the base address (starting
port number) of LPT devices is stored.
DumpofBIOSdataarea
The dump of BIOS data area address specified in the previous slide for a certain computer
shows that the base port address of LPT1 is 0x03bc, for lpt2 it is 0x0378, for Lpt3 it is
0x0278. These values need not be the same for all the computer and can vary
fromcomputer to computer.
©CopyrightVirtualUniversityofPakistan 94
11-PeripheralProgrammableInterface(PPI)II
SwappingLPTs
DirectParallel PortProgramming
unsignedintfar* lpt=
(unsignedint far *) 0x00400008 ;
unsigned int temp;
temp=*(lpt);
*lpt=*(lpt+1);
*(lpt+1)=temp;
The LPTs can be swapped i.e LPT1 can be made LPT2 and vice versa for LPT2. This can
beaccomplishedsimplybyswappingtheiraddressesintheBIOSdataareaasshownin the slide
above.
Direct ParallelPort
ProgrammingPort
Registers
• 40:08storethebaseaddressforlpt1
• The parallel port interface has 3 ports
internally
• If the Base address is 0X378 then the
three Ports will be 0x378,0x379 0x37A
©CopyrightVirtualUniversityofPakistan 95
11-PeripheralProgrammableInterface(PPI)II
LPTPorts
Direct ParallelPort
ProgrammingPort
Registers
Base+0=DataPort
7 6 5 4 3 2 1 0
Base+1=PrinterStatus
PrinterisBusy
The first port (Base +0) is the data port. Data to be sent/received is placed in this port. In
case of printer the (Base + 1) is the printer status port as described in the slide. Each bit
represents the various status of the printer quite similar to the status byte in case ofBIOS
service.
PrinterControlRegister
DirectParallel PortProgramming
PortRegisters
PrinterControlRegister=Base+2
7 6 5 4 3 2 1 0
0 0 0 IRQ SI IN ALF ST
(Base +2) is the printer control register it is used to pass on some control information to
the printer as described in the slide.
©CopyrightVirtualUniversityofPakistan 96
11-PeripheralProgrammableInterface(PPI)II
DirectParallelPortProgramming
DirectParallelPortProgramming
file*fptr;
unsigned far*base=(unsigned int far*)0x00400008 void
main (void)
{
fptr=fopen(“c:\\abc.txt”,”rb”);
while( ! feof (fptr) )
{if(!(inport(*base+1)&0x80)
{outport(*base,getc(fptr));
outport ((*base+2,inport((*base+2) | 0x01);
outport((*base+2,inport((*base+2) & 0xFE);
}
}}
TheaboveprogramdirectlyaccessestheregistersofthePPItoprintafile.Thewhile loop
terminates when the file ends. The if statement only schecks if the printer is busy of not.
If the printerisidle the program writes the next byte in file on tothe data portandthen turns
the strobe bit to 1 and then 0 to indicate that a byte has been sent to the printer.
Theloopthenagainstartscheckingthebusystatusoftheprinterandtheprocess continue.
©CopyrightVirtualUniversityofPakistan 97
12-ParallelPortProgramming
12 -ParallelPortProgramming
PrinterInterfaceandIRQ7
PrinterInterf
ace
Printer ACK
Interface
INT
PIC IRQ7
Printer
The printer interface uses the IRQ 7 as shown in the slide above. Therefore if interrupt
driven I/O isto be performedint 0x0fneed tobe programmedas an hardwareinterrupt.
InterruptDrivenPrinterI/O
char buf [1024]; int i = 0;
voidinterrupt(*oldint)();
void interrupt newint ();
void main (void)
{
outport(( *lpt),inport(*lpt)|4);
outport(( *lpt), inport( *lpt) | 0x10);
oldint =getvect (0x0F);
setvect(0x0F,newint);
outport(0x21, inport( 0x21) & 0x7F);//corrected
keep(0,1000);
}
©CopyrightVirtualUniversityofPakistan 98
12-ParallelPortProgramming
voidinterruptnewint()
{
outport( *lpt,Buf[ i]);
outport(( *lpt)+2, inport(( *lpt)+2) &0xFE);
outport((*lpt)+2,inport((*lpt)+2)|1);i++;
if(i==1024)
{
outport(0x21, inport(0x21)|0x80);//corrected
setvect(0x0F,oldint);
freemem(_psp);
}
}
Above is a listing of a program that uses int 0x0f to perform interrupt driven I/O.To enable
the interrupt 0x0f three things are required to be done. The interrupt should be enabled in
the printer control register; secondly it should also be unmasked in the IMR in PIC. The
program can then intercept or set the vector of interrupt 0x0f by placing the address of its
function newint();
The newint() will now be called whenever the printer can perform output. This newint()
function writes the next byte in buffer to the data registers and then send a pulse on the
strobe signal to tell the printer that data has been sent to it. When whole of the buffer has
been sent the int 0x0f vector is restored, interrupt is masked and the memory for the
program is de-allocated.
The above listing might not work. Not all of the printer interfaces are designed as
described above. Some modifications in the printer interface will not allow the interrupt
driven I/O to work in this manner. If this does not work the following strategy can be
adopted to send printing to the printer in background.
©CopyrightVirtualUniversityofPakistan 99
12-ParallelPortProgramming
Printinginthebackground
#include<stdio.h>#incl
ude<dos.h>#include<bi
os.h>
#include<conio.h>#incl
ude<stdlib.h>
voidinterrupt(*oldint)();vo
id interrupt newint();
unsigned int far*lpt =(unsigned int far*)0x00400008;char
st[80]= "this is a test print string !!!!!!!!!!!";
inti ;
voidmain()
{
oldint=getvect(0x08);setvec
t(0x08,newint);keep(0,1000)
;
}
voidinterruptnewint()
{
if(((inport((*lpt)+1))&0x80)==0x80)
{
outport(*lpt,st[i++]);
outport ((*lpt)+2, inport((*lpt)+2)&
0xfe);outport ((*lpt)+2, inport((*lpt)+2) | 1);
}
if(i==32)
{
setvect(0x08,oldint);freeme
m(_psp);
}
(*oldint)();
}
This program uses the timer interrupt to send printing to the printer in the back ground.
Whenever the timer interrupt occurs the interrupt function checks if the printer is idle
ornot. If it’s the printer is idle it takes a byte from the buffer and sends it to the data port
ofthe printer interface and then sends a pulse through the strobe signal. When the buffer
isfull the program restores the int 8 vector and the relinquishes the memory occupied by
the program.
©CopyrightVirtualUniversityofPakistan 100
12-ParallelPortProgramming
PrinterCableConnectivity
PrinterCableConnectivity
1 STROB
2 D0
3 D1
4 D2
5 D3
6 D4
7 D5
8 D6
9 D7
10 ACK
11 BUSY
12 PE
13 SLCT
14 AUTOFEED
15 ERROR
16 INIT
17 SLCTIN
18-25 GND
Not all the bits of the internal registers of the PPI are available in standard PCs. In
standard PCs the PPI is connected to a DB25 connector. And some of the bits of its
internal registers are available as pin outs as describes in the slide above.
©CopyrightVirtualUniversityofPakistan 101
12-ParallelPortProgramming
ComputertoComputercommunication
ComputertoComputer
Connectivity
It might be desirable to connect one computer to another via PPIs to transfer data. One
might desire to connect them such that one port of PPI at one end is connected to another
port of the other PPI at the otherend.But interconnecting thewhole 8 bits of PPI cannotbe
made possible as all the bits of the internal ports are not available as pinouts. So the
answer is to connect a nibble (4-bits) at one end to the nibble at the other. In this way two
way communication can be performed. The nibbles are connected as shown in the slide
above.
©CopyrightVirtualUniversityofPakistan 102
12-ParallelPortProgramming
PPIInterconnection
P02 15Q3
P13 13Q4
P24 12Q5
P35 10 Q6
P46 11 Q7
Q315 2 P0
Q413 3 P1
Q512 4 P2
Q610 5 P3
Q711 6 P4
The pins that are interconnected are shown in the slide above. Another thing worth
noticing is that the 4th bit of the data port is connected to the BUSY and vice versa. The
BUSY is inverted before it can be read from the status port. So the 4 th bit in data port at
PC1 will be inverted before it can be read at the 7th bit of status register at PC2.
FlowControl
An algorithm should be devised to control the flow of data so the receiver and sender may
know when the data is to be received and when it is to be sent. The following slides
illustrate one such algorithm.
D4 D3 D2 D1 D0
0 B3 B2 B1 B0 Sender
Sendersends
LOWNibble
and D4 = 0
receivedas
BUSY= 1
Receiver
1 B3 B2 B1 B0
BUSYACKPE SLC
ERE7 E6 E5 E4
E3
©CopyrightVirtualUniversityofPakistan 103
12-ParallelPortProgramming
First the low nibble of the byte is sent from the sender in bit D0 to D3 of the data port. D4
bit is cleared to indicate the low nibble is being sent. The receiver will know the arrival of
thelownibblewhenitschecksBUSYbitwhichshouldbeset(bytheinterface)onarrival.
BUSYACKPE SLCER
Sender
1 B3 B2 B1 B0
Receiversend
back LOW
NibbleandD4=0
receivedas
BUSY=1by
Sender
Receiver
0 B3 B2 B1 B0
D4 D3 D2 D1 D0
The receiver then sends back the nibble turning its D4 bit to 0 as an acknowledgement of
the receipt of the low nibble. This will turn the BUSY bit to 1 at the sender side.
D4 D3 D2 D1 D0
Sender
1 B7 B6 B5 B4
Sendersends Hi
Nibbleandturns
D4 = 1 received
as BUSY= 0 by
Receiver
Receiver
0 B7 B6 B5 B4
BUSYACKPE SLC ER
The sender then send the high nibble and turns its D4 bit to 1 indicating the
transmissionof high nibble. On the receiver side the BUSY bit will turn to 0 indicating the
receipt of high nibble.
©CopyrightVirtualUniversityofPakistan 104
12-ParallelPortProgramming
Thereceiverthensendsbackthehighnibbletothesenderasanacknowledgment.
BUSYACKPE SLCER
Sender
Receiversend
0 B7 B6 B5 B4
backHiNibble
andturns
D4=1received
asBUSY=0by
Sender
Receiver
1 B7 B6 B5 B4
D4 D3 D2 D1 D0
©CopyrightVirtualUniversityofPakistan 105
13-SerialCommunication
13 -SerialCommunication
Programimplementingthedescribedprotocol
inti=0;charBuf[1024];
while (1)
{ ch=Buf[i];
if((inport((*lpt)+1)&0x80)== 0)
{ ch=Buf [i];
ch = ch & 0xEF;
while((inport((*lpt)+1)&0x80)==0);
}
else
{ ch = Buf [i];
ch = ch >> 4;
ch=ch|0x10;
outport(*lpt,ch);
i++;
while((inport((*lpt)+1)&0x80)==80);
}
}
This is the sender program. This program if find the BUSY bit clear sends the low nibble
but turns the D4 bit to 0 before sending. Similarly it right shifts the byte 4 times sets
theD4 bit and then sends the high nibble and waits for acknowledgment until the BUSY is
cleared.
int i;
while(1)
{ if((inport(*lpt+1)&0x80)==0x80)
{ x = inport ((*lpt) + 1);
x = x >> 3;
x = x & 0x0F;
outport((*lpt), x);
while((inport(*lpt+1)&0x80)==0x80);
}
else
{ y = inport ((*lpt) + 1);
y = y << 1;
temp=y;
y = y & 0xF0; //instruction added
y = y | x;
©CopyrightVirtualUniversityofPakistan 106
13-SerialCommunication
This is receiver program. If the BUSY bit is clear it receives the low nibble and stores itin
x. Similarly if the BUSY bit is 0 it receives the high nibble and concatenates the both
nibble to form a byte.
©CopyrightVirtualUniversityofPakistan 107
13-SerialCommunication
SerialCommunication
SerialCommunication
• Advantages
• Disadvantages
TypesOfSerial Communication
• Synchronous
• Asynchronous
In case of serial communication the bits travel one after the other in serial pattern. The
advantage of this technique is that in this case the cost is reduced as only 1 or 2 lines
maybe required to transfer data.
The major disadvantage of Serial communication is that the speed of data transfer maybe
reduced as data is transferred in serial pattern.
There are two kinds of serial communications.
Synchronous Communication
SynchronousCommunication
• Timing signal is used to identify start and end
of a bit.
LSB MSB
1 1 0 1 0 1 1 0
01101011
©CopyrightVirtualUniversityofPakistan 108
13-SerialCommunication
SynchronousCommunication
• Samplingmaybeedgetriggered.
• Special line may be required for
timingsignal (requires another line).
• Or thetimingsignal maybe encoded
within the original signal (requires
double the bandwidth).
AsynchronousCommunication
AsynchronousCommunication
• Doesnotuse make useoftiming
signal.
• Each byte (word) needs to
encapsulatedin startand endbit.
©CopyrightVirtualUniversityofPakistan 109
13-SerialCommunication
UART(UniversalAsynchronousReceiverTransmitter)
SerialCommunicationusingaUART
Parity Startbit
bit ofnext
byte
01 23 4567 8
1
0
5–8 bit
Startbit 1,1.5,2
Stop bit
• 1.5Stopbit
SamplingRate
Bit rate=9600
Abitis sampledafter=1/9600
-- But start and end bitsofa particular Byte
cannot be recognized.
-- So 1.5 stop bit (high) is used to
encapsulate abyte. Alowstart bit at
the start of Byte is used to identify the
start of a Byte.
©CopyrightVirtualUniversityofPakistan 110
13-SerialCommunication
SamplingRate
-- Bit rateand other settings shouldbe the
same at both ends i.e.
- Data bitsper Byte.(5–8)
- Paritycheck
- ParityEven/Odd
- No.ofstopbits.
SamplingRate
1/1300 sec
Databits Stopbit
Startbit Oddparity
A=41H =01000001B
Parity=Odd
Data = 8
Stop bit =1
Datarate=300bits/sec
©CopyrightVirtualUniversityofPakistan 111
13-SerialCommunication
RS–232CStandard
• Standard for physical dimensions of the
connectors.
RS–232C Cable
PC (DCE)
Modem
(DTE) Connectedvia
serialport
RS–
232CConnectorsandSignalsD
B25 (25 pin connector)
13
25
12
24
11
23
10
22 RI
9
21
CD8
20 DTR
GND7 19
DSR6 18
CTS5
17
RTS 4
16
RD3T
15
XD2
14
1
25pinconnectoronPC
ThepinoutsoftheDB25connectorusedwithRS232Cisshownintheslideabove.
©CopyrightVirtualUniversityofPakistan 112
14-SerialCommunication(UniversalAsynchronousReceiverTransmitter)
14 -SerialCommunication(UniversalAsynchronous
Receiver Transmitter)
RS–232CStandard
• Standard for physical dimensions of the
connectors.
RS–232C Cable
PC (DCE)
Modem
(DTE) Connectedvia
serialport
RS–
232CConnectorsandSignalsD
B25 (25 pin connector)
13
25
12
24
11
23
10
22 RI
9
21
CD8
20 DTR
GND7 19
DSR6 18
CTS5
17
RTS 4
16
RD3T
15
XD2
14
1 25pinconnectoronPC
©CopyrightVirtualUniversityofPakistan 113
14-SerialCommunication(UniversalAsynchronousReceiverTransmitter)
FlowControlusingRS232C
RI
CD
DSR
PC RTS MODEM
CTS
RxD TxD
DTR(SHOULDREMAINHIGHTHROUGHOUTTHESESSION)
CTS(CANBEUSEDFORFLOWCONTROL)
Data is received through the RxD line. Data is send through the TxD line. DTR (data
terminal ready) indicates that the data terminal is live and kicking. DSR(data set ready)
indicates that the data set is live. Whenever the sender can send data it sends the signal
RTS( Request to send) if as a result the receiver is free and can receive data it send the
senderan acknowledge through CTS( cleartosend) indicating that itsclear to send now.
DB9ConnectorforUART
DB9Connector
1
CD
6
2 DSR
RxD 7
3 RTS
TxD 8
4 CTS
DTR 9
5 RI
GND
TheaboveslideshowsthepinoutsoftheDB9connector.
©CopyrightVirtualUniversityofPakistan 114
14-SerialCommunication(UniversalAsynchronousReceiverTransmitter)
UARTinternals
UARTInternals RxD
ReceiverBufferRegister ReceiverShiftRegister
InterruptEnableRegister
Interruptt
LineStatusRegister o
ParityLOGIC
InterruptID Register
DivisorLatchRegister
ModemControlRegister
1.CTS 2.DSR
Modem 3.CD
StatusRegister 4.RI
TransmitShift TxD
TransmitterHolding Register
Register
This slide shows the various internal registers within a UART device. The programmer
only needs to program these registers efficiently in order to perform asynchronous
communication.
Registersummary
Base +
TransmitterHoldingRegister THR 0
ReceiverData RBR 0
BandRateDivisor(Low Byte) DLL 0
BandRateDivisor(HighByte) DLM 1
InterruptEnable IER 1
FIFOControlRegister FCR 2
InterruptID IIR 2
LineControl LCR 3
ModeControl MCR 4
LineStatus LSR 5
ModemStatus MSR 6
ScratchPad SP 7
The above table lists the registers within the UART ans also shows their abbreviation.
Also it shows there offsets with respect to the base register.
©CopyrightVirtualUniversityofPakistan 115
14-SerialCommunication(UniversalAsynchronousReceiverTransmitter)
ServedPortsinStandardPC
BIOSsupports4UARTSasCOMPorts COM1,
COM2, COM3, COM4
BIOSDataArea
TextDump
-d40:0
0040:0000F803F802E803E802-BC037803 7802C09F..........x.x...
0040:001023C82080 02850020-00 00340034007110#. ................................... 4.4.q.
0040:00200D1C 71100D 1C64 20-20 393405 300B 3A27..q...d94.0.:'
0040:0030300B 0D1C00000000-0000 0000 000000 000...............
0040:0040D800 C30000000000-000350 0000100000..........P.....
0040:0050000A000000000000-0000000000000000................
0040:00600F0C 00D403293000-0000000002C90B00.....)0.........
0040:00700000000000 000800-1414141401010101................
-q
The above dump of the BIOS data area for a certain computer shows that the address of
COM1 is 03F8 , the address of COM2 is 02F8 and the address of COM3 is 03E8. These
addresses may not be same for all the computers and may vary computer to computer.
©CopyrightVirtualUniversityofPakistan 116
14-SerialCommunication(UniversalAsynchronousReceiverTransmitter)
SettingtheBaudrate
SettingtheBaudRate
1.8432MHZ=frequencygeneratingbyUARTS
internally
Baudrate=1.8432MHZ/(16*Divisor)
DivisorvalueloadedinDLL(Base+0) and
DLM ( Base +1 )
Divisor = 1, BaudRate=115200
Divisor = 0CH,Baud Rate = 9600
Divisor = 180H, Baud Rate = 300
The baud rate is set in accordance with the divisor value loaded within the UART internal
registers base +0 and base +1.
LineControlRegister
LineControlRegister
7 6 5 4 3 2 1 0
WordLength
0LoadTHR 00=5BITS
1LoadDivisorValue 01=6BITS
10=7BITS
11=8BITS
Stop Communication
Length ofStopBITS
=1ResumeCommunication= 0=oneBIT
0 1=1.5 for5bitWord
ConstantParity
ParityCheckandgene
0 =NOconstantParity
rationon
1=ConstantParity0
1ifbit4=0
if bit 4 =1 Parity
0=odd
1=Even.
The line control register contains important information about the behaviour of the line
through which the data will be transferred. In it various bits signify the word size,
lengthof stop bits, parity check, parity type and also the a control bit to load the divisor
value. The bit 7 if set indicates that the base +0 and base + 1 will act as the divisor
register otherwise if cleared will indicate that base + 0 is the data register.
©CopyrightVirtualUniversityofPakistan 117
14-SerialCommunication(UniversalAsynchronousReceiverTransmitter)
LineStatus Register
6 5 4 3 2 1 0
DataReady=1
OverRun Error=1
TSRisEmpty=1
TSRContainaByte=0
TransferError(FramingError)
StopCommunicationSignalfromOtherend=1
Line status register illustrates the status of the line. It indicates if the data can be sent or
received. If bit 5 and 6 both are set then 2 consecutive bytes can be sent for output. Also
this register indicates any error that might occur during communication.
InterruptEnableRegister
InterruptEnableRegister
3 2 1 0
TriggerInterrupt
OnDataReady=1
TriggerInterrupt
AssoonasTHRisempty=1
TriggerInterrupt
OnchangeinModemStatus=1 TriggerInterrupt
Onlinestatuschange=1
If interrupt driven output is to be performed then this register is used to enable interrupt
for the UART. It can also used to select the events for which to generate interrupt as
described in the slide.
©CopyrightVirtualUniversityofPakistan 118
14-SerialCommunication(UniversalAsynchronousReceiverTransmitter)
InterruptIDRegister
InterruptID Register
2 1 0
TriggerTriggered
Modem/Line
00=ChangeinModemStatus 01
= THR is Empty
10=DataisReady
11=ErrorinData
Once an interrupt occurs it may be required to identify the case of the interrupt. This
register is used to identify the cause of the interrupt.
©CopyrightVirtualUniversityofPakistan 119
15-COMPorts
15 -COMPorts
ModemControlRegister
ModemControllerRegister
4 3 1 0
DTR
1=SelfTest
0=Normal RTS
0=PollingOperator
1=InterruptsEnabled
In case software oriented flow control technique is used the bits 0 and 1 need to be set in
that case. Bit #3 need to be set to enable interrupts. Moreover if a single computer is
available to a developer the UART contains a self test mode which can be used by the
programmerto self test the software.In self testmodethe output of the UART isroutedto its
input. So you receive what you send.
©CopyrightVirtualUniversityofPakistan 120
15-COMPorts
ModemStatusRegister
ModemStatusRegister
7 6 5 4 3 2 1 0
Change
CD
inCTS
RI ChangeinDSR
DSR ChangeinRI
CTS ChangeinCD
This register indicates the status of the modem status line or any change in the status of
these lines.
FIFOQueue
UART(16550)FIFOQUEUE
7 6 2 1 0
FIFObufferon=1
NumberofCharactersReceivedT
o Trigger an Interrupt
ClearReceiver
00=AfterEveryCharacter01 Buffer=1
=After 4 Character
10=After8Character
11=After 14Character ClearsendBuffer=1
ThisfeatureisavailableinthenewerversionoftheUARTnumbered16500.Aqueueor a buffer of
the input or output bytes is maintained within the UART in order to facilitate more
efficient I/O. The size of the queue can be controlled through this register as shown by the
slide.
©CopyrightVirtualUniversityofPakistan 121
15-COMPorts
InterruptIDRegister
InterruptIDRegister(Revisited)
7 6 3 2 1 0
Interrupt
1= Interrupt
TriggeredBecauseBufferi Triggered=1
snotfullBut other side
has Reasons of
stopsendingdata.( Interrupt00=ChangeinModemLi
TimeOUT)
neStatus01=THR is Empty
10=Data is
ready11=ErrorinDataTran
Any one of these smit
BEINGSetIndicatesFIFOi
sON.
BIOSSupportforCOMPorts
INT# 14H
Service#0=Setcommunicationparameters
Service #01 = Output characters
Service#02=Readincharacters Service
#03 = Get port status
The following slide shows how int 14H service 0 can be used to set the line parameter of
the UART or COM port. This illustrates the various bits of AL that should be
setaccording before calling this service.
©CopyrightVirtualUniversityofPakistan 122
15-COMPorts
Service#0
AL=
BaudRate ParityCheck
DataLength
000=110bauds 00 = None
01 = Odd 00=5bits
001=150bauds
10=Parity 01=6bits
010=300bauds
Disable 10=7bits
011=600bauds 11 = Even 11=8bits
100=1200bauds #ofstopbits
101=2400bauds 0=1stopbit
110=4800bauds 1=1.5or2stopbit
111=9600bauds
TheServiceonreturnplacesthelinestatusinAHregisterasshownintheslidebelow.
AH=LineStatus
TimeOut DataReady
TSREmpty Overrunerror
THR Parity error
BreakDetected Framingerror
©CopyrightVirtualUniversityofPakistan 123
15-COMPorts
AndplacesthemodemstatusintheALregisterasshowninslidebelow.
AL=ModemStatus
CD Change inCTS
RI ChangeinDSR
Ready (DSR) Change in RI
ReadytoReceive Change inCD
Other service of 14h include service #1 which is used to send a byte and service #2
whichis used to receive a byte as shown in the slide below.
Service#01
ONENTRY
AL=ASCIIcharactertosend
ON RETURN
AH=ErrorCode
If7thbitinAH=1=Unsuccessful
0=Successful
Service#02
ONRETURN
AL=ASCIIcharacterreceived AH
= Error Code
©CopyrightVirtualUniversityofPakistan 124
15-COMPorts
CommunicationthroughModem
Modem
PC TelLine
Modem
PC
Modem
Modem is generally used to send/receive data to/from an analog telephone. Had the
telephonelinebeenpurelydigitaltherewouldhavebeennoneedofamodeminthis form. If data is
to be transferred from one computer to another through some media which
cancarrydigitaldatathenthemodemcanbeeliminatedandtheUARTonboth computers can be
interconnected. Such arrangement is called a NULL modem.
NULL
Modem
PC PC
©CopyrightVirtualUniversityofPakistan 125
15-COMPorts
NULLModemConfiguration
CD1 CD 1
RxD2 RxD 2
TxD3 TxD3
DTR4 DTR4
GND5 GND5
DSR6 DSR 6
RTS7 RTS7
CTS8 CTS8
RI 9 RI9
The above slide shows the configuration used to interconnect two UARTsIn this way a
full duplex communication can be performed and moreover flow control can also be
performed using DSR, DTS, RTS and CTS signals.
SampleProgram
Example:
#include<BIOS.H>
#include<DOS.H>
char ch1, ch2;
voidinitialize(intpno)
{
_AH=0;
_AL=0x57;
_DX=pno;
geninterrupt(0x14);
}
©CopyrightVirtualUniversityofPakistan 126
15-COMPorts
charreceivechar(intpno)
{
charch;
_DX=pno;
_AH=2;
geninterrupt (0x14);
ch = _AL;
returnch;
}
voidsendchar(charch,intpno)
{
_DX= pno;
_AH=1;
_AL = ch;
geninterrupt(0x14);
}
unsignedintgetcomstatus(intpno)
{
unsignedinttemp;
_DX= pno;
_AH=03;
geninterrupt(0x14);
*((char*)(&temp))=_AL;
*(((char*)(&temp))+1)=_AH; return
temp;
}
©CopyrightVirtualUniversityofPakistan 127
16-COMPortsII
16 -COMPortsII
SampleProgramusingBIOSroutines
Example:
#include<BIOS.H>
#include<DOS.H>
char ch1, ch2;
voidinitialize(intpno)
{
_AH=0;
_AL=0x57;
_DX=pno;
geninterrupt(0x14);
}
charreceivechar(intpno)
{
charch;
_DX=pno;
_AH= 2;
geninterrupt(0x14);
ch = _AL;
returnch;
}
The initialize () function initializes the COM port whose number is passed as parameter
using BIOS services. The recievechar() function uses the COM port number to receive a
byte from the COM port using BIOS services.
©CopyrightVirtualUniversityofPakistan 128
16-COMPortsII
voidsendchar(charch,intpno)
{
_DX=pno;
_AH=1;
_AL = ch;
geninterrupt(0x14);
}
unsignedintgetcomstatus(intpno)
{
unsignedinttemp;
_DX=pno;
_AH=03;
geninterrupt(0x14);
*((char*)(&temp))=_AL;
*(((char*)(&temp))+1)=_AH;
returntemp;
}
The sendchar() function sends a character to the COM port using BIOS service whose
number is passed as parameter. And the getcomstatus() function retrieves the status of the
COM port whose number has been specified and returns the modem and line status in an
unsigned int.
voidmain()
{
while(1){
i=getcomstatus(0);
if(((*(((char*)(&i))+1)&0x20)==0x20)&&(kbhit()))
{
ch1 = getche();
sendchar(ch1,0);
}
if((*(((char*)(&i))+1)&0x01)==0x01){ ch2
= receivechar (0);
putch(ch2);
}
if((ch1==27)||(ch2==27))
break;
}
}
Let’ssupposetwoUARTsareinterconnectedusingaNULLmodem
In the main () function there is a while loop which retrieves the status of the COM port.
Once the status has been retrieved it checks if a byte can be transmitted, if a key has been
pressed and its is clear to send a byte the code within the if statement sends the input
byteto the COM port using sendchar() function.
©CopyrightVirtualUniversityofPakistan 129
16-COMPortsII
The second if statement checks if a byte can be read from the COM port. If the
Dataready bit is set then it receives a byte from the data port and displays it on the
screen. Moreover there is another check to end the program. The program looks for an
escape character ASCII = 27 either in input or in output. If this is the case then it simply
breaks the loop.
SampleProgram
This program does more or less the same as the previous program but the only
differenceis that in this case the I/O is done directlyusing the ports and also that the Self
Testfacility is used to check the software.
#include <dos.h>
#include<bios.h>
voidinitialize(unsignedintfar*com)
{
outportb((*com)+3,inport((*com)+3)| 0x80); outportb
( (*com),0x80);
outportb((*com)+1,0x01);
outportb((*com)+3,0x1b);
}
voidSelfTestOn(unsignedintfar*com)
{
outportb((*com)+4,inport((*com)+4)|0x10);
}
The initialize() loads the divisor value of 0x0180 high byte in base +1 and low byte
inbase+0.Italsoprogramsthelinecontrolregisterforalltherequiredlineparameters. The
SelfTestOn() function simply enables the self test facility within the modem control
register.
©CopyrightVirtualUniversityofPakistan 130
16-COMPortsII
voidSelfTestOff(unsignedintfar*com)
{
outportb((*com)+4,inport((*com)+4)&0xEf);
}
voidwritechar(charch,unsignedintfar*com)
{
while(!((inportb((*com)+5)&0x20)==0x20));
outport(*com,ch);
}
charreadchar(unsignedintfar*com)
{
while(!((inportb((*com)+5)&0x01)==0x01));
return inportb(*com);
}
The SelfTestOff() function turns this facility off. The writechar() function writes the a
byte passed to this function on the data port. The readchar() function reads a byte from
the data port.
unsignedintfar*com=(unsignedintfar*)0x00400000;
void main()
{
charch=0;inti=1;intj=1; char
ch2='A';
initialize( com);
SelfTestOn(com);
clrscr();
while(ch!=27)
{
if(i==80)
{
j++;
i=0;
}
The main function after initializing and turning the self test mode on enters a loop which
will terminate on input of the escape character. This loop also controls the position of the
cursor such the cursor goes to the next line right after a full line has been typed.
©CopyrightVirtualUniversityofPakistan 131
16-COMPortsII
if(j==13)
j=0;
gotoxy(i,j);
ch=getche();
writechar(ch,com);
ch2=readchar(com);
gotoxy(i,j+14);
putch(ch2);
i++;
}
SelfTestOff(com);
}
All the input from the keyboard is directed to the output of the UART and all the input
from the UART is also directed to the lower part of the screen. As the UART is in self
test mode the output becomes the input. And hence the user can see output send to the
UART in the lower part of the screen as shown in the slide below
hellohowru? whatsnewaboutsystemsprogramming?
hellohowru? whatsnewaboutsystemsprogramming?
©CopyrightVirtualUniversityofPakistan 132
16-COMPortsII
SampleProgramusinginterruptdrivenI/O
#include<dos.h>#includ
e<bios.h>
voidinitialize(unsignedintfar*com)
{
outportb((*com)+3,inport((*com)+3)|0x80); outportb (
(*com),0x80);
outportb((*com)+1,0x01);
outportb((*com)+3,0x1b);
}
voidSelfTestOn(unsignedintfar*com)
{
outportb((*com)+4,inport((*com)+4)|0x18);
}
voidSelfTestOff(unsignedintfar*com)
{
outportb( (*com)+4,inport((*com)+4)&0xE7);
}
voidwritechar(charch,unsignedintfar*com)
{
//while(!((inportb((*com)+5)&0x20)==0x20));
outport(*com,ch);
}
charreadchar(unsignedintfar*com)
{
//while(!((inportb((*com)+5)&0x01)==0x01));
returninportb(*com);
}
©CopyrightVirtualUniversityofPakistan 133
16-COMPortsII
This si program is also quite similar to the previous one. The only difference is that in this
the I/O is performed in an interrupt driven patter using the Int 0x0C as the COM1 uses
IRQ4. Also to use it in this way IRQ4 must be unmasked from the IMR register in
PIC.Also before returning from the ISR the PIC must be signaled an EOI code.
voidinterruptnewint()
{
ch=readchar(com);
if (i==80)
{
j++;
i=0;
}
if(j==13)
j=0;
k= i*2+(j+14)*80*2;
*(scr+k)=ch;
i++;
outport(0x20,0x20);
}
©CopyrightVirtualUniversityofPakistan 134
16-COMPortsII
• C:\>DEBUG
-o 3f8 41
-o 3f8 42
-o 3f8 56
-o 3f8 55
-q
C:\>
#include <bios.h>
#include<dos.h>
void interrupt (*oldint)();
void interrupt newint();
unsignedcharfar*scr=(unsignedcharfar
*)0xB8000000;
voidinitialize(unsignedintfar*com)
{
outportb((*com)+3,inport((*com)+3)|0x80);
outportb ( (*com),0x80);
outportb( (*com) +1, 0x01);
outportb((*com)+3,0x1b);
}
voidmain(void)
{
oldint=getvect(0x0C);
setvect(0x0C,newint);
initialize(*com);
outport((*com)+4,inport((*com)+4)|0x08); outport
(0x21,inport (0x21)&0xEF);
outport((*com)+1,1);
keep(0,1000);
}
voidinterruptnewint()
{
*scr = inport(*com);
outport(0x20,0x20);
}
©CopyrightVirtualUniversityofPakistan 135
17-RealTimeClock(RTC)
17 -RealTimeClock (RTC)
SampleProgram
#include <dos.h>
#include<bios.h>
char ch1,ch2;
voidinitialize(unsignedintcom)
{
outportb ( com+3, inport (com+3) | 0x80);
outportb ( com,0x80);
outportb( com +1, 0x01);
outportb(com+3,0x1b);
}
voidmain()
{
initialize(0x3f8);
while(1)
{
if(((inport(0x3fd)&0x20)==0x20)&&(kbhit()))
{ch1=getche();
outport(0x3f8,ch1);
}
if(((inport(0x3fd)&0x01)==1))
{ch2=inport(0x3f8);
putch(ch2);
}
if((ch1==27)||(ch2==27))
break;
}
}
This program is same functionally as one of the previous programs which used BIOS
services to get the input data and send the output data. The only difference is that in this
case it does the same directly accessing the ports.
©CopyrightVirtualUniversityofPakistan 136
17-RealTimeClock(RTC)
NULLModem(Revisited)
CD1 CD 1
RxD2 RxD2
TxD3 TxD3
DTR4 DTR4
GND5 GND5
DSR6 DSR 6
RTS7 RTS7
CTS8 CTS8
RI9 RI9
Only two or three of the lines are being used to send receive data rest of the lines
arebeing used for flow control. The cost of these lines can be reduced by reducing the
lines used to flow control and incorporating software oriented flow control rather
thanhardware oriented flow control as show in the slide below.
NULLModem(Revisited)
CD1 CD 1
RxD2 RxD2
TxD3 TxD3
DTR4 DTR4
GND5 GND5
DSR6 DSR 6
RTS7 RTS7
CTS8 CTS8
RI9 RI9
The DTR, DSR, RTS and CTS lines have been eliminated to reduce cost but in this flow
control will be performed in a software oriented manner.
©CopyrightVirtualUniversityofPakistan 137
17-RealTimeClock(RTC)
Software Oriented
FlowControl
MakesuseofTwoControlcharacters.
– XON(^S)
– XOFF(^T)
XON whenever received indicates the start of communication and XOFF whenever
received indicates a temporary pause in the communication.
Following is a pseudo code which can be used to implement the software oriented flow
control.
while(1)
{
receivedchar=readchar(com); if
(receivedchar == XON)
{ ReadStatus=TRUE;
continue;
}
if(receivedchar==XOFF)
{ ReadStatus=FALSE;
continue;
}
if(ReadStatus==TRUE)
Buf[i++]=receivedchar;
}
the received character is firstly analysed for XON or XOFF character. If XON is received
the status is set to TRUE and if XOFF is received the status is set to FALSE.
Thecharacters will only be received if the status is TRUE otherwise they will be
discarded.
©CopyrightVirtualUniversityofPakistan 138
17-RealTimeClock(RTC)
RealTimeClock
TimeUpdationThroughINT8
RealTime ClockDevice
• Batterypowereddevice
• Updatestimeeven ifPCisshutdown
• RTChas64bytebatterypoweredRAM
• INT1AHused toget/settime.
Real time clock is a device incorporated into the PC to update time even if the computeris
off. It has the characteristics shown in the slide above which enables it to update time
even if the computer is off.
The BIOS interrupt 0x1Ah can be used to configure this clock as shown in the slide
below it has various service for getting/setting time/date and alarm.
©CopyrightVirtualUniversityofPakistan 139
17-RealTimeClock(RTC)
AL=1if Midnightpassed
AL=0ifMidnightnotpassed Set
Clock Counter1AH/01
ONENTRY
AH=01
CX = Clock count (Hi word)
DX = Clock count (Low word)
ReadTime 1AH/02
ON ENTRY
AH = 02
ONEXIT
CH = Hours (BCD)CL
= Minutes (BCD) DH
= Seconds (BCD)
©CopyrightVirtualUniversityofPakistan 140
17-RealTimeClock(RTC)
©CopyrightVirtualUniversityofPakistan 141
17-RealTimeClock(RTC)
©CopyrightVirtualUniversityofPakistan 142
17-RealTimeClock(RTC)
DisableAlarm 1AH/07
ONENTRY
AH = 07
ReadAlarm 1AH/09
ONENTRY
AH = 09
ONEXIT
CH = Hours (BCD)
CL= Minutes (BCD)
DH= Seconds (BCD)
DL=AlarmStatus(00=NotEnable
01=Enable)
©CopyrightVirtualUniversityofPakistan 143
17-RealTimeClock(RTC)
RTCinternals
RealTimeClock
7FH
The RTC internally has an array of registers which can be used to access the 64 byte
battery powered CMOS RAM.
InternalPorts
70–7FH(16ports)
Only 70 & 71H are important from
programming point of view
©CopyrightVirtualUniversityofPakistan 144
17-RealTimeClock(RTC)
The following slide shows the function of some of the bytes in the battery powered RAM
used to store the units of time and date.
64ByteBatteryPoweredRAM
00H=Current Second
01H = Alarm Second
02H = Current Minute
03H = Alarm Minute
04H = Current Hour
05H = Alarm Hour
06H= Day Of theWeek
07H = Number Of Day
64ByteBatteryPoweredRAM
08H=Month
09H = Year
0AH = Clock Status Register A
0BH = Clock Status Register B
0CH = Clock Status Register C
0DH = Clock Status Register D
32H = Century
©CopyrightVirtualUniversityofPakistan 145
17-RealTimeClock(RTC)
Dayoftheweek
WeekDay
01H = Sunday02H
= Monday 03H =
Tuesday 04H =
Wednesday 05H =
Thursday 06H =
Friday
07H=Saturday
Thevalueinthedaysoftheweekbyteindicatesthedayaccordingtoslideshownabove.
GenerallyBCDvaluesareusedtorepresenttheunitsoftimeanddate.
Year
No ofCenturyand Yearare inBCD.
©CopyrightVirtualUniversityofPakistan 146
17-RealTimeClock(RTC)
AccessingtheBatteryPoweredRAM
AccessingtheBatteryPoweredRAM
Following slide shown a fragment of code that can be used to read or write onto any byte
within the 64 byte battery powered RAM.
AccessingtheBatteryPoweredRAM
outport(0x70,0); outport(0x70,4);
sec=inport(0x71); outport(0x71,hrs);
©CopyrightVirtualUniversityofPakistan 147
17-RealTimeClock(RTC)
ClockStatusRegisters
StatusRegisterA
7 6 5 4 3 2 1 0
Interrupt
frequency
Timefreque
ncy
0=Timeisnotupdated
1=Time isupdated
The lower 4 bits of this register stores a code indicating the frequency with which
theRTC hardware interrupt can interrupt the processor. The next field is used to specify
the time frequency i.e. the frequency with the time is sampled and hence updated. The
most significant bit indicates that after time sampling if the time has been updated in to
the 64 byte RAM or not.
©CopyrightVirtualUniversityofPakistan 148
18-RealTimeClock(RTC)II
18 -RealTimeClock(RTC)II
ClockStatusRegisters
StatusRegisterB
7 6 5 4 3 2 1 0
0=Daylight
Updatetime saving time
Callperiodic 24/12–hourcounter
interrupt 0=12hourformat
1=24hourformat
Call Alarminterrupt
Callinterrupton
Time&dateformat
time update
0=BCD
Blockgenerator 1=Binary
StatusRegisterC
7 6 5 4 3 2 1 0
1=Timeupdatecomplete 1
= Alarm timereached
1=Periodicinterruptcall
Status register is used to identify the reason of interrupt generation as described in the
slide above.
©CopyrightVirtualUniversityofPakistan 149
18-RealTimeClock(RTC)II
StatusRegisterD
7 6 5 4 3 2 1 0
0=BatteryDead
Only the most significant byte in status register D is important which on being 0 indicates
that the battery is dead.
SampleProgram.
voidmain()
{
unsignedinthours,months,seconds;
_AH=2;
geninterrupt(0x1a);
hours = _CH;
minutes = _CL;
seconds = _DH;
hours= hours <<4;
*((unsigned char *)(& hours)) =
(*((unsignedchar*)(&hours)))>>4;
hours=hours+0x3030;
seconds=seconds<<4;
*((unsigned char *)(& seconds)) =
(*((unsignedchar*)(&seconds)))>>4;
seconds=seconds+0x3030;
©CopyrightVirtualUniversityofPakistan 150
18-RealTimeClock(RTC)II
minutes=minutes<<4;
*((unsigned char *)(& minutes)) =
(*((unsignedchar*)(&minutes)))>>4;
minutes=minutes+0x3030;
clrscr();
printf("%c%c-%c%c-%c%c%c%c",
*(((unsignedchar*)(&hours))+1),
*((unsignedchar*)(&hours)),
*(((unsignedchar*)(&minutes))+1),
*((unsignedchar*)(&minutes)),
*(((unsignedchar*)(&seconds))+1),
*((unsignedchar*)(&seconds)),
getch();
}
The above program uses the service int 1Ah/02H to read the time from the real
timeclock. It reads the time and converts the packed BCD values into unpacked BCD
values. These values are then converted into ASCII and displayed using the printf()
statement.
©CopyrightVirtualUniversityofPakistan 151
18-RealTimeClock(RTC)II
ReadtimefromRTC(SampleProgram)
This sample program directly accesses the 64 byte RAM to access the units of time.
Before reading the time it makes sure by checking the value of Status register A and
checking its most significant bit for time update completion. If the updation is complete
time can be read from the respective registers in the 64 byte RAM.
#include<bios.h>
#include<dos.h>
void main ()
{
inthrs,mins,secs;
char temp;
do{
outportb(0x70,0x0a);
temp=inportb(0x71);
}while((temp&0x80)==0);
outportb(0x70,0);
secs=inport(0x71);
outportb(0x70,2);
mins=inport(0x71);
outportb(0x70,4);
hrs=inport(0x71);
hrs=hrs<<4;
*((unsigned char *)(&hrs)) =
(*((unsignedchar*)(&hrs)))>>4;
hrs=hrs+0x3030;
mins=mins<<4;
*((unsigned char *)(&mins)) =
(*((unsignedchar*)(&mins)))>>4;
mins=mins+0x3030;
secs=secs<<4;
*((unsigned char *)(&secs)) =
(*((unsignedchar*)(&secs)))>>4;
secs=secs+0x3030;
clrscr();
©CopyrightVirtualUniversityofPakistan 152
18-RealTimeClock(RTC)II
printf("%c%c:%c%c:%c%c",
*(((unsignedchar*)(&hrs))+1),
*((unsignedchar*)(&hrs)),
*(((unsignedchar*)(&mins))+1),
*((unsignedchar*)(&mins)),
*(((unsignedchar*)(&secs))+1),
*((unsigned char*)(&secs)));
getch();
}
ThetimeunitsaresimilarlyreadandconvertedtoASCIIanddisplayed.
WritetheTimeonRTC
#include<bios.h>#inclu
de<dos.h>
unsignedcharASCIItoBCD(charhi,char lo)
{
hi=hi-0x30;
lo=lo-0x30; hi
= hi << 4; hi
= hi | lo;
return hi;
}
unsignedlongintfar*tm=
(unsignedlongintfar*)0x0040006c;
©CopyrightVirtualUniversityofPakistan 153
18-RealTimeClock(RTC)II
voidmain()
{
unsignedcharhrs,mins,secs;
char ch1, ch2;
puts("\nEnterthehourstoupdate:"); ch1=getche();
ch2=getch();
hrs=ASCIItoBCD(ch1,ch2);
puts("\nEntertheminutestoupdate:");
ch1=getche();
ch2=getch();
mins=ASCIItoBCD(ch1,ch2);
puts("\nEnterthesecondstoupdate:");
ch1=getche();
ch2=getch();
secs= ASCIItoBCD(ch1,ch2);
*tm=0;
_CH= hrs;
_CL=mins;
_DH=secs;
_DL=0;
_AH=3;
geninterrupt(0x1a);
puts("TimeUpdated");
}
Theabovelisting of theprogram inputs the timefrom the user which isinASCII format.It
converts the ASCII in packed BCD and uses BIOS services to update the time. In
DOSorwindowsthistimechangemaynotremaineffectiveafterthecompletionofthe program as
the DOS or windows device drivers will revert the time to original even if ithas been
changed using this method.
©CopyrightVirtualUniversityofPakistan 154
18-RealTimeClock(RTC)II
SampleProgram
#include<bios.h>#
include<dos.h>
unsignedcharASCIItoBCD(unsigned
charhi,unsignedcharlo)
{
hi=hi-0x30;
lo=lo-0x30; hi
= hi << 4; hi
= hi | lo;
returnhi;
}
voidmain()
{
unsignedinthrs,mins,secs;
char ch1, ch2;
inttemp;
puts("\nEnterthehourstoupdate:");
ch1=getche();
ch2=getche();
hrs=ASCIItoBCD(ch1,ch2);
puts("\nEntertheminutestoupdate:");
ch1=getche();
ch2=getche();
mins=ASCIItoBCD(ch1,ch2);
puts("\nEnterthesecondstoupdate:");
ch1=getche();
ch2=getche();
secs= ASCIItoBCD(ch1, ch2);
outportb(0x70,0x0b);
temp= inport(0x71);
©CopyrightVirtualUniversityofPakistan 155
18-RealTimeClock(RTC)II
temp = temp|0x80;
outportb(0x70,0x0b);
outportb(0x71,temp);
outport (0x70,0);
outport(0x71,secs);
outport (0x70,2);
outport(0x71,mins);
outport (0x70,4);
outport (0x71,hrs);
outportb(0x70,0x0b);
temp = inport(0x71);
temp= temp & 0x7f;
outportb(0x70,0x0b);
outportb(0x71,temp);
delay(30000);
do{
outportb(0x70,0x0a);
temp=inportb(0x71);
}while((temp&0x80)==0);
outportb(0x70,0);
secs=inport(0x71);
outportb(0x70,2);
mins=inport(0x71);
outportb(0x70,4);
hrs=inport(0x71);
hrs = hrs <<4;
*((unsigned char *)(&hrs)) =
(*((unsignedchar*)(&hrs)))>>4;
hrs=hrs+0x3030;
©CopyrightVirtualUniversityofPakistan 156
18-RealTimeClock(RTC)II
mins=mins<<4;
*((unsigned char *)(&mins)) =
(*((unsignedchar *)(&mins))) >>4;
mins=mins+0x3030;
secs = secs <<4;
*((unsigned char *)(&secs)) =
(*((unsignedchar*)(&secs)))>>4;
secs=secs+0x3030;
printf("\nUpdatedtimeis=%c%c:%c%c:%c%c",
*(((unsignedchar*)(&hrs))+1),
*((unsignedchar*)(&hrs)),
*(((unsignedchar*)(&mins))+1),
*((unsignedchar*)(&mins)),
*(((unsignedchar*)(&secs))+1),
*((unsignedchar*)(&secs)));
getch();
}
©CopyrightVirtualUniversityofPakistan 157
19-RealTimeClock(RTC)III
19 -RealTimeClock(RTC)III
ReadingtheDate
#include<bios.h>
#include<dos.h>
void main ()
{
unsignedintcen,yrs,mons,days;
_AH=4;
geninterrupt(0x1a);
cen=_CH;yrs=_CL;
mons=_DH;
days=_DL;
cen=cen<<4;
*((unsigned char *)(&cen)) =
(*((unsignedchar*)(&cen)))>>4;
cen=cen+ 0x3030;
mons=mons<<4;
*((unsigned char *)(&mons)) =
(*((unsignedchar*)(&mons)))>>4;
mons=mons+0x3030;
yrs=yrs<<4;
*((unsigned char *)(&yrs)) =
(*((unsignedchar*)(&yrs)))>>4;
yrs=yrs+0x3030;
days=days<<4;
*((unsigned char *)(&days)) =
(*((unsignedchar*)(&days)))>>4;
days=days+0x3030;
clrscr();
©CopyrightVirtualUniversityofPakistan 158
19-RealTimeClock(RTC)III
printf("%c%c-%c%c-%c%c%c%c",
*(((unsignedchar*)(&days))+1),
*((unsignedchar*)(&days)),
*(((unsignedchar*)(&mons))+1),
*((unsignedchar*)(&mons)),
*(((unsignedchar*)(&cen))+1),
*((unsignedchar*)(&cen)),
*(((unsignedchar*)(&yrs))+1),
*((unsignedchar*)(&yrs)));
getch();
}
SettingtheDate
unsignedcharASCIItoBCD(charhi,charlo)
{
hi=hi-0x30;
lo=lo-0x30; hi
= hi << 4; hi
= hi | lo;
returnhi;
}
voidmain()
{
unsignedcharyrs,mons,days,cen;
char ch1, ch2;
puts("\nEnterthecenturytoupdate:");
ch1=getche();
ch2=getche();
cen= ASCIItoBCD(ch1, ch2);
puts("\nEntertheyrstoupdate:");
ch1=getche();
ch2=getche();
yrs = ASCIItoBCD(ch1, ch2);
puts("\nEnterthemonthtoupdate:");
ch1=getche();
ch2=getche();
mons = ASCIItoBCD(ch1, ch2);
puts("\nEnterthedaystoupdate:");
ch1=getche();
ch2=getche();
days=ASCIItoBCD(ch1,ch2);
_CH=cen;_CL=yrs;_DH=mons;
_DL=days;_AH=5;
geninterrupt(0x1a);
puts("DateUpdated");
}
©CopyrightVirtualUniversityofPakistan 159
19-RealTimeClock(RTC)III
The above sample program takes ASCII input from the user for the new date. After taking
allthedateunitsasinputtheprogramsetsthenewdateusingtheBIOSservice1Ah/05H.
SettingtheAlarm
voidinterrupt(*oldint)();
void interrupt newint();
unsignedintfar*scr=(unsignedintfar*)0xb8000000; void
main()
{ oldint=getvect(0x4a);
setvect(0x4a,newint);
_AH=6;
_CH=0x23;
_CL=0x50;
_DH=0;
geninterrupt(0x1a);
keep(0,1000);
}
voidinterruptnewint()
{ *scr=0x7041;
sound(0x21ff);
}
The alarm can be set using BIOS function 1Ah/06h. Once the alarm is set BIOS will
generate the interrupt 4Ah when the alarm time is reached. The above program
interceptsthe interrupt 4Ah such that newint() function is invoked at the time of alarm. The
newint() function will just display a character ‘A’ on the upper left corner of the screen.
But this program may not work in the presence of DOS or Windows drivers.
AnotherwaytosetAlarm
#include<bios.h>#
include<dos.h>
void interrupt newint70();
voidinterrupt(*oldint70)();
unsigned int far *scr =
(unsigned int far *)0xb8000000;
unsignedcharASCIItoBCD(charhi,charlo)
{
hi=hi-0x30;
lo=lo-0x30; hi
= hi << 4; hi
= hi | lo;
returnhi;
}
©CopyrightVirtualUniversityofPakistan 160
19-RealTimeClock(RTC)III
voidmain(void)
{
inttemp;
unsignedcharhrs,mins,secs;
char ch1, ch2;
puts("\nEnterthehourstoupdate:");
ch1=getche();
ch2=getch();
hrs=ASCIItoBCD(ch1,ch2);
puts("\nEntertheminutestoupdate:");
ch1=getche();
ch2=getch();
mins=ASCIItoBCD(ch1,ch2);
puts("\nEnterthesecondstoupdate:");
ch1=getche();
ch2=getch();
secs= ASCIItoBCD(ch1, ch2);
outportb(0x70,1);
outportb(0x71,secs);
outportb(0x70,3);
outportb(0x71,mins);
outportb(0x70,5);
outportb(0x71,hrs);
outportb(0x70,0x0b);
temp = inport(0x71);
temp = temp|0x70;
outportb(0x70,0x0b);
outportb(0x71,temp);
oldint70=getvect(0x70);
setvect(0x70,newint70);
keep(0,1000);
}
voidinterruptnewint70()
{
outportb(0x70,0x0c);
if((inport(0x71)&0x20)==0x20)
sound(0x21ff);
*scr=0x7041;
(*oldint70)();
}
©CopyrightVirtualUniversityofPakistan 161
19-RealTimeClock(RTC)III
This program takes the time of alarm as ASCII input which is firstly converted into BCD.
This BCD time is placed in the 64 byte RAM at the bytes which hold the alarm time.
Once the alarm time is loaded the register is accessed to enable the interrupts such
thatother bits are not disturbed. Whenever the RTC generates an interrupt, the reason of
the interrupt needs to be established. This can be done by checking the value of status
register C, if the 5th bit of register C is set it indicates that the interrupt was generated
because the alarm time has been reached. The reason of interrupt generation is established
in the functionnewint70().Iftheinterruptwasgeneratedbecauseofalarmthenspeakeris turned
on by the sound() function and a character ‘A’ is displayed on the upper left cornerof the
screen.
OtherConfigurationBytesofBatteryPoweredRAM
©CopyrightVirtualUniversityofPakistan 162
19-RealTimeClock(RTC)III
©CopyrightVirtualUniversityofPakistan 163
19-RealTimeClock(RTC)III
DeterminingSystemsInformation
DeterminingSystems Information
INT11HIN
T12H
INT11H
usedtogethardwareenvironmentinfo.
OnEntry
call11H
OnExit
AX=SystemInfo.
Interrupt 11H is used to determine the systems information. On return this service returns
the systems info in AX register. The detail of the information in AX register is shown
inthe slide above.
©CopyrightVirtualUniversityofPakistan 164
19-RealTimeClock(RTC)III
DeterminingSystemsInformation
INT12H
usedformemoryinterfaced.
INT15H/88H
Returns=No.ofKBabove1MBmark.
Int 12H is used to determine the amount of conventional memory interfaced with the
processor in kilobytes. The amount of memory above conventional memory (extended
memory) can be determined using the service 15H/88H.
©CopyrightVirtualUniversityofPakistan 165
20-Determiningsysteminformation
20 -Determiningsysteminformation
TypesofProcessor
DeterminingtheProcessorType
Flagsregistertesttoidentify8086
15 12
Unusedin8086
The above slides show the test that can be used to determine if the underlying processor is
8086 or not. If its not 8086 some test for it to be 80286 should be performed.
©CopyrightVirtualUniversityofPakistan 166
20-Determiningsysteminformation
Checkingfor80286
If the bits 14-12 are cleared on pushing the flags register then the processor is 80286. This
can be checked as shown in the slide above.
AlignmentTest(IfNot286)
18
Eflags
AlignmentCheck
AlignmentCheck:
movdwordptr[12],EDX
In 32-bit processors it is more optimal in terms of speed if double word are placed at
addresses which ate multiples of 4. If data items are placed at odd addresses the access to
such data items is slower by the virtue of the memory interface of such PCs. So it more
optimal to assign such variables addresses which are multiple of 4. The 386 and 486 are
both 32 bit processors but 486 has alignment check which 386 does not have. This
©CopyrightVirtualUniversityofPakistan 167
20-Determiningsysteminformation
property can be used to distinguish between 386 and 486. If the previous tests have failed
thenthereisapossibilitythattheprocessorisnot8086or286.Toeliminatethe possibilityof it
beinga 386we perform thealignment test. Asshownin the slideabove the 18 th bit of the
EFLAGS register is the alignment bit, it sets if a double word is moved onto a odd
address or an address which does not lie on a 4 byte boundary.
AlignmentTest
pushfd
popeax
movecx,eax
movdword ptr[13], EDX
pushfd
popeax
In the above slide a double word is moved into a odd address. If the processor is 386 then
the 18th bit of the EFLAGS register will not be set, it will be set if the processor is higher
than 386.
©CopyrightVirtualUniversityofPakistan 168
20-Determiningsysteminformation
Distinguishingbetween486andPentiumprocessors
CPUIDTest
• 486willpassthealignmenttest.
• To distinguish486 with Pentium
CPUID Test is used.
A Pentium and 486 both will pass the alignment test. But a 486 does not support
theCPUID instruction. We will next incorporate the CPUID instruction support test to
find if the processor is 486 or a Pentium as Pentium does support CPUID instruction.
CPUIDTest
21
Eflags
• If aprogramcansetandalsoclearbit21ofEflags, then
processor supports CPUID instructions.
• Setbit21ofEflagsandreadvalueofEflagsand store
it.
• Clearbit21ofEflags,readthevalueofEflags.
• Compareboththevalueifbit21haschangedthe
CPUID instruction is available.
If the CPUID instruction is available the processor is a Pentium processor otherwise it’s a
486.
©CopyrightVirtualUniversityofPakistan 169
20-Determiningsysteminformation
MoreaboutCPUIDInstruction
CPUIDInstruction
Before Afterthe executionofInstruction
EAX =0 EAX =1
EBX– EDX– ECX
EBX = “Genu”EDX
= “ineI”
ECX=“ntel”
The CPUID instruction, if available, returns the vendor name and information about the
model as shown in the slide above. Beside rest of the test the CPUID instruction can also
be used by the software to identify the vendor name.
©CopyrightVirtualUniversityofPakistan 170
20-Determiningsysteminformation
Testing for
CoprocessorCoprocessor
control word
CoprocessorControlWord
7
1 1
Interrupt enableflag
11afterinitialization
signifiesextended
precisionoperation
The coprocessor control word contains some control information about the
coprocessor.The bit number 7 of coprocessor control word is the Interrupt Enable Flagand
bit number8 & 9 should contain 11 on initialization.
CoprocessorStatusWord
CoprocessorStatusWord
14 109 8
C3 C3 C1 C0
C3 C2 C0
0 0 0 st>operand
0 0 1 st<operand
1 0 0 st=operand
The coprocessor status register stores the status of the coprocessor. Very much like
theflags register in the microprocessor the Coprocessor status word can be used to
©CopyrightVirtualUniversityofPakistan 171
20-Determiningsysteminformation
determine the result of a comparison as shown in the slide.
©CopyrightVirtualUniversityofPakistan 172
20-Determiningsysteminformation
Followingtestcanbeperformedtotestthepresenceofcoprocessor.
ToCheckCoprocessoris
• Initialize Present
• ReadHi– ByteofControlregister.
• If value in Hi – Byte is 3, then
coprocessor is available, otherwise
its absent.
Once its established that the coprocessor is present then the model of the coprocessor
should be determined. In case an invalid numerical operation is requested the 8087
coprocessorgeneratesaninterruptwhilethehighercoprocessorsdoesnotuseinterrupts in fact
they make use of exceptions. This feature can be used to distinguish between 8087
andhigherprocessorasshownintheslideabove.Thehigherprocessorwillnotrespond to an
attempt made to set the IEM flag while 8087 will respond.
Checkfor8087Coprocessor
• IEMcanbesetin8087.
• IEMcannotbesetin80287,80387
astheyuseexceptiontoinformthe
software about any invalid
instruction.
• If an attempt to set this bit using
FDISIfailsthenitimplies,itsnota
8087 coprocessor .
©CopyrightVirtualUniversityofPakistan 173
20-Determiningsysteminformation
Distinguishingbetween80287and80387
Distinguishbetween80287&80387
• 80387only allows toreverse the
sign of infinity.
• Performadivisionbyzero.
• If the sign of result can be
reversedthen thecoprocessor is
80387.
Ifthesignofinfinitycanbereversedthanthecoprocessoris80387otherwiseits80387
©CopyrightVirtualUniversityofPakistan 174
20-Determiningsysteminformation
ReadingtheComputerconfiguration
voidPrintConfig(void)
{
unionREGSRegister;
BYTEAT;
clrscr();
AT=(peekb(0xF000,0xFFFE)==0xFC);
printf("YourPCConfiguration\n");
printf(" ------------------------------------------------- \n");
printf("PCtype
:");switch
(peekb(0xF000,0xFFFE))
{
case0xFF:printf("PC\n");break;
case0xFE:printf("XT\n");b
reak;
default:printf("ATorhigher\n");
break;
}
printf("Conventional
RAM:");int86(0x12,&Register,&Reg
ister);printf(" %d
K\n",Register.x.ax);
if(AT)
{
Register.h.ah=0x88;
int86(0x15,&Register,&Register);p
rintf("Additional RAM :
%dKover1megabyte\n",Register.x.ax);
}
int86(0x11,&Register,&Register);printf("Def
ault video mode: ");
printf("Diskdrives :%d\n",(Register.x.ax >>6 &
3)+1);printf("Serialinterfaces: %d\n", Register.x.ax >>9 &
0x03);printf("Parallel interfaces: %d\n\n", Register.x.ax >>
14);
}
voidmain()
{
PrintConfig();
}
In this program the general configurations of the computer are read using interrupt
11H,12H and 15H. First its determined if the Processor is and AT (advanced technology all
processors above 8086) type computer or not. This can be done easily by checking its
signature byte placed at the location F000:FFFEH which will contain neither 0xFF
nor0xFE if its an AT computer. The program shows the size of conventional RAM using
the interrupt 12H, then if the computer is an AT computer then the program checks
theextended memory size using int 15H/88H and reports its size. And ultimately the
program calls int 11H to show the number and kind of I/O interfaces available.
©CopyrightVirtualUniversityofPakistan 175
21-KeyboardInterface
21 -Keyboard Interface
ProcessorIdentification
_getprocproc near
;==Determinewhethermodelcamebeforeorafter80286====
xorax,ax ;SetAXto0
pushax ;andpushontostack
popf ;Popflagregisteroffofstack
pushf ;Pushbackontostack
popax ;andpop offofAX
andax,0f000h
;Donotcleartheupperfourbitscm
pax,0f000h ;Are bits 12 - 15 allequalto
1?jenot_286_386 ;YES --> Not 80386 or 80286
In the above slide the test for 8086 or not is performed by clearing all the bits of flags
register then reading its value by pushing flags and then poping it in AX, the bits 15-12 of
ax are checked if they have been set then it’s a 8086.
;--Testfordeterminingwhether80486,80386or80286------
;--Thefollowingtesttodifferentiatebetween80386and---
;--80486isbasedonanextensionoftheEFlagregisteron
;--the80486inbitposition18.
;--The80386doesn'thavethisflag,whichiswhyyou
;--cannotusesoftwaretochangeitscontents.
©CopyrightVirtualUniversityofPakistan 176
21-KeyboardInterface
The above slide further performs the test for 80286 if the previous test fails. It sets the bit
14-12 of flags and then again reads back the value of flags through stack. If the bits 14-12
have been cleared then it’s a 80486.
cli ;Nointerruptsnow
movebx,offsetarray
mov[ebx],eaxpushf
d
popeax
mov first,eax;
mov[ebx+1],eaxpus
hfd
popeaxshr
first,18shre
ax,18andfi
rst,1andea
x,1
cmpfirst,eaxinc
dl
sti
jnepende
The above code performs the alignment test as discussed before by test the 18 th bit after
addressing a double word at an odd address.
pushfd
popeax
movtemp,eaxmo
veax,1
shleax,21
pusheaxp
opfdpushf
dpopeaxs
hreax,21
shrtemp,21cmp
temp,eaxincdl
jepende
jmppende ;Testisended
the above code performs a test to see if CPUID instruction is available or not for which
the bit number 21 of flags is set and then read back.
©CopyrightVirtualUniversityofPakistan 177
21-KeyboardInterface
pendelabelnear ;Endtesting
popdipopf ;PopDIoffofstack
xordh,dhm ;Popflagregisteroffofstack
ovax,dx ;Sethighbyteofproc.codeto0
;Proc.code=returnvalueoffunct.
ret ;Returntocaller
_getprocendp ;Endofprocedure
ACPUIDProgram
#include"stdafx.h"#inclu
de<stdio.h>#include<dos.
h>unsigned long int
id[3];unsigned char
ch='\0';unsignedintsteppi
ngid;
unsignedintmodel,family,type1;uns
igned int cpcw;
intmain(intargc,char* argv[])
{
_asmxoreax,eax
_asmcpuid
_asmmovid[0],ebx;
_asmmovid[4],edx;
_asm mov id[8],
ecx;printf(" %s\n",(char*)(id
));
_asmmoveax,1
_asmcpuid
_asmmovecx,eax
_asmANDeax,0xf;
_asmmovsteppingid,eax;
_asmmoveax,ecx
©CopyrightVirtualUniversityofPakistan 178
21-KeyboardInterface
_asmshreax,4
_asmandeax,0xf;
_asmmovmodel,eax
_asmmoveax,ecx
_asmshreax,8
_asmandeax,0xf
_asmmovfamily,eax;
_asmmoveax,ecx
_asmshreax,12
_asmandeax,0x3;
_asmmovtype1,eax;
printf("\nsteppingis %d\nmodelis %d\nFamilyis %d\nTypeis
%d\n",steppingid,model,family,type1);
}
The above program places 0 in eax register before issuing the CPUID instruction. The
string returned by the instruction is then stored and printed moreover other information
about family, model etc is also printed.
©CopyrightVirtualUniversityofPakistan 179
21-KeyboardInterface
DetectingaCoProcessor
_asmfinit
_asmmovbyteptr cpcw+1,0;
_asm fstcwcpcw
if ( *(((char *)
(&cpcw))+1)==3)puts("Coproce
ssorfound");
else
puts("Coprocessornotfound");
Afterinitializationthecontrolwordisreadifthehigherbytecontainsthevalue3.
_getcoprocnear
movdx,co_none ;Firstassumethereisno CP
movbyte ptrcs:wait1,NOP_CODE;WAIT-instruction on
8087movbyte ptr cs:wait2,NOP_CODE;Replace by NOP
wait1:finit ;InitializeCop
movbyte ptrcpz+1,0;Movehigh bytecontrol
wordto0wait2:fstcw cpz ;Store control word
cmpbyte
ptrcpz+1,3;Highbytecontrolword=3?jnegcende
;No ---> No coprocessor
;--Coprocessorexists.Testfor8087-----------------------
incdx
andcpz,0FF7Fh
;Maskinterruptenablemaskflagfldcw
cpz ;Load in the control word
fdisi ;SetIEM flag
fstcw cpz
;Storecontrolwordt
estcpz,80h ;IEM flag set?
jnegcende ;YES--->8087,endtest
In the code above the IEM bit is set and then the value of control word is read to analyse
change in the control word. If the most significant bit is set then it’s a 8087 co processor
otherwise other tests must be performed.
©CopyrightVirtualUniversityofPakistan 180
21-KeyboardInterface
;--Testfor80287/80387------------------------------------
incdx
finit ;Initializecop
fld1 ;Number1tocop stack
fldz ;Number0tocopstack
fdiv ;Divide1 by0,ergtoST
fldst ;MoveSTontostack
fchs ;ReversesigninST
fcompp ;CompareandpopSTand
ST(1)fstswcpz ;Store result from status word
movah,byte ptr cpz+1 ;inmemoryand move AX
registersahf ;to flag register
jegcende ;Zero-Flag=1:80287
incdx ;Not80287,mustbe80387orinte-
;gratedcoprocessoron80486
gcende:movax,dx
;MovefunctionresulttoAXret
;Return to caller
_getcoendp
An operation (like division by zero is performed) which results in infinity. Then the
signof the result is reversed, if it can be reversed then its 80387 co processor otherwise its
certainly 80287.
KeyBoardInterface
60H
Processor
64H
INTR
IRQ1 PIC
SynchronousData
Keyboard
The keyboard interface as discussed earlier uses the IRQ1 and the port 60H as data port, it
alsousesanotherportnumber64Hasastatusport.Thekeyboardcanperform synchronous serial
I/O.
©CopyrightVirtualUniversityofPakistan 181
21-KeyboardInterface
Port64HStatusRegister
7 6 5 4 3 21 0
1=Output
1=Parity Bufferfull
Error
1=TimeOutError 1=InputBuffer
during input full
1=TimeOutError
duringoutput 1=KeyboardActive
Theaboveslideshowsthedetailedmeaningofbitsinport64H.
TypematicRate
7 6 5 4 3 2 1 0
Delay TypematicRate
11111=2 char/s
00¼Second 11110=2.1char/s
01½Second 11101=2.3char/s
11010=3 char/s
10 ¼ Second ::::::::::::::::
11 1Second ::::::::::::::::
00100=20char/s
00011=21.8char/s
00010=24char/s
00001=26.7char/s
00000=30char/s
The typematic rate of the keyboard can be controlled by a control word as depicted in the
slide above. The delay and typematic rates need to be specified in this control word. The
delay indicates the delay between first and second character input whenever a key is
pressed. The timing of rest of the successive character inputs for the same key is
determined by the typematic rate.
©CopyrightVirtualUniversityofPakistan 182
21-KeyboardInterface
RecievingbytesFrom
Keyboard
Inputfrom
60H
Keyboard
64H
Inputbufferfull
The input character scan code is received at port 60H. A certain bit in the port 64H or
keyboard controller is used as the IBF (input buffer full) bit. A device driver can check
this bit to see if a character has been received from the keyboard on which this bit will
turn to 1.
SendingbytestotheKeyboard
60H
FromProcessor
Lateron
Receives0xFA to
indicatesuccessful
transmission
64H
Outputbufferfull
Similarly some data (as control information) can be send to the keyboard. The processor
will write on the port 60H. The device driver will check the OBF( output buffer full bit of
port64Hwhichremainssetaslongasthebyteisnotreceivedbythekeyboard.On receipt of the
byte from the port 60H the keyboard device write a code 0xFA on the port 60H to
indicate that the byte has been received properly.
©CopyrightVirtualUniversityofPakistan 183
22-KeyboardInterface,DMAController
22 -KeyboardInterface,DMAController
Using the described information we can design a protocol for correctly writing on the
keyboard device as described below.
KeyboardwritingProtocol
• Wait tillinputbufferisfull
• Writeonbuffer
• Waittilloutputbufferisfull
• Checktheacknowledgementbyte
• Repeat the process if it was previously
unsuccessful.
Keyboard is a typically an input device but some data can also be send to the keyboard
device. This data is used as some control information by the keyboard. One such
information is the typematic rate. This type matic rate can be conveyed to the keyboard as
described by the slide below.
CommandforwritingTypematicrate
0xF3
Means Typematic rate willbe sent in the
next byte.
Other such control information is the LED status. Every keyboard has three LEDs for
representing the status of Num Lock, Caps Lock and the Scroll Lock. If the device driver
©CopyrightVirtualUniversityofPakistan 184
22-KeyboardInterface,DMAController
needs to change the status then the LED status byte should be written on the keyboard as
described below. But before writing this byte the keyboard should be told that the control
byte is to bewritten.Thisisdone bysendingthe code 0XED before sending the status byte
using the above described protocol.
KeyboardLEDs
LEDStatusbyte
2 1 0
ScrollLock
NumLock
CapsLock
LED Controlbyte=0xED
ChangingTypematicRate
#include <dos.h>
#include<conio.h>
char st [80];
intSendKbdRate(unsignedchardata, intmaxtry)
{
unsignedcharch;
do{
do{
ch=inport(0x64);
}while (ch&0x02);
outport(0x60,data);
do{
ch=inport(0x64);
}while(ch&0x01);
©CopyrightVirtualUniversityofPakistan 185
22-KeyboardInterface,DMAController
if(ch==0xfa)
{puts("success\n");
break;
}
maxtry=maxtry-1;
}while(maxtry!=0); if
(maxtry==0)
return1;
else
return0;
}
voidmain()
{
//clrscr();
SendKbdRate(0xf3,3);
SendKbdRate(0x7f,3);
gets(st);
SendKbdRate(0xf3,3);
SendKbdRate(0,3);
gets(st);
}
Now this function is used to change the typematic rate. Firstly 0XF3 is written to indicate
that the typematic rate is to be changed then the typematic rate is set to 0x7F and a
strngcan be type to experience the new typematic rate. Again this rate is set to 0. This
program willnotworkifyouhavebootedthesysteminwindows.FirstbootthesysteminDOS and
then run this program.
©CopyrightVirtualUniversityofPakistan 186
22-KeyboardInterface,DMAController
ChangingLEDsStatus
#include<bios.h>
#include<dos.h>
char st [80];
unsignedcharfar*kbd=
(unsignedcharfar*)0x00400017;
intSendKbdRate(unsignedchardata, intmaxtry)
{
unsignedcharch;
do{
do{
ch=inport(0x64);
}while (ch&0x02);
outport(0x60,data);
do{
ch=inport(0x64);
}while(ch&0x01);
ch=inport(0x60);if
(ch==0xfa)
{puts("success\n");
break;
}
maxtry=maxtry-1;
}while(maxtry!=0); if
(maxtry==0)
return1;
else
return0;
}
©CopyrightVirtualUniversityofPakistan 187
22-KeyboardInterface,DMAController
voidmain()
{
//clrscr();
SendKbdRate(0xed,3);
SendKbdRate(0x7,3);
puts("Enter a string ");
gets(st);
*kbd=(*kbd )|0x70;
puts("Enterastring");
gets(st);
}
Again the same function is being used in this program to turn on the keyboard
LEDs.Firstly 0xED is sent to indicate the operation and then 7 is written to turn on all the
LEDs. But tuning on the LEDs like this will not change the keyboard status indicated by
the byteat 40:17H. If the status for the device driver usage is to changes as well then the
corresponding at 40:17H can be set by ORing it with 0x70. This program will not work if
you have booted the system in windows. First boot the system in DOS and then run this
program.
DMAController
MainMe
I/O Processor
mory
DMA
©CopyrightVirtualUniversityofPakistan 188
22-KeyboardInterface,DMAController
DMA is a device which can acquire complete control of the buses and hence can be
usedto transfer data directly from port to memory or vice versa. Transferring data like this
can prove faster because a transfer will consume 2 bus cycles if it is performed using the
processor. So in this approach the processor is bypasses and its cycles are stolen and are
used by the DMA controller.
©CopyrightVirtualUniversityofPakistan 189
23-DirectMemoryAccess(DMA)
23 -DirectMemoryAccess(DMA)
The latch B of the DMA interface is used to hold the higher 4 or 8 bits of the 20 or 24 bit
absolute address respectively. The lower 16bits are loaded in the base address register and
the number of bytes to be loaded are placed in the count register. The DMA requests to
acquirebusesthroughtheHOLDsignal,itreceivesaHLDA(HoldAcknowledge )signal if no
higher priority signal is available. On acknowledgment the DMA acquires control of the
buses and can issue signals for read and write operations to memory and I/O ports
simultaneously.The DREQ signals are used by various devices to request a DMA operation.
And if the DMA controller is successful in acquiring the bus it sends back the DACK signal
to signify that the request is being serviced. For the request to be serviced properly the
DMA channel must the programmed accurately before the request.
DMACascading
A single DMA can transfer 8bit operands to and from memory in a single a bus cycle. If
16bit values are to be transmitted then two DMA controllers are required and should be
cascaded as shown above.
©CopyrightVirtualUniversityofPakistan 190
23-DirectMemoryAccess(DMA)
DMAProgrammingModel
• DMAhas4–Channels
• Each Channel can be programmed to transfer a
block of maximum size of 64k.
• ForeachChannelthereisa
• BaseRegister
• CountRegister
• HigherAddressNibble/ByteisplacedinLatchB.
• The Mode register is conveyed which Channel is
to be programmed and for what purpose i.e. Read
Cycle, Write Cycle, Memory to memory transfer.
• ArequesttoDMAismade tostartit’stransfer.
InternalRegisters
• Noof16&8bitInternalregisters
• Totalof27internalregistersinDMA
Register Number Width
StartingAddress 4 16
Counter 4 16
CurrentAddress 4 16
CurrentCounter 4 16
TemporaryAddress 1 16
TemporaryCounter 1 16
Status 1 8
Command 1 8
IntermediateMemory 1 8
Mode 4 8
Mask 1 8
Request 1 8
The above slides shows the characteristics of each register when a DMA channel is to be
programmed and also shows the total number of registers in the DMA controller. Some of
the registers are common for all channels and some are individual for each channel.
©CopyrightVirtualUniversityofPakistan 191
23-DirectMemoryAccess(DMA)
DMAModes
• BlockTransfer
• SingleTransfer
• DemandTransfer
The DMA can work in above listed modes. In block transfer mode the DMA is
programmed to transfer a block and does not pause or halt until the whole block is
transferred irrespective of the requests received meanwhile.
In Single transfer mode the DMA transfers a single byte on each request and updates the
counter registers on each transfer and the registers need not be programmed again. On the
next request the DMA will again transfer a single byte beginning from the location it last
ended.
Demand transfer is same as block transfer, only difference is that the DREQ
signalremains active throughout the transfer and as soon as the signal deactivates the
transfer stops and on reactivation of the DREQ signal the transfer may start from the point
it left.
©CopyrightVirtualUniversityofPakistan 192
23-DirectMemoryAccess(DMA)
ProgrammingtheDMA
The above table shows the addresses of all the registers that should be programmed to
perform a transfer. These registers act as status and control registers and are common for
all the channels.
©CopyrightVirtualUniversityofPakistan 193
23-DirectMemoryAccess(DMA)
DMA StatusRegister
Terminal count if reached signifies that the whole of the block as requested through some
DMA channel has been transferred. The above status register maintains the status of
Terminal count (TC) and DREQ for each channel within the DMA.
©CopyrightVirtualUniversityofPakistan 194
23-DirectMemoryAccess(DMA)
©CopyrightVirtualUniversityofPakistan 195
23-DirectMemoryAccess(DMA)
©CopyrightVirtualUniversityofPakistan 196
CS609- System Programming
Solved MCQS May – 19 - 2013
From Midterm Papers
MIDTERM EXAMINATION
Spring 2012
CS609- System Programming
1
Question No: 5 ( Marks: 1 ) - Please choose one
DTE is ____________.
►Preemptive
►Non-Preemptive (Page 48)
►Both Preemptive and Non-Preemptive
►None of Given
2
Question No: 10 ( Marks: 1 ) - Please choose one
The keyboard makes use of interrupt number _______ for its input operations.
►9 (Page 34)
►10
►11
►12
3
Question No: 16 ( Marks: 1 ) - Please choose one
LPTs can be swapped.
►True (Page 92)
►False
►Int 16H
►Int 17H (Page 84)
►Int 18H
►Int 19H
►D1
►D2
►D3
►D4 (Page 104)
4
MIDTERM EXAMINATION
Fall 2010
CS609- System Programming
►15H/2FH
►15H/4FH (Page 44)
►15H/FFH
5
Question No: 6 ( Marks: 1 ) - Please choose one
The BIOS interrupt ________ can be used to configure RTC.
►Five
►Seven
►Four
►Six (Page 72)
6
Question No: 12 ( Marks: 1 ) - Please choose one
DCE stands for __________.
►7 (Page 168)
►8
►9
►6
►2 (Page 105)
►3
►4
►5
►40:00H
►40:02H
►40:08H (Page 92)
►40:1AH
7
Question No: 17 ( Marks: 1 ) - Please choose one
The amount of memory above conventional memory (extended memory) can be determined using the service
_______.
►0xFD
►0xED (Page 181)
►0xFF
►0xEE
8
MIDTERM EXAMINATION
Spring 2009
CS609- System Programming (Session - 1)
9
Question No: 6 ( Marks: 1 ) - Please choose one
Each paragraph in keep function is ____ bytes in size.
►4
►8
►16 (Page 24)
►32
►9H
►13H
►15H
►65H (Page 65)
►4
►8 (Page 71)
►16
►32
10
Question No: 12 ( Marks: 1 ) - Please choose one
The interval timer can operate in ____modes.
►Three
►Four
►Five
►Six (Page 72) rep
11
Question No: 18 ( Marks: 1 ) - Please choose one
Interrupt ______ is used to get or set the time.
►0AH
►1AH (Page 136)
►2AH
►3AH
►1A/02H
►1A/03H (Page 138)
►1A/04H
►1A/05H
►Asynchronous serial
►Synchronous serial (not sure)
►Parallel communication
►None of the given
MIDTERM EXAMINATION
Spring 2009
CS609- System Programming (Session - 1)
►Programmed I/O
► Interrupt driven I/O
►Hardware Based I/O (Page 4)
►None of given
12
Question No: 2 ( Marks: 1 ) - Please choose one
The Function of I/O controller is to provide ____________.
►I/O control signals
►Buffering
►Error Correction and Detection
►All of given (Page 5) rep
13
Question No: 8 ( Marks: 1 ) - Please choose one
The microprocessor package has many signals for data. Below are some in Correct priority order (Higher to
Lower).
►Three
►Four
►Five
►Six (Page 72) rep
14
Question No: 13 ( Marks: 1 ) - Please choose one
BIOS DO NOT support ______.
►LPT1
►LPT2
►LPT3
►LPT4 (Page 91)
►D1
►D2
►D3
►D4 (Page 101)
►1
►3
►5
►7 (Page 114)
►INT 10 H
►INT 11 H
►INT 12 H (Page 162)
►INT 13 H
15
Question No: 18 ( Marks: 1 ) - Please choose one
Bit number _______ of coprocessor control word is the Interrupt Enable Flag.
►7 (Page 168)
►8
►9
►10
►0xF3
►0xED (Page 181) rep
►0xE5
►0xFF
MIDTERM EXAMINATION
Spring 2009
CS609- System Programming (Session - 1)
►True
►False
16
Question No: 3 ( Marks: 1 ) - Please choose one
In case of synchronous communication a timing signal is required to identify the start and end of a bit.
►Service#0
►Service#1 (Page 121)
►Service#2
►None of the given option.
17
Question No: 9 ( Marks: 1 ) - Please choose one
The ________function initialize the COM port whose number is passed as parameter using BIOS services.
►Initializecom()
►Initialize() (Page 125)
►Recievechar()
►None of these option
►Synchronous communication
►Asynchronous communication (Page 106)
►Both
►None of given
►Clock counter
►ROM
►Clock
►Real time clock (Page 136)
18
Question No: 14 ( Marks: 1 ) - Please choose one
There are two type of communication synchronous and Anti Synchronous
►True
►False (Page 105)
►STOn()
►SelfTest()
►SelfTestOn() (Page 127)
►Non of these
►True
►False
19
CS609 System Programming
Mid Term Examination – Spring 2006
► 4 bytes
► 6 bytes
► 8 bytes Click here for detail
► 10 bytes
► 0040:0000H
► 0040:0013H
► 0040:0015H
► 0040:0017H (Page 29)
Question No. 4 Marks : 2
If we use keep (0, 1000) in a TSR program, the memory allocated to it is
► 64000 bytes
► 32000 bytes
► 16000 bytes
► 80000 bytes
► 64
► 128
► 256 (Page 10)
► 512
20
CS609 – Solved Quizzes (1 & 2)
Quiz No.1
Question : 1 of 10 ( Marks: 1 ) - Please choose one
Total No. of bytes that can be stored in Keyboard Buffer is____.
►16
►32 (Page 54)
►64
►128
►Service # 0
►Service # 1
► Service # 2 (Page 121)
►None of the given options
21
Question : 6 of 10 ( Marks: 1 ) - Please choose one
------------ is used to read date from RTC
►1A\02H
►1A\03H
►1A\04H (Page 138)
►1A\05H
►Receivebyte ();
►Receive ();
►Receivechar (); (Page 125)
►None of the given option
►1A\02H
►1A\03H (Page 138)
►1A\04H
►1A\05H
22
Quiz No.2
Question : 1 of 10 ( Marks: 1 ) - Please choose one
Software based flow control make use of -------- control characters
►Xon
►XOFF
►Both (Page 135)
►None
►0
►1
►2
►3 (Page 249)
23
Question : 7 of 10 ( Marks: 1 ) - Please choose one
PPI interconnection _______ bits is cleared to indicate low nibble is being sent.
►D1
►D2
► D3
► D4 (Page 101)
►3
►5
►8 (Page 257)
►11
►Contiguous http://www.pgallert.de/english/SysAdmin/OS/file.htm
►Chained
►Indexed
►None
24
Question : 2 of 10 ( Marks: 1 ) - Please choose one
To access the block within cluster using BIOS services the cluster number should be converted into _____.
►CHS
►LBA
►LSN (Page 258)
►None of the given
25
Question : 4 of 10 ( Marks: 1 ) - Please choose one
Control information in files is maintained using
►BPB
►DPB
►FCB (Page 256)
►FPB
►1
►2 (Page 54)
►4
►8
►01
►02
►03
►FF
► 0x0A
►0x0B
► 0x0C
►0x0F (Page 95)
►0xf ,IRQ 7
► 0xa, IRQ 6
► 0x8, IRQ 5
►0x6, IRQ 2
26
Question : 9 of 10 ( Marks: 1 ) - Please choose one
If we want to produce the grave voice from speaker phone then we have to load the ____ divisor values at Port
____.
►high, 0x42
►low, 0x22
►high, 0x22
►low, 0x42
►if (((*scr)&12)==12)
►if (((*scr)&8)==8)
►if (((*scr)&4)==4)
►if (((*scr)&2)==2)
27
Reading Contents
Windows History
Keeping the demand of graphical user interface, Microsoft developed its first version Windows
3.1. In this version, DOS Kernel and FAT based file systems were used.
After that in the 1990s, certain new versions of Windows named Windows 95, 97 and 98 were
introduced supporting the 32-bit architecture of Intel’s processors.
Later on Windows NT versions were introduced supporting a file system based on new
technology called NTFS. Its security and file system was better than the previous versions.
Windows Server 2008 OS was developed for professional use to manage enterprise and server
applications. Support for multi-core technology and 64-bit applications was provided in this OS.
Other Windows versions supporting 32-bit, 64-bit architecture, multi-core and multiprocessing
were also introduced including Windows XP, Windows Vista, Windows 7, 8 and Windows 10.
Due to its dominance role, certain applications and software development tools are available in
the market that can easily integrate with Windows OS and can develop windows applications
ranging from small scale to enterprise level.
One of the key features of Windows OS is its rich GUI that makes its use very convenient. This
interface can be easily customized according to the local setup. The size, color and visibility of
graphical interface objects can also be changed by the user.
Compared to other operating systems, certain modern features exist in Windows due to which
most of the developers develop their applications for Windows targeting the huge market of
Windows.
Open Source Software is a software that is publically available with its source code to use,
modify and distribute with original rights. It is developed by the community rather than a single
company or vendor. In contrast, proprietary software is copyrighted and only available to use
under a license.
· As Windows components are provided and updated only by a single vendor, its
implementation remains uniform throughout the world. Further, extensions in Window
components or APIs are only vendor-specific and so no non-standard extension is possible
except for platform differences.
Windows also support various types of hardware platforms like open systems.
● In Windows OS, all the system resources including processes, threads, memory, pipes,
DLL etc. are represented by objects which are identified and referenced by a handle.
These objects cannot be directly accessed. In case, if any application approaches to
access these objects directly, Windows throws an appropriate exception. The only way
to access and operate on these objects is a set of APIs provided by Windows. Several
APIs can be related to a single object to manipulate it differently.
● A long list of parameters is associated with each API where each parameter has its own
significance but only few parameters are specified for a specific operation.
● To perform the task of multitasking and multi-threading efficiently, Windows provides a
number of synchronization constructs to arbitrate among the resources.
● The names of Windows APIs are long and descriptive for its proper and convenient use.
● Some pre-defined data types required for Windows APIs are:
■ BOOL (for storing a single logical value)
■ HANDLE (a handle for object)
■ LPTSTR (a string pointer)
■ DWORD (32-bit unsigned integer)
● Windows Data types avoid the pointer operator (*).
● Some lowercase prefix letters with variable names are used to identify the type of
variable. This notation is called Hungarian notation. For example, in the variable name
lpszFilename, ‘lpsz’ is Hungarian notation representing a long pointer to zero
terminated string.
● windows.h is a header file including all the APIs prototypes and data types
Topic 6: 32-Bit and 64-Bit Source Code Portability
Windows keeps two versions of each API, one for 32-bit and other for 64-bit. A 32-bit code can
be run on 64-bit hardware but will be unable to exploit some features of 64-bit like accessing
large disk space or using large pointer or 64-bit operation.
Latest versions of Windows support both 32 and 64-bit architectures by keeping two versions of
each API, one for 32-bit and other for 64-bit.
Interoperability of 32 and 64-bit: A single source code can be built for 32-bit as well as 64-bit
versions. To decide whether executable code of 32 or 64-bit is generated by the compiler at
runtime, it depends on its settings or configuration. Further, to decide which version of API is
used, it is also based on the compiler’s configuration.
A 32-bit code can run on 64-bit hardware successfully but will be unable to use some features
of 64-bit like large disk space, large pointer etc.
A source code developed for 64-bit architecture cannot easily run on a 32-bit machine. For this
purpose, re-compilation of the program is required and suitable configuration is made in the
compiler to generate a 32-bit executable code.
Windows provides a set of built-in APIs to perform I/O operations. A related API with specific
parameters is invoked for the concerned resource and I/O operation is performed.
Similarly, certain C/C++ standard functions are available to perform I/O operations. For example,
fopen(), fclose(), fread(), fwrite() etc. are C functions that can be used to perform I/O
operations related to files.
Standard C functions can be used inside the source code to run on Windows platform because
Windows has system calls at low level to support C/C++ functions for I/O operations.
In case, if portability is not focused and required to avail the advanced Windows features, then
it is preferred to use the Windows APIs.
if(argc!=3) {
printf(“Usage: cp file1 file2\n”);
return 1; }
inFile=fopen(argv[1], “rb”);
if(inFile==NULL) {
perror(argv[1]);
return 2;
}
outFile=fopen(argv[2], “wb”);
if(outFile==NULL) {
perror(argv[2]);
return 3; }
This program is used to copy one file to another using C standard functions. In this program, a
buffer of size 256 bytes is used in which the chunks of file are copied one by one.
The source file is opened in read binary mode and the destination file in write binary mode
using the C fopen() function.
If both are successfully opened, then a file is read inside a loop chunk by chunk using fread()
function and written onto the destination file using fwrite() function. After a few iterations, the
file will be written to the destination file and both files are closed.
if(argc !=3) {
fprintf(stderr, “Usage: cp file1 file2\n”);
return 1; }
lpwszFile1 = (LPTSTR)malloc(510);
lpwszFile2 = (LPTSTR)malloc(510);
iLen1 = MultiByteToWideChar(CP_ACP, 0, argv[1], -1, lpwszFile1, 510);
iLen2 = MultiByteToWideChar(CP_ACP, 0, argv[2], -1, lpwszFile2, 510);
hIn=CreateFile(lpwszFile1, GENERIC_READ, FILE_SHARE_READ, NULL,
OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (hIn==INVALID_HANDLE_VALUE) {
fprintf(stderr, “Cannot open input file. Error: %x\n”, GetLastError());
return 2; }
Numerous Windows functions are used to perform various tasks at low level. However,
Windows has a set of Convenience functions that combine several functions to perform a
common task. In most cases, these functions improve the performance because several tasks
are performed by a single function.
For example, CopyFile() is a convenience function that replaces the algorithms used for creating,
opening, reading and writing one file to another.
Program 1 - 3
#include<stdio.h>
#include<windows.h>
#define BUF_SIZE 256
LPWSTR lpwszFile1, lpwszFile2;
INT iLen1, iLen2;
CreateFile() API with a list of parameters is used to open or create a new file. Its return type is
HANDLE to an open file object in case of successful opening or creation. The parameters are:
dwDesiredAccess: It is a 32-bit double word which specifies the GENERIC_READ and WRITE
access.
Syntax:
● If the file is not opened in concurrent mode, then ReadFile() starts reading from the
current position.
● If the current location is End of File, then no Errors occur and *lpNumberOfBytesRead is
set to zero
● The function returns FALSE if it fails in case any of the parameter is invalid
Parameters
Syntax:
To write through the current size of file, the file must be opened with
FILE_FLAG_WRITE_THROUGH option
Lecture 15: Closing a File
After opening & using a file, it is required to close and invalidate the file handles in order to
release the system resources.
The CloseFile() API is used to close a file. A file handle is passed as parameter to this API as a
result of which the API will return True or False value. If the operation is successful, then it
returns True value. In case if the handle is already invalid, then it will return False value.
While writing a new code or enhancing an existing one, a programmer can adapt any of the
following strategies based on requirements.
<everything.h> includes all the header files required for typical Windows program
Example:
#include<everything.h>
VOID ReportError(LPCTSTR userMessage, DWORD exitCode, BOOL printErrorMessage)
{
DWORD eMsgLen, errNum,=GetLastError();
LPTSTR lpvSysMsg;
_ftprintf(stderr, _T(“%s\n”), userMessage)
if (printErrorMessage)
{
eMsgLen = FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
FORMAT_MESSAGE_FORM_SYSTEM, NULL, errNum, MAKELANGID(LANG_NEUTRAL,
SUBLANG_DEFAULT), (LPTSTR) &lpvSysMsg, 0, NULL);
if (eMsgLen > 0)
{
_fprintf(stderr, _T(“%s\n”), lpvSysMsg);
}
else
{
_ftprintf (stderr, _T(“Last Error Number; %d\n”), errNum);
}
if (lpvSysMsg != NULL)
LocalFree(lpvSysMsg);
}
if (exitCode>0)
{
ExitProcess(exitCode);
return;
}
}
Here this function ReportError() receives three parameters. Inside this function, the result of
GetLast() function is stored in the variable, errNum. A generic string variable “lpvSysMsg” is
declared for storing the error message.
If the user likes to print the error message, then the error code will first format and convert it
into a string using the FormatMessage() function.
If message length is greater than 0, then it will print the message, otherwise will print the error
code and memory is deallocated.
Topic 20:
● Enviornment.h
● Everything.h
#if(WIN32_WINNT>=0x600)
#else
#endif
#endif
#ifdef UNICODE
#define_UNICODE
#endif
#ifndef UNICODE
#undef_UNICODE
#endif
Everything.h includes all the header file that will be typically required for all the
subsequent window programs.
#include “ENVIORNMENT.h
#include<stdlib.h>
#include “support.h”
#include _MT
#endif
#include “Everything.h”
_ftprint(stderr,_t(“%s\n”},userMessage};
If (printErrorMessage)
eMsgLen= FormatMessage{FORMAT_MESSAGE_ALLOCATE_BUFFER|
FORMATE_MESSAGE_FROM_SYSTEM,
NULL,errNUM,MAKELANGID{LANG_NEUTRAL, SUBLANG_DEFAULT),
(LPSTR)&lpsSysMSg,0,NULL);
If(eMsgLen>0)
_ftprint(stder,T(“%s\n”|,lpvSysMsg));
Else
_ftprint{stderr,_T{“LastErrorNumber,%d\n”},errNum};
If (lpvSysMsg!=NULL) LOCALFree(lpvSysMsg);
If(exitCode>0)
ExitProcess(exitCode);
return;
}
Topic 21:- Standard IO devices:
Reading Material:
● Input
● Output
● Error
1. STD_INPUT_HANDLE
2. STD_OUPUT_HANDLE
3. STD_ERROR_HANDLE
Operating system have also the concept of redirection using the given API
It also return true if calls succeed and return false in case of fail.
Topic 22: Copying multiple files using windows API
Reading Content:
We will see in this module how to display files using cosonle APIs of windows,
in other words display file on screen is actually copying file on console. For this
we made utility function “options()”. We will also use this function further. This
function take a variable list of parameters and use to parse these variable list.
Basically we specify the list of parameters of a program on command prompt,
there are number of options in it. Option () is use to parse these options. It
identify the “-“ prefix and check all the possible options, and set the flag
against the set options.
#include “Everything.h”
#include <stdarg.h>
DWORD Options (int argc, LPCTSTR argv [], LPCTSTR OptStr, …) /*… show the
parameters list are variables */
Va_list pFlagList;
LPBOOL pFlag;
*pFlag= False;
For (iArg= 1; !(*(pFlag) && iArg <argc &&argv[iARG] [0]== _T(‘-‘); iArg++)
*pFlag = _memtchr (argv [iArg], OptStr [iFlag], _tcslen (argv [iArg]))!=
NULL
iFlag++;
Va_end (pFlagList);
For (iArg= 1; !(*(pFlag) && iArg <argc &&argv[iARG] [0]== _T(‘-‘); iArg++);
Retrun iArg;
CatFile Function:
Return;
}
Topic 23 : Encrypting files
Reading Content:
Encryption is a very old technique, and roman empire use to encrypt secret
conversation in war days and they use Ceasar Cipher algorithm to encrypt. In
this method an alphabet is substituted by another alphabet placed n positions
forward in circular manner. The text that is changed using encryption method is
called Cipher text.
The text that we are going to encrypt is called plain text so it is denoted by P
and after encrypt we present it with C.
● C = (P + n) mod 256
This technique is not exactly cipher but little bit similar to cipher. Following is
the code of encrypting file
#include "Everything.h"
#include <io.h>
BOOL cci_f (LPCTSTR, LPCTSTR, DWORD);
if (argc != 4)
return 0;
if (hOut == INVALID_HANDLE_VALUE) {
CloseHandle(hIn);
return FALSE;
}
while (writeOK && ReadFile (hIn, buffer, BUF_SIZE, &nIn, NULL) && nIn > 0) {
CloseHandle (hIn);
CloseHandle (hOut);
return writeOK;
Reading Content:
We will see in this module the types of API for file management. Windows
provides lots of function for file and directory management. These functions
are pretty straightforward and easy to use.
• Delete
• Copy
• Rename
Delete
Delete function will help to delete the file on a given path. For deleting file
the following API is used.
Returns TRUE if the file at the given valid file path is deleted
Copy
Hard Copy
Move
DWORD dwFlags );
We will discuss the functions which we can use for directory management. We
will do different directory operation like create directory , remove directory and
move directory.
LPSECURITY_ATTRIBUTES lpSecurityAttributes
);
BOOL RemoveDirectory(
LPCSTR lpPathName
);
BOOL SetCurrentDirectory(
LPCTSTR lpPathName
);
DWORD GetCurrentDirectory(
DWORD nBufferLength,
LPTSTR lpBuffer
);
In the previous module we see the certain APIs , which perform input/output
operations on console, now we use these APIs. We create two types of utility
functions, one is help us to display string on console and other is pass some
message to user and also take input from users. Following are names and
description of these functions.
#include "Everything.h"
#include <stdarg.h>
LPCTSTR pMsg;
va_end (pMsgList);
return FALSE;
va_end (pMsgList);
return TRUE;
BOOL success;
if (success)
else
CloseHandle (hIn);
CloseHandle (hOut);
return success;
Topic 28:
In this module we will see a small code which will use get current directory.
#include "Everything.h"
DWORD lenCurDir;
return 0;
Topic 29:
If we see historically, there were some file system which was of 12-bit after that
we have 32-bit system and still somewhere 32-bit file system are used. FAT
based system allowed a maximum file size of 232 bytes which is 4GB. NTFS
theoretically provides the file size limit of 264 which is very huge.
Files of such proportion are called huge files. Although for most of the
application 32 bit file space is sufficient. However due to rapid technological
changes leading to increased disk spaces its useful to know how to deal with 64
bit huge file spaces and windows facilitate with some API’s that support 64-bit
file system.
Topic 30:
Whenever a file is opened using CreateFile() the file pointer is placed at the start
of file. The file pointer changes as ReadFile() or WriteFile() operations are
performed. Every subsequent read/write operation is performed at the current
file pointer position.
SetFilePointer()
DWORD SetFilePointer(
HANDLE hFile,
DWORD dwMoveMethod
);
hFile
lDistanceToMove
The low order 32-bits of a signed value that specifies the number of bytes to
move the file pointer.
lpDistanceToMoveHigh
A pointer to the high order 32-bits of the signed 64-bit distance to move.
dwMoveMethod
FILE_BEGIN // file pointer move number of bytes w.r.t the start of file
Topic 31:
For large files that may have size 264 , we need to understand the 64 bit
arithmetic. To facilitate 64-bit integer arithmetic windows provide a union
LARGE_INTEGER. This union has structure for dealing with lower and higher
double words Moreover it also has a field to deal with whole quadword of type
LONGLONG.
struct {
DWORD LowPart;
LONG HighPart;
};
struct {
DWORD LowPart;
LONG HighPart;
} u;
LONGLONG QuadPart;
} LARGE_INTEGER;
Extension of SetFilePointerEx
BOOL SetFilePointerEx(
HANDLE hFile,
LARGE_INTEGER liDistanceToMove,
PLARGE_INTEGER lpNewFilePointer,
DWORD dwMoveMethod
);
hFile
A handle to the file. The file handle must have been created with the
GENERIC_READ or GENERIC_WRITE access right
liDistanceToMove
lpNewFilePointer
A pointer to a variable to receive the new file pointer.
dwMoveMethod
FILE_BEGIN
FILE_CURRENT
FILE_END
Topic 32:
union {
struct {
DWORD Offset;
DWORD OffsetHigh;
} DUMMYSTRUCTNAME;
PVOID Pointer;
} DUMMYUNIONNAME;
HANDLE hEvent;
} OVERLAPPED, *LPOVERLAPPED;
Implementation:
…..
Topic 33:
One method of getting file size is already exist and that is to open a file first
using create file, once file is open the file pointer is pointing to the first byte
then we move file pointer to the end of file (eof). So file pointer is move from
starting to end of file is give use the size of the file. Windows also provides an
API to get file size GetFileSizeEx()
GetFileSizeEx()
BOOL GetFileSizeEx(
HANDLE hFile,
PLARGE_INTEGER lpFileSize
);
hFile
A handle to the file. The handle must have been created with the
FILE_READ_ATTRIBUTES access right
lpFileSize
Windows also give option to set the size. The file size can also be changed,
reducing the file size truncate data. Increasing the file size can be useful where
the size of file is expected to grow. We use SetEndOfFileEx() to change the file
size.
Topic 34:
In this topic we discuss the example of creates a file with a capacity of specified
records. The file has a header followed by equal size records. The feature of
this example is, user can modify any record randomly and get the total count of
records in the file.
#include "Everything.h"
SYSTEMTIME recordCreationTime;
SYSTEMTIME recordLastRefernceTime;
SYSTEMTIME recordUpdateTime;
TCHAR dataString[STRING_SIZE];
} RECORD;
DWORD numRecords;
DWORD numNonEmptyRecords;
} HEADER;
HANDLE hFile;
LARGE_INTEGER currentPtr;
RECORD record;
TCHAR string[STRING_SIZE], command, extra;
SYSTEMTIME currentTime;
if (argc < 2)
if (hFile == INVALID_HANDLE_VALUE)
header.numRecords = _ttoi(argv[2]);
currentPtr.QuadPart = (LONGLONG)sizeof(RECORD) *
_ttoi(argv[2]) + sizeof(HEADER);
if (!SetEndOfFile(hFile))
return 0;
while (TRUE) {
continue;
currentPtr.QuadPart = (LONGLONG)recNo *
sizeof(RECORD) + sizeof(HEADER);
ov.Offset = currentPtr.LowPart;
ov.OffsetHigh = currentPtr.HighPart;
record.recordLastRefernceTime = currentTime;
if (record.referenceCount == 0) {
if (prompt) _tprintf (_T("record
Number %d is empty.\n"), recNo);
continue;
} else {
recNo, record.referenceCount);
record.referenceCount = 0;
header.numNonEmptyRecords--;
headerChange = TRUE;
recordChange = TRUE;
} else if (command == _T('w')) { /* Write the record, even if for the first time */
if (record.referenceCount == 0) {
record.recordCreationTime =
currentTime;
header.numNonEmptyRecords++;
headerChange = TRUE;
record.recordUpdateTime = currentTime;
record.referenceCount++;
recordChange = TRUE;
} else {
if (headerChange) {
argv[1], header.numNonEmptyRecords,
header.numRecords);
CloseHandle (hFile);
return 0;
}
Topic 35:
Windows provide a certain set of APIs for search files/folders within the hierarchical
structure of Directories/folders. These APIs include:
FindFirstFile() API
Where lpFileName represents the directory or path, and the filename. The name can
include wildcard characters, for example, an asteristk (*) or a question mark (?).
DWORD dwFileAttributes;
FILETIME ftCreationTime;
FILETIME ftLastAccessTime;
FILETIME ftLastWriteTime;
DWORD nFileSizeHigh;
DWORD nFileSizeLow;
DWORD dwReserved0;
DWORD dwReserved1;
CHAR cFileName[Max_Path];
CHAR cAlternateFileName[14];
DWORD dwFileType;
DWORD dwCreatorType;
DWORD WFinderFlags;
};
FindNextFile() API:
Where hFindFile represents the search handle returned by a previous call to the
FindFirstFile or FindFirstFileEx function &
FindClose() API
Topic 36:
Certain other APIs are also used for getting the file attributes but these API need to
have an open file handle rather than scan a directory or use a filename.
GetFileTime() API
lpCreationTime is a pointer to a FILETIME structure to receive the data and time the file
or directory was created.
lpLastAccessTime is a pointer to a FILETIME structure to receive the data and time the
file or directory was last accessed.
lpLastWriteTime is a pointer to a FILETIME structure to receive the data and time the
file or directory was last written to truncated or overwritten.
FileTimeToSystemTime() API
SystemTimeToFileTime() API
CompareFileTime() API
It compares file times of two files. It returns -1 if less, 0 if equal and +1 if greater.
SetFileTime() API
It sets the three time of file. NULL used if the file time is not to be changed.
GetFileType API
GetFileAttributes() API
DWORD GetFileAttributes(LPCTSTR lpFileName);
Where lpFileName is the name of a file or directory. Its return value is:
Topic 37:
Windows provide the facility of creating temporary files for storing the intermediate
results. These files are assigned unique names in a directory with extension .tmp.
Certain APIs are used for creating temporary files. These include:
GetTempFileName API
Where lpPathName represents the directory path for the filename. The string cannot be
longer than 14 characters.
Topic 38:
We can get the attributes of a file, listing of files and can traverse the directory
structure using certain windows APIs.
An application called lsW is used for showing files and listing their attributes. It uses
two option switched that is –l and –R where –l option is used to list the attributes of
files in a folder and –R is used for recursive traversal through subfolders.
This application or program will work with a relative pathname; it will not work with
absolute pathname.
#include<everything.h>
DWORD FileType(LPWIN32_FIND_DATA);
int i, fileIndex;
DWORD pathLength;
/* parse the search pattern into two parts: the parent and the filename or wild card
expression. The filename is the longest suffix not containing a slash. The parent is the
remaining prefix with a slash. This is performed for all command line search pattern. If
no file is specified, use * as the search pattern */
UNIX Touch command changes file access and changes the time to
current system time.
SetFileTime() Sets the date and time that the specified file or directory
was created, last accessed, or last modified.
BOOL LockFileEx(
HANDLE hFile,
DWORD dwFlags,
DWORD dwReserved,
DWORD nNumberOfBytesToLockLow,
DWORD nNumberOfBytesToLockHigh,
LPPVERLAPPED lpOverlapped
)
If a program does not release a lock or holds the lock longer, other
programs will not be able to proceed and their performance will be
negatively impacted.
The registry contains two basic elements: keys and values. Registry
keys are container objects similar to folders. Registry values are non-
container objects similar to files. Keys may contain values and sub-
keys. Keys are referenced with a syntax similar to Windows' path
names, using backslashes to indicate levels of hierarchy. Keys must
have a case insensitive name without backslashes.
The hierarchy of registry keys can only be accessed from a known root
key handle (which is anonymous but whose effective value is a
constant numeric handle) that is mapped to the content of a registry
key pre-loaded by the kernel from a stored "hive", or to the content of
a sub-key within another root key, or mapped to a registered service
or DLL that provides access to its contained sub-keys and values.
HKEY_LOCAL_MACHINE or HKLM
HKEY_CURRENT_CONFIG or HKCC
HKEY_CLASSES_ROOT or HKCR
HKEY_CURRENT_USER or HKCU
HKEY_USERS or HKU
HKEY_PERFORMANCE_DATA (only in Windows NT, but invisible in
the Windows Registry Editor)[5]
HKEY_DYN_DATA (only in Windows 9x, and visible in the Windows
Registry Editor)
Service Manager stores many settings in the registry. You seldom have
to edit the registry yourself, because most of those settings are
derived from entries that you make in day-to-day use. However, some
changes to settings might occasionally be required. Service Manager
stores most registry values in the following locations:
HKEY_CURRENT_USER\Software\Microsoft\System
Center<version>\Service Manager\Console
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\System
Center<version>
Topic 47: Listing Registry Keys
You can show all items directly within a registry key by using Get-
ChildItem. Add the optional Force parameter to display hidden or
system items. For example, this command displays the items directly
within PowerShell drive HKCU:, which corresponds to the
HKEY_CURRENT_USER registry hive:
PowerShell
Get-ChildItem -Path HKCU:\ | Select-Object Name
You can also specify this registry path by specifying the registry
provider's name, followed by ::. The registry provider's full name is
Microsoft.PowerShell.Core\Registry, but this can be shortened to just
Registry. Any of the following commands will list the contents directly
under HKCU:.
PowerShell
Get-ChildItem -Path Registry::HKEY_CURRENT_USERGet-ChildItem -
Path Microsoft.PowerShell.Core\Registry::HKEY_CURRENT_USERGet-
ChildItem -Path Registry::HKCUGet-ChildItem -Path
Microsoft.PowerShell.Core\Registry::HKCUGet-ChildItem HKCU:
It's a good idea to use a function call in the filter expression whenever
filter needs to do anything complex. Evaluating the expression causes
execution of the function, in this case, Eval_Exception.
try:
You do your operations here;
......................
except ExceptionI:
If there is ExceptionI, then execute this block.
except ExceptionII:
If there is ExceptionII, then execute this block.
......................
else:
If there is no exception then execute this block.
IEEE defines a standard for floating-point exceptions it is called IEEE Standard for Binary
Floating-Point Arithmetic (IEEE 754). This standard have defined five types of floating-point
exception:
Invalid operation
Division by zero
Overflow
Underflow
Inexact calculation
Errors Exception
Errors are usually raised by the environment in Exceptions are caused by the code of the
which the application is running. app the code belongs to.
The use of try-catch blocks can handle
It is not possible to recover from an error. exceptions and recover the system from
exception.
Errors occur at run-time and are unknown by the Exceptions may or may not be caught by
compiler. the compiler.
Programmers include an explicit test to check for An exception could occur nearly
error,for example whether a file read/write operation anywhere, and it is practical to test for
has failed. an exception.
Topic 55: Treating Errors as Exceptions
Programmer can use ReportError function to report the error and let windows treat it and
terminate the process.But it have its own limitations.
• A fatal error terminates the entire process when only a single thread should terminate.
• The programmer may require to continue program execution rather than terminate the process.
• Synchronization resources, such as events or semaphores, will not be released in most of the
circumstances.
Termination handlers are declared in language-specific syntax. Using the Microsoft C/C++
Optimizing Compiler, they are implemented using __try and __finally.
The guarded body of code can be a block of code, a set of nested blocks, or an entire procedure
or function.
The termination block is executed when the flow of control leaves the guarded body, regardless
of whether the guarded body terminated normally or abnormally.
Termination and exception handlers allow you to make your program more robust by both
simplifying recovery from errors and exceptions and helping to ensure that resources and file
locks are freed at critical junctures.
Consider the below given code where we are checking for invalid handle and
ReportingException.
If (hin == INVALID_HANDLE_VALUE)
ReportException(argv[iFile],1);
If(!GetFileSizeEx(hIn, &fsize) || fsize.HighPart > 0)
ReportException(_T(“FIle is Too Large”,1));
The filter function identifies the exception type and based on the type of the exception, the
handler can treat each exception differently. In the following program, the exception handling
and termination of a program are illustrated using a filter function.
The program generates an exception based on the type of the exception entered by the user. The
floating point exceptions are enabled with the help of controlfp function and old status is saved
in the fpOld. The try block mentions the different cases of exception generation with the help of
a switch statement.
Now we will see how the values in the ecategory reference variable are placed.
User generated exceptions are identified based on the masking operation generating zero as a
result.
Console control handlers are quite similar to the mechanism of exception handlers. Normal
exceptions respond to several asynchronous events like division by zero, invalid page fault etc.,
but they do not respond to console related events like Ctrl+C. Console control handlers can
detect and handle the events that are console related. The API SetConsoleCtrlHandler() is used to
add console handlers.
The API takes the address of the HandlerRoutine and Add Boolean as parameters. There can be a
number of handler routines if the Add parameter is set as TRUE. If the HandlerRoutine
parameter is set as NULL and Add is TRUE, then the Ctrl-C signal will be ignored.
The handler routine will be invoked if a console exception is detected. Handler routine runs in an
independent thread within the process. Raising an exception within the handler routine will not
interfere with the working of the original routine that created the handler routine. Signals apply
to the whole process, while exception applies to a single thread.
Usually signal handlers are used to perform cleanup tasks whenever a shutdown, close or logoff
events are detected. Signal Handler would return a TRUE value in case it takes care of the task
or it may return FALSE. In case of FALSE, the next handler in the chain is invoked. Signal
handler chain is invoked in the reverse order of which they are set up in and the system signal
handler is the last one in this chain.
We have a simple program that set up a console control handler and starts beeping in a
loop. The control handler is invoked whenever a console event occurs. The handler
handles the event likewise and clears an exitFlag to end the loop of the main function.
/* Chapter 4. CNTRLC.C */
/* Catch Cntrl-C signals. */
#include "Everything.h"
static BOOL WINAPI Handler(DWORD cntrlEvent);
static BOOL exitFlag = FALSE;
int _tmain(int argc, LPTSTR argv[])
while (!exitFlag) { /* This flag is detected right after a beep, before a handler exits */
Sleep(4750); /* Beep every 5 seconds; allowing 250 ms of beep time. */
Beep(1000 /* Frequency */, 250 /* Duration */);
}
}
_tprintf(_T("Stopping the main program as requested.\n"));
return 0;
}
BOOL WINAPI Handler(DWORD cntrlEvent)
{
switch (cntrlEvent) {
/* The signal timing will determine if you see the second handler message */
case CTRL_C_EVENT:
_tprintf(_T("Ctrl-C received by handler. Leaving in 5 seconds or
less.\n"));
exitFlag = TRUE;
Sleep(4000); /* Decrease this time to get a different effect */
_tprintf(_T("Leaving handler in 1 second or less.\n"));
return TRUE; /* TRUE indicates that the signal was handled. */
case CTRL_CLOSE_EVENT:
_tprintf(_T("Close event received by handler. Leaving the handler in 5
seconds or less.\n"));
exitFlag = TRUE;
Sleep(4000); /* Decrease this time to get a different effect */
_tprintf(_T("Leaving handler in 1 second or less.\n"));
return TRUE; /* Try returning FALSE. Any difference? */
default:
_tprintf(_T("Event: %d received by handler. Leaving in 5 seconds or
less.\n"), cntrlEvent);
exitFlag = TRUE;
Sleep(4000); /* Decrease this time to get a different effect */
_tprintf(_T("Leaving handler in 1 seconds or less.\n"));
return TRUE; /* TRUE indicates that the signal was handled. */
}
}
The program can be terminated by the user either by closing the console or with a Ctrl-C. The
handler will register with windows using the SetConsoleCtrlHandler(Handler, TRUE) function.
The handler will be activated upon the occurrence of any console event. If the registration of any
handler fails due to any reason, then an error message will be printed.
No __try and __catch keywords are required with VEH and they are like console control
handlers. Windows provides a set of APIs for VEH management as follows.
The given API has two parameters; FirstHandler is the parameter used to specify the order in
which the handler executes. Non-zero value indicates that it will be the first one to execute and
zero specifies it to be the last. If there are more than one handlers setup with zero value, then
they will be invoked in the order they are added using AddVectoredExceptionHandler(). Return
value is NULL in case of failure, otherwise it returns the Handle to the Vectored Exception
Handler.
The exception handler function should be fast and must return as quickly as possible, therefore it
should not have a lot of code. The VEH should neither perform any blocking operation like the
Sleep() function nor use any synchronization objects. Typically a VEH would access exception
information structure, do some minimal processing, and set a few flags.
Dynamic Memory
Need of dynamic memory arises whenever dynamic data structures like search tables, trees,
linked lists etc. are used. Windows provides a set of APIs for handling dynamic memory
allocation.
Windows also provides memory mapped files which allows direct movement of data to and from
user space and files without the use of file APIs. Memory Mapped files can help conveniently
handle dynamic data structure and make file handling faster because they are treated just like
memory. It also provides a mechanism for memory sharing among processes.
Windows essentially uses two API platforms i.e. Win32 and Win64.
The Win32 API uses pointers of size 32 bits, hence the virtual space is 2^32. All data types have
been optimized for 32 bit boundaries. Win64 uses a virtual space of 2^64 (16 Exabytes).
A good strategy is to design an application in such a way that it could run in both modes without
any change in code.
Win32 makes at least half of the virtual space (8GB) accessible to a process and the rest of the
space is reserved by the system for shared data, code, and drivers etc. Overall Windows provides
a large memory space available to user programs and hence requires optimal management.
Dynamic Memory
Need of dynamic memory arises whenever dynamic data structures like search tables, trees,
linked lists etc. are used. Windows provides a set of APIs for handling dynamic memory
allocation.
Windows also provides memory mapped files which allows direct movement of data to and from
user space and files without the use of file APIs. Memory Mapped files can help conveniently
handle dynamic data structure and make file handling faster because they are treated just like
memory. It also provides a mechanism for memory sharing among processes.
Windows essentially uses two API platforms i.e. Win32 and Win64.
The Win32 API uses pointers of size 32 bits, hence the virtual space is 2^32. All data types have
been optimized for 32 bit boundaries. Win64 uses a virtual space of 2^64 (16 Exabytes).
A good strategy is to design an application in such a way that it could run in both modes without
any change in code.
Win32 makes at least half of the virtual space (8GB) accessible to a process and the rest of the
space is reserved by the system for shared data, code, and drivers etc. Overall Windows provides
a large memory space available to user programs and hence requires optimal management.
Further information about the parameters of Windows Memory Management can be probed
using the following API.
The API returns a pointer to SYSTEM_INFO structure. The structure contains various
information regarding the system like page size, granularity, and application’s physical memory
address space.
A programmer allocates memory dynamically from a heap. Windows maintains a pool of heaps
and a process can have many heaps. Traditionally, one heap is considered enough. But several
heaps may be required to make a program more efficient.
In case a single heap is sufficient, then a runtime library function for heap allocation like
malloc(), free(), calloc(), realloc() might be enough.
Heap is a windows object and hence is accessed by a handle. Whenever you require allocating
memory from heap, you need a heap handle. Every process in windows has a default heap which
can be accessed through the following API.
HANDLE GetProcessHeap(VOID)
The API returns a handle to the process heap. NULL is returned in case of failure and not
INVALID_HANDLE_VALUE.
However, due to a number of reasons it would be desirable to have more than one heap.
Sometimes it is convenient to have distinct heaps for different data structures.
Separate Heaps
1. If a distinct heap is assigned to each thread, then each thread will only be able to use the
memory allocated to each thread.
3. Fragmentation is reduced when one fixed size data structure is allocated from a single heap.
4. Allocating a single heap among each thread simplifies synchronization.
5. If a single heap contains complex data structures, then they can be easily de-allocated with
a single API call by de-allocating the entire heap. We will not need complex de-allocation
algorithms in such cases.
6. Small heaps for a single data structure reduces the chances of page faults as per the
principle of locality.
We can create a new heap using HeapCreate() API and its size can be set to zero. The API
adjusts the heap size to the nearest multiple of page size. Memory is committed to the heap
initially, rather than on demand. In case the memory requirements increase than the initial, more
pages will automatically be allocated to the heap up to maximum size allowed.
If the required memory is not known, then deferring memory commitment is a good practice as
heap is a limited resource. Following is the syntax of the API used to create new heaps.
dwMaximumSize if non-zero, determines the maximum limit of the heap memory set by the
user. Heap is not grow-able beyond this point. In case it’s zero, then the heap is grow-able to the
extent of the virtual memory space available for the heap.
dwInitialSize is the initial size of the heap set by the programmer. SIZE_T is used to enable
portability. Based on the win32 or win64 platforms, SIZE_T will be 32 or 64 bit wide.
BOOL HeapDestroy(
HANDLE hHeap
);
hHeap is the handle to a previously created heap. Do not use the handle obtained from
GetProcessHeap() because it may raise an exception. This is an easy way to get rid of all the
contents of the heap including complex data structures.
Once a heap is created, it does not allocate memory that is directly available to the program.
Rather, it only creates a logical structure of heap that will be used to allocate new memory
blocks. Memory blocks are allocated using heap memory allocation APIs like HeapAlloc() and
HeapReAlloc().
hHeap is the handle of the heap from which memory is to be allocated. dwFlags are quite similar
to the flags used in HeapCreate().
HEAP_GENERATE_EXCEPTIONS: This flag will raise exceptions in case there is any failure
while allocating memory to heap. Exceptions are not generated by CreateHeap(), rather they may
occur at the time of allocation.
dwBytes is the size of the memory block to be allocated. For non-grow-able heap, its 0x7FFF8
approximately equivalent to 0.5 MB.
The return value of the function is LPVOID. This is the address of the allocated memory block.
Use this pointer in a formal way and there is no need to make any reference to the Heap handle.
If the exception flag is not set, then NULL is returned by HeapAlloc() and the GetLastError()
does not work on HeapAlloc().
BOOL HeapFree(HANDLE hHeap, DWORD dwFlags,
hHeap is the heap handle from which memory is to be allocated. dwFlags should be 0 or set to
HEAP_NO_SERIALIZE. lpMem should be the pointer previously returned by HeapAlloc() or
HeapReAlloc().
Return value of FALSE will indicate a failure. GetLastError() can be used to get the error.
HEAP_ZERO_MEMORY: only the newly allocated memory is set to zero (in case dwBytes is
greater than the previous allocation).
lpMem: specifies the pointer to the block previously allocated to the same heap hHeap.
dwBytes: It refers to the block size to be allocated that can be lesser or greater than the previous
allocation. But the same restriction as HeapAlloc applies i.e. the block size cannot be greater
than 0x7FFF8.
Some programs may require to determine the size of allocated blocks in the Heap. The size of the
allocated block is determined using the API HeapSize() as follows.
SIZE_T HeapSize(
HANDLE hHeap,
DWORD dwFlags,
LPCVOID lpMem
);
The function returns the size of the block or zero in case of failure. The only valid dwFlag is
HEAP_NO_SERIALIZE.
Serialization
Serialization is required when dealing with concurrent threads using some common resource.
Serialization is not required if threads are autonomous and there is no possibility of concurrent
threads disrupting each other.
b. Each thread has its own heap that is insulated from other threads.
Heap Exceptions
Heap exceptions are enabled using the flag HEAP_GENERATE_EXCEPTION. This allows the
program to close open “handles” before a program terminates. There can be two scenarios with
this option:
There are some other functions that can be used while working with heaps. For example,
HeapSetInformation() can be used to enable low fragmentation mode. It can also be used to
allow termination of a thread upon heap’s corruption.
Till now we have used the Memory Management APIs for allocating, reallocating and
deallocating heaps. We can also get the size of a heap through an API.
A typical methodology of dealing with heaps should be to get a heap handle either using
HeapCreate() or GetProcessHeap(). Use the handle obtained from the above to allocate memory
blocks from the heap using HeapAlloc(). If some block needs to be deallocated, use HeapFree().
Before the program is terminated or when the heap is not required, use HeapDestroy() to dispose
of the heap.
It is convenient not to mix up windows heap API and Run Time Library functions. Anything
allocated with C library functions should also be deallocated with C library functions.
The example is formulated using two heaps. The first one will be a node heap, while the other is
a record heap. Node heap will be used to build a tree, while the data heap will be used to store
keys.
Following are the three heaps as shown in the figure:
1. ProcHeap
2. RecHeap
3. NodeHeap
ProcHeap contains the root address, but RecHeap stores the records. NodeHeap on the other
hand stores the nodes when they are created. Sorting will be performed in the NodeHeap that
gives a reference to be searched in the RecHeap. The data structure will be maintained in the
NodeHeap. At the end, all the heaps will be destroyed except ProcHeap because it is created
using GetProcessHeap().
Topic 70: Binary Search Using Heaps
The example is formulated using two heaps. One is a node heap and the other is a data heap.
Node heap is used to build a tree, while data heap is used to store keys. Recursive functions are
used for allocating nodes and scanning nodes as tree is a recursive structure. Data in file is read
in a record and the key is used to lexically build a tree. The tree only contains the key entries.
The file is ultimately sorted by an In-order traversal of the tree.
#include "Everything.h"
#define KEY_SIZE 8
/* Structure definition for a tree node. */
typedef struct _TREENODE {
struct _TREENODE *Left, *Right;
TCHAR key[KEY_SIZE];
LPTSTR pData;
} TREENODE, *LPTNODE, **LPPTNODE;
#define NODE_SIZE sizeof (TREENODE)
#define NODE_HEAP_ISIZE 0x8000
#define DATA_HEAP_ISIZE 0x8000
#define MAX_DATA_LEN 0x1000
#define TKEY_SIZE KEY_SIZE * sizeof (TCHAR)
#define STATUS_FILE_ERROR 0xE0000001 // Customer exception
LPTNODE FillTree (HANDLE, HANDLE, HANDLE);
BOOL Scan (LPTNODE);
int KeyCompare (LPCTSTR, LPCTSTR), iFile; /* for access in exception handler */
BOOL InsertTree (LPPTNODE, LPTNODE);
int _tmain (int argc, LPTSTR argv[])
{
HANDLE hIn = INVALID_HANDLE_VALUE, hNode = NULL, hData =
NULL;
LPTNODE pRoot;
BOOL noPrint;
CHAR errorMessage[256];
int iFirstFile = Options (argc, argv, _T ("n"), &noPrint, NULL);
if (argc <= iFirstFile)
ReportError (_T ("Usage: sortBT [options] files"), 1, FALSE);
/* Process all files on the command line. */
for (iFile = iFirstFile; iFile < argc; iFile++) __try {
/* Open the input file. */
hIn = CreateFile (argv[iFile], GENERIC_READ, 0, NULL,
OPEN_EXISTING, 0, NULL);
if (hIn == INVALID_HANDLE_VALUE)
RaiseException (STATUS_FILE_ERROR, 0, 0, NULL);
__try {
/* Allocate the two growable heaps. */
hNode = HeapCreate (
HEAP_GENERATE_EXCEPTIONS | HEAP_NO_SERIALIZE, NODE_HEAP_ISIZE, 0);
hData = HeapCreate (
HEAP_GENERATE_EXCEPTIONS | HEAP_NO_SERIALIZE, DATA_HEAP_ISIZE, 0);
/* Process the input file, creating the tree. */
pRoot = FillTree (hIn, hNode, hData);
/* Display the tree in key order. */
if (!noPrint) {
_tprintf (_T ("Sorted file: %s\n"), argv[iFile]);
Scan (pRoot);
}
} __finally { /* Heaps and file handle are always closed */
/* Destroy the two heaps and data structures. */
if (hNode != NULL) HeapDestroy (hNode);
if (hData != NULL) HeapDestroy (hData);
hNode = NULL; hData = NULL;
if (hIn != INVALID_HANDLE_VALUE) CloseHandle (hIn);
hIn = INVALID_HANDLE_VALUE;
}
} /* End of main file processing loop and try block. */
/* Handle the exceptions that we can expect - Namely, file open error or out of
memory. */
__except ( (GetExceptionCode() == STATUS_FILE_ERROR ||
GetExceptionCode() == STATUS_NO_MEMORY)
? EXCEPTION_EXECUTE_HANDLER :
EXCEPTION_CONTINUE_SEARCH)
{
_stprintf (errorMessage, _T("\n%s %s"), _T("sortBT error on file:"),
argv[iFile]);
ReportError (errorMessage, 0, TRUE);
}
return 0;
}
LPTNODE FillTree (HANDLE hIn, HANDLE hNode, HANDLE hData)
/* Scan the input file, creating a binary search tree in the
hNode heap with data pointers to the hData heap. */
/* Use the calling program's exception handler. */
{
LPTNODE pRoot = NULL, pNode;
DWORD nRead, i;
BOOL atCR;
TCHAR dataHold[MAX_DATA_LEN];
LPTSTR pString;
/* Open the input file. */
while (TRUE) {
pNode = HeapAlloc (hNode, HEAP_ZERO_MEMORY, NODE_SIZE);
pNode->pData = NULL;
(pNode->Left) = pNode->Right = NULL;
/* Read the key. Return if done. */
if (!ReadFile (hIn, pNode->key, TKEY_SIZE,
&nRead, NULL) || nRead != TKEY_SIZE)
/* Assume end of file on error. All records
must be just the right size */
return pRoot; /* Read the data until the end of line. */
atCR = FALSE; /* Last character was not a CR. */
for (i = 0; i < MAX_DATA_LEN; i++) {
ReadFile (hIn, &dataHold[i], TSIZE, &nRead, NULL);
if (atCR && dataHold[i] == LF) break;
atCR = (dataHold[i] == CR);
}
dataHold[i - 1] = _T('\0');
/* dataHold contains the data without the key.
Combine the key and the Data. */
pString = HeapAlloc (hData, HEAP_ZERO_MEMORY,
(SIZE_T)(KEY_SIZE + _tcslen (dataHold) + 1) *
TSIZE);
memcpy (pString, pNode->key, TKEY_SIZE);
pString[KEY_SIZE] = _T('\0');
_tcscat (pString, dataHold);
pNode->pData = pString;
/* Insert the new node into the search tree. */
InsertTree (&pRoot, pNode);
} /* End of while (TRUE) loop */
return NULL; /* Failure */
}
BOOL InsertTree (LPPTNODE ppRoot, LPTNODE pNode)
/* Insert the new node, pNode, into the binary search tree, pRoot. */
{
if (*ppRoot == NULL) {
*ppRoot = pNode;
return TRUE;
}
if (KeyCompare (pNode->key, (*ppRoot)->key) < 0)
InsertTree (&((*ppRoot)->Left), pNode);
Else
InsertTree (&((*ppRoot)->Right), pNode);
return TRUE;
}
int KeyCompare (LPCTSTR pKey1, LPCTSTR pKey2)
/* Compare two records of generic characters.
The key position and length are global variables. */
{
return _tcsncmp (pKey1, pKey2, KEY_SIZE);
}
static BOOL Scan (LPTNODE pNode)
/* Scan and print the contents of a binary tree. */
{
if (pNode == NULL)
return TRUE;
Scan (pNode->Left);
_tprintf (_T ("%s\n"), pNode->pData);
Scan (pNode->Right);
return TRUE;
}
Memory Mapping
Dynamic memory is allocated from the paging file. The paging file is controlled by the
Operating System’s (OS) virtual memory management system. Also the OS controls the
mapping of virtual address onto physical memory. Memory mapped files help to directly map
virtual memory space onto a normal file.
There is no need to invoke direct file Input Output (IO) operations. Any data structure placed in
file will be available for later use as well. It is convenient and efficient to use in-memory
algorithms for sorting, searching etc. Large files could be processed as if they are placed in
memory. File processing is faster than ReadFile() and WriteFile(). There is no need to manage
buffers for repetitive operation on a file. This is more optimally done by OS. Multiple processes
can share memory space by mapping their virtual memory space onto a file. For file operations,
page file space is not needed.
Other considerations
Windows also use memory mapping while implementing Dynamic Link Libraries (DLLs) and
loading & executing executable (EXE) files. It is strongly recommended to use SHE exception
handling while dealing with memory mapped file to look for EXCEPTION_IN_PAGE_ERROR
exceptions.
In order to perform memory mapped file IO operations, file mapping objects need to be created.
This object uses the file handle of an open file. The open file or part of the file is mapped onto
the address space of the process. File mapping objects are assigned names so that they are also
available to other processes. Moreover, these mapping objects also require protection and
security attributes and a size. The API used for this purpose is CreateFileMapping().
LPSECURITY_ATTRIBUTES lpFileMappingAttributes,
DWORD flProtect,
DWORD dwMaximumSizeHigh,
hFile is the open handle to file compatible with protection flag dwProtect
PAGE_READONLY - It means that page can only be read within the mapped region. It can
neither be written nor executed. hFile must have GENERIC_READ access.
PAGE_READWRITE - Provides full access to object if the hFile has GENERIC_READ and
GENERIC_WRITE access.
PAGE_WRITECOPY - Means that when a mapped region changes, a private copy is written to
the paging file and not to the original file.
dwMaximumSizeHigh and dwMaximumSizeLow specify the size of the mapping object. If set to
0, then the current file size is used. Carefully specify this size in the following cases:
· If the file size is expected to grow, then use the expected file size.
· Do not map a region beyond this limit. Once the size is assigned the mapping region
cannot grow.
· The mapping size needs to be specified in the form of two 32-bit values rather than one
64-bit value.
· lpMapName is the name of the map that can also be used by other processes. Set this to
NULL if you do not mean to share the map.
Previously, we discussed that a file mapping can be assigned a shared name by using
CreateFileMapping(). This shared name can be used to open existing file maps using
OpenFileMapping(). A file map created by a certain process can be subsequently used by other
processes by referring to the object by name. The operation may fail if the name does not exist.
BOOL bInheritHandle,
LPCSTR lpMapName );
DWORD dwDesiredAccess,
DWORD dwFileOffsetHigh,
DWORD dwFileOffsetLow,
SIZE_T dwNumberOfBytesToMap );
dwDesiredAccess should be compatible with access rights of file mapping object. The three flag
commonly used are:
FILE_MAP_WRITE
FILE_MAP_READ
FILE_MAP_ALL_ACCESS
dwFileOffsetHigh and dwFileOffsetLow give the starting address of the file from where the
mapping starts. To start the mapping from the start of a file, set both as zero. This value should
be specified in multiples of 64K.
dwNumberOfBytesToMap shows the number of bytes of file to map. Set as zero to map the
whole file.
If the function is successful it returns the base address of the mapped region. If the function fails,
the return value is NULL.
As it is necessary to release Heap blocks with HeapFree(), it is also necessary to unmap file
views. File views are unmapped using UnmapViewOfFile().
lpBaseAddress is the pointer to the base address of the mapped view. If the function fails, the
return value is zero.
Flushing File View
The file view can be flushed using the FlushViewOfFile() API. This will force the OS to
writeback the dirty pages of the file on to disk. In case two processes try to access a file at a time
such that one uses file mapping and other uses ReadFile() and WriteFile(). Then both processes
may not receive the same view. Changes made through file maps might still be in memory and
may not be accessible through ReadFile or WriteFile unless they are flushed. To get a uniform
view, it is necessary that all the processes use file maps.
Topic 75: More About File Making
In Win32, it is not possible to map files bigger than 2-3 GB. Also the entire 3GB might not be
available for merely file space. The above limitation is alleviated in Win64. File mapping cannot
be extended. You need to know the size of a map in advance. Customized functions would be
required to allocate memory within the mapped region.
The following minimum steps need to be taken while working with mapped files:
· If the file is new then set its length as some non-zero value using SetFilePointerEx()
followed by SetEndOfFile().
· In the end, unmap file view with UnmapViewOfFile() and use CloseHandle() to close map
and file handles.
Accessing a file through file mapping presents visible advantages. Although the setting up of
fileviews might be programmatically complex, the advantages are far bigger. The processing
time may reduce 3 folds as compared to conventional file operations while dealing with
sequential files. These advantages may only seem to disappear if the size of input and output
files is too large. The example is a simple Ceasar cipher application. It sequentially processes all
the characters with the file. It simply substitutes each character by shifting it a few places in the
ASCII set.
/* Chapter 5.
cci_fMM.c function: Memory Mapped implementation of the
simple Caeser cipher function. */
#include "Everything.h"
BOOL cci_f (LPCTSTR fIn, LPCTSTR fOut, DWORD shift)
/* Caesar cipher function.
* fIn: Source file pathname.
* fOut: Destination file pathname.
* shift: Numeric shift value */
{
BOOL complete = FALSE;
HANDLE hIn = INVALID_HANDLE_VALUE, hOut =
INVALID_HANDLE_VALUE;
HANDLE hInMap = NULL, hOutMap = NULL;
LPTSTR pIn = NULL, pInFile = NULL, pOut = NULL, pOutFile = NULL;
__try {
LARGE_INTEGER fileSize;
/* Open the input file. */
hIn = CreateFile (fIn, GENERIC_READ, 0, NULL,
OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (hIn == INVALID_HANDLE_VALUE)
ReportException (_T ("Failure opening input file."), 1);
/* Get the input file size. */
if (!GetFileSizeEx (hIn, &fileSize))
ReportException (_T ("Failure getting file size."), 4);
/* This is a necessar, but NOT sufficient, test for mappability on 32-bit
systems S
* Also see the long comment a few lines below */
if (fileSize.HighPart > 0 && sizeof(SIZE_T) == 4)
ReportException (_T ("This file is too large to map on a Win32
system."), 4);
/* Create a file mapping object on the input file. Use the file size. */
hInMap = CreateFileMapping (hIn, NULL, PAGE_READONLY, 0, 0,
NULL);
if (hInMap == NULL)
ReportException (_T ("Failure Creating input map."), 2);
/* Map the input file */
/* Comment: This may fail for large files, especially on 32-bit systems
* where you have, at most, 3 GB to work with (of course, you have much
less
* in reality, and you need to map two files.
* This program works by mapping the input and output files in their
entirety.
* You could enhance this program by mapping one block at a time for
each file,
* much as blocks are used in the ReadFile/WriteFile implementations.
This would
* allow you to deal with very large files on 32-bit systems. I have not
taken
* this step and leave it as an exercise.
*/
pInFile = MapViewOfFile (hInMap, FILE_MAP_READ, 0, 0, 0);
if (pInFile == NULL)
ReportException (_T ("Failure Mapping input file."), 3);
/* Create/Open the output file. */
/* The output file MUST have Read/Write access for the mapping to
succeed. */
hOut = CreateFile (fOut, GENERIC_READ | GENERIC_WRITE,
0, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL,
NULL);
if (hOut == INVALID_HANDLE_VALUE) {
complete = TRUE; /* Do not delete an existing file. */
ReportException (_T ("Failure Opening output file."), 5);
}
/* Map the output file. CreateFileMapping will expand
the file if it is smaller than the mapping. */
hOutMap = CreateFileMapping (hOut, NULL, PAGE_READWRITE,
fileSize.HighPart, fileSize.LowPart, NULL);
if (hOutMap == NULL)
ReportException (_T ("Failure creating output map."), 7);
pOutFile = MapViewOfFile (hOutMap, FILE_MAP_WRITE, 0, 0,
(SIZE_T)fileSize.QuadPart);
if (pOutFile == NULL)
ReportException (_T ("Failure mapping output file."), 8);
/* Now move the input file to the output file, doing all the work in
memory. */
__try
{
CHAR cShift = (CHAR)shift;
pIn = pInFile;
pOut = pOutFile;
while (pIn < pInFile + fileSize.QuadPart) {
*pOut = (*pIn + cShift);
pIn++; pOut++;
}
complete = TRUE;
}
__except(GetExceptionCode() == EXCEPTION_IN_PAGE_ERROR ?
EXCEPTION_EXECUTE_HANDLER : EXCEPTION_CONTINUE_SEARCH)
{
complete = FALSE;
ReportException(_T("Fatal Error accessing mapped file."), 9);
}
/* Close all views and handles. */
UnmapViewOfFile (pOutFile); UnmapViewOfFile (pInFile);
CloseHandle (hOutMap); CloseHandle (hInMap);
CloseHandle (hIn); CloseHandle (hOut);
return complete;
}
__except (EXCEPTION_EXECUTE_HANDLER) {
if (pOutFile != NULL) UnmapViewOfFile (pOutFile); if (pInFile !=
NULL) UnmapViewOfFile (pInFile);
if (hOutMap != NULL) CloseHandle (hOutMap); if (hInMap != NULL)
CloseHandle (hInMap);
if (hIn != INVALID_HANDLE_VALUE) CloseHandle (hIn); if (hOut !=
INVALID_HANDLE_VALUE) CloseHandle (hOut);
/* Delete the output file if the operation did not complete successfully. */
if (!complete)
DeleteFile (fOut);
return FALSE; }
}
Content:
Another advantage of memory mapping is the ability to use convenient memory based
algorithms to process files. Sorting data in memory, for instance, is much easier than sorting
records in a file.
Program explained in topic 77 sorts a file with fixed-length records. This program, called sortFL,
is similar to Program explaining example of sorting with binary search tree that it assumes an 8-
byte sort key at the start of the record, but this example is restricted to fixed records.
The sorting is performed by the <stdlib.h> C library function qsort. Notice that qsort requires a
programmer-defined record comparison function.
This program structure is straightforward. Simply create the file mapping on a temporary copy of
the input file, create a single view of the file, and invoke qsort. There is no file I/O. Then the
sorted file is sent to standard output using _tprintf, although a null character is appended to the
file map.
Exception and error handling are omitted in the listing but are in the Examples solution on the
recommended book’s Website.
Content:
File maps are convenient, as the preceding examples demonstrate. Suppose, however, that the
program creates a data structure with pointers in a mapped file and expects to access that file in
the future. Pointers will all be relative to the virtual address returned from MapViewOfFile, and
they will be meaningless when mapping the file the next time. The solution is to use based
pointers, which are actually offsets relative to another pointer. The Microsoft C syntax, available
in Visual C++ and some other systems, is:
Notice that the syntax forces use of the *, a practice that is contrary to Windows convention but
which the programmer could easily fix with a typedef.
Using the address returned by MapViewOfFile() for maintaining indexes is meaningless as the
address is liable to change in each call to API. A simple methodology is to maintain an array of
record and then build an index for the records. And subsequently use the index to access records
directly.
The program uses record of varying sizes in a file. It uses the first field of each record as the key
of 8 characters. There are two file mapping. One mapping maps the original file and the other
maps the index file. Each record in index file contains a key and the pointer location into the
original file for the record containing that key. Once index file is created it can be easily used
later. Subsequently, index file records can be sorted for faster searching. The input file remains
unchanged. Pictorial Representation of the example is attached below;
We have previously seen the example use of memory mapped files in windows. This a
fundamental feature of windows. Windows itself uses this feature while working with Dynamic
Link Libraries (DLLs). DLLs are one of the most important components of windows on which
many high-level technologies depend like COM.
Static Linking
The conventional approach is to gather all the source code and library functions attach them and
encapsulate them into a single executable file. This approach is simple but has few
disadvantages.
The executable image will be large as it contains all library functions. Hence it will
consume more disk space and will require large physical memory to run.
If a library function updates the whole program will require recompilation.
There can be many programs that require a library function. Each program will have
static copy of its own. Hence, resources requirement will increase.
It will reduced portability as a program compiled with certain environment setting will
run same functions in different environment where some other version might be.
Using DLLs the library functions are not linked at compile time. They are linked at program load
time (implicit linking) or at run time (explicit linking).
As a result the size of executable package is smaller. DLLs can be easily used to create shared
libraries which can be used by multiple programs concurrently. Only a single copy of shared
DLLs is placed in memory. All the processes sharing the DLL map the DLL space onto their
program space. Each program will have its own copy of DLL global variables.
New versions or updates can be simply supported by just providing a new DLL without the need
of recompiling main code. The library runs in the same processes as the calling the program.
Importance of DLLs
DLLs are used in almost all modern operating systems. DLLs are most important in case of
windows as they are used to implement OS interfaces. The entire Windows API is supported by
a set of DLLs which are invoked to call kernel services. The DLL code can be shared by multiple
processes. DLL function when invoked by a process runs in process space. Therefore, it can use
resources of the calling process such as file handles and thread stack. DLLs should be written in
Thread-safe manner. A DLL exports variables as well as function entry points.
Implicit linking is the easier of the two techniques. Functions defined in a DLL are collected and
build as DLL. The build process builds a .LIB file which is a stub for actual code. The stub is
linked to the calling program at build time. It provides a place holder for each function in the
DLL.
The place holder/stub will call the original function in the DLL. This file should be placed in
common user library directory for the project. The build process also constructs the DLL that
contain the original binary image for the functions. This File is usually placed in the same
directory as the application.
Function interfaces defined in DLLs should be exported carefully.
Topic 82: Exporting and Importing Interfaces
For a DLL function to be useful for an application exporting it, it must be declared as exportable. This can
be done by using .DEF file or by using __declspec (dllexport) storage modifier.
The build process will generate a .LIB file and .DLL file. The .LIB files will a stub for function calls while
DLL file will hold the actual code. Similarly the calling program should use the __declspec (dllimport)
Storage modifier.
#else
#endif
If you are using Visual C++ compiler then this task is automatically performed by the compiler. When
building the calling program you need to specify the .LIB file. When executing the calling program the
.DLL file must be placed in the same directory as the application. Following is the default DLL search safe
order for explicit and implicit linking.
Explicit linking requires the program to explicitly a specific DLL to be loaded or freed. Once the DLL is
loaded then the program obtains the address of the specific entry point and uses that address as a
pointer in function call.
The function is not declared in the calling program. Rather a pointer to a function is declared. The
functions required are:
HANDLE hFile,
DWORD dwFlags );
File name need not mention the extension. The file path must be valid. See MSDN for details of dwFlags
Once the DLL is loaded. The programmer needs to obtain an entry point (procedure address) into the
DLL. This is done using:
hModule is the module handle obtained for the DLL. lpProcName cannot b UNICODE. It is the name of
the function whose entry point is to be obtained. Its return type is FARPROC. This is the far address of
the function. A C type function pointer declaration can be easily used to invoke the function in DLL and
pass its parameters.
Previously, we have studied many file conversion functions. Some used memory mapped IO, some used
file operations. Some performed file conversion some performed file encryption.
Now, we take a look at how we can encapsulate these functions in a DLL and invoke them explicitly.
HMODULE hDLL;
FARPROC pcci;
if (argc < 5)
if (hDLL == NULL)
if (pcci == NULL)
DLL Entry point can be specified optionally when you create a DLL. The code at entry point executes
whenever a process attaches to the DLL. In case of implicit linking the DLL attaches at the time of
process start and detaches when process ends. In case of explicit linking the process attaches when
LoadLibrary()/LoadLibraryEx() is invoked. Also the process detaches when FreeLibrary() is called.
LoadLibraryEx() can also suppress the execution of entry point. Further, entry point is also invoked
whenever a thread attaches to the DLL.
BOOL DllMain(
HINSTANCE hDll,
DWORD Reason,
LPVOID lpReserved)
hDll corresponds to the handle returned by LoadLibrary(). lpReserved if NULL, represent explicit
attachment else it represents implicit attachment. Reason will have on the four values
DLL_PROCESS_ATTACH
DLL_THREAD_ATTACH
DLL_THREAD_DETACH
DLL_PROCESS_DETACH
Based on the value of Reason the programmer can decide whether to initialize or free resources. All the
calls to DllMain are serialized by the system. Serialization is critically important as DllMain() is supposed
to perform initialization operations for each thread. There should be no blocking calls, IO calls or wait
functions as they will indefinitely block other threads. A call to other DLLs from DllMain() cannot be
performed except for few exceptions. LoadLibrary() and LoadLibraryEx() should never be called from
DllMain() as it will create more DLL entry points. DisableThreadLibraryCalls() can be used to disable
thread attach/detach calls for a specified instance.
Complications
Many times a DLL is upgraded to provide more features and new symbols. Also, multiple processes
share a single implementation of DLL. This strength of DLL may also lead to some complications.
If the new DLL has changed interfaces this render problems for older programs that have not
been updated for newer version.
Application that require newer updated functionality may sometime link with older DLL version.
Version Management
One intuitive resolution to the problem can be by using different directory for each version. But there
are several solutions. Use DLL version number as part of the .DLL and .LIB file names eg. Utils_4_0.DLL
and Utils_4_0.LIB. Applications requiring DLLs can determine the version requirements and
subsequently access files with distinct filename using implicit or explicit linking.
Microsoft has introduced a concept of side by side DLLs or assemblies. This solution requires application
to declare its DLL requirements using XML.
Microsoft DLLs support a callback function that version information and more regarding a DLL. This
callback function can be used dynamically to query the information regarding DLL. The works as follows:
DLLVERSION *pdvi
)
Information regarding the DLL is placed in the DLLVERSION structure. It contains a field cbSize which is
the size of the structure,
dwMajorVersion
dwMinorVersion
dwBuilderNumber
dwPlatformID
Multitasking Systems
Multiprocessor systems
Thread as a unit of execution
Resources of Processes
From programmer’s perspective each process has resources such as one or more threads distinct virtual
address space. Although, processes can share memory and files but the process itself lie in an individual
virtual memory space.
Resources of Threads
Each thread is a unit within the process. Nevertheless, it has resources of its own. Stack: for procedure
calls, interrupts, exception handling, and auto variables. Thread Local Storage (TLS): An array like
collection of pointers enabling a thread to allocate storage to create its unique data environment. An
argument on stack unique for the created thread. A structure containing the context (internal registers
status) of the thread.
Topic 88: Process Creation
The most fundamental process management function in windows is CreateProcess(). Windows does not
have any structure that keeps account of the parent-child processes. The process that creates a child
process is considered as parent process. CreateProcess() has 10 parameters. It does not returns a
HANDLE. Rather two handles, one for process and one for thread is returned in a parameter struct.
It creates a process with single primary thread. Windows does not have any structure that keeps
account of the parent-child processes. The process that creates a child process is considered as parent
process.
CreateProcess() has 10 parameters. It does not return a HANDLE. Rather two handles, one for process
and one for thread is returned in a parameter struct. One must be very careful while closing both these
handles when they are not needed. Closing the thread handle does not terminate the thread it only
deletes the reference to the thread within the process.
LPPROCESS_INFORMATION lpProcessInformation );
lpApplicationName and lpCommandLine specify the program name and the command line parameters.
lpProcessAttributes and lpThreadAttributes points to the process and thread’s security attribute
structure. bInheritHandles specifies whether the new process inherits copies of the calling process’s
inheritable handles. dwCreationFlags combines several flags
CREATE_SUSPENDED: indicates that the primary thread is a suspended thread and will continue if the
process invokes ResumeThread()
DETACHED_PROCESS and CREATE_NEW_CONSOLE are mutual exclusive, First flag creates a process
without console and second one creates a process with console
CREATE_NEW_PROCESS_GROUP : Indicates that the new process is the root of new process group. All
processes in the same root group receive the control signal if they share the same console.
lpEnvironment : points to an environment block for the new process. If NULL, the process uses the
parent’s environment.
lpCurDir Specifies the drive and directory of new process. If NULL parent working directory is used.
lpStartupInfo : specifies the main window appearance and standard device handles for new programs.
lpProcInfo is the pointer to the structure containing handles and id for process and thread.
IDs are unique to processes for their entire lifetime. ID is invalidated when a process is destroyed.
Although, it may be reused by other newly created processes. Alternately, a process can have many
handles with different security attributes.
Some process management functions require handles while others require IDs. Handles are required for
general purpose handle based functions. Just like file handles and handles for other resources the
process handles need to be closed when not required.
The process obtains environment and other information from CreateProcess() call. Once the process has
been created then changing this information may have no effect on the child. For example the parent
process may change its working directory but it will not have effect on child unless the child changes its
own working directory. Processes are entirely independent.
Topic 90: Specifying the Executable Image and the Command Line
Executable Image
Either lpApplicationName or lpCommandLine specifies the executable image. Usually lpCommandLine is
used, in which case lpApplicationName is set to NULL. However if lpApplicationName is specified there
are rules governing it.
lpApplicationName
lpApplicationName should not be NULL. It should specify the full name of the application including the
path. Or use the current path in which case current drive and directory will be used. Include the
extension i.e. .EXE or .BAT in the file name. For long names quotes within the string are not required.
lpCommandLine
lpApplicationName should be NULL. Tokens within string are delimited by spaces. First token is the
program image name. If the name does not contain the path of the image then the following search
sequence is followed.
Command Line
A process can obtain its command line from the usual argv mechanism. Alternately, in windows it can
call GetCommandLine(). It’s important to know that command line is not a constant string. A program
can change the command line. It’s advisable that the program makes changes on a copy of command
line.
Mostly, a child process requires access to a resource referenced in parent by a handle. If the handle is
inheritable then the child can directly receive a copy of open handles in parent. For example standard
input and output handles are shared in this pattern. This inheriting of handles is accomplished in this
manner. The bInheritHandles flag in the CreateProcess() call determines whether the child process will
inherit copies of parent open handles. Also it is necessary to make individual handles inheritable. Use
the SECURITY_ATTRIBUTES structure to specify this. It has a flag bInheritFlag which should be set to
TRUE. Also the nLength should be set to sizeof(SECURITY_ATTRIBUTES)
We have only made the handles inheritable. However, we need to pass the actual values of handles to
the child process. Either this is accomplished through InterProcess Communication (IPC) Or it is passed
on the child process by setting it up in the STARTUPINFO struct. The latter is a preferred policy as it
allows IO redirection and no changes are required in child process. Another approach is to convert the
file handles into text and pass them through the command line to the child process.
The handles if are already inheritable then they will readily accessible to the child. Inherited handles are
distinct copies. Parent and child might be accessing same file with different file pointer. Each process
should close handles.
Content Development
Content:
The synchronization can be attained easily by the wait process. Windows provides a general-
purpose wait function. Which can be used to wait for a single object and also multiple objects in a
group. Windows send a signal to the waiting process when the process terminates
nCount is the number of objects in an array. Should not exceed MAXIMUM WAIT OBJECTS
dwMilliseconds is the timeout period for wait . 0 for no wait and INFINITE FOR indefinite wait.
bWaitAll describes if it's necessary to wait for all the objects to get free.
WAIT_OBJECT_0
WAIT_OBJECT_0+n
WAIT_TIMEDOUT
WAIT_FAILED
WAIT_ABANDONED_0
Topic No - 97 (Environment Block )
Content:
-The EB contains string regarding the environment of the process of the form Name = Value
And
To share the parent environment with the child process set lpEnvironment to NULL in the call to
CreateProcess()
Any process can modify the environment variables and make new ones
lpName is the variable name. On setting the string the value is modified if the variable already exists. If it
does not exist
then a new variable is created and assigned the value. IF the value is NULL then the variable is deleted.
If LpValue is not as long as the value specified by the count then the actual length of the string is
returned.
PROCESS_QUERY_INFORMATION
CREATE_PROCESS
PROCESS_TERMINATE
PROCESS_SET_INFORMATION
DUPLICATE_HANDLE
CREATE_HANDLE
it can be useful to limit some processes right like giving PROCESS_TERMINATE right to parent process
only.
Topic No - 98 (A Pattern Searching Example )
Content:
This example uses the power of windows multitasking to search a specific pattern among numerous
files.
The process take the specific pattern along with filenames through command line.
The standard output file is specified as inheritable in new process start-up info structure.
As soon as the search end the results are displayed one at a time.
The program uses the exit code to identify whether the process has detected a match or not.
The Example
#include "Everything.h"
HANDLE hTempFile;
PROCESS_INFORMATION processInfo;
int iProc;
#ifdef UNICODE
dwCreationFlags = CREATE_UNICODE_ENVIRONMENT;
#endif
if (argc < 3)
GetStartupInfo (&startUpSearch);
GetStartupInfo (&startUp);
CreateFile (procFile[iProc].tempFile,
if (hTempFile == INVALID_HANDLE_VALUE)
startUpSearch.dwFlags = STARTF_USESTDHANDLES;
startUpSearch.hStdOutput = hTempFile;
startUpSearch.hStdError = hTempFile;
hProc[iProc] = processInfo.hProcess;
/* Processes are all running. Wait for them to complete, then output
/* List the file name if there is more than one file to search */
CloseHandle (processInfo.hProcess);
CloseHandle (processInfo.hThread);
CloseHandle (hProc[iProc]);
if (!DeleteFile (procFile[iProc].tempFile))
return 0;
}
Topic No - 99 (Working in Multiprocessor Environment)
Content:
Multiprocessor Environment
If the system is uniprocessor the processor time is multiplexed among multiple processes in an
interleaved manner
If the system is multiprocessor then windows scheduler can run process threads on separate processors.
The performance gain will not be linear because of dependencies among processes (wait and signal)
From programming point of view it is essential to understand this potential of windows so that
programs can be designed optimally.
Subsequently, it is possible to create independent threads within a process which can be scheduled on
separate processor.
Process Times
Windows API provides a very simple mechanism for determining the amount of time a process has
consumed.
LPFILETIME lpExitTime,
LPFILETIME lpKernelTime,
LPFILETIME lpUserTime );
Elapsed time can be computed by subtracting creation time from exit time.
Content:
Example
It uses the API GetCommandLine()to get the command line as a single string
It then uses the SkipArg() function the skip past the executable name.
#include "Everything.h"
int _tmain (int argc, LPTSTR argv[])
STARTUPINFO startUp;
PROCESS_INFORMATION procInfo;
LONGLONG li;
FILETIME ft;
OSVERSIONINFO windowsVersion;
HANDLE hProc;
/* Skip past the first blank-space delimited token on the command line */
if (!GetVersionEx (&windowsVersion))
if (windowsVersion.dwPlatformId != VER_PLATFORM_WIN32_NT)
windowsVersion.dwOSVersionInfoSize = sizeof(OSVERSIONINFO);
if (!GetVersionEx (&windowsVersion))
if (windowsVersion.dwPlatformId != VER_PLATFORM_WIN32_NT)
GetStartupInfo (&startUp);
/* Execute the command line and wait for the process to complete. */
hProc = procInfo.hProcess;
if (WaitForSingleObject (hProc, INFINITE) != WAIT_OBJECT_0)
elTiSys.wMilliseconds);
usTiSys.wMilliseconds);
return 0;
Content:
Example
It uses the API GetCommandLine()to get the command line as a single string
It than uses the SkipArg() function the skip past the executable name.
#include "Everything.h"
{
STARTUPINFO startUp;
PROCESS_INFORMATION procInfo;
LONGLONG li;
FILETIME ft;
OSVERSIONINFO windowsVersion;
HANDLE hProc;
/* Skip past the first blank-space delimited token on the command line */
windowsVersion.dwOSVersionInfoSize = sizeof(OSVERSIONINFO);
if (!GetVersionEx (&windowsVersion))
if (windowsVersion.dwPlatformId != VER_PLATFORM_WIN32_NT)
windowsVersion.dwOSVersionInfoSize = sizeof(OSVERSIONINFO);
if (!GetVersionEx (&windowsVersion))
if (windowsVersion.dwPlatformId != VER_PLATFORM_WIN32_NT)
GetStartupInfo (&startUp);
/* Execute the command line and wait for the process to complete. */
hProc = procInfo.hProcess;
elTiSys.wMilliseconds);
usTiSys.wMilliseconds);
keTiSys.wMilliseconds);
return 0;
Content:
Terminating another process can be problematic as the terminating process does not get a chance to
release resources.
SEH does not help either as there is no mechanism that can be used to raise an exception in another
process.
Console control event allows sending a message from one process to another.
Usually, a handler is set up in a process to catch such signals. Subsequently, the handler generates an
exception.
This creates a process group in which the created process is the root and all the subsequently created
processes by the parent are in the new group.
The target process should have the same console as the process generating the event.
More specifically, the calling process cannot have its own console using CREATE_NEW_CONSOLE or
DETACHED_PROCESS flags.
DWORD dwProcessGroupId );
Content:
Simple Job Management Shell
jobbg
jobs
kill
The job shell parses the command line and then calls the respective function for the given command.
The shell uses a user-specific file keeping track of process ID and other related information
Several shells can run concurrently and use this shared file.
Also, concurrency issues can be encountered as several shells can try to use the same file.
/* Chapter 6 */
control signal.
*/
#include "Everything.h"
#include "JobManagement.h"
#define MILLION 1000000
LARGE_INTEGER processTimeLimit;
DWORD i, localArgc;
TCHAR argstr[MAX_ARG][MAX_COMMAND_LINE];
LPTSTR pArgs[MAX_ARG];
hJobObject = NULL;
processTimeLimit.QuadPart = 0;
basicLimits.PerProcessUserTimeLimit.QuadPart = processTimeLimit.QuadPart *
MILLION;
hJobObject = CreateJobObject(NULL, NULL);
if (NULL == hJobObject)
pArgs[i] = argstr[i];
while (!exitFlag) {
*pc = _T('\0');
CharLower (argstr[0]);
}
else if (_tcscmp (argstr[0], _T("kill")) == 0) {
exitFlag = TRUE;
CloseHandle (hJobObject);
return 0;
Related commands (jobs, fg, kill, and suspend) can be used to manage the jobs. */
If neither is set, the background process shares the console with jobbg. */
/* - - - - - - - - - - - - - - - - - - - - - - - - - */
/* - - - - - - - - - - - - - - - - - - - - - - */
/* Execute the command line (targv) and store the job id,
DWORD fCreate;
LONG jobNumber;
BOOL flags[2];
STARTUPINFO startUp;
PROCESS_INFORMATION processInfo;
GetStartupInfo (&startUp);
/* Simplifying assumptions: There's only one of -d, -c (they are mutually exclusive.
Also, commands can't start with -. etc. You may want to fix this. */
if (argv[1][0] == _T('-'))
if (hJobObject != NULL)
if (!AssignProcessToJobObject(hJobObject, processInfo.hProcess)) {
CloseHandle (processInfo.hThread);
CloseHandle (processInfo.hProcess);
return 4;
if (jobNumber >= 0)
ResumeThread (processInfo.hThread);
else {
CloseHandle (processInfo.hThread);
CloseHandle (processInfo.hProcess);
return 5;
CloseHandle (processInfo.hThread);
CloseHandle (processInfo.hProcess);
return 0;
Related commands (jobbg and kill) can be used to manage the jobs. */
*/
return 0;
basicInfo.TotalProcesses, basicInfo.ActiveProcesses,
basicInfo.TotalTerminatedProcesses);
return 0;
}
/* kill [options] jobNumber
1. Using TerminateProcess
/* Options:
-b Generate a Ctrl-Break
-c Generate a Ctrl-C
HANDLE hProcess;
if (processId == 0) {
ReportError (_T ("Job number not found.\n"), 0, FALSE);
return 1;
if (hProcess == NULL) {
return 2;
if (cntrlB)
else if (cntrlC)
else
if (!killed) {
return 3;
CloseHandle (hProcess);
return 0;
}
Topic No - 105 (Listing Background jobs)
Looks up into the file and acquires the status of the processes listed in the file.
Also displays the status of the processes listed and other information.
/* Scan the job database file, reporting on the status of all jobs.
In the process remove all jobs that no longer exist in the system. */
JM_JOB jobRecord;
TCHAR jobMgtFileName[MAX_PATH];
OVERLAPPED regionStart;
if ( !GetJobMgtFileName (jobMgtFileName) )
return FALSE;
FILE_SHARE_READ | FILE_SHARE_WRITE,
return FALSE;
regionStart.Offset = 0;
regionStart.OffsetHigh = 0;
regionStart.hEvent = (HANDLE)0;
__try {
while (ReadFile (hJobData, &jobRecord, SJM_JOB, &nXfer, NULL) && (nXfer > 0)) {
hProcess = NULL;
if (jobRecord.ProcessId == 0) continue;
if (hProcess != NULL) {
CloseHandle (hProcess);
if (NULL == hProcess)
jobRecord.ProcessId = 0;
} /* End of while. */
} /* End of try. */
CloseHandle (hJobData);
return TRUE;
Content:
Finding a Process Id
It simply looks up into the file based on job number and reads the record at the specific location.
HANDLE hJobData;
JM_JOB jobRecord;
TCHAR jobMgtFileName[MAX_PATH+1];
OVERLAPPED regionStart;
LARGE_INTEGER fileSize;
FILE_SHARE_READ | FILE_SHARE_WRITE,
/* Position to the correct record, but not past the end of file */
return 0;
fileSizeLow = fileSize.LowPart;
/* SetFilePoiner is more convenient here than SetFilePointerEx since the the file is known to be
"short" ( < 4 GB). */
regionStart.hEvent = (HANDLE)0;
LockFileEx (hJobData, 0, 0, SJM_JOB, 0, ®ionStart);
CloseHandle (hJobData);
return jobRecord.ProcessId;
/* Chapter 6 */
/* JobObjectShell.c One program combining three
job management commands:
Jobbg - Run a job in the background
jobs - List all background jobs
kill - Terminate a specified job of job family
There is an option to generate a console
control signal.
This implemenation enhances JobShell with a time limit on each process.
There is a time limit on each process, in seconds, in argv[1] (if present)
0 or omitted means no process time limit
*/
#include "Everything.h"
#include "JobManagement.h"
return 0;
}
/* kill [options] jobNumber
Terminate the process associated with the specified job number. */
/* This new features this program illustrates:
1. Using TerminateProcess
2. Console control events */
/* Options:
-b Generate a Ctrl-Break
-c Generate a Ctrl-C
Otherwise, terminate the process. */
Switching among processes is time-consuming and expensive. Threads can easily allow
concurrent processing of the same function hence reducing overheads
Usually, independent processes are not tightly coupled, and hence sharing resources is difficult.
Threads can perform the asynchronously overlapped functions with less programming effort.
The use of multithreading can benefit more with a multiprocessor environment using efficient
scheduling of threads.
Threads share resources within a process. One thread may inadvertently another thread’s data.
In some specific cases, concurrency can greatly degrade performance rather than improve.
In some simple single-threaded solutions using multithreading greatly complicates matters and even
results in poor performance.
Topic No - 111 (Thread Basics)
However, individual threads may have their unique data storage in addition to the shared data.
Programmer must assure that a thread only uses its own data and does not interfere with
shared data.
Moreover, each thread has its own stack for function calls.
The calling process usually passes an argument to the thread. These arguments are stored in
the thread’s stack
Each thread can allocate its own Thread Local Storage(TLS) and set clear values.
This also assures that a thread will not modify data of any other thread’s TLS.
Topic No - 112 (Thread Management)
Threads can also be treated as parent and child. Although the OS is unaware of that.
The CreateThread() requires the starting address of the thread within the calling process.
It also requires stack space size for the thread which comes from process address space.
lpStartAddress is the starting address of the thread function within the calling process of the form:
The function returns a DWORD value which is usually an exit code and accepts a single pointer
argument.
lpThreadParam is the pointer passed to the thread and is usually interpreted as a pointer to a structure
containing arguments
dwCreationFlags if 0 would been that thread would start readily. If its CREATE_SUSPENDED then the
thread will be suspended requiring the use of ResumeThread() to start execution.
lpThreadId is a pointer to a DWORD that will receive the thread identifier. If its kept NULL then no
thread identifier will be returned,
An alternate is that the thread function returns with the exit code.
When a thread exits the thread stack is deallocated and the handle reffering to the thread are
invalidated.
If the thread is linked to some DLL then the DllMain() function is invoked with te reason
DLL_THREAD_DETACH.
In this case Thread resources will not be deallocated, completion handlers do not execute, no
notification is sent to attached DLLs.
A thread object will continue to exist even its execution has ended until the last reference to the thread
handle is destroyed with CloseHandle().
Thread Ids and handles can be obtained using functions quite similar to one used with processes.
GetCurrentThread()
GetCurrentThreadId()
GetThreadId()
OpenThread()
Topic No - 115
The functions of thread management that are discussed above are enough to program any useful
threading application. However, there are some more functions introduced in the later versions of
Windows i.e. Windows XP and Windows 2003 to write a useful and robust program. These functions are
described below.
GetProcessIdOfThread()
This function was not available in the earlier version of windows and requires Windows 2003 or later
versions. When a thread is specified, this function tells us, to which process a specified thread is linked
while returning the id of that process. This function is useful for, mapping thread and process and
program that manages or interacts with threads in another process
GetThreadIOPendingFlag()
This function determines whether the thread, specified by its handle, has any outstanding I/O requests.
For example, the thread might be blocked for some IO operations. The result is the status at the time
that the function is executed; the actual status could change at any time if the target thread completes
or initiates an operation.
In some cases, we might require to pause any running thread or resume any paused thread. For these
purposes, Windows maintains a suspend count. Every thread has its separate suspend count. A thread
will only run if the suspend count is 0 and if it is not, then the thread is paused and execution of that
thread will be stopped until the suspend count becomes 0. One thread can increment or decrement the
suspend count of another thread using SuspendThread and ResumeThread. Recall that a thread can be
created in the suspended state with a count of 1
Both functions, if successful, return the previous suspend count. 0xFFFFFFFF indicates failure.
Topic No - 116
One thread can wait for another thread to complete or terminate in the same way that threads wait for
process termination. Threads and actions should be synchronized so that action can be performed after
completing execution. The wait function can be used with the thread handle to check whether a
particular thread has completed its execution or not. Window treats thread as an object, and there are
two different types of wait function in the windows that can be used for threads as well (Since the
thread is also an object), i.e. WaitForSingleObject() or WaitForMultipleObjects(). WaitForSingleObject()
is used to wait for a single specified object and WaitForMultipleObjects can be used for more than 1
object, those objects are defined in the form of an array.
For WaitForMultpleObjects, there is a defined limit in windows to wait for execution i.e.
MAXIMUM_WAIT_OBJECTS(64), Usually, it is 64 objects. If there are more than 64 objects, we can use
multiple calls as well i.e. if there are 100 objects, we can create 2 arrays of 64 and 36 objects and call
WaitForMultipleObjects() two times.
The wait function waits for the object, indicated by the handle, to become signaled. In the case of
threads, ExitThread() and TerminateThread() set the object to the signaled state, releasing all other
threads waiting on the object, including threads that might wait in the future after the thread
terminates. Once a thread handle is signaled, it never becomes nonsignaled.
Note that multiple threads can wait on the same object. Similarly, the ExitProcess() function sets the
process state and the states of all its threads to signaled.
Topic No – 117
C Library in Threads
Let’s assume a scenario, where we are using the window threading function with C library functions
concurrently, there might arise a problem of thread safety. For example, strtok() is a C library function,
used to extract the token from the specified string, this function uses the global memory space for its
internal processing, and if we are using the different copies of that string, they all might use global
space. The result achieved from this operation might get compromised and unsatisfactory. In such cases,
these types of problems are resolved by using C Library threading functions rather than Windows
threading functions. For C Library, Windows C provides a thread-safe library named LIBCMT. This library
can be used for thread-relevant functions to program a multithreaded program. So far, we were using
the CreateThread and ExitThread functions to create and exit the threads, this library provides us with
these equivalent functions i.e. _beginthreadex() and endthreadex() respectively. These C library
functions are quite simpler but are not diverse as compared to Windows threading functions. i.e.
_beingthreadex() is a simpler function but it does not allow users to specify the security attributes.
_endthreadex() does not allow return values and does not pass information regarding the status of the
thread. If we are using the windows program and using this C Libray thread function the return value
must be type cast to HANDLE to process it further, since the original return type of _beginthreadex() is
not HANDLE.
Topic No - 118
In this example of pattern searching with multithreading, the program is managing concurrent I/O to
multiple files, and the main thread, or any other thread, can perform additional processing before
waiting for I/O completion. In this way, we can manage several files in a very simple and efficient way. In
the previous examples of pattern searching, we used the multitasking approach, but here we will use
the multithreading technique, that enables us to write an optimal program.
In Synchronous Input-Output, suppose an example of the keyboard, when the user presses any
keyboard button the execution stops until the button is released, but in asynchronous input-output, a
Sound card plays the song at the same time other operations are also performed.
When a read operation is performed, this can be performed concurrently, a file can be read by several
processes at the same time or we can also read several files at the same time. But the problem arises
when several processes attempt to write a single file. For now, we will be limited to read operation only.
This program can be provided several files in which a specific pattern is to be searched. For every file, a
separate thread will be created which will search the pattern in that file. Once the pattern is found in
the file, it will be reported in a temporary file
/* grepMT. */
/* Parallel grep-- multiple thread version. */
#include "Everything.h"
typedef struct { /* grep thread's data structure. */
int argc;
TCHAR targv[4][MAX_PATH];
} GREP_THREAD_ARG;
GetStartupInfo(&startUp);
DeleteFile ( gArg[iThrd].targv[3]);
/* Adjust thread and file name arrays. */
tHandle[iThrd] = tHandle[threadCount - 1];
_tcscpy(gArg[iThrd].targv[3], gArg[threadCount -1 ].targv[3]);
_tcscpy(gArg[iThrd].targv[2], gArg[threadCount -1 ].targv[2]);
threadCount--;
}
}
Structure of thread argument is created with argument count (argc) and thread
argument value (targv) which will be passed to thread. A Thread prototype is also
created named as ThGrep which will be used to search the specific patterns in the file.
STARTUPINFO and PROCESS_INFORMATION structures are created for startup and for
creating process respectively
stratUp information is placed in GetStratupInfo function
Files are specified using the command line, in which patterns are to be searched, and for
every file inputted, a separate thread will run.
Loop will run till argc-2 times, in argc, the first two parameters will be process name and
pattern and 3rd parameter is inputted file names, though, this loop will run, till the
number of files inputted. In this loop for every file, the name of the file is copied and its
temporary file is created
Further, arguments that are to be passed to the thread are also prepared, 1st argument
will store the pattern which is to be searched, 2nd argument will store the name of the
file in which the pattern is to be searched and 4th argument will store the count i.e. how
many arguments are stored.
Thread is created using _beginthreadex, whose name is ThGrep and is passed all the
arguments that were created (gArg). With this, the handle of every thread will be stored
in the form of an array (tHandle)
Standard output files, standard error files, and flags are set for startup information
Total number of threads are stored in ThreadCount
Another loop is run, in which the Wait function is called to wait for multiple objects
specified with a number of threads (threadCount) and handles of all threads (tHandle
array). When the wait function is completed, an exitcode will be generated which will tell
us why the wait function is stopped, and further, for garbage collection, the handle of
the thread is also closed with the CloseHandle function.
When the wait function was in execution, a process is created with help of the
CreateProcess function which has been provided, startup information(startUp), and
process information (processInfo).
Another wait function is also called but this time WaitForSingleObject is called which has
been provided the handle of the process named hProcess.
Once the execution of the process is completed, the handle of the process as well as the
thread are closed.
Topic No – 119
In the example of multithreaded pattern searching, there was one main thread, which was running
other threads. Each file was assigned a separate thread, which was finding the pattern in that file. This
model is more like a boss worker model. The boss worker model is one in which there is one boss and
many workers. The Boss assigns work to workers and each worker report result back to the boss. There
are many other models, which are used to write an efficient and more understandable multithreaded
program depending on the scenarios.
The work crew model is one in which the workers cooperate on a single task, each performing a small
piece. They might even divide up the work themselves without direction from the boss. Multithreaded
programs can employ nearly every management arrangement that humans use to manage concurrent
tasks.
The Client-Server model is mostly used worldwide, in which a client requests the server and the server
runs a thread for that client. For every client, a separate thread is run at the server end. In this way, the
work is done concurrently rather than sequentially. Another major model is the pipeline model, where
work moves from one thread to the next
There are many advantages to using these models when designing a multithreaded program, including
the following.
• Most models can be used which makes things simpler and expedites programming and
debugging efforts.
• Models help you obtain the best performance and avoid common mistakes
• Models naturally correspond to the structure of programming language constructs.
• Maintenance is simplified.
• Troubleshooting is simplified when they are analyzed in terms of a specific model.
• Synchronization and coordination are simplified using well-defined models.
Topic No – 120
The window is a multiprocessing system i.e. more than 1 process can run simultaneously. This example
shows, how multithreading can be used to achieve the optimal performance gain in a multiprocessing
system, each process runs a different thread to utilize the optimal resources. The main idea behind this
is to subdivide the process into similar tasks so that a separate thread is run for each subtask i.e. A big
array is divided into smaller parts, each part is sorted separately with different threads, and all the parts
are merged. This will allow parallelism and better performance gain. The strategy implemented in this
example is the worker crew model, work is divided into different workers, and all the work is merged in
the end. This strategy could also be used, by using multiprocessing instead of multithreading, but the
result might not be as efficient as it is with multithreading, because switching overhead is low in
multithreading but high in multiprocessing. In the example of MergeSort, each subarray is sorted with
qsort() and merged as in the mergesort algorithm. The program code will run most accurately if a
number of records are divisible by a number of threads and the number of threads is in the power of
two. If the number of processes is equal to the number of threads, this would be the most optimal
situation otherwise less optimal. If a list is subdivided into 4 sub-lists and 4 threads are created for these
sublists, they all must be created at a suspended state and should only be resumed when all the threads
are created. If one thread is completed and the other, which is to be merged, is not created or does not
exist, this will occur in a race condition. To avoid the race condition, all the threads should be created
with a suspended state and resumed to run all concurrently. The following diagrams explain it further
A Large array is divided into smaller 4 subarrays. For each subarray, a separate thread is created i.e.
thread 0, thread 1, thread 2, and thread 3. When thread 0 is sorted it will wait for thread 1 to be sorted,
once sorted, they both will be merged. The same happens for thread 2 and thread 3, they are merged
when sorted. These 2 merged subarrays are then sorted and merged to form a large sorted array.
Topic No – 121
if (NULL == pRecords)
ReportError (_T ("Failure to map input file."), 6, TRUE);
CloseHandle (mHandle);
/* Create the sorting threads. */
lowRecordNum = 0;
for (iTh = 0; iTh < numFiles; iTh++) {
threadArg[iTh].iTh = iTh;
threadArg[iTh].lowRecord = pRecords + lowRecordNum;
threadArg[iTh].highRecord = pRecords + (lowRecordNum + nRecTh);
lowRecordNum += nRecTh;
pThreadHandle[iTh] = (HANDLE)_beginthreadex (
NULL, 0, SortThread, &threadArg[iTh],
CREATE_SUSPENDED, NULL);
}
/* Resume all the initially suspened threads. */
for (iTh = 0; iTh < numFiles; iTh++)
ResumeThread (pThreadHandle[iTh]);
/* Wait for the sort-merge threads to complete. */
WaitForSingleObject (pThreadHandle[0], INFINITE);
for (iTh = 0; iTh < numFiles; iTh++)
CloseHandle (pThreadHandle[iTh]);
/* Print out the entire sorted file. Treat it as one single string. */
stringEnd = (LPTSTR) pRecords + nRec*RECSIZE;
*stringEnd =_T('\0');
if (!noPrint) {
_tprintf (_T("%s"), (LPCTSTR) pRecords);
}
UnmapViewOfFile(pRecords);
// Restore the file length
/* SetFilePointer is convenient as it's a short addition from the file end */
if (!SetFilePointer(hFile, -2, 0, FILE_END) || !SetEndOfFile(hFile))
/* Merge two adjacent arrays, each with nRecs records. p1 identifies the first */
DWORD iRec = 0, i1 = 0, i2 = 0;
LPRECORD pDest, p1Hold, pDestHold, p2 = p1 + nRecs;
pDest = pDestHold = malloc (2 * nRecs * RECSIZE);
p1Hold = p1;
while (i1 < nRecs && i2 < nRecs) {
if (KeyCompare ((LPCTSTR)p1, (LPCTSTR)p2) <= 0) {
memcpy (pDest, p1, RECSIZE);
i1++; p1++; pDest++;
}
else {
memcpy (pDest, p2, RECSIZE);
i2++; p2++; pDest++;
}
}
if (i1 >= nRecs)
memcpy (pDest, p2, RECSIZE * (nRecs - i2));
else memcpy (pDest, p1, RECSIZE * (nRecs - i1));
memcpy (p1Hold, pDestHold, 2 * nRecs * RECSIZE);
free (pDestHold);
return;
}
int KeyCompare (LPCTSTR pRec1, LPCTSTR pRec2)
{
DWORD i;
TCHAR b1, b2;
LPRECORD p1, p2;
int Result = 0;
p1 = (LPRECORD)pRec1;
p2 = (LPRECORD)pRec2;
for (i = 0; i < KEYLEN && Result == 0; i++) {
b1 = p1->key[i];
b2 = p2->key[i];
if (b1 < b2) Result = -1;
if (b1 > b2) Result = +1;
}
return Result;
}
Introduction to Parallelism
Multiprocessing and multithreading, both are responsible for multiple flows of execution, but the
multithreading is optimal. Windows is not only multithreading or multiprocessing it also supports
multiprocessors as well. If we have a system with multiprocessors, we should learn to program to use
the potential of multiprocessors because the processor’s speed has reached its bottleneck i.e. after a
certain limit, its speed cannot be enhanced. Parallelization is the key to future performance
improvement since we can no longer depend on increased CPU clock rates and since multicore and
multiprocessor systems are increasingly common. Previously, various programs have been discussed
that unleash the power of parallelism. The properties that enabled parallelism include the following:
Major task is divided into subtasks and many worker threads were run. Subtasks were divided
into worker threads that perform their work. These worker subtasks run independently, without
any interaction between them.
As subtasks are complete, a master program can merge the results of divided subtasks into a
single result.
The programs do not require mutual exclusion of any sort. Only the master worker is
synchronized with each worker and waits for them to complete.
Every worker will work as a separate thread on a separate processor. it is the most optimal
situation
Program performance scales automatically, up to some limit, as you run on systems with more
processors; the programs themselves do not, in general, determine the processor count on the
host computer. Instead, the Windows kernel assigns worker subtasks to available processors.
If you “serialize” the program the results should get precisely the same as the parallel program.
The serialized program is, moreover, much easier to debug.
A thread is an execution unit. In a multithreading program, one procedure can have several threads.
Every thread needs data, that it doesn’t want to share with other threads and which is unique i.e. it
varies from thread to thread. One technique is to have the creating thread call CreateThread (or
beginThreadex) with lpvThreadParm pointing to a data structure that is unique for each thread. The
thread can then allocate additional data structures and access them through lpvThreadParm. Windows
also provides Thread Local Storage (TLS), which gives each thread its array of pointers. The following
figure shows this TLS arrangement.
A function can have many threads i.e. thread 1, thread 2, etc. Every column in the TLS arrangement
corresponds to thread numbers and every thread needs variables i.e. TLS index 0,1,2,3 etc. Initially, no
TLS indexes (rows) are allocated, but new rows can be allocated and deallocated at any time. Once the
row is allocated, it will be allocated to all rows. The primary thread would be a logical choice for TLS
space management, however, every thread can access TLS.
DWORD TlsAlloc(VOID): This API is used to allocate the index and it returns the TLS index in the
form of the double word. Otherwise returns -1 in case of failure.
BOOL TlsFree(DWORD dwIndex): Frees the specified index.
• LPVOID TlsGetValue (DWORD dwTlsIndex ) and BOOL TlsSetValue (DWORD dwTlsIndex,
LPVOID lpTlsValue): Provided valid indexes are used. The programmer can access TLS space
using these simple GET/SET APIs
Some Cautions
• TLS provides a convenient mechanism for accessing memory that is global within a thread but
inaccessible to other threads.
• Global storage of a program is accessible by all threads
• TLS provides a convenient mechanism for accessing memory that is global within a thread but
inaccessible to other threads.
• Global storage of a program is accessible by all threads
Topic No – 124
In a multitasking or multithreading system, there are a number of processes running, each of which
competes for resources like processor, memory, etc. The operating system is responsible for managing
the resources. The Windows kernel always runs the highest-priority thread that is ready for execution. A
thread is not ready if it is waiting, suspended, or blocked for some reason. Since the threads are
dependents on the processes and they receive priority relative to their process priority classes. Process
priority classes are set initially when they are created using CreateProcess, and each has a base priority,
with values including
IDLE_PRIORTY_CLASS for threads that will run only when the system is idle. This is the lowest
priority process.
NORMAL_PRIORTY_CLASS indicating no special scheduling requirements.
HIGH_PRIORTY_CLASS indicating time-critical tasks that should be executed immediately.
REALTIME_PRIORTY_CLASS, the highest possible priority
The priority class of a process can be set and got using BOOL SetPriorityClass(HANDLE hProcess,
DWORD dwPriority) and DWORD GetPriorityClass(HANDLE hProcess) respectively.
There are some enhances variants of priority levels i.e. ABOVE_NORMAL_PRIORTY_CLASS (which is
below HIGH_PRIORTY_CLASS) and BELOW_NORMAL_PRIORTY_CLASS (which is above
IDLE_PRIOTY_CLASS).
PROCESS_MODE_BACKGROUND_BEGIN, which lowers the priority of the process and its threads for
background work without affecting the responsiveness of foreground processes and threads.
PROCESS_MODE_BACKGROUND_END restores the process priority to the value before it was set with
PROCESS_MODE_BACKGROUND_BEGIN.
Thread priorities are either absolute or are set relative to the process base priority. At thread creation
time, the priority is set to that of the process. The relative thread priorities are in a range of ±2 “points”
from the process’s base. The symbolic names of the resulting common thread priorities, starting with
the five relative priorities, are:
THREAD_PRIORTY_LOWEST
THREAD_PRIORTY_BELOW_NORMAL
THREAD_PRIORTY_NORMAL
THREAD_PRIORTY_ABOVE_NORMAL
THREAD_PRIORTY_HIGHEST
THREAD_PRIORTY_TIME_CRITICAL is 15, or 31 if the process class is REAL_TIME_PRIORTY_CLASS
THREAD_PRIORTY_IDLE is 1, or 16 for REAL_TIME_PRIORTY_CLASSES processes
Thread priorities are dynamic. They change with the priority of process or windows may also boost
thread priority as per need. This feature can be enabled disabled using SetThreadPriorityBoost()
Topic No – 125
Thread States
The following figure shows how the executive manages threads and shows the possible thread states.
This figure also shows the effect of program actions. Such state diagrams are common to all multitasking
OSs and help clarify how a thread is scheduled for execution and how a thread moves from one state to
another.
(From Inside Windows NT, by Helen Custer. Copyright © 1993, Microsoft Press.
Reproduced by permission of Micro-soft Press. All rights reserved.)
A thread is in the running state when it is running on a processor. More than one thread can be
in the running state on a multiprocessor computer
The executive places a running thread in the wait state when the thread performs a wait on a
non-signaled handle, such as a thread or process handle. I/O operations will also wait for the
completion of a disk or other data transfer, and numerous other functions can cause waiting. It
is common to say that a thread is blocked, or sleeping, when in the wait state
A thread is ready if it could be running. The executive’s scheduler could put it in the running
state at any time. The scheduler will run the highest-priority ready thread when a processor
becomes available, and it will run the one that has been in the ready state for the longest time if
several threads have the same high priority. The thread moves through the standby state before
entering the ready state.
The executive will move a running thread to the ready state if the thread’s time slice expires
without the thread waiting. Executing will also move a thread from the running state to the
ready state.
The executive will place awaiting thread in the ready state as soon as the appropriate handles
are signaled, although the thread goes through an intermediate transition state. It is common to
say that the thread wakes up.
A thread, regardless of its state, can be suspended, and a ready thread will not be run if it is
suspended. If a running thread is suspended, either by itself or by a thread on a different
processor, it is placed in the ready state.
A thread is in the terminated state after it terminates and remains there as long as there are any
open handles on the thread. This arrangement allows other threads to interrogate the thread’s
state and exit code.
Normally, the scheduler will place a ready thread on any available processor. The programmer
can specify a thread’s processor, which will limit the processors that can run that specific thread.
In this way, the programmer can allocate processors to threads and prevent other threads from
using these processors, helping to assure responsiveness for some threads. The appropriate
functions are SetProcessAffinityMask and GetProcessAffinityMas. SetThreadIdealProcessor can
specify a preferred processor that the scheduler will use whenever possible; this is less
restrictive than assigning a thread to a single processor with the affinity mask.
Topic No – 126
There are several factors to keep in mind as you develop threaded programs; lack of attention to a few
basic principles can result in serious defects, and it is best to avoid the problems in the first place than
try to find them during testing or debugging.
Here are a few guidelines. There may be a few inadvertent violations, however, which illustrates the
multithreaded programming challenges.
Make no assumptions about the order in which the parent and child threads execute.
It is possible for a child thread to run to completion before the parent, or, conversely, the child
thread may not run at all for a considerable period.
On a multiprocessor computer, the parent and one or more children may even run concurrently.
Make sure all the initializations required by a child thread have been performed before calling
CreateThread()
In case a thread has been run but initialization is required then use some technique like thread
suspension until data is initialized.
Failure by the parent to initialize data required by the child is a common cause of “race
conditions” wherein the parent “races” the child to initialize data before the child needs it.
Any thread, at any time, can be preempted, and any thread, at any time, may resume execution
Do not confuse synchronization and priority. These both are different concepts. Threads are
defined as their priorities when created, However Threads are synchronized in such a way that
one thread completes its specific purpose, and only then another thread may run its process.
Even more so than with single-threaded programs, testing is necessary, but not sufficient, to
ensure program correctness. It is common for a multithreaded program to pass extensive tests
despite code defects. There is no substitute for careful design, implementation, and code
inspection.
Threaded program behavior varies widely with processor speed, number of processors, OS
version, and more. Testing on a variety of systems can isolate numerous defects, but the
preceding precaution still applies.
The default stack size for a thread is 1MB. Make sure the size is sufficient as per thread needs.
Threads should be used only as appropriate. Thus, if there are activities that are naturally
concurrent, each such activity can be represented by a thread. If, on the other hand, the
activities are naturally sequential, threads only add complexity and performance overhead.
If you use a large number of threads, be careful, as the numerous stacks will consume virtual
memory space and thread context switching may become expensive. In other cases, it could
mean more threads than the number of processors.
Fortunately, correct programs are frequently the simplest and have the most elegant designs.
Avoid complexity wherever possible.
Topic No – 127
Timed Waits
In threading, there are some functions that can be used to wait for threads. The Sleep function allows a
thread to give up the processor and move from the running to the wait state for a specified period of
time. After the completion of the specified time, the thread will move to the ready state and will wait to
move on running state. A thread can perform a task periodically by sleeping after carrying out the task.
The time period is specified in milliseconds and can even be INFINITE, in which case the thread will never
resume. A 0 value will cause the thread to relinquish the remainder of the time slice; the kernel moves
the thread from the running state to the ready state.
The function SwitchToThread() provides another way for a thread to yield its processor to another ready
thread if there is one that is ready to run.
Topic No – 128
Fibers
A fiber, as the name implies, is a piece of a thread. More precisely, fiber is a unit of execution that can
be scheduled by the application rather than by the operating system. An application can create
numerous fibers, and the fibers themselves determine which fiber will execute next. The fibers have
independent stacks but otherwise run entirely in the context of the thread on which they are scheduled,
having access, for example, to the thread’s TLS and any mutexes owned by the thread. Furthermore,
fiber management occurs entirely in user space outside the kernel. Fibers can be thought of as
lightweight threads, although there are numerous differences. A fiber can execute on any thread, but
never on two at one time. Fiber that is meant to run on different threads at different instances should
not access thread-specific data from TLS.
Fiber Uses
Fiber APIs
A set of different API functions are provided that can help create and manage fibers. These functions are
ConvertThreadToFiber() or
ConvertThreadToFiberEx()
After calling this API the thread will now contain a fiber. It will provide a pointer to the fiber data
more or less like thread data. This can be used accordingly.
CreateFiber()
Each new fiber has a start address, stack size, and a parameter. Each fiber is identified by
address and not a handle.
GetFiberData()
GetCurrentFiber()
SwitchToFiber()
It uses the address of the other fiber. The context of the current fiber is saved and the context of
the other fiber is restored. Fibers must explicitly indicate the next fiber that is to run in the
application.
SwitchToFiber()
It uses the address of the other fiber. The context of the current fiber is saved and the context of
the other fiber is restored. Fibers must explicitly indicate the next fiber that is to run in the
application.
DeleteFiber()
Topic No – 130
Using Fibers
Previously we discussed several APIs used to manage the fibers, here we will develop a scheme with
these APIs to use the fibers. Fiber enables us to control the switching of threads. This scheme firstly
converts a thread to fiber and then uses it as the primary fiber. The primary fiber creates other fibers
and switching among these fibers is managed by the application.
In the center top, a primary thread is created in which ConvertThreadToFiber is used to convert into
fiber and then with the help of a loop number of fibers are created. To convert the execution to a
certain fiber SwitchToFiber is used. Primary fiber is switched to Fiber 0 which gets fiber data and then
switches to Fiber 1 which also performs the same work i.e. gets data and this fiber switches to primary
fiber. The primary fiber starts execution from the point where it was switched before i.e. now it switches
to fiber 2 which gets data and switches to fiber 0 and then back to fiber 2 and at the end Thread is
closed.
Master-Slave Scheduling: One fiber decides which fiber to run. Each fiber transfers back the
execution to the primary fiber. (Fiber 1)
Peer to Peer Scheduling: A fiber determine which fiber should be next to execute based on some
policy (Fiber 0 and 2)
Topic No – 131
Threads enable concurrent processing and parallelism in multiple processors. However, there are some
pros and cons because of this concurrent processing. When many threads may run concurrently. They
may need to synchronize for tasks in numerous instances. i.e. in the Boss-worker model, there are one
boss thread and many workers threads, the Boss thread waits for all workers to complete execution and
compiles all the data. In case if boss executes before the worker completes its execution, this may affect
the output we are required to ensure that the Boss will not access the worker’s memory unless the
worker completes it. Alternately, the workers do not start working unless the boss has created all the
workers so that a worker may not try to access the data of another worker which has not been created
as yet. When multiple threads are using share data they may require coordination i.e. when a thread is
using the data other threads should wait for it. Also, programmers need to ensure that two or more
threads are not modifying the same data item simultaneously. In case many threads are modifying
elements of the queue, the programmer needs to assure that two or more threads do not attempt to
remove an element at the same time. Several programming flaws can leave such vulnerabilities in the
code.
Example
There are two threads i.e. Thread 1 and Thread 2, operation of both of these is the same i.e.
Thread 1 Thread 2
{ {
M=N M=N
M=M+1 M=M+1
N=N N=N
} }
Let’s suppose that the initial value of N is 4 though the final values after execution of thread 1 will be
M=5, N=5. After completion of thread 1 when thread 2 starts executing, the initial value of N is 5
(because of thread 1 execution) thus final values become M=6 and N=6.
When these same threads are run concurrently as shown in the following figure.
When the thread is executing, the value of N is 4 initially. After two instructions thread is switched. Thus
the value of M becomes 5 and the value of N is not changed because 3rd instruction is not executed yet.
Since the thread is switched therefore thread 2 starts execution and value N is 4 here. It executes all
three instructions and values become M=5 and N=5. Again thread is switched and the remaining 1
instruction of thread 1 executes which affects the N variable and because of this final value of N
becomes 5.
Thus we can see in many situations the final output of concurrent processing may not be same as it is in
the normal processing
Topic No – 132
When problems and errors arise because of parallelism or concurrent processing these types of errors
are of critical section problem i.e. A critical resource which is used by number of processes or threads.
This critical section problem can be handled by the following ways:
This solution uses a global variable named Flag to indicate to all threads that a thread is
modifying a variable. A thread turns it TRUE before modifying a variable turns it to FALSE after
modifying it. Each thread would check this Flag before modifying the variable. If the Flag is TRUE
then if indicates that the variable is being used by some other thread.
Even in this case, the thread could be preempted between the time FLAG is tested and the time FLAG is
set to TRUE; the first two statements form a critical code region that is not properly protected from
concurrent access by two or more threads. Another attempted solution to the critical code region
synchronization problem might be to give each thread its own copy of the variable, as follows:
Topic No – 133
Volatile Storage
The volatile stage is a Windows level or Compiler level facility provided when using incrementing
operations which change the shared variables to reduce the conflicts that arise because of switching.
Latent Defects
Even if the synchronization problem is somehow resolved there still may remain some latent problems.
A thread code may switch to another thread while a variable value has been modified in a register
without writing it back. The use of registers for intermediate operations is a compiler optimization
technique.
Turning off optimization may adversely affect the performance of the whole program but not always
sometimes it may slow the program. ANSI C provides a qualifier volatile for this purpose. A variable with
a volatile qualifier will always be accessed from memory for operations and will always be stored in
memory after any operation. A volatile qualifier also means that the variable can be accessed at any
instance. A volatile qualifier should only be used where necessary as it degrades the performance.
Using volatile
• Even if it’s read only for two or more threads but the outcome of the threads depends on its
new value.
Content Development
Topic No - 134
Cache Coherency:
The volatile qualifier does not assure that the changes are visible to processors in a desired order.
Processor usually hold the values in cache before writing them back to memory. This may alter
the sequence in which different processors see the values.
Memory Barriers:
To assure that the memory is accesses in the desired order use memory barrier or memory fences.
The interlocked function provide memory barrier. Further, the concept is clarified the diagram
showing 4 processors in a system on two dual core chips.
The diagram depicts 4 processor core on two chips. Each core has its own register with
intermediate values of variables. Each core has separate Level-1 cache for instruction and data. A
common larger L2 cache for cores on each chip. Memory is shared among cores.
Volatile qualifiers only assures that the new data values will be updated in L1. There is no
guarantee that the values will be visible to other thread running on different processors. Memory
barriers assure that the main memory is updated and cache of all processors is coherent.
For example if core 0 updates a variable N using a memory barrier. If core 3 L1 cache also contains
the values of N then the value in its cache is either updated or removed so that core 3 could access
the new value coherent value of N. This certainly incurs a huge cost. Moving data within the core
registers costs less than a cycle whereas more data from one core to another via main memory can
cost 100s of cycles.
Topic No - 135
Interlocked Functions:
Interlocked functions are most suited if variable with volatile scope only need to be incremented,
decremented and exchanges. Interlocked functions are simpler and faster and easy to use.
However, they do pose the performance drawback as they do generate a memory barrier.
They both use a 32-bit signed variable as parameter that should be stored at 4-byte boundary in
memory. They should be used whenever possible to improve performance.
InterlockedIncrement(&N);
N should be volatile integer placed on appropriate memory boundary. Function return the new
value of N. However some other thread may preempt before the value is returned and change the
value of N. Do not call it twice to increment the value by two as the thread may be preempted
between both calls. Rather use InterlockedExchangeAdd().
Topic No - 136
Another criterion for correct thread code is that global storage should not be used for local storage
purpose. If a global variable is used to store thread specific data then it will be used by other threads
as well. This will result in incorrect behavior no matter how the rest of code is written. The
following example is incorrect usage of the global variables.
DWORD N;
....
...
N = 2 * pArgs->Count; ...
N is kept global but is used to store thread specific information from its parameters structure. For
example, if many thread functions are running at the same time and each function modifies N with
its own parameters, no function will be able to use N properly because N is global and all threads
are running at the same time and affecting N's value. Due to this problem, the final results will be
incorrect.
It's important to know when to use local and when to use global variables while dealing with these
variables. If you know a variable contains thread-specific information, make it local. You can
make a variable global if you know the information in it will be used by other threads.
Adhering to such practices can become even more convoluted when single threaded programs are
converted to run as multi-threaded programs have threads being run in parallel. For example:
DWORD N=0;
....
...
DWORD result;
...
result=...;
return result;
The code above is for a single-threaded program. The program executes sequentially. The thread
function is called repeatedly in a for loop. When a thread is called, the information it returns is
stored in N, and the process is repeated for the next thread and so on. Because this is a single-
threaded program, only one thread will be executed at a time. If you alter this code to a multi-
threaded program, you will face issues.
Topic No - 137
Guide lines for writing thread safe code to ensure that the code runs smoothly in a threaded
environment. When more than one thread can run the same code without introducing
synchronization issues, it is said to be thread safe.
Variables that are required locally should not be accessible globally. They should be placed
on Stack or in Data Structure passed to thread or the thread TLS. If a function is being used
by several thread and it contains a variable that is thread specific such as a counter than
store it in TLS or thread dedicated DS. Do not store it in global memory.
Avoid race conditions. If some required variables are uninitialized then create suspended
threads until variables are initialized. If some condition needs to be met before a code block
is executed then make sure the condition is met by waiting on synchronization objects.
Thread should not change the process environment. Changing environment by one thread
will effect other threads. Thread should not change standard input or output devices. Also
it should not change environment variables. A primary thread may change process
environment as an exception. In that case the same environment used by rest of threads.
As a principle primary thread ensures that no other thread changes the process
environment. All variable that are meant to be shared among threads are either kept global
or static. They are protected by synchronization or interlocked mechanism using memory
barriers.
Topic No - 138
Windows support different types of synchronization objects. The objects can be used to
enforce synchronization and mutual exclusion.
Synchronization Mechanism
There are always inherent risks involved in the use of such objects such as deadlocks. Care
is required while using these objects. In higher version of Windows other objects like SRW
locks and Condition variables are also available. Other advanced objects are waitable
timers and IO completion ports.
Topic No - 139
CRITICAL_SECTION Objects
Critical Section
Critical section is the part of the code that can only be executed by one thread at a time. Windows
provides the CRITICAL_SECTION objects as a simple lock mechanism for solving critical
section problem. CRITICAL_SECTION Objects are initialized and deleted but do not have
handles and are not shared among processes. Only one thread at a time can be in the CS variable,
although many threads may enter and leave CS at numerous instances.
When a thread wants to enter into the critical section it must call EnterCriticalSection() and when
it leaves it should call LeaveCriticalSection().
If one thread has entered a CS it can enter again. Windows maintain a count. Thread will have to
leave as many times as it enters. This is implemented to support recursive functions and to make
shared library functions thread safe. There is no timeout to EnterCriticalSection(). A thread will
remain blocked forever if a matching Leave call is not received. A thread will remain blocked
forever if a matching Leave call is not received.
However as thread can always poll to see whether another thread owns a CS using the function:
If the function returns TRUE then it indicates that the current thread now owns the CS. If it returns
false then it indicates that the CS is owned by some other thread and it is not safe to enter CS.
Since CRITICAL_SECTION is a user space object therefore it has apparent advantages over other
kernel space objects.
Topic No - 140
In this module, we will study how to use protected shared variables in critical section and
how critical section can assist us use protected shared variables.
Using the critical section construct is easy and intuitive. Consider an example of a server
that maintains status related information in shared variables, information like number of
requests, number of responses and requests currently under process. Since such count
variables are shared therefore only one thread can be allowed to modify it at a time.
CRITICAL_SECTION construct can be used easily to ensure this. In this solution also an
intermediate variable is used to emphasize the role of CRITICAL_SECTION.
Topic No 141:
In this module, we will discuss a guideline how to protect shared variables, how to use
synchronization object, and what kind of mapping is there between synchronization object and
variables like one to one or one to many.
All the variables within the critical section must be guarded by a single object. Using different
objects within the same thread or using different objects across numerous threads sharing same
data would be incorrect. Shared variables must be protected by a single object across all threads
for mutual exclusion to work.
Below is given an example of incorrect use of synchronization object. In this example, two
different objects are using the same variable N. Such code may generate incorrect results therefore
share all variables using a single object.
Topic No 142:
Producer-Consumer Problem
Producer-Consumer Threads
Producer consumer problem is a classical problem in mutual exclusion. It has many versions. Here
we describe a simplistic version. This version clearly shows how to build and protect data
structures for storing objects. Also we discuss how to establish invariant properties of variables
which are always TRUE outside CS. In addition to primary threads there are two more threads: a
producer and a consumer thread. The producer periodically creates a message. The message is
contained in a table. The consumer on request of the user displays the message. The displayed
data must be most recent and no data should be displayed twice. Do not display data while it is
being updated by the producer. Do not display old data. Some of the produced messages may never
be used and maybe lost. This is also like the pipeline model in which a message moves from one
thread to another. The producer also computes a simple checksum of the message. Consumer
makes sure by checking the checksum. If the consumer accesses the table while it is being updated
the table will be invalid. CS ensures that this does not happen. The invariant is that the checksum
is correct for current message contents.
Topic No 143:
Producer-Consumer
Producer consumer problem is a classical problem in mutual exclusion. Based on limitations and
function described the following program is developed:
/* Chapter 9. simplePC.c */
#include "Everything.h"
#include <time.h>
DWORD nLost;
time_t mTimestamp;
} MSG_BLOCK;
/* 1) !fReady || fStop */
MSG_BLOCK mBlock = { 0, 0, 0, 0, 0 };
DWORD status;
InitializeCriticalSection (&mBlock.mGuard);
if (hConsume == NULL)
if (status != WAIT_OBJECT_0)
if (status != WAIT_OBJECT_0)
DeleteCriticalSection (&mBlock.mGuard);
return 0;
{
srand ((DWORD)time(NULL)); /* Seed the random # generator */
while (!mBlock.fStop) {
/* Random Delay */
Sleep(rand()/100);
EnterCriticalSection (&mBlock.mGuard);
__try {
if (!mBlock.fStop) {
mBlock.fReady = 0;
MessageFill (&mBlock);
mBlock.fReady = 1;
InterlockedIncrement (&mBlock.mSequence);
return 0;
{
CHAR command, extra;
if (command == _T('s')) {
* The Producer will see the new value after the Consumer returns */
mBlock.fStop = 1;
EnterCriticalSection (&mBlock.mGuard);
__try {
if (mBlock.fReady == 0)
else {
MessageDisplay (&mBlock);
InterlockedIncrement(&mBlock.nCons);
} else {
_tprintf (_T("Illegal command. Try again.\n"));
return 0;
DWORD i;
msgBlock->mChecksum = 0;
msgBlock->mData[i] = rand();
msgBlock->mChecksum ^= msgBlock->mData[i];
msgBlock->mTimestamp = time(NULL);
return;
{
/* Display message buffer, mTimestamp, and validate mChecksum */
DWORD i, tcheck = 0;
tcheck ^= msgBlock->mData[i];
msgBlock->mData[0], msgBlock->mData[DATA_SIZE-1]);
if (tcheck == msgBlock->mChecksum)
else
return;
Topic No 144:
Mutexes
In this module we will discuss mutexes. Mutex, a short form of Mutual Exclusion. Windows
provides an object called a mutex. Using this object we can enforce mutual exclusion.
Mutexes have some advantages beyond CRITICAL_SECTION. Mutexes can be named and have
handles. They can also be used for interprocess communication between threads in different
processes. For example, if two processes share files through memory maps, then mutexes can be
used for protection. Mutexes also allow timeout values. Mutexes can automatically become
signaled once abandoned by the terminating thread. A thread gains ownership to mutex by waiting
successfully on mutex handle using WaitForSingleObject() or WaitForMultipleObject().
Ownership is released using ReleaseMutex(). A thread should be careful about releasing a thread
as soon as possible. A thread can acquire a single mutex several times. A thread will not be blocked
if it already has ownership. The recursive property holds for Mutexes also.
• CreateMutex()
• ReleaseMutex()
• OpenMutex()
lpMutexName is the name of the mutex. The name is assigned as per rules of the Windows
namespace.
OpenMutex() is used to open and existing named mutex. The Open operation is followed by
Create. The main thread would usually create a mutex while other threads can open it. This
construct allows to synchronize threads from different processes.
Similarly ReleaseMutex() releases the ownership of the Mutex for a thread. It fails if the Mutex
is not already owned by the thread.
Topic No 145:
In this module we will discuss deadlocks. Deadlocks may arise when Mutexes or CRITICAL
SECTIONS are used.
Concurrency objects must be used carefully, otherwise it can lead to a deadlock situation.
Deadlocks are a byproduct of concurrency control. It occurs when two or more objects try to lock
a resource at the same time.
An Example
For instance, there are two lists (A and B) with the same structure maintained by different worker
threads. In one situation, it can be possible that an operation is only allowed if a certain element is
either present in both the lists or none. An operation is invalid if an element is present in just one.
In another situation, an element in one list can not be in the other. Based on these situations,
concurrency objects are required for both lists. Using a single mutex for both lists will degrade
performance by restricting concurrent updates.
One method to avoid this situation is to maintain a hierarchy. All threads should follow the same
hierarchy. They should acquire and release mutexes in the same order. In the example, the
situation can be easily avoided if both the threads acquire mutexes in the same order.
Another technique that can be used is an array of mutexes and using WaitForMultipleObjects()
with the fWaitAll flag. In this case, the thread will either acquire both the mutexes or none. This
technique is not possible with CRITICAL_SECTION.
Topic No 146:
Mutexes and CS
This module is about Mutexes VS CRITICAL_SECTIONs. Mutexes and CS are similar in the
sense that they are used to achieve the same goal. Both can be owned by a single thread, and other
threads trying to gain access to a resource shall be denied access until the resource is released.
However, there are few advantages of using mutexes against some performance drawbacks.
Advantages of Mutexes
Mutexes that are abandoned due to the abrupt termination of a thread are automatically signaled
so that waiting threads are not blocked permanently. However, abrupt termination of a thread
indicates a serious programming flaw. Mutex waits can time out, whereas CS wait does not. As
they are named, mutexes are shareable over numerous threads in different processes. Threads that
create the mutexes can acquire immediate ownership (slight convenience). However, in most
cases, CSs will work considerably faster.
Topic No 147:
Semaphores
This module is about semaphores. Semaphore is simply a data structure that is used for
concurrency control.
Semaphore Count
Semaphore maintains a count. A semaphore is in a signaled state when the count is greater than 0.
Semaphore is unsignaled when the count is 0. Threads use the wait function on semaphore. The
count is decremented when a waiting thread is released. The following functions are used:
CreateSemaphore(), CreateSemaphoreEx(), OpenSemaphore(), and ReleaseSemaphore().
cReleaseCount gives the count after the release and must be greater than 0. The call will fail and
return FALSE if it would cause the count to exceed the maximum, and the count will remain
unchanged. Also, in this case, the release count will not be valid.
Any thread can release the semaphore, not just the one that acquired its ownership. Hence, there
is no concept of abandonment.
Topic No 148:
Using Semaphores
This module is about using semaphores. We have already discussed semaphores. Now we will
discuss how to use the structure and functions in a program to enforce mutual exclusion.
Semaphore Concepts
Classically, a semaphore count represents the number of available resources. Sem Maximum
represents the number of resources. In a producer-consumer scenario, the producer would place an
element in the queue and call ReleaseSemaphore(). The consumer will wait on the semaphore and
decrease the count after consuming the item.
Thread Creation
Previously, in many examples, numerous threads were created in a suspended state and were not
activated until all the threads were created. This problem can be handled by semaphores without
suspending threads. A newly created thread will wait on a semaphore with the semaphore
initialized by 0. The boss thread will call ReleaseSemaphore with the count set to the number of
threads.
Topic No 149:
Semaphore Limitation
Limitations
Some limitations are encountered while using windows semaphore. How can a thread request to
decrease the count by 2 or more? The thread will have to wait twice on the semaphore. Calling
wait twice will not be atomic and execution may switch in between.
Below is given a code of two threads, thread1 on the left side and thread2 on the right side. The
problem with the below code is that the wait functions are not atomic; they are separate wait
functions. Because they are not single functions, switching between these two wait functions (after
completion of first wait function) in thread1 may occur, and execution control may reach the
thread2 wait function, resulting in a deadlock-like situation. So this is one of the limitations of
semaphores.
How can we deal with this limitation? Make the two wait functions atomic by using a
CriticalSection to avoid switching during the wait, as shown in the below code.
Other Solutions
Another solution one can suggest is the use of WaitForMultipleObjects() with an array holding
multiple references to the same semaphore. However, this solution will fail readily as the call to
WaitForMultipleObjects() fails when it detects duplicate objects in the array. Secondly, all the
handles may get signals even if the count is 1.
Topic No 150:
Events
In this module, we will discuss Events. Events are synchronization objects like Mutexes,
Semaphores and Critical Section that can be used for concurrency control.
Events can indicate to other threads that a specific condition now holds, such as some message
being available. Multiple threads can be released from wait simultaneously when an event is
triggered. Events are classified as either manual-reset or auto-reset. Event property is set using
CreateEvent(). A manual-reset event can signal several waiting threads simultaneously and can be
reset. An auto-reset event signals a single waiting thread, and the event is reset automatically. The
functions used to handle events are CreateEvent(), CreateEventEx(), OpenEvent(), SetEvent(),
ResetEvent() and PulseEvent().
The function creates an event. bManualReset is set to TRUE to set a manual reset event. If
bInitialState is TRUE, the event is set to a signaled state. A named event can be opened using
OpenEvent() from any process.
The above three functions are used to control events. A thread can signal the event by using
SetEvent(). In the case of auto-reset, a single thread is released out of many. Event automatically
returns to a non-signaled state. If no threads are waiting, the event remains in a signaled state
until a thread waits on it. In this case, the thread will be immediately released.
If the event is manual-reset, it remains signaled until a thread explicitly calls ResetEvent().
Meanwhile, all the waiting threads are released. Consequently, other threads may also wait to be
released immediately before the reset.
PulseEvent() releases all threads currently waiting for a manual reset. The event is automatically
reset. In the case of an auto-reset event, PulseEvent() releases a single waiting thread, if any.
Topic No 151:
In this module, we will discuss Event Usage Models. Event can be implemented using different
types of models. These models are designed according to the usage.
Combinations
There can be four distinct ways to use events. These four models culminate from combinations of
SetEvent(), PulseEvent() and use of auto and manual events. Each combination proves useful
depending on the situation. The combinations are highlighted in the table.
Understanding Events
An auto-reset event can be conceptualized as a door that automatically shuts when opened.
Whereas the manual reset does not shuts automatically when opened. Hence, PulseEvent() can be
considered as a door that is open which shuts when a single thread passes through in the case of
auto-reset. While in the case of a manual reset, multiple waiting threads can pass through.
SetEvent(), on the other hand, simply opens the door and releases a thread.
Topic No 152:
Producer Consumer
Here is another example that provides a solution to our Producer-Consumer problem using Events.
The solution uses Mutexes rather than CSs. A combination of auto-reset and SetEvent() is used in
the producer to ensure only one thread is released. Mutexes in the program make access to the
message data structure mutually exclusive while an event is used to signal the availability of new
messages.
#include "Everything.h"
#include <time.h>
time_t mTimestamp;
} MSG_BLOCK;
DWORD status;
if (hProduce == NULL)
if (hConsume == NULL)
if (status != WAIT_OBJECT_0)
ReportError (_T("Failed waiting for consumer thread"), 3, TRUE);
if (status != WAIT_OBJECT_0)
CloseHandle (mBlock.mGuard);
CloseHandle (mBlock.mReady);
return 0;
while (!mBlock.fStop) {
/* Random Delay */
__try {
if (!mBlock.fStop) {
mBlock.fReady = 0;
MessageFill (&mBlock);
mBlock.fReady = 1;
InterlockedIncrement(&mBlock.mSequence);
return 0;
DWORD ShutDown = 0;
CHAR command[10];
if (command[0] == _T('s')) {
ShutDown = mBlock.fStop = 1;
ReleaseMutex (mBlock.mGuard);
__try {
MessageDisplay (&mBlock);
InterlockedIncrement(&mBlock.nCons);
} else {
return 0;
DWORD i;
msgBlock->mChecksum = 0;
msgBlock->mData[i] = rand();
msgBlock->mChecksum ^= msgBlock->mData[i];
msgBlock->mTimestamp = time(NULL);
return;
DWORD i, tcheck = 0;
TCHAR timeValue[26];
tcheck ^= msgBlock->mData[i];
msgBlock->mData[0], msgBlock->mData[DATA_SIZE-1]);
if (tcheck == msgBlock->mChecksum)
else
return;
}
Content Development
Topic No -154
Content:
Topic No -154
Content:
1)-The first issue is timeout when you call a wait on any object then you must
specify the timeout otherwise the thread will be blocked for an indefinite
period, so a programmer should take care of it.
2)- When a thread is in the critical section (CS) and it terminates without
releasing the critical section then it will be impossible for the other thread in
the queue to complete its execution and be blocked.
3)-Mutexes have a way out for the above problem in critical section, if a
thread or process owns a mutex and it terminates, then mutex has an
abandonment property, abandoned mutex handles are signaled, it is a useful
feature not available with CSs. It abandons the process and other threads are
not blocked and fail as in CS case. Mutex abandonment of a process shows a
flaw in the programming.
4)- If you used the construct of WaitForSingleObject() and also specified the
timeout then you must program in such a way that it checks if the wait ends
due to timeout then it must release the critical resource.
5)-Exactly one of the waiting threads at a time should be given the ownership
of mutex. Only the OS Scheduler decides which thread has the priority
according to its scheduling policy. Program should not assume the priority of
any particular thread over the other.
6)- A single mutex can be used to define several critical regions for several
threads which are competing for the resource.
9)-The same data structure that stores the resource should also be used to
store the mutexes because mutexes correspond to the resources.
10)- Invariant is the property that assures whether you have enforced the
concurrency correctly or not.
12)- Complex conditions and decision structures should be avoided for
entering into the critical region. Each critical region must have one entry and
one exit.
13)- Must ensure that mutex is locked on entry and unlocked on exit.
14)- Avoid premature exits from the critical region such as break, return or
goto statements, termination handlers are useful for protecting against such
problems.
15)- If the critical code region becomes too lengthy (longer than one page,
perhaps), but all the logic is required, consider putting the code in a function
so that the synchronization logic will be easy to read and comprehend.
Topic No -155
Content:
----------------------------------
The interlocked functions provide a simple mechanism for synchronizing
access to a variable that is shared by multiple threads. They also perform
operations on variables in an atomic manner. The threads of different
processes can use these functions if the variable is in shared memory.
Interlocked functions are as useful as they are efficient; they are implemented
using atomic machine instructions (for this reason, they are sometimes called
“compiler intrinsic statements”).
Some Interlocked Functions and their details:
======================================
Long InterlockedExchange(
LONG Value
It returns the previous value of *Target and sets its value to Value.
===========================================
Long InterlockedExchangeAdd(
LONG Increment
========================================
--------------------------------------------------
-----------------------------------
a)-Each thread that performs memory management can create a Handle to its
own heap using HeapCreate(). Memory allocation is then performed using
using HeapAlloc() and HeapFree() rather than using malloc() and free()
----------------------------------
Lecture-157-Synchronization Performance Impact
----------------------------------
a)- Locking, waiting and even interlocked operations are time consuming
b)- Locking requires kernel operation and waiting is expensive
c)- Only one thread at a time can execute a critical code region, reducing
concurrency and produces almost serialization execution
d)- Many processes when competing for memory and cache can produce
unexpected effects.
----------------------------------
Lecture-158-159-Synchronization
Performance Impact
----------------------------------
Reading Material
We arrange data in the memory in such a way to avoid cache conflicts and
allocation are aligned over optimal cache boundaries
Following are the main modifiers used for minimal memory contention
3)-Using Mutexes-MX
Following are the main inferences we drew by running the programs using
these scenarios.
a)-Real Time
b)-User Time
c)-System Time
Inferences
1)- NS (no synchronization) and IN (interlocked functions) exhibit all most the
same time for this example.
4)-Any type of locking (even IN) is more expensive than no locking but we
cannot discard it because of the significance of synchronization.
Clock Rate
5- Mutexes are very slow, and unlike the behavior with CSs, performance
degrades rapidly as the processor count increases. For instance, Table 9–1
shows the elapsed and times (seconds) for 64 threads and 256,000 work
units on 1-, 2-, 4-, and 8-processor systems. CS performance, however,
improves with processor count and clock rate.
—-----------------------
While working with synchronization we face the problem of false sharing
contention
Cache has cache lines having different capacities, if you have a large array
with different elements on a single line and if we access any element on that
line then the whole line will be locked.
If on one line we store more than one element then we have to face the
consequences of false sharing contention.If any one element is locked on a
single line then other elements will also be locked and can only be accessed
serially.
—-----------
It will be better to align each variable in such a way that each variable is
placed on a different cache line. We have to use this type of alignment in
order to avoid false sharing.Of course when we use this type of alignment it
will be expensive memory wise.
Critical Section works in user space; it does not work in kernel space and
does not require complicated and/or convoluted system calls while mutexes
invoke kernel calls for locking and unlocking like ReleaseMutex() requires
system calls.
Lock bit is on
===============
If any thread invoke CS with the help of EnterCriticalSection() it test its lock
bit, if lock bit is off ,it mean no other thread is entered in the CS yet, so lock bit
atomically set and operation proceed without waiting hence locking and
unlocking of CS is very efficient and it just require a couple of machine level
instructions
The ID of the thread is stored in the CS data structure and it also stores the
status of the recursive calls, so recursion overheads are also performed in the
critical section.
=========
Spin count will tell how much time the tight loop repetitively checks the lock
bit. If the lock bit repetitively checked without yielding the processor then it
gives up calling WaitForSingleObject()
-Spin count determines the number of times the loop repeats and it is used in
multiprocessor system
LeaveCriticalSection() turns off the lock bit and also informs the kernel by
ReleaseSemphore() call in case there are any waiting threads.
We can specify spin count when we initialize the Critical Section using the function
According to MSDN 4000 is the good spin count for heap related functions but it depends, like if
you have a short critical section then small spincount will work optimally.
Spin count should be set as per the dynamics of the application and the number of processes.
—------------------------------
Slim reader writer locks are used to optimize the Synchronization and minimizing the
overheads due to synchronization and mutual exclusion.
-In shared mode reading is possible and different threads can read the data at a time in
shared mode
SRWs can be used either in Exclusive Mode or in Shared mode but you can not
upgrade or downgrade the Exclusive Mode or Shared Mode so Exclusive mode cannot
be converted into Shared mode and Shared mode cannot be converted to Exclusive
mode once acquired. Threads have to decide the mode before using SRWs whether it
is exclusive or shared.
SRWs are light weight and slim and the size of the associated pointers are 32 bit or 64
bits only. No Kernel objects are associated with SRWs and hence SRWs locks require
minimal resources.
SRWs do not support recursion unlike Critical Section, so SRW is simpler and faster.
The spin count value of SRW is preset optimally and cannot be set manually
========================
InitializeSRWLock()
Similarly, there are two APIs which can access SRWs in shared mode
There are also two APIs which are used to access SRWs in Exclusive mode
-AcquireSRWLockExclusive
-ReleaseSRWLockExclusive
Using SRWs
If any thread needs to read the shared data then we get the SRW in shared mode but
the thread to write that data will use SRW in exclusive mode.
SRWs are used just like CS OR MUTEX in exclusive mode and are used in shared
mode if the guarded variable is not changed by thread.
- The don’t allow recursion because in recursion you have to maintain the stack and it
will require time and more execution
Evaluation SRWs performance by using threads shows that these are almost twice fast
than critical Section
Just like memory contention systems also have to face the thread contention
As the number of threads increases, it causes serious performance issues for following
reasons
1)- When a thread is created almost 1MB space is reserved for that thread and with
increasing threads memory area is piled up.
2)- With large number of threads, context-switching can become time consuming
4)- If you minimize the threads the you can lose the benefits of parallelism and
synchronization
Optimizations:
Semaphore Throttles
The Solution:
More variations:
• This technique works well with older version of windows. NT6 onwards have
optimization techniques of their own. This technique should be used with care in
case of NT6.
• This technique can also be applied to reduce contention in other resources like
files, memory etc.
Topic 168
Thread Pools
Thread pools are easy to use. Also already developed programs that uses
threads can be easily modified to use thread pools.
Application creates a work object rather than a thread. Each work object is
submitted to the thread pool.
Each work object has a call back function and is identified by a handle like
structure
Thread pool manages a small number of “worker threads”
169
Thread Pools APIs:
This API is used to register a work object (i.e. A call back function and a
parameter)
The Windows Thread pool decides which and when to invoke the callback
function of a work object
pwk is the reference returned by the CreateThreadpoolWork() call
The callback function associated with pwk will be executed once. The thread
used to run the callback function is determined by the kernel scheduler.
Programmer do not need to manage threads but synchronization still needs to be
enforced.
The wait function does not have a timeout. It returns when all the callback
functions have returned.
Optionally it can also cancel the work objects whose call back function has not
started as yet using the second parameter. Callbacks that have started will run to
completion.
pwk is the work thread pool work object.
Work is the work object and Context is pv value obtained from the work object
creation call.
The Instance is the callback instance that provide important information
regarding the instance to enable kernel to schedule callbacks.
CallbackMayRunLong informs the kernel that the callback instance may run for a
longer period. Normally callback instances are expected to run for a short period.
Content Development
Content:
According to MSDN, a thread pool is a collection of worker threads that efficiently execute
asynchronous callbacks on behalf of the application. The thread pool is used to reduce the number of
application threads and provide management of the worker threads.
The application that primarily performs parallel processing, or processes independent work items in the
background, or performs an exclusive wait on Kernel objects can benefit from the Threadpool.
Every process has a dedicated thread pool. When a callback function is registered for a threadpool, then
these functions are executed by the process thread pool. We cannot be sure exactly how many threads
are there in a specific process thread pool. The purpose of a thread is to take (or map) a callback
function of a work object and start executing it. When a callback function finishes executing then the
corresponding thread becomes free and available for the Windows scheduler to be mapped with
another work object.
If we have a lot of work objects but limited threads within the thread pool, then those work objects
begin to compete with each other for having control of threads. This is normal behavior. It is the
responsibility of the Windows kernel to resolve the competition. A system programmer has little control
over this. If the programmer thinks that default threads in a thread pool are not sufficient, he/she may
create more threads using CreateThreadPool() but it might provide benefits or it might further degrade
the performance. depending on the situation.
Content:
Whenever we create a work object, we also associate a callback function with it.
CreateThreadPoolWork() associates a work object to a callback function and that callback function maps
to a certain thread from the thread pool. A callback function typically performs computations and I/O.
There can be several types of callback functions. Some of them are listed below:
Content:
We have discussed four locking mechanisms along with thread pools. Also we compared the
performance of each using different programs.
Many SW and HW factors vary the performance of each mechanism. However we can generalize the
performance in the following order from highest to lowest performance:
Content:
One of the first processors made by Intel had a clock speed of 477 MHz. As the time progresses, clock
speed increases which means more instructions per second can be executed and hence increase in
performance. But there is a bottleneck in clock speed - we cannot incrementally increase clock speed.
Most systems today, whether laptops or servers, will have clock rates in the 2–3GHz range. So the
question arises how we can further enhance the computational performance given the clock speed
bottleneck problem? Here comes the concept of multi-core processors - a single chip with multiple
computational units or cores. The chip makers are marketing multicore chips with 2, 4, or more
processors on a single chip. In turn, system vendors are installing multiple multicore chips in their
systems so that systems with 4, 8, 16, or more total processors are common.
In the conventional programming paradigm, a programmer doesn't usually write programs keeping in
mind the multi-core architecture. We have to implement parallel programming constructs like threading
in our programs which is way more complex and different as compared to conventional serial
programming. Therefore, if you want to increase your application’s performance, you will need to
exploit the processor’s inherent parallelism using threads. For example, the Boss/Worker model that we
have discussed previously.
But it comes with a cost - we have to use synchronization constructs, concurrency control, mutual
exclusion etc. When we frequently use these constructs the performance might degrade. Also, if you
increase and further increase computational cores in microprocessor chips, it doesn’t guarantee that
performance will also enhance. After a certain limit, the performance begins to drop.
Usage of Parallelism:
Importance of Parallelism:
The most immediate step for writing a program that uses parallelism is to identify the parallel
components within the program. So you must be proficient in writing parallel programs and then
pinpoint which components can be executed in parallel.
Once you have identified parallel components, the next step is to implement those componenets in a
parallel manner. You have to use any synchronization constraints like mutual inclusion etc.
● In this “do it yourself” (DIY) approach, thread management and synchronization is managed by
the programmer.
● DIY approach is useful and effective in smaller programs with simple parallel structures.
● DIY can become complex and error prone while using recursion.
● Thread pool enables advanced kernel scheduling and resource allocation methods to enhance
performance.
● However, these APIs are only available in NT6 and above.
Frameworks have extensions of programming languages for expressing parallelism. Certain constructs
and APIs sets are available in a specific framework that can be used in our programs to implement
parallelism.
● Loop parallelism: In loop parallelism, every loop iteration can execute concurrently. For
example: matrix multiplication. In loop level parallelism our task is to extract parallel tasks from
the loops. These parallel tasks will then be assigned to individual processor cores and hence it
will reduce computational time.
● Fork-Join parallelism: In this type of parallelism, a function call can run independently from the
calling program, which eventually must wait for the called function to complete its task. In fork-
join parallelism, the control flow divides (like the shape of the fork) into multiple flows that join
later.
Framework Features:
● OpenMP: This framework is open source, portable and scalable. Numerous compilers support
OpenMP like Visual C++. It supports multi-platform shared-memory parallel programming.
Complete documentation can be found at: https://www.openmp.org/
● Intel Thread Building Blocks (TBB): It is a flexible performance library that contains a set of APIs
which can add parallelism to applications. More information can be found at:
https://www.intel.com/content/www/us/en/developer/tools/oneapi/onetbb.html#gs.rodjo
● Cilk++: It adds extension to ordinary serial programming to perform parallel programming. It
supports C and C++. More information can be found at: https://cilk.mit.edu/
Use of threads is direct and straightforward, however there are numerous pitfalls. To overcome these
pitfalls we have discussed many techniques. These challenges may also manifest while using parallelism
frameworks.
● Identifying independent subtasks is not simple. Especially when a problem is encountered when
working with older codes designed for serialized systems.
● Too many subtasks can degrade performance
● Too much locking can degrade performance.
● Global variables cause problems. Global variables may contain a count updated iteratively. Using
parallelism this may need to be taken care of in parallel.
● Subtle performance issues arise due to memory cache architecture and multicore chips.
Up till now it has been assumed that each thread is free to use any processor of a multiprocessor
system. The kernel makes scheduling decisions and allocates processors to each thread. This is natural
and conventional and is almost always the best approach. However it is possible to assign a thread to a
specific processor by setting processor affinity. A processor affinity is the ability to direct or bind a
specific thread, or process, to use a specified core. When implementing processor affinity, our goal is to
schedule a certain thread or a process on a certain subset of CPU cores.
Processor Affinity can be used in a number of ways. You can dedicate a processor to a small set of
threads and exclude other threads from that processor. However, Windows can still schedule its own
threads on the processor.
1. One of the advantages of defining processor affinity is to minimize delay caused due to memory
barriers (concurrency control constructs etc.) For example, you can assign a collection of
threads to a processor-pair that shares L2 Cache. Processor affinity can effectively decrease
cache issues.
2. If you want to test a certain processor core(s) then processor affinity may be used for diagnostic
purposes.
3. Worker threads that contend for a single resource can be allocated to a single processor by
setting up processor affinity.
Each process has its own process affinity mask and a system affinity mask. These masks are basically bit
vectors.
Affinity masks are actually pointers or bit vectors. To get and set these masks a set of APIs are used.
Some of them are defined below:
It reads both the process and system affinity mask. On a single processor system the value of masks will
be 1.
The process affinity mask that is inherited by any child process can be set by this function. The new mask
must be a subset of the mask returned by GetProcessAffinityMask(). The new value will affect all the
threads in the process.
Further the thread masks are also set by a similar function. These functions are not designed
consistently. SetThreadAffinityMask() returns a DWORD which is the previous mask while
SetProcessAffinityMask() returns a BOOL.
Previously, we discussed how to create processes and how to create threads. Also we discussed how to
control concurrency. Now we will discuss how to pass information among processes. This is achieved via
Inter-process communication (IPC). It is a mechanism that allows processes to communicate with each
other. The communication between these processes can be seen as a method of co-operation between
them.
Usually a file-like object called pipe can be used for IPC. There are two types
1. Anonymous Pipes
2. Named Pipes
Anonymous Pipes:
Named Pipes:
Anonymous pipes allow one-way (half-duplex) communication. They can be used to perform byte based
IPC. Each pipe has two handles: a read and a write handle
Interprocess communication occurs between two processes. In this case one process can be the parent
process and the other can be a child. Suppose the parent process wants to write output which the child
will read. In this case parent will have to pass the *phRead handle to child. In other words, the phRead
handle should belong to the child process and phWrite should belong to the parent process. This is
usually done using the start-up structure as discussed previously.
Reading a pipe read-handle will block, if the pipe is empty. Otherwise ReadFile() can read as many bytes
as there are in the pipe or specified in ReadFile(). Similarly a write operation will block if the pipe is in a
buffer and the buffer is full.
Anonymous pipes work one way. For two way operation two pipes will be required.
The given example is built in a way such that the parent process creates two child processes.
The Child processes are piped together. The parent process sets up the child processes in such a way
such that their standard input and output can be redirected via the pipe. Child processes are designed
such that they accumulate data and ultimately process it.
include "Everything.h"
int _tmain (int argc, LPTSTR argv [])
/* Pipe together two programs whose names are on the command line:
Redirect command1 = command2
where the two commands are arbitrary strings.
command1 uses standard input, and command2 uses standard output.
Use = so as not to conflict with the DOS pipe. */
{
DWORD i;
HANDLE hReadPipe, hWritePipe;
TCHAR command1 [MAX_PATH];
SECURITY_ATTRIBUTES pipeSA = {sizeof (SECURITY_ATTRIBUTES), NULL,
TRUE};
GetStartupInfo (&startInfoCh1);
GetStartupInfo (&startInfoCh2);
if (cLine == NULL)
ReportError (_T ("\nCannot read command line."), 1, TRUE);
targv = SkipArg(cLine, 1, argc, argv);
i = 0; /* Get the two commands. */
/* Skip past the = and white space to the start of the second
command */
targv++;
while ( *targv != '\0' && (*targv == ' ' || *targv == '\t') )
targv++;
if (*targv == _T('\0'))
ReportError (_T("Second command not found."), 2, FALSE);
startInfoCh2.hStdInput = hReadPipe;
startInfoCh2.hStdError = GetStdHandle (STD_ERROR_HANDLE);
startInfoCh2.hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE);
startInfoCh2.dwFlags = STARTF_USESTDHANDLES;
Named pipes are the general-purpose mechanism for implementing IPC-based applications like
networked file access and client/server systems.
The above figure shows an illustrative client/server relationship, and the pseudocode shows the scheme
for using named pipes.
In this figure, the server creates multiple instances of the same pipe, all of them can support a client.
The server also creates a thread for each named pipe instance, so that each client has a dedicated
thread and named pipe instance.
CreateNamedPipe() creates an instance of the named pipe and returns a handle. here is the
specification of this function:
Details of Parameters:
\\.\pipe\pipeName
The period (.) stands for local machine. Creating a pipe on a remote machine is not possible.
Name is case insensitive and can be upto 256 characters. It can contain any character except
backslash.
First call to the CreateNamedPipe() function will create a named pipe and an instance. Closing the last
handle to an instance will delete the instance and the named pipe.
A client connects to a named pipe using CreateFile with the pipe name. In many cases, the client and
server are on the same machine.
If the server is on a different machine, the name would take this form:
\\servername\pipe\pipename
Using the server name as (.) when the server is local rather than using the local machine name delivers
significantly better connection performance.
There are seven functions to interrogate pipe status information. Some functions are also used to set
the state information. They are mentioned briefly:
● GetNamedPipeHandleState()
Returns information whether the pipe is in blocking or non-blocking mode, message oriented or
byte oriented, number of pipe instances, and so on.
● SetNamedPipeHandleState()
Allows the program to set the same state attributes. Mode and other values are passed as
reference so that NULL can also be passed indicating no change is desired.
● GetNamedPipeInfo()
Determines whether the handle is for client or a server, buffer sizes, and so on.
● Some functions get information regarding client name, client and server session ID and process
ID such as GetNamedPipeClientSessionId(),
GetNamedPipeServerProcessId()
The server creates a named pipe instance. Once a pipe is created the server would wait for a client to
connect.
The ___ register of Real Time Clock is used to enable interrupt on various events
like alarm time and time-up duration
Status Register A
Status Register B