0% found this document useful (0 votes)
24 views71 pages

CD Unit 4

lecture notes

Uploaded by

21wh1a05h9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
24 views71 pages

CD Unit 4

lecture notes

Uploaded by

21wh1a05h9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 71
,UN-TIME ENVIRONMENTS At the end of this chapter, the reader will be able to understand: * Storage organization. awit * Heap management. * Symbol table and its organization. Scanned with CamScanner i —_ perates with the operating systems and other systems softy. 2 ‘AT to sy Se eee cons aucti eaten operators, data types, Parameters, procedurey ct various eee cane The compiler does this by creating and managing a Run-Tim, flow-contro + (RTE) and it is assumed that the target programs are being execute ZAeacreariiclztlcals with allocaionsoFtorage locaigiace in the RTE ng Of variahy les, passing of parameters, I/O devices etc 6.1 STORAGE ORGANIZATION i i i ted runs in its own logical addr iler writing, the target program being execut ¢ ; address ns logical address space is managed and organized with the close Coordination of the compiler, the operating system, and the target machine. A run-time Object Program in logical address space is typically divided into code and data areas, as shoyy, Code Static Heap Free Memory Stack FIGURE 6.1: Subdivision of runtime memory into code and data areas The addressing type used in the target machine, (such as Register addressing, byte addressing, etc.), influences the sto) rage allocation and layout for the data objects. The allocation can be one of two types 1. Aligned - Here, the variables, Packed - used when sufficient Space is not available; variables are placed in contiguous blocks, end-to-end. It may be required to execute some additional instructions at run time to Place the packed data so that it can be operated as if it were aligned properly, * Code - This area is used generated code is fixed at compile time. Scanned with CamScanner Run-Time Environments 291 atl i Static = he nee of some program data objects may also be known at compile time. These objects are placed in the Static area. «Stack and Heap — : ses ond Hes nF oes areas are placed at the two ends of the unused space, and ote each other, Stack area is used to store Activation Records a Procedure calls, and is basically used for temporary or short: ived data, while the Heap area is used to allocate and deallocate arbitrary, dynamic chunks of storage. of these areas, the Code and Static areas are statically determined at compile time, which allows the addresses of the objects to be compiled into the target code itself, the Stack and Heap areas are dynamic, and helps in improved or maximized space utilization. 6.1.1 Static vs Dynamic Storage Allocation Static Dynamic Isalso known as compile-time allocation | Is known as run-time allocation. The decision is made based on the | ¢.., 4e decided only when th contents ofthe program;doesn’tdepend | C2 bedecidesionly when'the prog running. on the output, Ex: Code and Static area Ex: Stack and Heap areas Sometimes, Compilers allocate dynamic space using a combination of both the Stack and the Heap areas: * Stack - used for local and short-term variables. + Heap - used for non-local and long-term variables; memory isallocated when the objects are created and deallocated when they are nullified. Garbage allocation enables the reuse of space allocated to useless data elements by detecting such elements and returning their space explicitly. 6.2 STACK ALLOCATION OF SPACE Stack area is used to allocate space to the local variables whenever a procedure is called; this space is popped off the stack when the procedure terminates. This arrangement has 2 advantages: + Itallowsspace to be shared by procedures whose durations are non-overlapping. + Allowsus to compile code for a procedure keeping the relative addresses of the non-local variables always the same. 2 Scanned with CamScanner 1 Activation Trees eee are used to efficiently describe the nesting of procedure calls to make Acti ghocaion feasible, The nesting of procedue calls can be ilust the st the following example: Example 6.1 Consider a sorting program which re using the recursive Quicksort algorithm. ated using ads nine integers into ‘an array ‘a’ and sorts them int a[11); void readArray () { /* Reads 9 integers into a{1],...,a[9] */ inti; ) int partition (int m, int n) ( /* Picks a separator value y, and partitions a[m.. n] so thata[m.. pp-I] are less than. y,al[p] = v,anda[p + 1.,n] are equal to or greater than v. Returns p. */ } void quicksort (int m, int n) { inti; if(m>m){ i = partition (m,n); quicksort (m, i- 1); quicksort (i + 1,n); } } main () { readArray ( ); a[0] = -9999; a[10] = 9999; quicksort (1, 9); may Of a{o} = pa a[10] = 9999, quicksort (1,9 ); } FIGURE 6.2: A YyPlel Quick son program Scanned with CamScanner RunTime Environments 293 ppe main function performs three tasks: “ call readArray a setthe e end points call quicksort on the input array ne possible sequence of calls for the execution of the program can be depicted as fllows enter main () ent enter readArray ( ) leave readArray () enter quicksort (al¥i9})) enter partition ( 1,9) Jeave partition (1,9 enter quicksort (1, leave quicksort ( 1,3) enter quicksort (5, 9 ) | leave quicksort (5, 9) | leave quicksort (1,9) leave main ( ) FIGURE 6.3: Possible flow of the program of fig. 6.2 Nesting of procedure activations which means that if an activation of a sco fure p alls procedure g, then that activation of q must end before the activation. of pcan end. | We have 3 cases here: Case 1 The activation of q terminates normally, control returns just after the point of p at which the call to qwas made. Case 2 The activation of gaborts, and cannot continue; then p aborts simultaneously with ¢. Case 3 = The activation of q terminates and q cannot handle the exception condition. In this case, there are 2 cases: Case 3.1 pcan handle the exception, the activation of q terminates and the activation of p continues, need not necessarily be from the calling point. Scanned with CamScanner 294 Principles of compiler tivation of p terminates along with 4, and ty, } and the Case 3.2 gp cannot hai exception is passed ‘An activation tree is use execution of the entire program. ndle the exception, the to some open activation which can handle to represent the activations of procedures during ty e In the tree « Node- represents U Root- activation of the “main” of procedures called by the parent procedure, from left to right. he activation of a procedure. 6 procedure. ¢ Children nodes- activations ‘The activations are shown in their order of occurrence 6.2.2 Activation Records ord (AR) is a memory block used for information management for ture, AR is also called a frame, are used to store information e when a procedure call occurs. This status can include Activation Rec single execution ofa proced about the status of a machin information such as the value of the program counters and the machine registers etc, ‘ARs allow the control of flow of the program: when the control returns from the called procedure, the calling procedure is activated by restoring the values of the relevant registers and the program counter to the point immediately after the call. AR keeps track of flow of procedures. A run-time stack, called the control stack, is used to manage procedure calls and returns, In this stack, the root of the activation trees is at the bottom, and the activation ocedure where the control currenily resides is at the top. the top to the bottom, the activation of procedures is arranged in the reverse order from the currently active procedure to the root. Rishon Se one ee down, i.e., the bottom of the stack is shown » picted as follows: (read from bottom to top) Actual Parameters Returned Values Control Link Access Link Saved Machine Status Local Data ‘Temporaries FIGURE 6.4: A general Activation Record Scanned with CamScanner 95 puntime Environments? —_ a romporaries hold temporary values, such as the result of a Mathematical Gjeulation, a buffer value or so on. Bi cal Data belongs to the procedure where the control is currently located- a, Saved Machine Status provides information about the state of a machine just ‘fore the point where the procedure is called. 4, an Access Link is used to locate remotely available data. This field is optional. Control Link points to the activation record of the procedure which called it, ic. the caller. This field is optional. This link is also known as dynamic link. Return Value holds any values returned by the called procedure. These values can also be placed ina register depending on the requirements. 7, The Actual Parameters used by the calling procedure are stored here. ,gsve is determined when a procedure i called. 42.3 Calling Sequences ing Sequence is a code that allocates an AR on the stack and enters information capt fields. Itis the sequence in which procedures are called by the main procedure aswell as by the other procedures. Return Sequence is a code used to restore the state of the machine so the calling procedure can continue its execution after the call. ‘The code in a calling sequence is divided between the calling procedure (“caller”) andthe called procedure (the “callee”). The portion of calling sequence assigned to thecaller is generated separately each time a procedure is called; however, the portion signed to the callee is generated only once. Hence for efficiency it is better to putas uch of the calling sequence code in the callee as possible. Some important principles while designing calling sequences are: 1. For ease of communication and efficient data accessing, the values communicated between caller and callee are usually placed at the beginning of the callee’s activation record. : 2 The fixed-length items, such as control link, access link, machine status,etc.,_ are normally placed in the middle. 3 Variable sized data items are placed at the end of the AR. Such items include as dynamic arrays, ten pora Hi Scanned with CamScanner Principles of Compiler Design 296 E Parameter and returned value Control Link Links and Saved Status ‘Temporaries and local data Parameter and returned value Control Link Links and Saved Status Temporaries and local data FIGURE 6.5: Division of tasks between caller and callee A typical calling sequence can be described as follows: Evaluation of actual parameters by the caller. Caller stores the return address and the old value of the TOP pointer into the callee’s AR. Register values and other status info are stored by the callee. ° ° — Initialization of locally available data. The corresponding return sequence is as follows: ¢ Return values are placed next to the parameters. ¢ — Callee restores the value of the TOP pointer and other registers, and branches to the return address specified in the status field. i * — Caller uses the return value placed in the registers. 6.2.4 Variable-length Data on the Stack Sears wpb pgeed se allots space to fixed size variable, but can also be used fot the cost of garbage collecting ie ange Stack usage minimizes and avoids j iz allocation for variable-length arrays is done as follows: a a at poate ee procedure are placed after the Ak beginning of each array is mePetia ee TOP pointer; a pointer to the Scanned with CamScanner Run-Time Environments 2 ms i ed irany procedure “q) is called by this procedure, then the AR for the q is Bee after the calling procedure’s arrays, and the arrays of q are placed eyon Fi sh stack is accessed using 2 pointers: top and top_sp; top points to actual top of the «ind top SP points to the end of the machine status field, and used t0 ind local, eS duength fields of the top AR. fx 6.3 ACCESS TO NON-LOCAL DATA ON THE STACK xonlocal data is a parameter or data value that is used or required within a procedure, put does not physically belongs to that procedure. There are various situations or conditions of access of nonlocal data. These include: 6,3.1 Data Access without Nested Procedures Generally, a variable has a scope of existence or use only within the function or procedure where it is defined. After being declared once globally, if the same variable js declared elsewhere, then this new definition overrides the global definition and is used as its value. For non-nested procedures, variables are accessed as follows: 1. Global variables are declared statically as their values and locations are known or fixed at compile time. 9. Other variables (local) are declared locally at the top of the stack, using the top_sp pointer. Static allocation allows parameter passing by reference which is much more efficient than passing by values. 6.3.2 Issues with Nested Procedures 1. Access is complicated when nested procedures are used. 2. The nesting does not specify the (relative) positions of the nested and the nesting ARS; this makes it difficult to determine the scope of reference of the variable being accessed. 6.3.3 A Language with Nested Procedure Declarations — ML Properties of ML include: © The variables declared in ML are unchangeable once they are initialized; this means that ML is a Functional Language. * Variable declaration is done using the statement: val = Scanned with CamScanner iler Design F inciples of COPIICE 290 aC fining functions is: phe syntax for di ) = fun ( <018> Function bodies are defined as: itions> in end let 2GB Virtual Memory (Disk) 3-15 ms 256MB - 2 GB Physical Memory(RAM) 100 oe ms aed ie on sm FIGURE 6.7: Typical Memory Hierarchy Configuration Scanned with CamScanner of Compiler Design woz Principles P jes 5.7, the fastest storage elements are the Sits registers, also known en in figure 6-7“ rage space available in them is in the units of “words! only; nex: expr a pea soe he clements made of Static RAM (SRAM), extendin he I*- come to the range required; next is the F is the Virtual memory, ith a proportional increase in the access tj cveral megabytes, with a prop s mg of i Feat or main memory, made of Dynamic RAM (DRAM); jag fe oplemanted| by Gigabytes of disk space. -cess is required, the machine searches the hierarch ory search or access is req : ry Hae nest level ist. Caches are managed by RAMs due to their relatively sar ses, whle the Virtual Memory is managed by the Operating System ite, fast acces - a due to its large size and also the large access time involved, 6.4.3 Local Locality refers to the amount of data requirements for a particular program, and the times required to access or locate the data. Locality is of 2 types, namely: 1. Temporal Locality: this is present if the memory locations accessed by it are likely to be accessed again soon. 2. Spatial Locality: this is the condition when the memory locations close to the location accessed are likely to be accessed within a short period of time. Programs Usually, every program spends a lot of time executing a small part of the code (almost 90% of the time is spentin executing 10% of the code). This is because: * Programs contain many instructions that are never executed; for example, a program may be having a lot of iflse if loops, but most of the times only one of the conditions is true, and the corresponding part of the code is executed, The program executes only a small part of the entire code in a typical code; for example, the program ma empl ara ly have a lot of error-checking loops to handle exceptional cases, but these cases are invoked onl i pti 3 wher conditions arise. Ieee necacas Usually, the innermost loo i a PS and recursive el cycles consume most of the The Memory Hierarchy hel, Y Helps to reduce the access time i i ua accessed data in the fast and small storage, and fee, pane = arge storage. This also helps to improve the locality of a conn oe a Optimization Using the Memory Hierarchy Scanned with CamScanner Run-Time Environmenis compiler place the basic block ( flow of instr executed sequentially) that low of instructions that are always are exectten tens the data bated in sequence-in the same page of the memory: 9, Changing the data tayo ; 9. Changing the data layout according to the program requirements can also help rove the efficie! au eweE pene This implies that it is much more efficient to move aa neo ees which is accessed regularly, from the slower levels to the at terarchy, than accessing those locations in the slower levels itself Qnce these locations are moved to the faster levels, then they can be accessed as many times as required, : 6.4.4 Reducing Fragmentation Before the program execution begins, the heap is a contiguous block of free space. During the course of execution, for each allocation request, the heap allocates memory from this free space by placing the requested memory chunk into the holes (free memory chunks), Such allocation may require the holes to be split, to create smaller holes. Deallocation requests will add the free space back to these holes, by coalescing. It is important to manage the growth and use of these holes, else it is possible that at a later time, no hole is large enough to accommodate the requested block size. This condition is known as fragmentation, when the total space available is enough for the current request, but the individual blocks of memory are not large enough. Best - Fit and Next - Fit Object Placement The Bestfit algorithm tries to place the memory requested in the smallest hole that is large enough. This strategy spares the larger holes to satisfy subsequent, larger requests. Alternatively, a first strategy can be used which tends to place an object in the first hole in which it fits. This takes less time to place the objects, but has lower performance than the best-fit strategy. ‘The best-fit scheme is implemented using binning, placing the free space into separate bins, A bin can be thought of as a collection of chunks of free space. Usually, there will be larger number of small-sized chunks. Binning makes it easy to find the best-fit chunk. * If there is a bin for the requested size, then any chunk from that bin can be used. * — Ifa bin for the requested size is not available, use a bin that can allocate the desired size; within that bin, use either best-fit or first-fit. * If the target bin is empty, or the available chunks are not large enough, then search for a bin with the next larger sizes. Disadvantage of best-fit strategy is that spatial locality is not taken care of while allocating the memory space using best-fit placement. This can be handled by using Scanned with CamScanner compiler Design Principles of Comp 304 sexy, which tends to allocate the object in the chunk that has last been strategy, W ” the Pe apron is the speed of allocation operation. This al split Managing and C 7 erefers to mergi Coalescing i tant because: larger chunks. This is impor me ‘The larger chunks can be used even for requests for smaller sizes, but smaller chunks cannot be combined together to satisfy the request of a larger chunk, ‘Also, these larger chunks can be split as required, to accommodate the smaller ‘oalescing Free Space ing or combining the dealle fed space to adjacent chunks, «5 chunks. This is not as easily possible in the binning strategy, because the deallocated spaces cannot be combined with the adjacent blocks as the bin contains only the blocks ofa particular size, and coalescing will increase the size of the block. This block wall then need to be moved to a separate bin containing blocks of that size. There are two data structures useful to support coalescing: 1. Boundary Tags: At both ends of each chunk, we keep a free/used bit that tells whether the block is currently allocated or free. A count of the number of bytes in that chunk is also maintained adjacent to that bit. 2. A Doubly Linked, Embedded Free List. The free chunks are linked in a doubly linked list, which maintains pointers to other free chunks; the overhead involved is the extra space required to maintain the pointers. Manual Deallocation Requests Thisinvolves explicitly arranging for the deallocation of data. There are many problems associated with manual deallocation such as deleting the spaces that are no longer accessed, while not deleting those spaces which are referenced by others, Problems with Manual Deallocation 1. Iserrorprone; this includes memory leak error (failing to delete data that cannot be referenced) and dangling-pointer-dereference error (referencing deleted data). 2. Memory leaks may slow down the execution of the program due to excessive memory utilization; hence it is essentially critical that non-stop, long running Programs like operating system or server codes do not have leaks, Even if that space has been reallocated, using its poii on it can produce random, difficult results, 4. Deallocating a memo Scanned with CamScanner and Tools gramming Convention nee the idea 1, Object Ownership presents all times. The owner is usually @ P' of associating an owner with each object at ointer to that object, and may belong to a ; -rall responsibility of the object, vocation. The owner assumes: the overall resf ty. ij pana re.) assing it to another owner. Other non-owner pass including deleting the object oF marmot and shown pointers to this object can also exist, however they allowed to perform the delete function. Peteaeallyallocatcd objedt 9, Reference Continuity associates a count with each dyna) Neo ‘ ively, whenever a reference This countis incremented and decremented, respective Ys 5 ‘ his count reaches zero, it means and hence can be safely deleted. 3, Region-based Allocation is used with objects whose lifetime is limited to specific phases in a computation, Then, for that particular phase (region) only, all the objects are created. When all the computation in that region is completed, the entire region is deleted. It is an efficient way of deallocating memory because, it deleted the objects all at the same time, instead of deleting them one by one. 6.5 GARBAGE COLLECTION Garbage refers to the data that cannot be referenced. Such garbage can be collected either manually or automatically, depending on the implementation specified in the programming language under consideration. Garbage collector collects garbage and reduces heap space. 6.5.1 Design Goals for Garbage Collectors Garbage collection is the reclamation of chunks of memory containing objects that can no longer be accessed or reached. We need to assume that: to the object is created or removed. When U that there is no other reference to this object, * Objects have a type that is determined by the garbage collector at run-time; this provides information about the size of the object, and also about any pointers within the object. * References to the object always points to the address at the beginning of the object, not within the object. The garbage collector searches the mutator (user program) to for unreachable objects, and reclaims their space to return to the memory manager for later allocation. Type Safety For the proper working garbage collector, the language must be able to tell whether an object is, or could be used as, a pointer to a memory chunk. This is possible if the language is type safe. A type safe language is one that can determine the type of any data component. They are of 2 types: Scanned with CamScanner les of Compiler Design | = -lement at compile time: 30 ages determine the type of the ele eigen ~ these languages 1, Stal 7 7 ex. ML. jetermine the type at compile time, but a s id amic: these languages canno' 7 java. s at run - time; for ex., J De ta ace soatr h edetermine thecype, ener statical y cally, is said to yhich cann’ Alanguage W ic garbage collection is not possible in unsafe lengua at st 5 ae Sse! e manipulated arbitrarily. ad s, memory addresses can be manipulate caus these languages, y Performance Metrics id to be good if it sati be considered wher sfies the performance metrics. The various Garbage collector is sai designing a garbage collector are: performance metrics to 1. Overall Execution Time: it is necess significantly increase the total run time can be a very slow task. 2, Space Usage: avoid fragmentation, and make best, 01 of the memory space. 3. Pause Time: sometimes garbage collectors pause the working of the actual program when garbage collection begin without warning. Thus it is important that the maximum pause time be reduced. 4. Program Locality: the garbage collector should try to improve the temporal and spatial locality of a program. This can be done by freeing up space and reusing it (temporal) or relocating data used together in the same cache (spatial). Garbage collector must support temporal and spatial locality. 6.5.2 Reachability Root set refers to all the data that can be reach dereferencing any pointer, All the members of the also, these objects are reachable by thei the reachable and unreachable objects. ary that a garbage collector does not ofan application, as garbage collection r at least optimum, usage ‘ed directly by a program, without rootset are reachable by a program; ‘mselves, recursively. Root sets are used to find re parameters to the for the callee; maintains the size, pander Se TT Scanned with CamScanner Run-Time Environments 307 g, Reference Assignments; assignments of the form u = ¥, Where u and v are references; now u is 2 reference to the object referred to by v; the original 1 Increase as well as decrease the size of the set. 4, Procedure Returns: when the Procedure exits, its local variables and all references are popped off the stack; this makes all those objects unreachable; decreases the size of the set. : reference in u is lost ways to find unreachable objects: 1, Reference counting: as already discussed, this method maintains a count of the references to an object; when this count reaches zero, the object becomes unreachable. 9, Trace-based garbage collection (or Transitive Tracing): works by tracing all the references transitively; the garbage collector marks all the objects in the root set.as reachable, all objects reachable from it are recursively reachable; all the other objects are marked as unreachable. 6.5.3 Reference Counting Garbage Collectors Reference counts are maintained as follows: 1, Object Allocations: the reference count (RC) of the new object is set to 1. 2, Parameter Passing and Return Values: the RC of each object passed into the procedure is incremented. 3. Reference Assignments: for statement u = y, where wand vare references, the RC of the object referred to by v goes up by one, and the RC of the old object referred to by u goes down by one. 4. Procedure Returns: when the procedure exits, all the references held by the local variables must also be decremented; the count must be decremented once for each reference even if several objects hold references to the same object. 5. Transitive Loss of Reachability: when the RC of an object becomes zero, we must also decrement the count of each object pointed to by a reference within the object. Disadvantages 1. It cannot collect unreachable, cyclic data structures; this is because data structures often point back to their parent nodes. 2. Itis expensive. The overhead associated with reference counting is high; this is because additional operations are introduced with each reference assignment, To overcome this high overhead, the concept of deferred reference countingis introduced, Scanned with CamScanner Jes of Compiler Design 308 Principle: ‘ ne the root set. 3 Jude references from RGs do not inc In this, a an incremental fashion Garbagt ction is performed in : ; 1 cs bage Seen 1y modifying the RCs of each object can be deferreg, 9, The opera rformed at a later time ‘ s ° 3. aad ai garbage is collected immediately, reducing the space usage 3. Also, age is .6 SYMBOL TABLE e used by the compiler to keep track of S\ ST) is basically a data structur Symbol table (ST) is basically a dat s : : P ck scope and binding information about names. ST is a store house of benls ari, and ‘run - time information’ about every identifier in the source program. Al” accesses find the ‘attributes’ of the identifier from relating to an identifier require to first 3! ; - the ST, Symbol table is usually organized as a ‘hash table and it provided fast access, Compiler-generated temporaries many also be stored in ST. i The symbol table is searched every time a name is encountered in the source text, Changes occur in the table ifa new name or new information about an existing name is discovered. AST exists through out the compilation time and run-time. There is a access to the symbol table at every stage of the compilation process. 6.6.1 Symbol Table Entries Itis useful for a compiler, if the symbol table can grow dynamically as necessary. A ST mechanism must allow us to add new entries and find existing entries efficiently, Compiler uses ST to achieve compile time efficiency. ST consists of various entries. Each entry in the ST is for the declaration of a name. The information saved about a name depends on the usage of the name hence the format of the entries need not have to be uniform. Attributes stored in a symbol table for each identifier includes: * Type ; * Size * Scope / visibility information © Base address Addresses to location of auxil : ages auxiliary symbol tables (in case of records, procedures, ¢ — Address of the location cont and its length in the string ining the strin, pool In general, the items to be stored in ST are: * Variable names ig Which actually names the identifier Scanned with CamScanner Run-Time Environments . Constants Procedure names «Function names Labels in source language Literal constants and strings. following types of information is used by compiler from symbol table: * Name * Date type * Procedures declared © — Offset in storage Pointer to structure/record table Parameter passing is by value or reference? * Base address 6.6.2 ST Operations Major operations required of a symbol table are: 1. Insertion 2. Search 3. physical’ Keywords are often stored in ST before the compilation process begins. 6.6.3 ST and Its Relation with Various Phases of Compiler ST is accessed at every stage of the compilation process. 1, Scanning Includes insertion of new identifiers. 2. Parsing Deletions which are ‘Purely logical’ depending scope and visibility and ‘not Access to the symbol table to ensure that an operand exists (declaration before use). 3. Semantic Analysis Itincludes : a) Determination of types of identifiers from declarations. b) Type checking to ensure that operands are used in type valid contexts ©) Checking scope, visibility notations. 4. IR Generation Memory allocation and for relative address (ie relative to a base address that is known only at run-time) calculation. Scanned with CamScanner 310 Principles of Compiler Design 5 Optimization ‘All memory accesses through ST. 6. Target code Translation of relative addresses t0 word boundary etc. absolute address in terms of word length 6.6.4 Name Storage in ST s can be represented in ST in two different ways: Name: |, Fixed length name Here, a fixed space in symbol table isallo here is that space is wasted if the name is too small. cated for each name. The disadvantage Example Nurs Attribute 2. Variable length name Here, the nam i can be stored with the help of starting index and length of each name. Example Name STARTINGINDEX | Lenny | AttiPute 0 15, 15 ane 35, ai ST managements used for quick i; i quick insertion of identifi er and to, search identifier easi!¥ Scanned with CamScanner RunTime Environments 317 4.6.5 57 Organization Approaches e organized in different way. Few of the) an be organ! y eM are explaine plained below: 1, The linear list Linear list is the simplest and easiest mechanism used to implementa g ages array t0 store names and its associated information. The new name to the table in the order that they arrive. It uses ‘available’ pointer at the gored records. The table is searched linearly or sequentially to check if a narae ig already present or notin the table, This is done whenever a new name is added to the table. If the name is not present, then a record for a new name is created and added to the ist ata specified position by the ‘available’ pointer, Minimum amount of space js occupied by list organization. this method are added end of all | name | info 1 name 2 info 2 | available—> | namen | infon l FIGURE 6.8: Linear list organization of ST The average number of comparisons, C, required for search are: ¢ - (+) » for successful search and C = n for unsuccessful search, where n = number ofecords in ST. Advantages Requires less space. Disadvantage Has higher access time. 2. Search trees A search tree is more efficient approach to ST organization. It consists ‘ide assignment) and r-value (right side assignment). The left link an are added for each record and these links point to the record in the first name is searched whenever a name is to be added. If the name di 4 record for the new name is created and added at the proper tree. Here the names can be accessed alphabetically. of Lvalue (left d the right link search tree, The toes not exist, then Position of the search Scanned with CamScanner as principles of ‘Compiler Design ae F = t= left : Tom | : r=right | y t name, info - | c | name, info | + ue | ation of ST. FIGURE 6.9: Search tree organize mes and to make ‘m’ queries is proportional to (m + ime required to enter ‘n’ na : opt ee ne Jue of ‘n’ searching becomes easier. n) log,n. Advantage: Higher the val 3. Hash tables Here ‘open hashing’ method is considered ie, There is no need to limit the number of entries that can be made in the table. This method is useful in performing ‘e’ queries for ‘n’ names to make ‘e’ queries is proportional to n(n + e)/m where m = any constant of our choice. Here ‘m’ can be as large as we like & may be equal to ‘n’ also. Here this method is more efficient compared to above two methods. The only disadvantage is that the space taken by the data structure grows with m. Hence it involves time - space trade off. the hash table to search the list of symbol ash! index. Create a record for that name ‘ame is not present in that list. table records that is used to built on that h: and insert is at the head of the list if the n; FIGURE 6.10: Hash table organization of st Scanned with CamScanner CODE GENERATION 7 end of this chapter, the reader will be able to; Design basic blocks and flow graphs. Optimize basic blocks. Scanned with CamScanner Code generation phase is responsible for generating the target code. This chapt fi Coates in code generation phase and the register allocation strategies Cae Mom for tree adress code and DAG is explained with an example. 11.1 Introduction Code generator is the last phase in the design of a compiler. It takes three address code or DAG representation of the source program as input and produces an equiv alent target program as output. Figure 11.1 shows the position of the location of code generation phase The code generator has many limitations regarding the generation of code that is of high quality, accurate, and efficient. In addition to this, the code generator should mnefficiently. There are many issues that are to be considered while designing a code generator. 11.2 Issues in the Design of a Code Generator The code generated is target language dependent and operating system dependent as mem” ory management, instruction selection, register allocation, order of evaluation would affect the efficiency of the code generated. Even the input to the code generation phase is an issue because there are many forms of intermediate codes that are generated by the front end. Scanned with CamScanner 426 Code Generation = Intermediate Codi rarget Source ,| intermediate rma oe opti | nem Ge ae, code | code generator{ code | | os Srp / Symbol Table Figure 11.1 Code Generator 11.2.1 Input to the Code Generator The output of front end is the input to the code generator along with the information in the symbol table that is used to determine the run time address of the data objects denoted by the names in the intermediate representation. Intermediate code is either in the linear form or in the hierarchical form. Linear rep- resentation that includes postfix notation, three address representation (quadruples and triples), and hierarchical representation includes syntax trees and DAGs. The input is intermediate representation and is error free. The values of variable names present in the intermediate code can be represented by target machine in a directly manipulatable form Since semantic analyzer has already performed the type checking, type conversion oper ators have been inserted wherever necessary and obvious semantic errors have already been detected. 11.2.2 Target Programs The output of the cod i i of eee le generator is the program in target language. Various possible out Absol i i i 7 i oa Se ee ee is static and is always placed in the same locatio® memory locations. This may be suitable fees to: Fast in execution but requires are produce absolute code. for programs like VI editor. Compilers like P Relocatable machine code—this all ‘loc: lows th ms ; na resulting in a set of relocatable object modules, ‘ioe a ae es le ed ammi to link the modules and load the programs int th : : a grams into the mem ecution. Alth 4 i eer lemory foi p the P cedure is expensive, there is flexibility in being able to a, Sa ee and? Scanned with CamScanner Issues in the Design of a Code Generator 427 callother previously compiled programs from an object module. If relocation is not out automatically by the target machine then the compiler must provide explicit relocation information to the loader. This is used to link the separately compiled object modelos ‘Assembly code—When we have Assembly language as the output, it makes the proce: of code generation easier. We can generate mnemonic instructions and use the aa facil ties of the assembler to generate code. The cost involved after code generation is less as the other forms require an assembler. Particularly for a machine with a low memory, this choice jsreasonable, where a compiler must use more passes. 41.2.3 Memory Management The role of code generator in coordination with front end is to map the names of variables in the source program to the addresses of the data objects in run time memory. The name ina three address statement refers to a symbol table entry for that name, which is used by code generator. Each label in the three address code has to be converted to actual memory addresses of instructions and this process is called “back patching.” In case of quadruples, if the numbers are referred by labels, then each quadruple is read and the address is computed by maintain- ing a counter for the words used for the instructions generated so far. The quadruple array within an additional field is used to store the count. For example, if a reference stich as j: goto i is encountered, where i could be less than or greater than j, © ifiis less than j, it is a backward jump, we may simply generate a jump instruction with the target address equal to the machine location of the first instruction in the code for quadruple i. ® Thejump is forward jump when i is greater than j. Hence, i exceeds j. Here we have store on quadruple list i the location of the first instruction generated for quadruple j. Then the age iis processed. Then for all forward jumps to i, proper machine locations are ae Instruction Selection of tion set of the target machine is an important factor that controls the uniformity the executi ms J i then each except the target machine does not support each data type @ uniform manner, Machine ion {2 the general rule requires special handling. Beeration, roms and speed of fetes tee ere ther important factors to consider in code Bram isnot Seen selection is straightforward provided the efficiency of the target pro- age” example, fot nat 25k then, but this results in more computation time. Dcated and can bee consider a simple statement a: =b + ¢, where a, b, and care statically translated into the machine code as follows: Scanned with CamScanner weo | Wuue uerierauon If the code generator translates statement-by-statement, it offen produced a very poor code as given below. Let the statements in three address code be would be translated into MOV b, RO ADD c, RO MOV RO, a MOV a, RO ADD e, RO MOV RO, d The fourth statement in the code sequence is redundant. I quently used, then the third statement is also redundant. Let us consider another statement, a = a + 1; if this statement is translated into machine code it results in three machine instructions. If the target machine has an increment instruc- tion INC, using this instruction is more efficient as it would perform the same task as that of three instructions generated in normal code generation. Instruction speeds are needed to design good code sequence but knowing the accurate timing information is often a difficult task. Deciding which machine code sequence is best for a given three address construct may also require knowledge about the context in which that construct appears. if the value of @ is not subse- 11.2.5 Register Allocation Efficient use of registers is particularly important in code generation as the instructions ister operands are short involving the regi and fast than those instructions involving the oper ands in memory. The use of registers is often subdivided into two sub problems: © During the first phase, that is, register allocation phase, we select the set of names that reside in registers at a point in the program. © During the later phase of register assignment, we pick the register where a variable reside in Register assignment to variables is a difficult task as the regi i sed r gisters available are to be us bn ee eae sleoby, the peeranns that are currently running on the syste™ 1 is NP-complete problem and is also complicated because the hardware and/or the operating system of the target machine may at en aa ister usage. eae ae Bee come operations and to store atl y tion requires regi oe Soe siachine Tremmtnpietion ee teeetenarige =e TEM Sys Mxy will Scanned with CamScanner The even register of an even/odd register pair will have x, which is the multiplicand. The multiplier y is a single register and the multiplicand value is taken from the odd resists pair. The result occupies the complete even/odd register pair. Similarly, the division instruc- tion is of the form Dxy Inthe even/odd register pair, the 64-bit dividend occupies even register x and y represents the divisor. After performing the division operation, the remainder is stored in the even reg- ister and the quotient is stored in the odd register. 11.2.6 Choice of Evaluation Order The efficiency of the target code depends on the evaluation order, The computation order effects the register requirements to hold intermediate results. Choosing the best order is another difficult task. If the input is in the form of three address code, it may require reorder- '28 of the input for efficient code generation. If the input is in the form of DAG, then the best “ede can be generated by traversing the tree in post order form. Scanned with CamScanner 320 Principles of Compiler Design 7.2 THE TARGET LANGUAGE arget language The output may take differen, for ‘The output of a CG is the t achine- language or assembly lang MS lik absolute machine language, relocatable mé 7.2.1 A Simple Target Machine Model i e-4 sable machine with n Fe Monier instructions consists of an operator, eee by get fol sea ‘a list of source operands. A label may precede an instruction. The possible kinds , operations are as listed below:- seneral py PUrpoe ith 1. Load Operation(LD):- General format is LD dst, addr. LD, loads the value in location ‘addr’ into location ‘dy. Here destination = dst = addr. Example a) LD RS, R4 > This is a ‘register-to register copy’ instruction where the contents of the register R4 is copied into the register R3. b) LD RI, x Loads the value in location ‘x’ into register Rl. 2. Store Operation (ST):- Example ST x, RI Stores the value in register R1 into the location xe 3. Computation Operation:- The general form is OP dst, or dst, srel and sre st, src, 5102 where ‘Op are locations, iS a operator like ADD or SUB, and not necessarily ee . F Example ADDR,,R,R, 4, Unconditional Jumps (BR):. The general form is BRL where « Pe ere “BR’ ji machine instruction ae ni Is bray E with label [, inch, This causes control to branch to the Scanned with CamScanner _ 5, Conditional Jumps:. Code Generation 321 qhe general form is fey, for any of the common gxample BGTZ R2,L > This instruction causes F r than Zero and allows control to ume that the targe: Let us a that the target machine has many addressing modes: location can be a variable mn; . i. for g amiable name ‘x’ referring to the memory location that is reserved’ for 'x’ in the instruction, [The Hvalue of x] 2, Indexed mode of addressing is of the form a(r), where ‘a’ is a variable and ‘r’ is a register. md R, L where R' isa rey a register, ‘I tests on values j in register R isa label, and ‘cond’ stands 4 Jump to label L if the value in register R2 is Pass to the next machine instruction if not k _ Example a) LDR3,a (R4) # Has the effect of setting. R3 = contents (a + contents (R4)) | ie, Contents (x) denote the contents of the register or memory location represented oa b) LD R3, 200(R4) # Effect of setting is:- R3 = Contents ( 200 + contents (R4)) DS. Indirect mode of addressing is of the form *r, means the memory location has found in the location represented by the contents of registering ‘r’. *200(r) LD R3, * 200(R4) 7 Sein (Contents (200 + Contents (R4))) content mode of addressing: is prefixed by #. e integer 200 into register R2. * pe #200 > Lo the integer 200 into register R3. ,R3, #200 > Adds Scanned with CamScanner 322 Principles of Compiler Design Example The 3 address statement: A = B + Ccan be implemented by the machine instructions: + LD RO, B // RO-B LD RI, C // R1=C ADD RO, RO,RI // RO=RO+R1 ST A, RO // A=RO b) afi] = b + LD RI, b // R1=b LD RQ, i // R2=i MUL R2, R3, 8 // R2=R2*8 ST b(R2), RI // contents (a+ contents (R2))=R1 c)a=* + LD RI, b // Rl=b LD R2,0(R1) // R2=contents (0+Contents(R1)) STa, 12 //a-R2 4) ifa < bgotoL + LDR1,a//Rl=a LD RQ, b // R2=b SUB R1, RI, R2 // RI=RI-R2 BLTZ RI, L //ifacb ie ifR1 < Ojump toL 7.2.2 Program and Instruction Cost Optimization technique consists of. of times in a program and replaci constructs. The most heavily traye ly elled part Therefore finding an optimal target frre, ee Program ar € target for optimization. e Bet prograr ee 2 an undecidable problem. Addressing. ee penne meee Sealy 8 nj i detecting patterns th: ‘at have to be repe: ber ng these patterns be peated num! equivalent but most effective cost of one. Scanned with CamScanner b Code Generation 323 8) UDRARS = Con. | (Since regis only registers are involved) Good CG algorithm is one th; executed by the generated Example at seek aaa eeks to minimize the sum of the costs of the instructions target program on typical input. a) LDRO,Y> Cost = 2 LD Ri, z > Cost = 2 ADD Ro, RI + Cost = 1 ST x, Ro > Cost = 2 Therefore, the total cost of above instruction sequence is = 7 b) MOVRi, M — Cost here is 2 ADD # 10, R1 > cost hereis 2 SUB 2(Ro), *(R1) > cost here is 3 fore, the total cost is 7. Mode From Address Added Cost . Absolute M = ; i R - Register . Indexed c(R) c+ contents(R) ; |. IndirectRegister *R contents(R) 0 *c(R) contents (c+ contents(R)) 5 Indirect-indexed Scanned with CamScanner 324 Principles of Compiler Design 7.3 ADDRESSES IN THE TARGET CODE allocation strategies i.e. static 5 "i ‘ ards storage This section briefly explains two standards storag IR can be converted in to ie “ ames in - allocation and stack allocation. It also shows how nam edure calls and returns usin, addresses in target code by looking at CG for simple proc 8 static and stack allocation. i eas The logical address space is partitioned in to four code and data are 1. ‘Code’: Holds the executable target code. At compile time, size of the target program can be determined. 2. ‘Static’ For holding global constants and other data generated by the compiler that can be determined at the compile time. 3. ‘Heap’:-For holding data objects that are allocated and freed during program execution. Size of the heap cannot be determined at compile time. 4. ‘Stack’:-For holding activation records as they are created and destroyed during Procedure calls and returns. Size of the ‘stack’ cannot be determined at the compile time. ‘Code’ and ‘Static’ are statically determined area, dynamically managed area. 7.3.1 Static Allocation Tollowing three- address state precedure calls and returns:- where as ‘Heap’ and ‘Stack’ are ment is considered to illustrate CG for simplified 1. call ‘callee’ 2. return 3. halt 4. action, ¥ can also be written as, MOV # here + 20, callee stati icarea GOTO callee.code-area Scanned with CamScanner Code Generation 325 Here ‘callee.staticay ea’ 2 and ‘callee.CodeAres are consi of Bitte activation ae ee.CodeArea’ are constants re ferring to the address ‘#here + 20’is the Example sume we have the following // Code for C action | call P action 2 halt // Code for P action 3 literal return address: address code: return Then, the input to CG will be: Three address Code Active recode for Active recode for | ies 52 c ( 64 bytes ) P (88 bytes ) action 1 buf for input in fig. 7.3 is:- Scanned with CamScanner 3 //code for ¢ 100 : ACTION 1 //code for action 1 120 : ST 364, #140 132 : BR 200 //call P 140 : ACTION 2 ais 160 : HALT //return to operating system //code for P 200 : ACTION 3 220 : BR* 364 ; z //300-363 hold activation record for c //364-451 hold activation record for p 368 : //local data for P FIGURE 7.4: Target code for static allocation 7.3.2 Stack Allocation By using relative addresses for storage in activation records, Static allocation becomes stack allocation. The position of an activation record for a procedure is unknown till can be conveniently done by using indexed address mode of our target machine. But relative address in an activati A register ‘SP’ is used. ‘SP’ is a Pointer to the start of the activation record on TOS. Whenever a procedure call occurs, ‘SP’ ig incremented by calling procedure and control is transferred to the called Piccaure secrement ‘SP’ as scon as control returns to the caller. This deallocates the activat) r LD SP, #stack start /, /Initialize th See othe pegsu procedure € stack code f Halt // Terminate execution eee Procedure The above procedure initi 5 ializes th ote. in memory, © stack by setting ‘SF to the start of the stack area A procedure call occurs ‘SP ig j Is incre i i transferred to the called procedures ned by Salling Procedure and contol ADD SP, SP, i i SP, # callerrecordSize //increment Stack poi; SP. inter, SP * SP, # here+16 //save Teturn address, Scanned with CamScanner Here ‘# callersrecg,, Code Generation 377 Isize? SP point to the nes Presents the size of the acts activatin activation record. ‘ADD aodress of the instruction fonmun oe OPeFand ‘theres in gion cocoa ail ollowing ee ePOn ie The called procedure ty, oie sfers co, ntrol to the return addre BR*O(SP) //return to cat, s8 usin Decrement ‘SP’ as soon as , Ontrol returns to th SUB SP, SP, # caller:recoresi, ° Seas © //dectens, decrement stack pointer 7.3.3 Run-time Addresses for Names The storage allocation Strategy and the |. ayout of local data in an activation record for rocedure determi ; ; api mine how the Storage for names is accessed. The name in a three. address code statement is really . ..- ‘Sa tement is really a pointer to a symbol-table entry for the name. This approac vantageous as it makes the compiler more portable i.e. there is no need to change the front end when the compiler is moved to different machine where a different run-time organization is needed. 7.4 BASIC BLOCK AND FLOW GRAPH This section deals with basic blocks and flow-graphs. A graphical representation of intermediate code is helpful for CG even though if the graph is not explicitly “onstructed by a code-generation algorithm. The representation can be constructed explained below:- 1. Partition the intermediate code into ‘basic block’. [Basic Block is a sequence of statements that enters at the start and ends with a branch at the end]. Basic Blocks are nothing but maximum sequence of constructive three-address instructions with following properties: 4) The flow of control can only enter the basic block through the first instruction in the block. No jumps should exist into the middle of the block. Control lfleave the block without halting or branching, except at the last instruction in the block. : : ; The ioe block becomes the nodes of a ‘flow graph’. The edges indicate which block can follow which other block. Us see how to identify the basic block. Basic Blocks ii £ constructive statements in which flow of control enters = ics ey the end without halt or possibility of branching except Bane blocks are constructed by partitioning a sequence of three-address Scanned with CamScanner 328 Principles of Compiler Design ALGORITHM 7.1:- Partitioning three-address instruction into basic locks INPUT: - Sequence of three-address instructions OUTPUT: - List of basic blocks. METHOD: Step 1 * state! s of basic blocks. The The first step is to determine the set of leaders. The 1*statement of basi T rules to obtain the leaders are: 1. The first statementis a leader. 2. Any statement which is the target of conditional or unconditional GOTO is a leader. $,. Any statement which immediately follows the conditional GOTO or unconditional GOTO is a leader. Step 2 For each leader construct the basic blocks which consists of the leader and all the instructions up to but not including the next leader or the end of the intermediate program. [i.e. for each leader, its basic blocks consist of itself]. Example let us construct basic blocks for the following:- 1. i=0 2. if (>10) goto 6 3. aliJ=0 4, i=i+l 5. goto2 6. End Let us apply step] and step2 of algorithm 8.1 and identify the basic blocks:- Step 1 Identify the leader. L 1, G0 L 2. if(i>10) goto 6 L 3. ali]=0 L=Leader 4, i=itl 5. goto2 6. End Scanned with CamScanner Code Generation 3; a) Lisa leader + Because lis a first statement [Based on rule] of ste P1 in algorithm 7.1] b) 6, 2are leaders Rule 2 of step! [Any statement that is the target of ¢ onditional or unconditional ¢ The 2™ statement is if (>10) goto 6 t t Condition; GOTO [i.e. It is condition: The 5th statement is: goto 2 t Unconditional goto therefore 2 is a leader. ©) 3isa leader > Rule3 of step] from algorithm 7.1 [Any statement which immediately follows conditional GOTO is a leader) + Statement 2 i.e. if (i>10) goto 6 is a conditional GOTO. Statement 3 immediately follows statement 2. Therefore statement 3isa leader. Step 2 For each leader construct the basic blocks. 2 if (i>10) GOTO6 BZ 7070 is a leader) ‘al goto]. Therefore 6 is a leader. blocks identified using algorithm 7,1 tely follows unconditional GOTO ie, 5 [Rule 3] Scanned with CamScanner 330 Principles of Compiler Design ercise 7.1 1. Consider for i from 1 to 10 do for j from | to 10 do alij] =0.0; for i from 1 to 10 do afi,i] =1.0; This is a source code that turns a 10X10 matrix into an iden block for the above source code. ntity matrix. Identify basic Solution First write the intermediate code for the given source code and then identify leaders followed by basic blocks. the ic i=l BI L L a[t,] =0.0 j=l ifj<= 10 _ goto (3) L ]10. i=i+1 ll. ifi<=10 goto (2) B4 rp ir . Sea t= 88* 1, alt,]=1.0 B6 i=i+l ifi<=10 _ goto(13) Scanned with CamScanner Code Generation Note Address Location 1000- Base Adress 10004 ‘int’ occupies 4 bytes 10008 1000 1012 sin 1016 an Assume location starts from 1000. » Here n=5 Loc Afi] = Base Address of A+ (i-1) *c Here c—> word length = 4 size i index loc A[i] = Base Address of A+ NS Loc A[i] = ADD (A) + 4i-4 4 bytes = ADD (A) -4+ 4i Se == T,=T,[T,] 2 Consider the following program segment:- Sum = 0 fori=1tondo _ Sum = Sum + ali] isig end for Meese the intermediate code and identify basic blocks. Me termediate code for the given program segment is:- Scanned with CamScanner 332 Principles of Compiler Design Sum =0 i=1//fori=lton feLP T =44i q Add(A) - 4 T,] //ali] Sum = Sum+ T, i=it+] goto 3 //end for BOON DSH 2S end Step 2 if ion goto 10 // goto end ifi #1 ton The basic blocks can be obtained from step] as shown below:- [ee L} 1. Sum=0 Bi 2 ——— L | 3. ifisngoto10 Bg 5 TR=Adaay—4 BS eas (T) i. Sum = Sums i ae ey 9. goto3 a [ey Scanned with CamScanner Code Generation 333 7.4.2 Next-use Information The ‘use’ of a name address statemen; control can flow from of x, then j is said to « ‘live’ at statement j ALGORITHM 7.2: Deter: +2: Determin eilive; statement in basic block. ng the liveness and the next-use information for each in a three; addre: : ldress statement is defined as follows:- Assume three- a value to ‘x’ tatementi to j Use’ the val assigns ; If statement ‘j’ has ‘x’ as an operand, and 4 path that has no intervening assignments ue of x computed at statement i. Here ‘x’ is said to along INPUT: A basic block B of thre OUTPUT: At each statement: information of x,y and z, METHOD: We x=yt+zinB, e-address statements ix=y+z on B, we attach to i the liveness and nextuse = A 3 ced to determine the next uses of x, yand z for three-address statement Let us start with the last statement in B and scan backwards to the beginning of B. At each statement i: x = y + z in B do: I. Attach to statement i the information that is currently found in the symbol table regarding the next-use and liveness of x, y &z 2. Set x to ‘not live’ and ‘no next use’ in the symbol table. 3. Set y and z to ‘live’ and the next uses of y and z to i in the symbol table. Here ‘+’ represents any operator. If three-address statement i is of the form x or x= ty, the steps are same as above but ignore z. 7.4.3 Flow Graphs FLOW GRAPHS are used to rep irected graph. There exists an € first instruction in block 2 to imm " First step to write the flow graph is to construct the basic blocks. Letus write the flow graph for the following basic blocks: Sie | 2 if(ir10) Goto6 BZ resent the basic blocks and their relationship by a dge from block 1 to block 2 iff it is possible for the ediately flow to the last instruction in block 1. a, Scanned with CamScanner 334 Principles of Compiler Design a L | 3. ali) =0 BS 4. Ja 5, goto2 {Pas G2 8 aamememat 5) plier ena B4 Flow graph for these basic blocks is: B,|_1i=0 2. if (i> 10) Goto KB4 4.4 =i 5. goto BZ In block B2 we have: if (i>10) goto B4 = if(i>10) goto K B4 Instead of ‘6’ it is written as B4, as stat tement 6 exists in block B4. Illy in block BS we have: goto y = Roto B2 Statement 2 exists in block BQ lor lin 2] e: Scanned with CamScanner a a[i]=0 iit] goto BQ FIGURE 7.5: Flow graph Exercise 7.2 1. Obtain the graph for the following program construct: Prod =0 i=l do Prod = Prod + afi]* bli] i=iti bE While is 20 Scanned with CamScanner er WET, LOCALij=aua(a) 434i // To And ALi) | f 1,=T7, (TJ \llly LOC B[i] = Add(B) - 4+ 4i //To find B[i] T, Solution Step 1 3 address code (TAG) is:= 1. Prod=0 Need = 45 T, = Add(A) - T,=1,(T,] T,=4*i T, = Add(B) - T,=T,(T,] =T,*T, //ali] * bi) 10. ae Prod +T, lL. isi¢1 c compe Oe 12. if (i < 20)goto 3 //do—while Step 2 Identify basic blocks Scanned with CamScanner CodeGenerction 337 Li}. Prod =0 Bl 2. i=l] L | 3. T,=4*i 4. T,= Add(A) - 4 B2 5. T,=T,{T,] 6. T,=4*i i. T, = Add(B) - 4 8. T,=T,{T,) 9. T,=i,"a, 10, Prod = Prod + T, ll. if(is20) goto 3 L {12 End BS HereT, = 4*iandT, = 4* ican be written only once. That is T, = 4* i. Then 6% ‘Statement can be eliminated & 8” statement will become T, = T, [T,]. A T, is used instead of T,, Step 3 Draw the flow graph. Scanned with CamScanner 3g Principles or“ompr 3 a= ta Ty 10. Prod= Prod + T, 11. If(i < 20) goto B, FIGURE 7.6: Flow graph Here in B2, we have, if i < 20) goto 3 ¥ Replace 3 by B2 [3 is in B2] if (i < 20) goto B2 2. Consider following Program segment; for i= 1 tondo : for j=1 tondo ChiI=Ai jap l,j) for end for a) Obtain 3 c b) Obtain basic Sis ©) Obtain the flow Staph Scanned with CamScanner MOCAL jl Add (A) 46-D4¢n 4 Gay 4 =Add (A) + 4nian + 4)-4 =Add (A) -444ni-4n+ 4j | T, Ta DaE ST. Toda T,=T+T, T= Ly Illy Loc Bi »JIF Add(B) ~ 4+ 4ni— 4m + 45 Qed Ga Lal " Scanned with CamScanner 340 Principles of Compiler Design Step 1 Find three-address code: 1 i=l ing 3AC er writin if (i>n) goto 2] ——+ write these @ the end after writing 2. Bi i: 4. if j>n) goto 19 4*n 10. T, = Add(A) -4 M1. T,=T, (1) //alij] 12. T, =Add(B) -4 13. T,=T, (T,] //blij] 14. T= Add(C) -4 15. T,,=T,+T, //alij] +blij] 16. T,, (T,] =T,, //clij]=alij] + biij] 17, j=j+1 18. goto 4 19. i=isl 20. goto 2 21. End Step 2 Obtain basic blocks;- Leaders are: 1, 2, 3, 4, 5, 19, and 21 if 2 ifi>n) goto 91 5 Scanned with CamScanner Code Generation L iL 4. if(j >n) goto 19 B4 L/5. T,=4*n BB 18. goto 4 Sd yd. om L 19, i=itl B6 20. goto 2 L | 21. End B7 Step 3 Obtain flow graph: * Here in B2 we have: if (i > n) goto 21 ¥ Replace by if (i > n) goto B7 * Also, if (j > n) goto 19 + Replace by if Gi > n) goto B6 Also replace goto 4 and goto 2 by goto B4 and goto B2 respectively, Scanned with CamScanner 342 Principles of Compiler Design 1,=T, (7, T,,=Add(C) -4 1,,=7,+Ts T,, (T,]=T, jeitl goto Bt 1 ral BE | soto B2 1 | B7 [End FIGURE 7.7: Flow graph for matrix addition. 7.5 OPTIMIZATION OF BASIC BLOCKS The basic blocks can be improved durin, and ‘loop optimizations’. Optimizations p 8 runtime by performing: - called ‘local optimization’, These are erformed exclusively with in asiest to perform since we do ‘local optimization a basic block are not consider any Scanned with CamScanner Code Generation 343 control flow ing lormatio, the local optimization fond Considers only the statements within the block. Many of ation has corresne a : same principle bu © Corresponding ‘global optimization’ which operates on the at flow of informe Considers additional analysis to perform ‘global optimization’ looks AU0N among the basic blocks of a problem, 7.5.1 DAG Representation of Basic Blocks Majority of local optimization tec : hniques begi: i ‘ans of ba Mrroldirected acyclic graph (DA iniques begins with transformation of basic blocks : *). Let us see how to construct DAG from basic blocks 1. For each & every initial values of the variables appearing in the basic block, there exists a node in the DAG. 2 IE‘Sisa Statement with in the block then there exists a node ‘N’ associated with each ‘S’. The children of ‘N’ are those nodes corresponding to statements that are the last definations,, prior to ‘S’, of the operands used by ‘S’. 3. Label node ‘N’ by the operator applied at ‘S’ and also attached to ‘N’. It is list of variables for which it is the last definition within the block. 4. DAG consists of ‘output Nodes’. These few nodes which are ‘output nodes’ are those nodes whose variables are ‘live on exit’ from the block. ‘Live on exit’ means that their values can be used later in another block of the flow graph]. Global flow analysis calculates these ‘live variables’ The following code improving transformations are performed on the code present in block. This is performed by using DAG representation of a basic block. 1. Eliminate ‘ocal s’ [are those instructions which compute a value that have already been computed]. 2, Eliminate ‘dead code’ [Instruction that compute a value which is never used are ‘Dead Codes’]. 3. Reorder those statements that do not depend on one another. This reordering help to reduce time a temporary value needs to be preserved in a register. 4. Algebraic laws can also be used to reorder operands of three-address instructions. This helps to simplify the computation. 7.5.2 Finding Local Common Subexpressions ‘Iwo operations are ‘common’ if they produce the same result. In such a case, it is better to Sees result once and reference it the second time rather than re-evahee it Common subexpressions can be detected by noticing, asa new node ‘M? is about to be added, whether there is an existing node 'N’ with the same children, in the same rder, and with the same operator. In this case, N computes the same value as M and ‘May be used in its place. i Scanned with CamScanner

You might also like