ACA Solution Manual

aca

Uploaded by

Mitesh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF or read online on Scribd

0% found this document useful (0 votes)

307 views39 pages

ACA Solution Manual

aca

Uploaded by

Mitesh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF or read online on Scribd

You are on page 1/ 39

Problem 11 45 x 1432x2415 x 24+8X2_ 5 ‘ eye ie ae 1.85 cycles/instruction. cP) 40_x 10S cycles/sec 1.55 eydles/instruction (45000 x 1 + 32000 x 2+ 15000 x 2 + 8000 x 2}cycles 40x 10)eydes/s ‘The execution time can also be obtained by dividing the total number of instructions by the MIPS rate: (45000 + 32000 + 15000 + 8000)instructions ‘25.8 x 10° instructions/s MIPS rate = 10°° x = 25.8MIPS. Execution time = = 3.875 ms. = 3.875 ms, Problem 1.2 Instruction set and compiler technology affect the length of the ex- ecutable code and the memory access frequency. CPU implementation and control determines the clock rate, Memory hierarchy impacts the effective memory access time. ‘These factors together determine the effective CPI, as explained in Section 1.1.4. Problem 1.3 (a) The effective CPI of the processor is calculated as 15 x 10° cyeles/sec sec__is instructi 10% 107 instructions sec ~ V5 “Yeles/instruction CPL = (b) The effective CPI of the new processor is (140.3 x 240.05 x 4) = 1.8 cycles/instruction. L Therefore, the MIPS rate is 30 x 10° cycles/sec 1.8 cycles/instruction ~ 16-7 MIPS.Problem 1.4 (a) Average CPI = 10.6 +2% 0.18 +4 x 0.12 +8 x 0.1 = 2.24 cycles / instruction. (b) MIPS rate = 40/2.24 = 17.86 MIPS. Problem 1.5 (a) False. The fundamental idea of multiprogramming is to overlap the computations of some programs with the I/O operations of other programs. (b) True. In an SIMD machine, all processors execute the same instruction at the same time, Hence it is easy to implement synchroaization in hardware, In an MIMD machine, different processors may execute different instructions at the same time and it is difficult to support synchronization in hardware. (c) True. Interprocessor communication is facilitated by sharing variables on a mul- tiprocessor and by passing messages among nodes of a multicomphter. ‘The mul- ticomputer approach is usually more difficult to program since the programmer taust pay attention to the actual distribution of data among the processors. (4) False. In general, an MIMD machine executes different instruction streams on different processors. (@) True, Contention among processors to access the shared memory may create hot spots, making multiprocessors less scalable than multicomputers. Problem 1.6 The MIPS rates for different machine-program combinations are shown, in the following table: C ‘Machine Programm | Compater A | Computer B | Computer C Program 1 100) 10 5 Program 2 | ___O1 T 3 Program 3 oz or 2 Program 4 i os fa Various means of these values can be used to compare the relative performance of the computers. Definition of the means for a sequence of positive numbers @3,02,...,y are summarized below. (See also the discussion in Section 3.1.2.) (a) Arithmetic mean: AM = (SUL, a:)/n. (b) Geometric mean: GM = (FTL, a:)!/.(c) Harmonic mean: HM = n/[S72.,(1/ai)) In general, AM > GM > HM. (ua) Based on the definitions, the following table of mean MIPS rates is obtained: ‘Computer A | Computer B [ Computer C ‘Arithmetic mean 25.3 281 325 ‘Geometric mean Lis 0.59 2.66 ‘Harmonic mean 0.25 020, 21 Note that the arithmetic mean of MIPS rates is proportional to the inverse of the harmonic mean of the execution times. Likewise, the harmonic mean of the MIPS rates is proportional to the inverse of the arithmetic mean of execution times. The two ‘observations are consistent, with Bq. 1.1. ‘If we use the harmonic mean of MIPS rates as the performance criterion (i.e., each program is executed the same number of times on each computer), computer C has the best performance. On the other hand, if the arithmetic mean of MIPS rates is used, which is equivalent to allotting an equal amount of time for the execution of each program on each computer (i.e., fast-running programs are executed more frequently), then computer A is the best choice. Problem 1.7 » An SIMD computer has a single control unit. The other processors are simple slave processors which accept instructions from the control unit, and perform an identical operation at the same time on different data. Each processor in an MIMD computer has its own control unit, and execution unit. At any moment, a processor can execute an instruction different from the other processors. * Multiprocessors have a shared memory structure. The degree of resource sharing is bigh, and interprocessor communication is carried out via shared variables in the shared memory. In multicomputers, each node typically consists of a processor and local memory. The nodes are connected by communication channels which provide the mechanism for message interchanges among processors. Re- source sharing is light among processors + In UMA architecture, each memory iocation in the system is equally accessible to all processors, and the access time is uniform. In NUMA architecture, the access time to a memory location depends on the proximity of 2 processor to the memory location. Therefore, the access time is nonuniform. in NORMA architecture, each processor has its own private memory; no memory is shared among processors. Hach processor is allowed to access its private memory only. In COMA architecture, such as that adopted by KSR-1, each processor lias its private cache, which together constitutes the global address space of the system. Itis like a NUMA with cache in place of memory. A page of data can be migrated to a processor upon demand or be replicated on more than one processorProblem 1.8 (a) The total number of cycles needed on a se 4) x 64 = 1664 cycles. (b) Bach PE executes the same instruction on the corresponding elements of the vectors involved. There is no communication among the processors, Hence the total number of cycles on each PE is 44448444244 = 26. quential processor is (44+ 4+8+4424 (¢) The speedup is 64 with a perfectly parallel execution of the code.Problem 1.9 Because the processing power of a CRCW-PRAM and an EREW-PRAM is the sazne, we neod only focus on memory accessing. Below, we prove that the time com. plexity of simulating a concurrent write or a concurrent read on an EREW-PRAM is Ollogn). Before the proof, we assume it is known that an BREW PRAM can sort numbers or write a number to n memory locations in O(log n) time (a) We prescnt the proof for simulating concurrent writes below. 1. Create an auxiliary array A of length n. When CROW processor P,, for i = 0,1,...n ~ 1, desires to write a datum z; to a location |, each corresponding EREW processor P; writes the ordered pair (Ij,.7;) to location Ali]. These writes are exclusive, since each processor writes to a distinct memory location. 2 Sort the array by the first coordinate of the ordered pairs in O(log n) time, hich causes all data written to the same location to be brought together in the output. 3 Buch EREW processor P,, for i = 1,2,...~ 1, now inspects Afi] = (l,.23) and Ali—i] = (i,,24), where j and are values in the range 0 < j.k ) If we change the positions of some switch modules in the Baseline network, it becomes: > P 2» 9 wo ™ » ‘ 2 1" = 2 2 10 ° ° % : ‘ “ ” » S me ™ x > a which is just the Plip network. (c) Since both the Omega network and the Flip network are topologically equivalent to the baseline network, they are topologically equivalent to each other.Problem 2.16 (a) RY (b) Lk /2h (c) 2k"? (a) 2n. fe) oo # A Kary Leube is a ring with k nodes. A beary 2-cube is a 2D k x torus, A mesh is a torus without end-around connections. A Dary n-cube is a binary n-cube. 4m Omega network is the multistage network implementation of shuffie- exchange network, Its switch modules can be repositioned to have the same interconnection topology as a binary n-cube. (4) The conventional torus has long end-around connections, but the folded torus has equal-length connections. (See Figure 2.21 in the text). fg) ‘The relation B= 2QuN/k will be shoum in the solution of Problem 2.18. Therefore, if both the number of nodes IV and wire bisection width B are constants, the channel width W will be proportional to &: w= Bb = Bk/(2N) ‘The latency of a wormhole-routed network is +=, » Twa w which is inversely proportional to w, hence also inversely proportional to i. This means a network with a higher k will have lower latency. For two k-ary n-cube networks with the same number of nodes, the one with a lower dimension has a larger &k, and hence a lower latency. It will be shown in the solution of Problem 2.18 that the hot-spot throughput, is equal to the bandwidth of a single channel: Low-dimensional networks have a larger k, hence @ higher hot-spot through- putProblem 4.1 (a) Processor design space is a coordinated space with the « and y axes representing clock rate and CPI, respectively. Each point in the space corresponds to a design choice of a processor whose performance is determined by the values of the coordinates. (b) The time required between issuing two consecutive instructions. (c} The number of instructions issued per cycle (a) The number of cycles required for the execution of a simple instruction, such as add, move, etc. (e) Two or more instructions attempt to use the same functional unit at the same time. (4) A coprocessor is usually attached to a processor and performs special functions at a fast speed. Examples are floating-point and graphical coprocessor (g) Registers which are not designated for special usage, as opposed to special-purpose registers such as base registers or index registers. (h) Addressing mode specifies how the effective address of an operand is generated so that its actual value can be fetched from the correct memory location (i) In the case of a unified cache, both data and instructions are kept in the same cache. In split caches, data and instructions are held in separate caches, (j) Hardwired control: Control signals for each instruction are generated by proper Gircuitry such as delay elements. Microcoded control: Bach instruction is inmple- mented by a set of microinstructions which are stored in a control memory. The decoding of microinstractions generates appropriate signals to control the execution of an instruction.Problem 4.2 (a) Virtual address space is the memory space required by a process during its execution to accommodate the variables, buffers, etc., used in the computations. (b) Physical address space is the set of addresses assigned to the physically available memory words. (c) Address mapping is the process of translating a virtual address to a physical address. (d) The entirety of a cache is divided into fixed-size entities called blocks. A block is the unit of data transfer between main memory and cache (¢) Multiple levels of page tables used to translate a virtual page number into a page frame number. In this case, some tables actually store pointers to other tables, similar to indirect addressing mode. The objective is to deal with a large memory space and facilitate protection. (£) Hit ratio at level i of the memory hierarchy is the probability that a data item is found in M. (g) Page fault is the situation in which a demanded page cannot be found in the main memory and has to be brought in from the disk. (h) A hash function maps an element in a large set to an index in a small set. Usually it treats the input element as a number or a sequence of numbers and performs arithmetic operation on it to generate the index. A suitable hash function should map the input set uniformly into the output set. (i) An inverted page table contains entries that record the virtual page number asso- ciated with each page frame that has been allocated. This is contrary to a direct mapping page table. (J) The strategies used to select page or pages resident in the main memory to be replaced in case such needs arise,Problem 4.4 (a) The comparison is tabulated below: Tem CISC RISC. Tnstruction | 16-64 bits fixed (SDbit) format __| per instruction format ‘Addressing 12-24 limited to 3-5 modes (anostly register-based, except load/store) CPI 21, on the average 5 | < 15, very close tol (b) © Advantages of separate caches: 1. Double the bandwidth because two complementary requests are ser- viced at the same time. 2. Simplify logic design as arbitration between instruction and data. ac- cesses to the cache is simplified or eliminated. 3. Access time is reduced because data and instruction can be placed close to the functional units which will access them. For instance, instruction cache can be placed close to the instruction fetch and decode units, ‘© Disadvantages of separate caches: 1. Complicate the problem of consistency because data and instruction may coexist in the same cache block. This is true if self modifying code is allowed or when data and instructions are intermixed and stored in the same cache block. To avoid this would require compiter support to ensure that instruction and data are stored in different cache blocks 2. May lead to inefficient use of cache memory because the working set size of a program varies with time and the fraction devoted to data and instruction also varies. Hence, the sum of data cache size and instruction cache size is usually larger than the size of a unified cache. As a result, the utilization of instruction cache and/or data cache is likely to be lower. For separate caches, dedicated data paths are required for both instruction and data caches, Separate MMUs and TLBs are also desirable for separate caches to shorten the time of address translation. & higher memory bandwidth should be used for separate caches to support the increased demand.In actual implementation, there is tradeoff between the degree of support provided and the resulting hardware complexity, (ec) * Instruction issue: Scalar RISC processor issues one per cycle; superscalar RISC can usually issue more than one per cycle. © Pipeline architecture: In ani m-issue superscalar processor, up to m pipelines may be active in any base cycle. A scalar processor is equivalent to a superscalar processor with m = 1. * Processor performance: An m-issne superscalar can have a performance m times that of a scalar processor, provided both are driven by the same clock rate, no dependence relation or resource conflicts exist among instructions, (4) Both superscalar and VETW architectures employ multiple functional units to al- low concurreut instruction executions. Superscalar requires more sophisticated hardwate support such as large reorder registers and reservation tables iu order to make efficient use of the system resources. Software support is needed to resolve data dependences and improve efficiency. In VLIW, instructions are compacted by compiler which explicitly packs together instructions which can be executed in concurrency based on heuristics or run-time statistics. Because of the explicit specification of parallelism, the hardware and software support at run time is usually simplified. For instance, the decoding logic can be simple. Problem 4.5 Only a single pipeline in scalar CISC or RISC architecture is active at a time, exploiting parallelism at microinstruction level. Operation requirement is simple. In a superscalar RISC, multiple pipelines can be active simultaneously, To do 80 requires extensive hardware and software support to effectively exploit instruction parallelism. In VLIW architecture, multiple pipelines can be active ai the same time. Sophisticated compilers are needed to compact irregular codes inte a long instruction word for concurrent execution.Problem 4.6 (a) i486 is a CISC processor. The following diagram shows the general instruction, format, A few variations also exist for some instructions. (RETTITETT TTA aime armcttnone crane DassT ee code x bys) “nad rn” “eit ats inode Telopaiehd byte be diglcement ea an ad mes ea bges St erent) reper a ass Trdeopecter Data format: Byte (8 bits): 0-255 Word (16 bits): 0-64K DWord (32 bits): 0-4G 8-bit integer (8 bits): 107 16-bit integer (8 bits): 10 32-bit integer (8 bits): 10° 65-bit integer (8 bits): 10'% 8-bit unpacked BCD (1 digit): 0-9 8-bit. packed BCD (2 digits): 0-9 80-bit packed BCD (18 digits): 10%! Single-precision real (24 bits): +10*°* Double-precision real (53 bits): £10*#°8 Extended-precision real (64 bits): +10#498? Byte string, Word string, DWord string, Bit string to support ASCII data eceveececoreere(b) There ated. eeoceee ° are 12 different modes whereby the effective address (BA) can be gener- register mode immediate mode direct mode: EA + displacement rogister indirect or base: BA + (base register) based with displacement: EA + (base register) + displacement index with displacement: EA + (index register) + displacement scaled index with displacement: EA + (index register) x scale + displacement based index: EA @ (base register) + (index register) based scaled index: EA + (base register) + (index register) x scale hased index with displacement: EA = (base register) + (index register) + displacement based scaled index with displacement: EA + (base register) + {index register) x scale + displacement relative: New.PC — PC + displacement. (used in conditional jumps, loops, and call instructions) Problem 4.9 (a) ‘Two situations may cause pipelines to be underutilized: (i) the instruction latency is longer than one base eyele, and (ii) the combined cycle time is greater than the base cycle. (b) Dependence among instructions or rescuree conflicts among instructions can pre- vent simultaneous execution of instructions, Problem 4.19 (a) Vector instructions perform identical operations on vectors of length usually auch larger than 1. Scalar instructions operate on a number or a pair of numbers at a time. (b) Suppose the pipeline is composed of & stages and the vector is of length V. The first output is generated in the k-th cycle. Afterward, an additional output is generated in each cycle, The last result comes out of the pipeline in cycle (N+ k—1). Using a base scalar machine, it takes Né oycles. Thus dhe speedup is Wk/(N + —1).(c) If m-issue vector processing is employed, each vector is of length N/m. Therefore, the execution time is (N/m +k — 1) cycles. If only parallel issue is used, the execution time is (N/m)k. Thus, the speed improvernent is Nim+k Nk (N/m)k ~ N+m(k—1)° Problem 4.11 (a) ‘The average cost is nt oaso Sse For ¢ to approach ¢2, the conditions are 43 >> #1 and ¢¢2 >> e181 (b) The effective access time is ta = Do Site = nts + (1 Pa hate = hts + = Ate (©) Witz = rts, Then t. = (A+ (1 Adria B= G/ty =1/(h+ (1 —h)r). (d) The plottings are shown in the following diagram: Bs oss (e) fr = 100, we have B= 1/(h-+ (1— h) « 100) > 0.95, Solving the inequality, we obtain the conditionProblem 4.14 (a) Inclusion property refers to the property that information present in a lower-level memory must be a subset of that in a higher-level memory, (b) Coherence property requires that copies of an information item be identical through- out the memory hierarchy. (c) Write-through policy requires that changes made to a data item in a lower level memory be made to the next higher level memory immediately. (d) Write-back policy postpones the update at level (i +1) memory until the item is replaced or removed from level i memory. (@) Paging divides virtual memory and physical memory into pages of fixed sizes to simplify memory management and alleviate fragmentation problem. (f) Segmentation divides the virtual address space into variable-sized seginents, Each segment corresponds to a logical unit. The main purpose of segmentation is to facilitate sharing and protection of information among programs.Problem 4.18 Attributes Symbolic processing Numeric processing Data objects Lists, relational databases, scripts, semantic nets, frames, blackboards, objects, production systems. Integer, floating-point numbers, vectors, matrices, Common operations ‘Search, sort, pattern matching, filtering, contexts, partitions, transitive closures, unification, text retrieval, set operations, reasoning. ‘Aad, subtract, multiply, divide, matrix multiplication, matrix-vector multiplication, reduction operations like dot product of vectors, ete Memory requirements. Large memory with intensive access pattern. Addressing is often content-based. Locality of reference may not hold. Great memory demand with intense access. Access pattern usually exhibits high degree of spatial and temporal localities. Communication patterns Message traffic varies in size and destination; geanularity and: format of message units Message traffic and granularity are relatively uniform. Proper mapping can restrict ‘communication to largely between neighboring pruvessurs “Algorithm Properties Nondeterministic, possibly parallel and distributed. computations. Data dependences may be global and irregular in pattern and granularity. Typically deterministic. Amenable to paraliel and distributed computations. Data dependence is mostly local and segular. Input/Output requirements Inputs ean Be graphical and audio as well as from keyboard; access to very large on-line databases, Targe data sets usually exceed memory capacity. Fast 1/0 is highly desirable, ‘Architecture Features Parallel update of large knowledge bases, dynamic load balancing; dynamic memory allocation; hardware-supported garbage collection; stack Processor architecture; symbolic processors. Can be pipelined vector processor, MIMD, or SIMD processors using various memory and interconnection structures. Systolic array is suitable for certain types of computations.Problem 5.9 (a) Bach set of the cache consists of 256/8 = 32 block frames, and the entire cache has 16 x 1024/256 = 64 sets. Similarly, the memory contains 1024 x 1024/8 = 131072 blocks. "Thus, the memory address format is as shown ia the following figure: Qa a Cache address tag Set Word address address A dlock B of the main memory is mapped to a block frame in set F of the cache if F = B mod 64. (b) The effective memory access time for this memory hierarchy is 50 x 0.95 + 400 x (10.95) = 47.5 + 29 = 67.5 ns.Problem 5.10 (a) The address assignment is shown in the following diagram: ene ato ni t I Sedat ce ts Mo My My M3 z= B 2 a] me] fw] [ee os} Cwm] Cw] Pen woo on] Paar} | see _} Dota wont Memory daa register(b) There are 1024 / 16 = 64 blocks in the main memory, and 256 / 16 = 16 block frames in the cache. (c) 10 bits are needed to address each word in the main memory: 2 for selecting the memory module and 8 for the offset of 2 word within the module, 6 bits are required to select a word in the cache: 2 bits to select the set number and 4 bits to select a word within a block. Besides, each block frame neods a 4-bit address tag to determine the block resident in it. (d) The mapping of memory blocks to the block frames in cache is shown in the following diagram: cache Moin Memory oe = at 52 sao = Be 35 8 87 es Sett 89 seta | re] seta Bis) B80) = 514 Bis asz 53 After the set in which a memory block can be mapped into is identified, the address tag of the block frames in that set is compared by associative search with the physical memory address to determine if the desired block is in cache,Problem 5.14 (a) ‘Phe effective access time for each memory access is ta = fill — hilton + fall — Rall ‘The CPI in jis can be estimated as a(mt, + t,) +1 cPi=mt, ++ fat,= z = ‘The effective MIPS of the entire system is thus =2-____ MIPS = pi = Smt +i) #1 (8) Using the data given, we have the following values: t, = 0.5 » (1 — 0.95) x 0.5 40.5 x (16.7) x 0.5 = 0.0875 CPT = 0.4 x 0.0875 + 1 +008 x5 = 0.485, And finally, —P_ = 95, 0.485, Hence, the number of processors needed is p = 13. Problem 6.1 (a) nk 15000% 5 _ 75000 _ 4 gog6 Speedup = ¢2Gq—1) = 5+ (1500-1) ~ 15004 (b) Efficiency = n/[k + (n ~ 1)} = 15000/15004 = 0.9997. Throughput = nf /[k+(n~1)] = 15000%25x10° (instructions/s) /15004 = 24.99 MIPS.Problem 6.5 Lower bound of MAL = the maximum number of checkmarks in anj row of the reservation table. Upper bound of MAL = the number of 1’s in the initia collision vector plus. 1. Detailed proof can be found in the paper by Shar (1972). Problem 6.6 (a) Forbidden latencies: 1, 2, and 5. Initial collision vector: (10011). (b) State transition diagram: (c) MAL = 3 (a) Throughput = 4 = 16.67 million operations per second (MOPS).(e) Lower bound of MAL = 2, The optimal latency is not achieved. Problem 6.7 (a) Reservation table: (b) State transition diagram: {c) Simple cycles: (4), (5), (7), (3,1), (34), (3.5.4), (3,5,7), (1,7), (5,4), (5,7), (3,7), (2,3,4), (1,3,5,4), (1,3,5,7), (1,3,7), (1.4.3), (1,44), (1,4,7), (5.5.4), (5,3,7), (5,3,1,7) Greedy cycle: (1,3) (a) 1+3 MAL =*t*=2 fe) 1 Throughput = 5-Problem 6.9 (a) Forbidden latency: 3; collision vector: (100). (b) State transition diagram is shown below: (c) Simple cycles: (2), (4), (1,4), (11,4), and (2,4); greedy cycles: (2) and (1,1,4) (4) Optimal constant latency cycle: (2); MAL = 2 (e) Throughput = = 25 MOPS. —t_ 2x 20mProblem 6.10 (a). Forbidden latencies: 3, 4, 5 ; collision vector: (11100). (b) State transition diagram is shown below: ({¢) Simple cycles: (1,1,6), (2,6), (6), and (1,6). (d) Greedy cycle: (1,2,6). {e) MAL = 1414 6/3 = 2.67. (£) minimum allowed constant cycle: (6). (g) Maximum throughput = 1 MAL xr > 3/87)- (bh) 1/(6r).Problem 6.11 The (ree pipeline stages are referred to as IF, OF, and EX for instruc tion fetch, operand fetch, and execution, respectively. ‘The following diagram shows the sequence of execution: 4 Cy ‘5 u ts % & wy Se Ref oa Ex wo Pace Par Pace [a At to, Ok) 9 I(e) = {RO} —> RAW hazard At ty, O(2) NI) = {Acc} — RAW hazard, At ts, O(L,) MIUs) = {Ace} —s RAW hazard. ‘The following shows a scheduling which avoids the hazard conditions: oF we | Raw! Saw Bend Saw a Problem 6.12 (a) For the given value ranges of m and n, we know that mn(N—-1) > N-1>N-m. Now, Eq, 6.32 can be rewritten as mn(N —1) +mnk Sm) = “Cy in) mk” From elementary algebra, we know that the right hand side of the above equation will attain the largest value when the term mnk is smallest. As a result, the value of & should be 1 in order to maximize $(m,n) (b) Instructional level parallelism limits the growth of superscalar degree (c) The multiphase clocking technique limits the growth of superpipeline degree.Problem 6.13 © Solution 1 (a) Reservation table: si [x x $2 x x $3 x 84 [x (b) Forbidden latency: 4. Collision vector: (1000). (c} State transition diagram: (4) Simple cycles: (1,5), (14,5), (1,1,1,5), (1,2,5), (1,2,3,8), (1,2,4,2,5), (1,2,8,2,1,5), (2,5), (2,1,8), (211,238), (2,1,2,8,5}, (2,3;5), (3,5), (3,2,8), (3,2,1,5), (,3,2,1,2,8), (5), (3,2,1,2), and (3). (e) Greedy cycles: (1,1,1,5) and (1,2,3,2). EH141+5_ ; = (4) MAL = 2. (g) Maximum throughput = 1/(2r).* Solution 2 {a) Reservation table: si (xX] Xt s2 x x 83 x x $4 x (b) Forbidden latency: 2, 4. Collision vector: (1010). (ce) State transition diagram (a) Sinsple cycles: (3), (5), (1,5), and (8,5). (e) Greedy cycles: (1,5) and (3). (f) MAL =3. (g) Maximum throughput = 1/(37).Problem 6.14 (a) The complete reservation table for the composite pipeline is as follows: 12.3 4 5 6 7 8 9 10 It 12 x x (b) Forbidden latencies: 8 1, 7, 9, 3, 2- Collision vector: (111000111). (c) State transition diagram: moun, (€) Simple eycles: (5), (6), (10), (4,6), (4,10), (5,6), and (5,10). Greedy cycles: (5) and (4,6). (e) MAL = 5. (f) Maximum throughput = 1/(57)-‘Problem 7.12 (a) A 16 x 16 Omega network using 2 x 2 switches is shown below: 000 001 colo oon 0100 10 ono ~ om 1100) ~S 1100 04 mo cove ott 100 cue ont mn 1100 Hor ino ua(b) 1011 — 0101 is indicated by — in the above diagram; 0111 -» 1001 is indicated k -—-. As can be seen, there is no blocking for the two connections. (c) Bach switch box can implement two permutations in one pass (straight or cross ‘There are log, 16 x 16/2 switch boxes. Therefore, the total number of single-pa: permutations can be computed as gif log, 18 = 95? = 168, The total number of permutations is 16!, therefore, Number of single pass permutations _ 16% 22.05 «107 Total number of permutations ‘16! (d) At most logy 16 = 4 passes are needed to realize all permutations. Problem 7.13 (a) A unicast pattern is a one-to-one communication, and a multicast pattern is a one-to-many communication. (b) A broadeast pattern is a one-to-all communication, and a conference pattern is a many-to-many communication (c) The channel traffic at any time instant is indicated by the number of channels used to deliver the message involved. (4) The communication latency is indicated by the longest packet transmission time involved (e) Partitioning of a physical network into several logical subnetworks. In each of the subnetworks, appropriate routing schemes can be used to avoid deadlock.

Lab Manual Soft Computing
100% (1)
Lab Manual Soft Computing
44 pages
ParallelProgramminginCwithMPIandOpenMP PDF
No ratings yet
ParallelProgramminginCwithMPIandOpenMP PDF
272 pages
Trigonometric FLANN
No ratings yet
Trigonometric FLANN
9 pages
An Embedded Software Primer by David e Simon
No ratings yet
An Embedded Software Primer by David e Simon
69 pages
LCD and Keyboard Interfacing: Unit V
100% (1)
LCD and Keyboard Interfacing: Unit V
21 pages
Shared Memory Architecture
No ratings yet
Shared Memory Architecture
17 pages
Unit 1 WSN
No ratings yet
Unit 1 WSN
139 pages
Assignment 1: Name Class Date Period Sbuid Netid Email
No ratings yet
Assignment 1: Name Class Date Period Sbuid Netid Email
4 pages
CS 8491 Computer Architecture
No ratings yet
CS 8491 Computer Architecture
103 pages
William Stallings Computer Organization and Architecture 10 Edition
No ratings yet
William Stallings Computer Organization and Architecture 10 Edition
34 pages
Chapter 4 (Processors and Memory Hierarchy)
100% (1)
Chapter 4 (Processors and Memory Hierarchy)
17 pages
Question Bank
No ratings yet
Question Bank
6 pages
CBM342 BCI Unit IV
No ratings yet
CBM342 BCI Unit IV
22 pages
CBM342 BCI Unit III
No ratings yet
CBM342 BCI Unit III
16 pages
Unit4 COA
No ratings yet
Unit4 COA
71 pages
Fat Papers Vlsi Vellore Vit
No ratings yet
Fat Papers Vlsi Vellore Vit
4 pages
Syllabus For Fundamentals of The Embedded Systems: Programming Examples
No ratings yet
Syllabus For Fundamentals of The Embedded Systems: Programming Examples
3 pages
Os
No ratings yet
Os
140 pages
4th Sem End Semester Question Papers
No ratings yet
4th Sem End Semester Question Papers
15 pages
Experiment No 1 - Anurag Singh
No ratings yet
Experiment No 1 - Anurag Singh
9 pages
Instrucoes MIPS
No ratings yet
Instrucoes MIPS
1 page
Collision Free Scheduling
No ratings yet
Collision Free Scheduling
18 pages
CD Unit 4 Compiler Design Jntuk r20
No ratings yet
CD Unit 4 Compiler Design Jntuk r20
17 pages
p1 p2 l1 p1 l1 l2 p2 l1 l3 p1 p2: Course: Hardware Software Co0Design Assignment 1 (10 Marks) Last Date: 3 MRCH 2023
No ratings yet
p1 p2 l1 p1 l1 l2 p2 l1 l3 p1 p2: Course: Hardware Software Co0Design Assignment 1 (10 Marks) Last Date: 3 MRCH 2023
1 page
Syllabus PDF
No ratings yet
Syllabus PDF
102 pages
Microprocessor Answer Key
No ratings yet
Microprocessor Answer Key
36 pages
WINSEM2023-24 BCSE205L TH VL2023240500897 2024-03-15 Reference-Material-I
No ratings yet
WINSEM2023-24 BCSE205L TH VL2023240500897 2024-03-15 Reference-Material-I
17 pages
Question Bank: Department of Information Technology
No ratings yet
Question Bank: Department of Information Technology
14 pages
MCP Data Transfer Instructions
No ratings yet
MCP Data Transfer Instructions
3 pages
ECE304
No ratings yet
ECE304
1 page
Awsn Question Paper
No ratings yet
Awsn Question Paper
14 pages
Chapter 3 - CPU Architecture
No ratings yet
Chapter 3 - CPU Architecture
62 pages
Sample Midterm Exam Questions
No ratings yet
Sample Midterm Exam Questions
2 pages
Systolic Array
No ratings yet
Systolic Array
42 pages
Computer Organisation and Architecture
No ratings yet
Computer Organisation and Architecture
6 pages
OS Lab Manual by Rajesh
No ratings yet
OS Lab Manual by Rajesh
24 pages
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
No ratings yet
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
7 pages
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
No ratings yet
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
41 pages
MCQ On Unit 1 2 PDF
0% (1)
MCQ On Unit 1 2 PDF
22 pages
EC-506 Scilab Laboratory Manual: Submitted by
No ratings yet
EC-506 Scilab Laboratory Manual: Submitted by
29 pages
Address Decoder For PC
No ratings yet
Address Decoder For PC
19 pages
High Performance Computer Architecture (CS60003)
No ratings yet
High Performance Computer Architecture (CS60003)
2 pages
Multiple Choice Questions - Coa: Ans: A
No ratings yet
Multiple Choice Questions - Coa: Ans: A
8 pages
Computer Organization Jan 2010 OLD
No ratings yet
Computer Organization Jan 2010 OLD
1 page
L 1 ParallelProcess Challenges
No ratings yet
L 1 ParallelProcess Challenges
82 pages
ACA UNIT-5 Notes
No ratings yet
ACA UNIT-5 Notes
15 pages
Companies Taglines
No ratings yet
Companies Taglines
37 pages
Week 6: Assignment Solutions
No ratings yet
Week 6: Assignment Solutions
4 pages
Embedded System LESSONPLAN
No ratings yet
Embedded System LESSONPLAN
7 pages
Computer Architecture - A Quantitative Approach Chapter 5 Solutions
No ratings yet
Computer Architecture - A Quantitative Approach Chapter 5 Solutions
14 pages
Microprocessor and Interfacing Devices/Peripherals: 8086 Instructions Set
No ratings yet
Microprocessor and Interfacing Devices/Peripherals: 8086 Instructions Set
24 pages
Midterm Exam Architecture
No ratings yet
Midterm Exam Architecture
2 pages
Pipelining vs. Parallel Processing
No ratings yet
Pipelining vs. Parallel Processing
23 pages
Parallel Algorithm Merged
No ratings yet
Parallel Algorithm Merged
76 pages
Microprocessor and Interfacing Techniques: (Course Code: CET208A) Credits-3
No ratings yet
Microprocessor and Interfacing Techniques: (Course Code: CET208A) Credits-3
147 pages
PDF 1
No ratings yet
PDF 1
17 pages
Multiprocessors
No ratings yet
Multiprocessors
39 pages
Computer Architecture and Organization Ch#2 Examples
No ratings yet
Computer Architecture and Organization Ch#2 Examples
6 pages
7063.NP Hard
No ratings yet
7063.NP Hard
17 pages
Computer Organization Hamacher Instructor Manual Solution - Chapter 7
67% (3)
Computer Organization Hamacher Instructor Manual Solution - Chapter 7
13 pages
IR Model Question Paper
No ratings yet
IR Model Question Paper
2 pages
Java Ring
100% (1)
Java Ring
25 pages
Quiz Solutions
No ratings yet
Quiz Solutions
1 page
Schlumberger People
No ratings yet
Schlumberger People
3 pages
DVB IIMA Case Study
No ratings yet
DVB IIMA Case Study
56 pages
Chapter 5: Marketing Strategies Strategic Marketing Planning
No ratings yet
Chapter 5: Marketing Strategies Strategic Marketing Planning
2 pages
Addition of Two BCD Numbers
No ratings yet
Addition of Two BCD Numbers
1 page

ACA Solution Manual

Uploaded by

ACA Solution Manual

Uploaded by

You might also like