0% found this document useful (0 votes)

12 views21 pages

Microprocessor Indiviual Assignment

Uploaded by

zelalemg935

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views21 pages

Microprocessor Indiviual Assignment

Uploaded by

zelalemg935

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 21

Individual Assignment

Draw the detailed architecture of Pentium 4 and Core2 and core i3 microprocessors:

Explain the working principle

Register organization, MEMORY PAGING and Paging Registers

……………………………………………………………………………………………………………………………………………………………….

PENTIUM 4
Introduction

The Pentium 4 processor is Intel’s new flagship microprocessor that was introduced at 1.5GHz in

November of 2000. It implements the new Intel NetBurst microarchitecture that features significantly higher
clock rates and world-class performance. It includes several important new features and innovations that will
allow the Intel Pentium 4 processor to deliver industry-leading performance for the next several years.

The Pentium 4 processor is designed to deliver performance across applications where end users can truly
appreciate and experience its performance. For example, it allows a much better user experience in areas
such as Internet audio and streaming video, image processing, video content creation, speech recognition, 3D
applications and games, multi-media, and multi-tasking user environments. The Pentium 4 processor enables
realtime MPEG2 video encoding and near real-time MPEG4 encoding, allowing efficient video editing and
video conferencing. It delivers world-class performance on 3D applications and games, such as Quake 3 ∗,
enabling a new level of realism and visual quality to 3D applications.

Traditionally we have always looked for higher clock speeds and instruction level parallelism. By
implementing this features the performance of the processor could be substantially enhanced.

But that was not enough to meet the challenges of newer applications.

This was the genesis of the birth of the Pentium 4 microprocessor which implemented Intel Netbrust
architecture.

The most recent version of the Pentium Pro architecture microprocessor is the Pentium 4 microprocessor
and recently the Core2 from Intel. The Pentium II, Pentium III, Pentium 4, and Core2 are all versions of the
Pentium Pro architecture.

The Pentium 4 was released initially in November 2000 with a speed of 1.3 GHz. It is currently available in
speeds up to 3.8 GHz. Two packages are available for early versions of this integrated microprocessor, the
423-pin PGA and the 478-pin FC-PGA2. Both versions of the original issue of the Pentium 4 used the 0.18
micron technology for fabrication. The most recent versions use either the 0.13 micron technology or the 90
nm (0.09 micron) technology. Newer versions of the Pentium 4 use the LGA.

Pentium 4

Introduced 2000
Clock speeds 1.3–1.8 GHz
Bus width 64 bits
Number of transistors 42 million
Feature size (nm) 180
Addressable memory 64 GB
Virtual memory 64 TB
cache 256 kB L2

Architecture

This image was taken from “Advance microprocessors and peripherals” by K M Bhurchandi and A K Ray

1. The processor fetches instructions from memory in the order of the static program.
2. Each instruction is translated into one or more fixed-length RISC instructions, known
as micro-operations, or micro-ops.
3. The processor executes the micro-ops on a superscalar pipeline organization, so that
the micro-ops may be executed out of order.
4. The processor commits the results of each micro-op execution to the processor’s
register set in the order of the original program flow.

Pentium 4 Pipeline

In effect, the Pentium 4 architecture consists of an outer CISC shell with an inner RISC core. The inner RISC
micro-ops pass through a pipeline with at least 20 stages. In some cases, the micro-op requires multiple
execution stages, resulting in an even longer pipeline. This contrasts with the five-stage pipeline used on the
earlier Intel x86 processors and on the Pentium.

We now trace the operation of the Pentium 4 pipeline.

Front End
GENERATION OF MICRO-OPS The Pentium 4 organization includes an in-order front end (Figure 14.9a) that
can be considered outside the scope of the pipeline depicted in Figure 14.8. This front end feeds into an L1
instruction cache, called the trace cache, which is where the pipeline proper begins. Usually, the processor
operates from the trace cache; when a trace cache miss occurs, the in-order front end feeds new instructions
into the trace cache.

With the aid of the branch target buffer and the instruction lookaside buffer (BTB & I-TLB), the fetch/decode
unit fetches Pentium 4 machine instructions from the L2 cache 64 bytes at a time. As a default, instructions
are fetched sequentially, so that each L2 cache line fetch includes the next instruction to be fetched. Branch
prediction via the BTB & I-TLB unit may alter this sequential fetch operation. The ITLB translates the linear
instruction pointer address given it into physical addresses needed to access the L2 cache. Static branch
prediction in the front-end BTB is used to determine which instructions to fetch next.

Once instructions are fetched, the fetch/decode unit scans the bytes to determine instruction boundaries;
this is a necessary operation because of the variable length of x86 instructions. The decoder translates each
machine instruction into from one to four micro-ops, each of which is a 118-bit RISC instruction. Note for
comparison that most pure RISC machines have an instruction length of just 32 bits. The longer micro-op
length is required to accommodate the more complex Pentium operations. Nevertheless, the micro-ops are
easier to manage than the original instructions from which they derive.

The generated micro-ops are stored in the trace cache.

TRACE CACHE NEXT INSTRUCTION POINTER The first two pipeline stages deal with the selection of
instructions in the trace cache and involve a separate branch prediction mechanism from that described in
the previous section. The Pentium 4 uses a dynamic branch prediction strategy based on the history of recent
executions of branch instructions. A branch target buffer (BTB) is maintained that caches information about
recently encountered branch instructions. Whenever a branch instruction is encountered in the instruction
stream, the BTB is checked. If an entry already exists in the BTB, then the instruction unit is guided by the
history information for that entry in determining whether to predict that the branch is taken. If a branch is
predicted, then the branch destination address associated with this entry is used for prefetching the branch
target instruction.

Once the instruction is executed, the history portion of the appropriate entry is updated to reflect the result
of the branch instruction. If this instruction is not represented in the BTB, then the address of this instruction
is loaded into an entry in the BTB; if necessary, an older entry is deleted. The description of the preceding
two paragraphs fits, in general terms, the branch prediction strategy used on the original Pentium model, as
well as the later

Pentium models, including Pentium 4. However, in the case of the Pentium, a relatively simple 2-bit history
scheme is used. The later Pentium models have much longer pipelines (20 stages for the Pentium 4 compared
with 5 stages for the Pentium) and therefore the penalty for misprediction is greater. Accordingly, the later

Pentium models use a more elaborate branch prediction scheme with more history bits to reduce the
misprediction rate.

The Pentium 4 BTB is organized as a four-way set-associative cache with 512 lines. Each entry uses the
address of the branch as a tag. The entry also includes the branch destination address for the last time this
branch was taken and a 4-bit history field. Thus use of four history bits contrasts with the 2 bits used in the
original Pentium and used in most superscalar processors. With 4 bits, the Pentium 4 mechanism can take
into account a longer history in predicting branches. The algorithm that is used is referred to as Yeh’s
algorithm [YEH91].

The developers of this algorithm have demonstrated that it provides a significant reduction in misprediction
compared to algorithms that use only 2 bits of history [EVER98].

Conditional branches that do not have a history in the BTB are predicted using a static prediction algorithm,
according to the following rules:

 For branch addresses that are not IP relative, predict taken if the branch is a return and not taken
otherwise.
 For IP-relative backward conditional branches, predict taken. This rule reflects the typical behavior of
loops.
 For IP-relative forward conditional branches, predict not taken.

TRACE CACHE FETCH The trace cache takes the already-decoded micro-ops from the instruction decoder and
assembles them in to program-ordered sequences of micro-ops called traces. Micro-ops are fetched
sequentially from the trace cache, subject to the branch prediction logic.
A few instructions require more than four micro-ops. These instructions are transferred to microcode ROM,
which contains the series of micro-ops (five or more) associated with a complex machine instruction. For
example, a string instruction may translate into a very large (even hundreds), repetitive sequence of micro-
ops. Thus, the microcode ROM is a microprogrammed control unit in the sense discussed in

Part Four. After the microcode ROM finishes sequencing micro-ops for the current Pentium instruction,
fetching resumes from the trace cache.

DRIVE The fifth stage of the Pentium 4 pipeline delivers decoded instructions from the trace cache to the
rename/allocator module.

Out-of-Order Execution Logic

This part of the processor reorders micro-ops to allow them to execute as quickly as their input operands are
ready.

ALLOCATE The allocate stage allocates resources required for execution. It performs the following functions:

 If a needed resource, such as a register, is unavailable for one of the three micro-ops arriving at the
allocator during a clock cycle, the allocator stalls the pipeline.
 The allocator allocates a reorder buffer (ROB) entry, which tracks the completion status of one of the
126 micro-ops that could be in process at any time.2
 The allocator allocates one of the 128 integer or floating-point register entries for the result data
value of the micro-op, and possibly a load or store buffer used to track one of the 48 loads or 24
stores in the machine pipeline.
 The allocator allocates an entry in one of the two micro-op queues in front of the instruction
schedulers.

The ROB is a circular buffer that can hold up to 126 micro-ops and also contains the 128 hardware registers.
Each buffer entry consists of the following fields:

 State: Indicates whether this micro-op is scheduled for execution, has been dispatched for
execution, or has completed execution and is ready for retirement.
 Memory Address: The address of the Pentium instruction that generated the micro-op.
 Micro-op: The actual operation.
 Alias Register: If the micro-op references one of the 16 architectural registers, this entry
redirects that reference to one of the 128 hardware registers.

Micro-ops enter the ROB in order. Micro-ops are then dispatched from the ROB to the Dispatch/Execute unit
out of order. The criterion for dispatch is that the appropriate execution unit and all necessary data items
required for this micro-op are available. Finally, micro-ops are retired from the ROB in order. To accomplish
in-order retirement, micro-ops are retired oldest first after each micro-op has been designated as ready for
retirement.
REGISTER RENAMING The rename stage remaps references to the 16 architectural registers (8 floating-point
registers, plus EAX, EBX, ECX, EDX, ESI, EDI, EBP, and ESP) into a set of 128 physical registers. The stage
removes false dependencies caused by a limited number of architectural registers while preserving the true
data dependencies (reads after writes).

MICRO-OP QUEUING After resource allocation and register renaming, micro-ops are placed in one of two
micro-op queues, where they are held until there is room in the schedulers. One of the two queues is for
memory operations (loads and stores) and the other for micro-ops that do not involve memory references.

Each queue obeys a FIFO (first-in-first-out) discipline, but no order is maintained between queues. That is, a
micro-op may be read out of one queue out of order with respect to micro-ops in the other queue. This
provides greater flexibility to the schedulers.

MICRO-OP SCHEDULING AND DISPATCHING The schedulers are responsible for retrieving micro-ops from the
micro-op queues and dispatching these for execution. Each scheduler looks for micro-ops in whose status
indicates that the micro-op has all of its operands. If the execution unit needed by that micro-op is available,
then the scheduler fetches the micro-op and dispatches it to the appropriate execution unit.. Up to six micro-
ops can be dispatched in one cycle. If more than one micro-op is available for a given execution unit, then the
scheduler dispatches them in sequence from the queue. This is a sort of FIFO discipline that favors in-order
execution, but by this time the instruction stream has been so rearranged by dependencies and branches
that it is substantially out of order.

Four ports attach the schedulers to the execution units. Port 0 is used for both integer and floating-point
instructions, with the exception of simple integer operations and the handling of branch mispredictions,
which are allocated to Port 1. In addition, MMX execution units are allocated between these two ports. The
remaining ports are for memory loads and stores.

Integer and Floating-Point Execution Units

The integer and floating-point register files are the source for pending operations by the execution units. The
execution units retrieve values from the register files as well as from the L1 data cache. A separate pipeline
stage is used to compute flags (e.g., zero, negative); these are typically the input to a branch instruction.

A subsequent pipeline stage performs branch checking. This function compares the actual branch result with
the prediction. If a branch prediction turns out to have been wrong, then there are micro-operations in
various stages of processing that must be removed from the pipeline. The proper branch destination is then
provided to the Branch Predictor during a drive stage, which restarts the whole pipeline from the new target
address.

Generally, the Architecture of Pentium 4 Processor consists of a Bus Interface Unit (BIU), Instruction Fetch
and Decoder Unit, Trace Cache (TC), Microcode ROM, Branch Target Buffer (BTB), Branch Prediction,
Instruction Translation Look-aside Buffer (ITLB), Execution Unit, and Rapid Execution Module.
It is clear from the image that The Architecture of Pentium 4 Processor has four different modules such as (i)
memory subsystem module, (ii) front-end module, (iii) integer/floating point execution unit, and (iv) out-of-
order execution unit. The memory subsystem module contains a Bus Interface Unit (BIU) and L3 cache
(optional). The front-end module consists of instruction decoder, Trace Cache (TC), microcode ROM, Branch
Target Buffer (BTB) and branch prediction. Integer/Floating point execution unit has the L1 data cache and
execution unit. The out-of-order execution unit consists of execution unit and retirement. In this section, the
detailed internal Architecture of Pentium 4 Processor has been discussed elaborately.

Memory Subsystem
This includes the L2 cache and the system bus. The L2 cache stores both instructions and data that cannot fit
in the Execution Trace Cache and the L1 data cache. The external system bus is connected to the backside of
the second-level cache and is used to access main memory when the L2 cache has a cache miss, and to access
the system I/O resources.

Bus interface Unit (BIU) The Bus Interface Unit (BM) is used to communicate with the system bus, cache bus,
L2 cache, L1 data cache and L1 code cache.

Paging and virtual Memory

PENTIUM 4 CACHE ORGANIZATION

All of the Pentium processors include two on-chip L1 caches, one for data and one for instructions. For the
Pentium 4, the L1 data cache is 16 KBytes, using a line size of 64 bytes and a four-way set-associative
organization.

The Pentium 4 instruction cache is described subsequently. The Pentium II also includes an L2 cache that
feeds both of the L1 caches. The L2 cache is eightway set associative with a size of 512 KB and a line size of
128 bytes. An L3 cache was added for the Pentium III and became on-chip with high-end versions of the
Pentium 4.

The figure (which was directly taken from “COMPUTER ORGANIZATION AND ARCHITECTURE DESIGNING FOR
PERFORMANCE EIGHTH EDITION” by William Stallings) below provides a simplified view of the Pentium 4
organization, highlighting the placement of the three caches. The processor core consists of four major
components:

Fetch/decode unit: Fetches program instructions in order from the L2 cache, decodes these into a
series of micro-operations, and stores the results in the L1 instruction cache.

Out-of-order execution logic: Schedules execution of the micro-operations subject to data

dependencies and resource availability; thus, micro-operations may be scheduled for execution in a different
order than they were fetched from the instruction stream. As time permits, this unit schedules speculative
execution of micro-operations that may be required in the future.

Execution units: These units executes micro-operations, fetching the required data from the L1 data
cache and temporarily storing results in registers.

Memory subsystem: This unit includes the L2 and L3 caches and the system bus, which is used to
access main memory when the L1 and L2 caches have a cache miss and to access the system I/O resources.
Unlike the organization used in all previous Pentium models, and in most other processors, the Pentium 4
instruction cache sits between the instructions decode logic and the execution core. The reasoning behind
this design decision is as follows: As discussed more fully in Chapter 14, the Pentium process decodes, or
translates, Pentium machine instructions into simple RISC-like instructions called micro-operations. The use
of simple, fixed-length micro-operations enables the use of superscalar pipelining and scheduling techniques
that enhance performance.

However, the Pentium machine instructions are cumbersome to decode; they have a variable number of
bytes and many different options. It turns out that performance is enhanced if this decoding is done
independently of the scheduling and pipelining logic. We return to this topic in Chapter 14.

The data cache employs a write-back policy: Data are written to main memory only when they are removed
from the cache and there has been an update. The Pentium 4 processor can be dynamically configured to
support write-through caching.

The L1 data cache is controlled by two bits in one of the control registers, labeled the CD (cache disable) and
NW (not write-through) bits (Table 4.5). There are also two Pentium 4 instructions that can be used to control
the data cache:

INVD invalidates (flushes) the internal cache memory and signals the external cache (if any) to invalidate.
WBINVD writes back and invalidates internal cache and then writes back and invalidates external cache.

Both the L2 and L3 caches are eight-way set associative with a line size of 128 bytes.

Register Organization
The Pentium 4 architecture employs a combination of general-purpose registers and special-
purpose registers to facilitate efficient data processing and control flow. The key components include:

General-Purpose Registers: The Pentium 4 has 128 physical registers, which include 128 integer
and floating-point registers. These are used for arithmetic operations and data manipulation.

Reorder Buffer: It contains 126 entries that help in out-of-order execution, allowing the processor to
execute instructions as resources become available rather than strictly following the original order.
Memory Management Registers: These include the Page Directory Base Register (PDBR) and
Control Registers (CR0, CR3, CR4), which are essential for managing memory access and paging.

Memory Paging
Paging is a memory management scheme that eliminates the need for contiguous allocation of physical
memory. The Pentium 4 supports a 32-bit paging mechanism, which allows the operating system to
manage memory efficiently by mapping virtual addresses to physical addresses. Key features include:

Page Size: The standard page size is typically 4 KB, but the Pentium 4 also supports 4-MB
pages through Page Size Extensions (PSE), which reduces the overhead of managing multiple smaller
pages.

Virtual Memory: The paging mechanism enables the execution of programs larger than the available
physical memory by using disk space to store inactive pages. This allows for a more flexible use of RAM and
improves multitasking capabilities.

Paging Mechanism

The translation from virtual addresses to physical addresses involves several steps:

Linear Address Generation: The CPU generates a linear address based on the program's request.

Page Directory and Page Table Lookup: The linear address is divided into parts that index into
a two-level page table structure:

The first part accesses the Page Directory Entry (PDE).

The second part accesses the Page Table Entry (PTE), which points to the actual physical frame in memory.

Physical Address Calculation: The PTE provides the upper bits of the physical address, while the
lower bits come from the linear address itself.

Paging Registers
The Pentium 4 utilizes several critical registers for managing its paging system:

Page Directory Base Register (PDBR): This register holds the base address of the page
directory in memory. It is crucial for translating linear addresses to physical addresses.

Control Registers (CR):

CR0: Controls enabling/disabling paging and other processor states.

CR3: Contains the PDBR, pointing to the current page directory.

CR4: Enables additional features such as PSE, allowing for larger page sizes and enhanced memory
management capabilities.

Paging Enhancements

The Pentium 4 introduced enhancements such as:

Page Address Extension (PAE): This feature allows addressing more than 4 GB of physical
memory by extending addressable space beyond traditional limits

Support for Multiple Paging Levels: This allows for efficient mapping of large address spaces
while minimizing memory fragmentation.

Core 2
Introduction
Intel Core 2 Duo is a high performance and power efficient dual core Chip-Multiprocessor (CMP). CMP
embeds multiple processor cores into a single die to exploit thread-level parallelism for achieving higher
overall chip-level Instruction-Per-Cycle (IPC). In a multi-core, multithreaded processor chip, thread-level
parallelism combined with increased clock frequency exerts a higher demand for on-chip and off-chip
memory bandwidth causing longer average memory access delays. There has been great interest shown by
researchers to understand the underlying reasons that cause these bottlenecks in processors.

Intel’s Core 2 microprocessors represent a shift towards energy-efficient, high-performance computing.

Utilizing the Core microarchitecture, these processors optimized multi-core designs and improved instruction
efficiency.

The advances in circuit integration technology and inevitability of thread level parallelism over instruction
level parallelism for performance efficiency has made Chip-Multiprocessor (CMP) or multi-core technology
the mainstream in CPU designs.

Architecture of Intel Core 2 Duo

The Intel Core 2 Duo E6400 processor supports CMP and belongs to the Intel’s mobile core family. It is
implemented by using two Intel’s Core architecture on a single die. The design of Intel Core 2 Duo E6400 is
chosen to maximize performance and minimize power consumption [18]. It emphasizes mainly on cache
efficiency and does not stress on the clock frequency for high power efficiency. Although clocking at a slower
rate than most of its competitors, shorter stages and wider issuing pipeline compensates the performance
with higher IPC’s. In addition, the Core 2 Duo processor has more ALU units [13]. The five main features of
Intel Core 2 Duo contributing towards its high performance are:

 Intel’s Wide Dynamic Execution

 Intel’s Advanced Digital Media Boost
 Intel’s Intelligent Power Capability
 Intel’s Advanced Smart Cache
 Intel’s Smart Memory Access

Core 2 Duo employs Intel’s Advanced Smart Cache which is a shared L2 cache to increase the effective on-
chip cache capacity. Upon a miss from the core’s L1 cache, the shared L2 and the L1 of the other core are
looked up in parallel before sending the request to the memory. The cache block located in the other L1
cache can be fetched without off-chip traffic. Both memory controller and FSB are still located off-chip. The
off-chip memory controller can adapt the new DRAM technology with the cost of longer memory access
latency. Intel Advanced Smart Cache provides a peak transfer rate of 96 GB/sec (at 3 GHz frequency).

Core 2 Duo employs aggressive memory dependence predictors for memory disambiguation. A load
instruction is allowed to be executed before an early store instruction with an unknown address. It also
implements a macro-fusion technology to combine multiple micro-operations.

Another important aspect to alleviate cache miss penalty is data

prefetching. According to the hardware specifications, the Intel
Core 2 Duo includes a stride prefetcher on its L1 data cache and
a next line prefetcher on its L2 cache. The Intel Core micro-
architecture includes in each processing core two prefetchers to
the Level 1 data cache and the traditional prefetcher to the Level
1 instruction cache.

In addition it includes two prefetchers associated with the Level

2 cache and shared between the cores. In total, there are eight
prefetchers per dual core processor.The L2 prefetcher can be
triggered after detecting consecutive line requests twice.

(Block Diagram of Intel Core 2 Duo Processor)

The stride prefetcher on L1 cache is also known as Instruction Pointer-Based (IP) prefetcher to level 1 data
cache. The IP prefetcher builds a history for each load using the load instruction pointer and keeps it in the IP
history array. The address of the next load is predicted using a constant stride calculated from the entries in
the history array. The history array consists of the following fields.

 12 un-translated bits of last demand address

 13 bits of last stride data (12 bits of positive or negative stride with the 13th bit the sign)
 2 bits of history state machine
 6 bits of last prefetched address—used to avoid redundant prefetch requests.
The IP prefetcher then generates a prefetch request to L1 cache for the predicted address. This request for
prefetch enters a FIFO and waits for its turn. When the request is encountered a lookup for that line is done
in the L1 cache and the fill buffer unit. If the prefetch hits either the L1 cache or the fill buffer, the request is
dropped. Otherwise a read request to the corresponding line is sent to L2 cache.

Other important features involve support for new SIMD instructions called Supplemental Streaming SIMD
Extension 3, coupled with better power saving technologies. Table 1.1 specifies the CPU specification of the
Intel Core 2 Duo machine used for carrying out the experiments. It has separate 32 KB L1 instruction and data
caches per core. A 2MB L2 cache is shared by two cores. Both L1 and L2 caches are 8-way set associative and
have 64-byte lines.

14-Stage Pipeline
While the Netburst architecture relied on extremely deep pipelines (up to 31 stages),
Core 2 uses a much shorter 14-stage pipeline. This is longer than the 12-stage pipeline
that AMD uses in the Athlon 64 but longer pipelines allow the workload undertaken by
the processor to be broken down into smaller parts that are carried out faster.

Instructions/Clock Cycle

The Core 2 is based on four-wide architecture. This means that it is capable of fetching,
dispatching, executing and retiring four instructions for every clock cycle. This beats the
three-wide architecture currently used in the Pentium, 4/D and Athlon 64 architectures
by 33%.

L1/Shared L2 Cache

The Core 2 Duo and the Core 2 Extreme processors have two cores that each have
64KB of L1 cache. This is split into a 32KB instruction cache called I-cache and a 32KB
data cache called D-cache.

The two cores also share a larger L2 cache, which differs from the Pentium D and
Athlon 64 X2, both of which have independent L2 caches. On the E6300 and E6400,
this is 2MB, while on the higher-end E6600, E6700 and X6800, this is doubled to 4MB.

Macro-Fusion

Core 2 also supports Macro-Fusion. This allows certain x86 instructions to be

combined into a single instruction called a micro-op which reduces the processing time.
In effect, this can allow the four-wide architecture to carry out up to five instructions per
clock cycle.

Increased SIMD Efficiency

Unlike the Pentium 4/D, which could only execute one 128-bit SIMD (Single Instruction
Multiple Data) instruction every two clock cycles, the Core 2 can do the same amount of
work in a single clock cycle.

Thermal Design Power

All the Core 2 Duo models will have a TDP of 65W, half that of the Pentium 965 Extreme
Edition. The Core 2 Extreme comes in with a TDP of 75W. This means that the Core 2 CPUs
will run cooler and be more energy efficient than their counterparts.
Core i3
Core 2 Microprocessor Architecture:
Principles, Organization, and Memory
Management
Abstract
The Core 2 microprocessor, introduced by Intel, marked a significant advancement in
microprocessor technology. This paper delves into its architecture, working principles, register
organization, and memory paging mechanism. By exploring the intricate design and
functionality, we aim to provide a comprehensive understanding of its contributions to modern
computing. The Core 2 represents a key milestone in the evolution of processor technology,
blending innovative features with practical engineering to achieve a balance of performance and
energy efficiency.

1. Introduction
The introduction of the Core 2 microprocessor signified a pivotal moment in the development of
computer architecture. As computing demands increased, the need for processors capable of
delivering higher performance without a proportional increase in power consumption became
critical. Intel’s Core 2 addressed these challenges by leveraging its Core microarchitecture,
which emphasized parallel execution, efficient resource utilization, and scalable performance.
This paper explores the Core 2’s architecture in detail, including its working principles, register
organization, and memory paging mechanisms. By analyzing these elements, we aim to highlight
the innovations that made the Core 2 a cornerstone in the evolution of modern microprocessors.

2. Architecture Overview
The Core 2 microprocessor’s architecture is a testament to Intel’s commitment to innovation and
efficiency. At its core lies a superscalar design that allows the processor to execute multiple
instructions concurrently. Unlike earlier designs that often relied on sequential execution, the
Core 2’s four-issue pipeline enables simultaneous instruction handling, significantly boosting
performance. This capability is complemented by out-of-order execution, a technique where
instructions are executed as soon as their operands are available, rather than strictly following
program order. This approach minimizes idle cycles and maximizes throughput.
Advanced branch prediction is another hallmark of the Core 2 architecture. By accurately
predicting the flow of program execution, the processor reduces delays caused by pipeline stalls,
ensuring smoother operation. The integrated cache hierarchy plays a crucial role in enhancing
performance. With a dual-level cache system comprising a 32 KB L1 cache split between
instruction and data caches and a unified L2 cache shared between cores, the Core 2 minimizes
memory latency and accelerates data access.

2.1 Microarchitecture Components

The microarchitecture of the Core 2 is a carefully orchestrated combination of components, each
designed to optimize instruction processing. The front-end is responsible for fetching and
decoding instructions. It includes a fetch unit that retrieves instructions from memory or the
cache, a decode unit that translates x86 instructions into micro-operations (µ-ops), and a
sophisticated branch prediction unit that anticipates instruction flow to prevent pipeline
interruptions.

The execution engine is the heart of the processor, where instructions are processed through
multiple functional units. These include Arithmetic Logic Units (ALUs) for integer operations
and Floating Point Units (FPUs) for floating-point computations. The engine features reservation
stations that hold µ-ops until the required resources are available. Once executed, results are
temporarily stored in a reorder buffer (ROB) to maintain program order during write-back. This
ensures accurate execution and result consistency.

The Core 2’s cache hierarchy further enhances efficiency. The L1 cache, with its split design,
provides rapid access to frequently used instructions and data. The larger L2 cache, shared
between cores, acts as a buffer for less frequently accessed information, reducing reliance on
slower main memory. This hierarchical structure balances speed and capacity, ensuring optimal
performance across a wide range of applications.

3. Working Principle
The Core 2 microprocessor’s working principle revolves around its pipeline architecture, which
divides instruction execution into distinct stages. The pipeline begins with the instruction fetch
stage, where instructions are retrieved from memory or the L1 cache. These instructions are then
decoded into µ-ops by the decode unit. The execution stage processes these µ-ops using the
functional units, while the memory access stage handles data retrieval or storage. Finally, the
write-back stage ensures that the results are stored in the appropriate registers or memory
locations.

Out-of-order execution and speculative execution are key techniques that enhance the pipeline’s
efficiency. Out-of-order execution allows the processor to execute instructions as soon as the
necessary resources and operands are available, bypassing dependencies that might otherwise
cause delays. Speculative execution further optimizes performance by predicting the outcomes of
conditional instructions and executing subsequent instructions based on these predictions. If the
predictions are correct, the processor avoids delays; if not, the speculative results are discarded,
and the correct path is followed.

4. Register Organization
The register organization in the Core 2 microprocessor is meticulously designed to support
efficient instruction execution and resource management. General-purpose registers (GPRs)
serve as the primary storage locations for intermediate data and operands. The eight 32-bit GPRs
(EAX, EBX, ECX, EDX, ESI, EDI, ESP, and EBP) are versatile and can be used for various
operations, including arithmetic, logic, and addressing.

Segment registers play a crucial role in memory management by facilitating segmented

addressing. These registers (CS, DS, SS, ES, FS, and GS) hold segment selectors that define the
base addresses of memory segments. This segmentation provides a level of abstraction that
simplifies memory access and enhances security.

Control registers (CR0, CR2, CR3, and CR4) are pivotal in managing the processor’s operational
modes and memory management. CR0, for instance, enables or disables features such as paging
and protected mode. CR3 holds the base address of the page directory, a critical component in
virtual memory management. CR4 extends the processor’s capabilities by enabling advanced
features like Physical Address Extension (PAE), which allows addressing beyond the 4 GB limit
of traditional 32-bit systems.

Debug registers (DR0-DR7) are specialized for debugging purposes, providing hardware
breakpoints that assist developers in identifying and resolving issues. The floating-point and
SIMD registers are tailored for high-performance computations, supporting operations that
require significant numerical precision or parallel data processing.

5. Memory Paging
Memory paging is an essential feature of the Core 2 microprocessor, enabling efficient utilization
of physical memory through a virtual memory system. Paging divides virtual memory into fixed-
size pages, which are mapped to physical memory frames. This mechanism not only simplifies
memory management but also enhances security and reliability by isolating processes.

The paging process involves translating virtual addresses into physical addresses through a
multi-level hierarchy. A virtual address is divided into three components: the page directory, the
page table, and the page offset. The Memory Management Unit (MMU) uses the CR3 register to
locate the page directory, which contains pointers to page tables. Each page table, in turn, maps
virtual pages to physical frames. The page offset specifies the exact location within the physical
frame. This hierarchical approach ensures efficient memory translation and minimizes the
overhead associated with large memory spaces.
5.1 Advanced Paging Features
The Core 2 processor supports advanced paging features that enhance its capabilities. Physical
Address Extension (PAE) allows the processor to access more than 4 GB of physical memory by
extending the addressable range to 36 bits. This is particularly beneficial for applications
requiring large datasets or intensive computations. Additionally, the page fault handler is a
critical component of the paging system, managing exceptions caused by invalid memory
accesses. By identifying and resolving page faults, the handler ensures the stability and reliability
of the system.

6. Conclusion
The Core 2 microprocessor’s architecture, working principles, and memory management
mechanisms exemplify a sophisticated design focused on balancing performance and efficiency.
Through innovations such as out-of-order execution, advanced branch prediction, and an
integrated cache hierarchy, the Core 2 set new standards for microprocessor design. Its efficient
register organization and robust memory paging system further underscore its engineering
excellence. By understanding these elements, we gain insights into the evolution of modern
processors and their role in shaping the future of computing.

References
1. Intel Corporation. "Intel Core Microarchitecture Technical Overview."
2. Stallings, W. "Computer Organization and Architecture."
3. Tanenbaum, A. S. "Modern Operating Systems."

………

………………………………………………………………………………………………………………………………………………………………………..

Core 3 Microprocessor Architecture:

Innovations, Organization, and Advanced
Memory Management
Abstract
The Core 3 microprocessor, a successor to Intel’s Core 2 architecture, introduced a new era of
performance, efficiency, and scalability in microprocessor technology. This paper explores the
architecture of the Core 3 processor, its working principles, register organization, and the
advanced memory management techniques that underpin its operation. By delving into the
innovations and design philosophies behind the Core 3, we aim to illuminate its significance in
the evolution of modern computing.

1. Introduction
With the advent of the Core 3 microprocessor, Intel redefined the boundaries of processing
power and energy efficiency. The Core 3 architecture built upon the successes of the Core 2,
incorporating advances in multi-threading, power management, and memory handling to meet
the demands of increasingly complex applications. This paper presents an in-depth examination
of the Core 3’s architecture, exploring how its design principles contributed to higher
performance, better resource utilization, and enhanced scalability.

2. Architectural Innovations
The Core 3 microprocessor architecture represents a convergence of advanced techniques aimed
at improving computational efficiency and versatility. Unlike its predecessors, the Core 3
architecture integrates a refined execution engine, expanded multi-core capabilities, and
improved interconnects to enhance overall performance. The adoption of an updated instruction
set, including support for new SIMD (Single Instruction, Multiple Data) operations, provides a
significant boost to workloads involving multimedia, cryptography, and scientific computing.

2.1 Multi-Core Design and Scalability

A key feature of the Core 3 is its emphasis on multi-core scalability. While the Core 2 primarily
targeted dual-core and quad-core configurations, the Core 3 architecture extends support to
configurations ranging from quad-core to octa-core setups, with shared resources dynamically
allocated to maximize efficiency. The introduction of advanced inter-core communication
mechanisms ensures low-latency data transfer and synchronization, enabling seamless parallel
execution of tasks.

2.2 Enhanced Execution Engine

The Core 3’s execution engine builds on the out-of-order execution model introduced in earlier
architectures. It features an expanded reservation station and increased functional unit density,
allowing for a higher degree of parallelism. The execution engine includes dedicated units for
integer operations, floating-point calculations, and vector processing, each optimized for low
latency and high throughput. Speculative execution, a hallmark of modern processors, is further
refined in the Core 3, employing advanced branch prediction algorithms to minimize pipeline
stalls.

2.3 Integrated Cache Hierarchy

The Core 3 architecture incorporates a three-level cache hierarchy to address the growing
performance gap between the processor and main memory. The L1 cache, dedicated to each
core, is split into separate instruction and data caches, ensuring rapid access to frequently used
data. The L2 cache, also core-specific, provides additional buffering, while the shared L3 cache
facilitates efficient data sharing among cores. Enhanced cache coherence protocols further
optimize multi-core performance by minimizing data inconsistencies.

3. Working Principles
The working principles of the Core 3 microprocessor are grounded in its advanced pipeline
design and efficient resource management. The instruction pipeline, extended to accommodate
higher clock frequencies, consists of stages for fetching, decoding, executing, and retiring
instructions. Each stage is optimized for speed and efficiency, with a particular focus on reducing
bottlenecks through techniques like dynamic scheduling and speculative execution.

Dynamic power management is a standout feature of the Core 3. By monitoring workload

demands in real time, the processor can dynamically adjust its clock speed and voltage levels,
balancing performance and energy efficiency. This capability is further enhanced by the
inclusion of per-core power gating, allowing inactive cores to be powered down to conserve
energy without affecting active cores.

4. Register Organization
The register organization in the Core 3 microprocessor is both extensive and flexible, catering to
the diverse needs of modern applications. General-purpose registers (GPRs) provide the
foundation for data manipulation, with each core featuring its own set of registers for parallel
processing. These registers are 64-bit, enabling support for both legacy 32-bit and modern 64-bit
applications.

Control registers play a crucial role in configuring and monitoring the processor’s operating
modes. Registers such as CR0, CR3, and CR4 are integral to enabling features like paging,
protected mode, and Physical Address Extension (PAE). Debug registers, including DR0 through
DR7, facilitate sophisticated debugging by enabling hardware breakpoints.

The floating-point and SIMD registers, expanded to accommodate advanced vector instructions,
provide significant computational power for applications requiring high precision or parallel data
processing. These registers are particularly beneficial in scientific computing, multimedia
processing, and artificial intelligence workloads.

5. Advanced Memory Management

The Core 3’s memory management system builds upon the foundational concepts of paging and
segmentation, introducing advanced techniques to handle larger memory spaces and more
complex workloads. The adoption of extended paging mechanisms, such as four-level page
tables, allows the processor to manage terabytes of physical memory efficiently.
5.1 Address Translation and Paging Mechanisms
Address translation in the Core 3 leverages a hierarchical paging model, where virtual addresses
are mapped to physical addresses through a series of page tables. The virtual address is divided
into multiple components, including indexes for the page directory, page table, and page offset.
The Memory Management Unit (MMU) performs this translation using information stored in
control registers like CR3. By incorporating larger page sizes and support for transparent huge
pages, the Core 3 reduces the overhead associated with address translation.

5.2 Security and Isolation

Security is a critical aspect of the Core 3’s memory management system. Features like the
Execute Disable (XD) bit prevent the execution of code in data pages, mitigating the risk of
certain types of attacks. Memory protection keys (MPKs) allow fine-grained access control to
memory regions, enhancing process isolation and system stability.

5.3 Memory Coherence and Inter-Core Communication

To support multi-core scalability, the Core 3 employs sophisticated cache coherence protocols
and inter-core communication mechanisms. The inclusion of a ring interconnect architecture
ensures low-latency communication between cores and shared resources, while maintaining
consistency across the cache hierarchy. These features enable efficient data sharing and
synchronization, crucial for multi-threaded workloads.

6. Conclusion
The Core 3 microprocessor exemplifies Intel’s commitment to advancing processor technology.
Its architectural innovations, efficient register organization, and sophisticated memory
management techniques position it as a cornerstone of modern computing. By building on the
strengths of its predecessors and introducing groundbreaking features, the Core 3 architecture
delivers unparalleled performance and energy efficiency. As computing demands continue to
evolve, the Core 3 stands as a testament to the ingenuity and foresight of modern processor
design.

References
1. Intel Corporation. "Intel Core Architecture Innovations and Technical Overview."
2. Hennessy, J., & Patterson, D. "Computer Architecture: A Quantitative Approach."
3. Tanenbaum, A. S. "Structured Computer Organization."

Intel 80586 (Pentium)
100% (3)
Intel 80586 (Pentium)
24 pages
Pentium 4 Processor
100% (9)
Pentium 4 Processor
23 pages
Pentium Memory Hierarchy (By Indranil Nandy, IIT KGP)
100% (5)
Pentium Memory Hierarchy (By Indranil Nandy, IIT KGP)
6 pages
M5 Pentium Processor
No ratings yet
M5 Pentium Processor
17 pages
Pentium 4 Structure
100% (6)
Pentium 4 Structure
38 pages
User Guide LIGO Fuel Level Sensor 2021
No ratings yet
User Guide LIGO Fuel Level Sensor 2021
32 pages
Pentium Processor - Branch Prediction, Cache Organization - Code Cache
No ratings yet
Pentium Processor - Branch Prediction, Cache Organization - Code Cache
18 pages
Cpe 631 Pentium 4
No ratings yet
Cpe 631 Pentium 4
111 pages
Prismax Spec
No ratings yet
Prismax Spec
2 pages
Chap 5
No ratings yet
Chap 5
60 pages
Pentium-4 RNM Final
No ratings yet
Pentium-4 RNM Final
27 pages
2-Introduction To Pentium Processor
92% (13)
2-Introduction To Pentium Processor
15 pages
Pentium Generations 2004
No ratings yet
Pentium Generations 2004
19 pages
Week2 - 1
No ratings yet
Week2 - 1
64 pages
Pentium 4
100% (2)
Pentium 4
8 pages
The Pentium: M. Sonza Reorda
No ratings yet
The Pentium: M. Sonza Reorda
29 pages
Intel Pentium 4 Processor: Presented by Michele Co
No ratings yet
Intel Pentium 4 Processor: Presented by Michele Co
60 pages
Presentation Cea Chapter16 2 Demo
No ratings yet
Presentation Cea Chapter16 2 Demo
30 pages
VLSI Physical Design Automation PDF
No ratings yet
VLSI Physical Design Automation PDF
29 pages
Advanced Microprocessors: The Pentium Processors
No ratings yet
Advanced Microprocessors: The Pentium Processors
4 pages
The Microarchitecture of The Pentium 4 Processor 1
No ratings yet
The Microarchitecture of The Pentium 4 Processor 1
13 pages
The Intel Pentium Processor
No ratings yet
The Intel Pentium Processor
12 pages
ITLB, Branch Prediction and Hyperthreading
No ratings yet
ITLB, Branch Prediction and Hyperthreading
2 pages
Pentium 4 Cache Presentation
No ratings yet
Pentium 4 Cache Presentation
20 pages
MP - Module 6 - Pentium 4 - Aeraxia - in
No ratings yet
MP - Module 6 - Pentium 4 - Aeraxia - in
6 pages
MP Co4 PDF
No ratings yet
MP Co4 PDF
79 pages
CA Lecture 12
No ratings yet
CA Lecture 12
48 pages
Pentium
No ratings yet
Pentium
18 pages
Unit No 6
No ratings yet
Unit No 6
18 pages
Intel Pentium 4
100% (1)
Intel Pentium 4
5 pages
Lecture 4 (PENTIUM)
No ratings yet
Lecture 4 (PENTIUM)
20 pages
Pentium 1 Features and Architecture
No ratings yet
Pentium 1 Features and Architecture
5 pages
Processor Architecture
No ratings yet
Processor Architecture
52 pages
Pentium 3 and 4 Assignment
No ratings yet
Pentium 3 and 4 Assignment
2 pages
COA - Practice Set
No ratings yet
COA - Practice Set
3 pages
Chapter 2-The Pentium Processor: 2.1 Protected Mode Operation of X86 Intel Family
No ratings yet
Chapter 2-The Pentium Processor: 2.1 Protected Mode Operation of X86 Intel Family
51 pages
Nat ADABAS4 ND
100% (1)
Nat ADABAS4 ND
54 pages
NetBurst Microarchitecture
No ratings yet
NetBurst Microarchitecture
3 pages
Tutorial On Kalman Filter
No ratings yet
Tutorial On Kalman Filter
47 pages
Cache Definition
No ratings yet
Cache Definition
8 pages
Engineers Guide To Microchip 2018
100% (1)
Engineers Guide To Microchip 2018
36 pages
US IT Recruiting Training Material - Road To US Staffing and USA
No ratings yet
US IT Recruiting Training Material - Road To US Staffing and USA
17 pages
Module 6 Pentium 4
No ratings yet
Module 6 Pentium 4
5 pages
Module 5 - Pentium
No ratings yet
Module 5 - Pentium
39 pages
Basicfeaturesof Advanced Microprocessors
No ratings yet
Basicfeaturesof Advanced Microprocessors
15 pages
Intel Pentium 4 Processor
No ratings yet
Intel Pentium 4 Processor
10 pages
Ece I Basic Electronics Engg. (15eln15) Notes
No ratings yet
Ece I Basic Electronics Engg. (15eln15) Notes
124 pages
Smart Agriculture System
100% (1)
Smart Agriculture System
9 pages
Advanced Processor Architecture: Summer 1997
No ratings yet
Advanced Processor Architecture: Summer 1997
28 pages
The Intel Pen Ti Um Processor
No ratings yet
The Intel Pen Ti Um Processor
12 pages
MP Module 5 - Pentium Propcessor
No ratings yet
MP Module 5 - Pentium Propcessor
9 pages
Pentium 4 Pipe Lining
100% (5)
Pentium 4 Pipe Lining
7 pages
Intel Pentium 4 Processor Family
No ratings yet
Intel Pentium 4 Processor Family
1 page
Pentium-4 Cache Organization - Final
No ratings yet
Pentium-4 Cache Organization - Final
13 pages
Q10,12
No ratings yet
Q10,12
8 pages
CH 2
No ratings yet
CH 2
24 pages
ACT - Hydrostatic Fluid Extension
No ratings yet
ACT - Hydrostatic Fluid Extension
34 pages
Temperature Prediction Models in Mass Concrete State of The Art Literature Review
No ratings yet
Temperature Prediction Models in Mass Concrete State of The Art Literature Review
10 pages
Module 5 - Introduction-to-Pentium-Processor
No ratings yet
Module 5 - Introduction-to-Pentium-Processor
15 pages
Cisco Asa Firepower
No ratings yet
Cisco Asa Firepower
11 pages
Windows Server 2003 Domains Active Directory
No ratings yet
Windows Server 2003 Domains Active Directory
392 pages
Module 4 part-II
No ratings yet
Module 4 part-II
22 pages
Official - PCPP
No ratings yet
Official - PCPP
12 pages
Module-5 - Pentium Processors - Final
No ratings yet
Module-5 - Pentium Processors - Final
43 pages
MP Print
No ratings yet
MP Print
4 pages
3.5.7 Lab - Create A Python Unit Test - ILM
No ratings yet
3.5.7 Lab - Create A Python Unit Test - ILM
9 pages
Module 5 - Pentium Processors - Final
No ratings yet
Module 5 - Pentium Processors - Final
43 pages
Module 4
No ratings yet
Module 4
28 pages
Amdocs Fte Cpi Form Nitdgp
No ratings yet
Amdocs Fte Cpi Form Nitdgp
4 pages
Abdullah 2018
No ratings yet
Abdullah 2018
45 pages
The Microarchitecture of The Pentium 4 Processor 1
No ratings yet
The Microarchitecture of The Pentium 4 Processor 1
12 pages
Inside The Cpu
No ratings yet
Inside The Cpu
10 pages
Pros Dle24
No ratings yet
Pros Dle24
37 pages
EASE 4.0 Loudspeaker Device File Formats V4.02i
No ratings yet
EASE 4.0 Loudspeaker Device File Formats V4.02i
19 pages
Code:: Bahria University, Islamabad Campus Short Assignment (Quiz 01) (Fall 2020 Semester)
No ratings yet
Code:: Bahria University, Islamabad Campus Short Assignment (Quiz 01) (Fall 2020 Semester)
4 pages
Fall 2023 - CS607 - 1
No ratings yet
Fall 2023 - CS607 - 1
3 pages
CC0002 Notes
No ratings yet
CC0002 Notes
10 pages
A Computer Network Is A System of Interconnected C
No ratings yet
A Computer Network Is A System of Interconnected C
2 pages
Lecture 01 Intro
No ratings yet
Lecture 01 Intro
31 pages
Alex Watts CV
No ratings yet
Alex Watts CV
2 pages
Machine Learning May 2024
No ratings yet
Machine Learning May 2024
8 pages
Pentium Cache
No ratings yet
Pentium Cache
5 pages
Form IEPF 4 - 2017 18 1
No ratings yet
Form IEPF 4 - 2017 18 1
6 pages
Plag Report
No ratings yet
Plag Report
18 pages
Microprocessor Module-5 Question Answers
No ratings yet
Microprocessor Module-5 Question Answers
8 pages
Day 3 - Customizing ChatGPT
No ratings yet
Day 3 - Customizing ChatGPT
44 pages
ARM Microcontrollers Programming for Embedded Systems
From Everand
ARM Microcontrollers Programming for Embedded Systems
Sever Spanulescu
5/5 (1)
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
From Everand
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
Digital Equipment Corporation
No ratings yet
GameCube Architecture: Architecture of Consoles: A Practical Analysis, #10
From Everand
GameCube Architecture: Architecture of Consoles: A Practical Analysis, #10
Rodrigo Copetti
No ratings yet
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet

Microprocessor Indiviual Assignment

Uploaded by

Microprocessor Indiviual Assignment

Uploaded by

Individual Assignment

Explain the working principle

Register organization, MEMORY PAGING and Paging Registers

We now trace the operation of the Pentium 4 pipeline.

The generated micro-ops are stored in the trace cache.

Out-of-Order Execution Logic

Integer and Floating-Point Execution Units

Paging and virtual Memory

PENTIUM 4 CACHE ORGANIZATION

Out-of-order execution logic: Schedules execution of the micro-operations subject to data

The first part accesses the Page Directory Entry (PDE).

Control Registers (CR):

CR0: Controls enabling/disabling paging and other processor states.

CR3: Contains the PDBR, pointing to the current page directory.

The Pentium 4 introduced enhancements such as:

Intel’s Core 2 microprocessors represent a shift towards energy-efficient, high-performance computing.

Architecture of Intel Core 2 Duo

 Intel’s Wide Dynamic Execution

Another important aspect to alleviate cache miss penalty is data

In addition it includes two prefetchers associated with the Level

(Block Diagram of Intel Core 2 Duo Processor)

 12 un-translated bits of last demand address

Core 2 also supports Macro-Fusion. This allows certain x86 instructions to be

Increased SIMD Efficiency

Thermal Design Power

2.1 Microarchitecture Components

Segment registers play a crucial role in memory management by facilitating segmented

Core 3 Microprocessor Architecture:

2.1 Multi-Core Design and Scalability

2.2 Enhanced Execution Engine

2.3 Integrated Cache Hierarchy

Dynamic power management is a standout feature of the Core 3. By monitoring workload

5. Advanced Memory Management

5.2 Security and Isolation

5.3 Memory Coherence and Inter-Core Communication

You might also like