0% found this document useful (0 votes)
42 views50 pages

L05 Riscvi

- HW2 is due on March 3rd, Lab 2 is available this week, and the discussion will be about Venus (for RISC-V), Memory Management, and debugging. - A program's memory address space contains four regions: stack, heap, static data, and code. The stack grows downward and is used for local variables and function calls. The heap grows upward and is used for dynamic memory allocation. Static data and code are loaded at program start. - Managing dynamic memory on the heap is tricky and can cause bugs like memory leaks or using memory after it has been freed. Functions like malloc(), free(), and realloc() are used to allocate, free, and resize memory blocks
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views50 pages

L05 Riscvi

- HW2 is due on March 3rd, Lab 2 is available this week, and the discussion will be about Venus (for RISC-V), Memory Management, and debugging. - A program's memory address space contains four regions: stack, heap, static data, and code. The stack grows downward and is used for local variables and function calls. The heap grows upward and is used for dynamic memory allocation. Static data and code are loaded at program start. - Managing dynamic memory on the heap is tricky and can cause bugs like memory leaks or using memory after it has been freed. Functions like malloc(), free(), and realloc() are used to allocate, free, and resize memory blocks
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Course Info

• HW2 due Mar. 3rd


• Lab 2 is available and in this week’s Lab session

• Discussion this week: Venus (for RISC-V), Memory


Management & debug

1
CS 110
Computer Architecture
C Memory Management
Instructors:
Siting Liu & Chundong W ng
Course website: https://to st-l b.sist.sh ngh itech.edu.cn/courses/CS110@Sh ngh iTech/
Spring-2023/index.html
School of Inform tion Science nd Technology (SIST)
Sh ngh iTech University

2023/2/6
a
a
a
a
a
a
a
a
a
a
a
C Memory Management
• To simplify, assume one program runs at a time
• A program’s address space contains 4 regions:
Memory Address
(32 bits assumed here)

• stack: local variables inside functions, grows


~ FFFF FFFFhex
stack
downward

• heap: space requested for dynamic data via


malloc(); resizes dynamically, grows upward

• static data: variables declared outside heap


static functions, does not grow or shrink. Loaded
when program starts, can be modi ied. static data
• code (a.k.a. text): loaded when program code
starts, does not change

• 0x0 unwritable/unreadable (NULL pointer)


~ 0000 0000hex

3
f
The Stack
• Every time a function is called, a new “stack
frame” is allocated on the stack

• Stack frame includes:


funcA() { funcB(); }
funcB() { funcC(); }

• Return address (who called me?) funcC() { funcD(); }

• Arguments
• Space for local variables
funcA frame

• Stack frames contiguous blocks of memory; funcB frame


stack pointer indicates start of stack frame

• When function ends, stack frame is tossed off


the stack; frees memory for future stack frames funcC frame
• Details covered later (RISC-V processor) funcD frame
Stack Pointer
4
Passing Pointers into the Stack
stack
• It is ine to pass a #define BUFLEN 256
int main() {
buf char array
pointer to stack …
persistent through
load_buf’s execution
space further down. char buf[BUFLEN];
load_buf(buf, BUFLEN); pointer buf

}

• However, it is bad to return a char *make_buf() { stack


pointer to something in the char buf[50];
stackAddr points
return buf;
stack! } to overwritten memory

• Memory will be overwritten int main(){ buf???


when other functions called! …


char *stackAddr = \
So your data would no make_buf(); Carving on the
longer exist, and writes can foo();
moving boat to look
overwrite key pointers,

} for the sword
causing crashes! 5
Solve with slides to come …
f
Managing the Heap
• The heap is dynamic memory – memory that can be allocated, resized, and
freed during program runtime.
• Useful for persistent memory across function calls
• But biggest source of pointer bugs, memory leaks, …
• Large pool of memory, not allocated in contiguous order
• Back-to-back requests for heap memory could result in blocks very far apart
• C supports four functions for heap management:
• malloc() allocate a block of uninitialized memory
• calloc() allocate a block of zeroed memory
• free() free previously allocated block of memory
• realloc() change size of previously allocated block (might move)
• Read-more: http://web.archive.org/web/20030222051144/http://
home.earthlink.net/~bobbitts/c89.txt section 4.10.3 memory management
functions

6
Managing the Heap
• void *malloc(size_t n):
– Allocate a block of uninitialized memory
– n is an integer, indicating size of allocated memory block in bytes
– size_t is an unsigned integer type big enough to “count” memory bytes
– sizeof returns size of given type in bytes, produces more portable code
– Returns void* pointer to block; NULL return indicates no more memory; always
check for return NULL (if (ip))
– Think of pointer as a handle that describes the allocated block of memory;
Additional control information stored in the heap around the allocated block!
(Including size, etc.)
“Cast” operation, changes type of a variable.
• Examples: Here changes (void *) to (int *)
int *ip1, *ip2;
ip1 = (int *) malloc(sizeof(int));
Ip2 = (int *) malloc(20*sizeof(int)); //allocate an array of 20 ints.
Assuming size of objects can
typedef struct { … } TreeNode; lead to misleading, unportable
TreeNode *tp = (TreeNode *) malloc(sizeof(TreeNode));
code. Use sizeof()!

7
Managing the Heap
• void free(void *p):
– Releases memory allocated by malloc()
– p is pointer containing the address originally returned by malloc()
int *ip;
ip = (int *) malloc(sizeof(int));
... .. ..
free((void*) ip); /* Can you free(ip) after ip++ ? */

typedef struct {… } TreeNode;


TreeNode *tp = (TreeNode *) malloc(sizeof(TreeNode));
... .. ..
free((void *) tp);
– When you free memory, you must be sure that you pass the original
address returned from malloc() to free(); Otherwise, system
exception (or worse)!

8
Managing the Heap
• void *realloc(void *p, size_t size):
– Returns new address of the memory block.
• In doing so, it may need to copy all data to a new location.
realloc(NULL, size); // behaves like malloc
realloc(ptr, 0); // behaves like free, deallocates heap block

– Always check for return NULL

int *ip; ip = (int *) malloc(10*sizeof(int));


… … … /* check for NULL */
Keep track of ip = (int *) realloc(ip, 20*sizeof(int));
this, since it
might change. /* contents of first 10 elements retained */
… … … /* check for NULL */
realloc(ip,0); /* equivalent to free(ip); */

9
Summary
• Code, static storage are easy: they never grow or shrink
• Stack space is relatively easy: stack frames are created and destroyed
in last-in, irst-out (LIFO) order, avoid “dangling references"
• Managing the heap is tricky:
• Memory can be allocated/deallocated at any time
• “Memory leak”: If you forget to deallocate memory
• “Use after free”: If you use data after calling free
• “Double free”: If you call free 2x on same memory
• Free stack: useless

10
f
Using Dynamic Memory—Linked List
typedef struct Node node * head = NULL;
head = (node *) malloc(sizeof(node));
{ if(head == NULL){
int val; return 1;
struct Node *next; }
head -> val = 1;
} node; head -> next = NULL;
Create the irst node
The irst node The last node

Data Data Data Data

Ptr to next Ptr to next Ptr to next Ptr to


Node Node Node NULL

Ptr to head
11
f
f
Using Dynamic Memory—Iterate
typedef struct Node void print_list(node *head){
{ node * current = head;
while (current != NULL){
int val; printf("%d\t", current -> val);
struct Node *next; current = current -> next;
} node; }
printf("\n");
}

The irst node

val Data Data Data

Ptr to next Ptr to next Ptr to


head
Node Node NULL

Ptr to current node


12
f
Using Dynamic Memory—Push
typedef struct Node void push_node(node ** head, int val){
node * new_node;
{ new_node = (node *) malloc (sizeof
int val; (node));
struct Node *next; new_node -> val = val;
new_node -> next = *head;
} node; *head = new_node;
printf("Node %d push succeeds!\n",
(*head) -> val);
The irst node }

val Data Data Data

Ptr to next Ptr to next Ptr to


head
Node Node NULL

new_node Ptr to head node


13
f
Using Dynamic Memory—Remove Last
typedef struct Node int remove_last(node * head) {
int retval = 0;
{
int val; if (head->next == NULL) {
struct Node *next; retval = head->val;
free(head);
} node; return retval;
}

node * current = head;


The irst node while (current->next->next != NULL) {
current = current->next;
}
Data Data Data
retval = current->next->val;
free(current->next);
Ptr to Ptr to current->next = NULL;
NULL NULL printf("%d is removed.\n",retval);
next next return retval;
}

Ptr to cur. node


14
f
How are Malloc/Free implemented?
• Underlying operating system allows malloc library to ask for
large blocks of memory to use in heap (e.g., using Unix sbrk()
call)
• C standard malloc library creates data structure inside unused
portions to track free space

15
Simple Slow Malloc Implementation

Initial Empty Heap space from Operating System

Free Space
Malloc library creates linked list of empty blocks (one block initially)

Object 1 Free
First allocation chews up space from start of free space

Free

After many mallocs and frees, have potentially long linked list of odd-sized blocks
Frees link block back onto linked list – might merge with neighboring free space
16
Faster malloc implementations
• Keep separate pools of blocks for different sized objects
• “Buddy allocators” always round up to power-of-2 sized chunks
to simplify inding correct size and merging neighboring blocks:

17
f
Power-of-2 “Buddy Allocator”
free

used

18
Malloc Implementations

• All provide the same library interface, but can have


radically different implementations
• Uses headers at start of allocated blocks and space in
unallocated memory to hold malloc’s internal data
structures
• Rely on programmer remembering to free with same
pointer returned by malloc
• Rely on programmer not messing with internal data
structures accidentally!

19
Agenda

• C Memory Management
• C Bugs: covered in discussion this week

20
Summary

• C has several main memory segments in which to


allocate data:
– Static Data: Variables outside functions/code
– Stack: Variables local to function
– Heap: Objects explicitly malloc-ed/free-d.
• Heap data is biggest source of bugs in C code

21
CS 110
Computer Architecture
Intro to RISC-V I
Instructors:
Siting Liu & Chundong W ng
Course website: https://to st-l b.sist.sh ngh itech.edu.cn/courses/CS110@Sh ngh iTech/
Spring-2023/index.html
School of Inform tion Science nd Technology (SIST)
Sh ngh iTech University

2023/2/6
a
a
a
a
a
a
a
a
a
a
a
Review
• Number representations (Unsigned/Signed)
• How C compiler works
• C codes are analyzed and break into basic operations
• C usage
• Pointers & Memory Management
• Overview of Von Neumann Architecture
• CPU (CA/CC/Registers, etc.) & Memory
• Next introduce how basic operations are implemented
• RISC-V Assembly (basic operations can be performed by hardware)
• Micro-architecture (hardware, basics on digital circuit)
• Other number representations ( loating-point, IEEE standard 754)
23
f
History
53 years ago:
High Level Language temp = v[k]; Apollo Guidance
v[k] = v[k+1];
Program (e.g., C) v[k+1] = temp; Computer
Compiler
Assembly Language
lwAnything
xt0, 0(x2)
can be represented
lw xt1, 4(x2) as a number,
programmed in
Program (e.g., RISC-V) sw xt1, 0(x2)
i.e.,4(x2)
sw xt0, data or instructions Assembly
Assembler 30x30x30cm, 32 kg.
0000 1001 1100 0110 1010 1111 0101 1000
Machine Language 10,000 lines of machine
ISA Program (RISC-V)
1010 1111 0101 1000 0000 1001 1100 0110
1100 0110 1010 1111 0101 1000 0000 1001 code manually entered –
0101 1000 0000 1001 1100 0110 1010 1111
Machine tons of easter eggs!
abcnews.go.com/Technology/apollo-11s-source-code-tons-
Interpretation easter-eggs-including/story?id=40515222

Hardware Architecture Description


(e.g., block diagrams)
Architecture
Implementation

Logic Circuit Description


(Circuit Schematic Diagrams)

Margaret Hamilton with the


code she wrote.
24
Intro to ISA
• Part of the abstract model of a computer that de ines how the CPU is
controlled by the software; interface between the hardware and the
software;

• Programmers’ manual because it is the portion of the machine that is


visible to the assembly language programmers, the compiler writers,
and the application programmers.

• De ines the supported data types, the registers, how the hardware
manages main memory, key features, instructions that can be executed
(instruction set), and the input/output model of multiple ISA
implementations

• ISA can be extended by adding instructions or other capabilities


-by ARM
25
f
f
Instruction Set Architecture
• Early trend was to add more and more instructions to new CPUs to do
elaborate operations

• VAX architecture had an instruction to multiply polynomials!


• RISC philosophy (John Cocke IBM, John Hennessy Stanford, David
Patterson Berkeley, 1980s)

• Hennessy & Patterson won ACM A.M. Turing Award


Reduced Instruction Set Computing (RISC)

• Keep the instruction set small and simple, makes it easier to build fast
hardware.

• Let software do complicated operations by composing simpler ones.


26
Mainstream ISAs
X86/AMD64 ARM RISC-V
CISC RISC RISC
Fees for ISA
Fees for ISA No fees for ISA
(Limited)
Fees for micro- Depending on
Fees for micro-
architecture usage (commercial
architecture
(Limited) vs. open-source)
A lot of historical Simple & can DIY,
Relatively simple
burden expandable

non-pro it RISC-V
Intel/AMD ARM
foundation
27
f
RISC vs. CISC

Assembly Assembly
Compiled on Mac machine using ARM CPU Compiled on Windows machine using Intel CPU
28
More than 3,100 RISC-V Members
• Alibaba Cloud: T-Head ⽞铁 C series; E series, and R series
• Huawei: Hi3861V100 SoC for IoT/smart home
• Tencent: recently become a premier member
• Intel, Google, Meta, SiFive, AMD/Xilinx, etc.
ShanghaiTech hold several RISC-V Summits China recent years!

• Can Linux OS work on RISC-V CPU?

29
More than 3,100 RISC-V Members
The total market for
RISC-V IP and Software
is expected to grow to
$1.07 billion by 2025 at
a CAGR of 54.1%
• Semico Research predicts the Source: Tractica
market will consume 62.4 billion
62.4 billion RISC-V CPU cores
RISC-V CPU cores by 2025, a
146.2% CAGR 2018-2025. The by 2025
industrial sector to lead with 16.7
billion cores.

From riscv.org Source: Semico Research Corp 30


Assembly Language
• Basic job of a CPU: execute lots of instructions.
• Instructions are the primitive operations that the CPU may execute.
• Other examples: MIPS, IBM/Motorola PowerPC (quite old Mac),
Intel IA64, ...

31
Why RISC-V in CS110?
• Why RISC-V instead of Intel x86?
• RISC-V is simple, elegant. Don’t want to get bogged down in gritty
details.

• It is a very very clean RISC


• No real additional "optimizations"
• Generally only one way to do any particular thing
• https://toast-lab.sist.shanghaitech.edu.cn/
courses/CS110@ShanghaiTech/Spring-2023/
lecture_notes/riscvcard.pdf
RISC-V Green Card
32
Assembly Registers (hardware/variable)
• Unlike C or Java, assembly cannot use variables
• Keep assembly/computer hardware abstract simple
• Assembly operands are registers
• Limited number of special locations/memory built directly into the CPU

• Operations can only be performed on these registers in RISC-V

• Bene it: Since registers are directly in hardware (CPU), they are very fast

33
f
Registers, inside the Processor
Memory
Processor Input
Enable?
Read/Write
Control

Program
Datapath
Address
PC Bytes

Registers Write Data

Arithmetic & Logic Unit Read Data Data


Output
(ALU)

Processor-Memory Interface I/O-Memory Interfaces


34
Registers, inside the Processor
Processor Registers
x0/zero
Control x1
x2
……
Datapath ……
PC
x31
Registers • Similar to memory, use “address” to refer
to speci ic location
Arithmetic & Logic Unit
(ALU) PC register
• Hold address of the current instruction

• 32 registers in RISC-V (in RV32 variant)


– Why 32? Smaller is faster, but too small is bad.

• Each RISC-V register is 32 bits wide (in RV32 variant)


– Groups of 32 bits called a word in RV32; P&H textbook uses 64-bit
variant RV64 (doubleword) 35
f
RISC-V Manual, RTFM
• https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf
• https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-
asm.md
Number indicates address/pointer/register width
I: Integer (integer arith., load, store and control-
low instructions)
M: Integer multiplication & division extension
A: Atomic instruction (read-modify-write)
F: single-precision loating-point (FP) extension (FP
registers/arith./load/store)
D: double-precision … (similar to F, with more bits)

RV32 + IMAFD extension = RV32G


RV64 + IMAFD extension = RV64G
From riscv.org 36
f
f
C, Java variables vs. registers
• In C (and most High Level Languages) variables declared irst and given a
type
– Example: int fahr, celsius;
char a, b, c, d, e;
• Each variable can ONLY represent a value of the type it was declared as
(cannot mix and match int and char variables).
• In Assembly Language, registers have no type, simply stores 0s and 1s;
operation determines how register contents are treated (think about the
hardware)

37
f
Assembly Instructions
• In assembly language, each statement (called an instruction), executes
exactly one of a short list of simple commands
• Unlike in C (and most other High Level Languages), each line of assembly
code contains at most 1 instruction
• Another way to make your code more readable: comments!
• Hash (#) is used for RISC-V comments
– anything from hash mark to end of line is a comment and will be
ignored

38
Assembly Instructions
• Different types of instructions (4 core types + B/J based on the handling
of immediate)

• Different types have different format but “rs1”, “rs2” and “rd” are at the
same position (hardware friendly)
• As an ID number, the machine code of the instructions has different
ields; format depends on their operands/type 39
f
Assembly Instructions
• Different types of instructions (4 core types + B/J based on the handling
of immediate)

• R-type
• Register-register operation, mainly for arithmetic & logic
• Has two operands (accessed from the source registers, rs1 & rs2) and one
output (saved to the destination register, rd)
• Cannot access main memory (instruction executed by CPU alone, no data
exchange with main memory)

40
RV32I R-type Arithmetic
• Syntax of instructions: assembly language, two register operands
• Addition: add rd, rs1, rs2 (operation rd,rs1,rs2)
Adds the value stored in register rs1 to that of rs2 and stores the sum into
register rd, similar to a = b+c, a ⇔ rd, b ⇔ rs1, c ⇔ rs2
• Example: add x5, x2, x1
add x6, x0, x5
add x4, x1, x3 Registers
0 x0/zero
0x12340000 x1
0x00006789 x2
0xFFFFFFFF x3
x4
x5
x6
x7
41
RV32I R-type Arithmetic
• Syntax of instructions: assembly language, two register operands
• Subtraction: sub rd, rs1, rs2
Subtract the value stored in register rs2 from that of rs1 and stores the
difference into register rd, equivalent to a = b-c, a ⇔ rd, b ⇔ rs1, c ⇔
rs2
• Example: sub x5, x2, x1
sub x6, x0, x5 Registers
0 x0/zero
0x12340000 x1
0x00006789 x2
x3
x4
x5
x6
x7
42
RV32I R-type Logic Operation
• Syntax of instructions: assembly language, two register operands
• AND/OR/XOR: and/or/xor rd, rs1, rs2
Logically bit-wise and/or/xor the value stored in register rs1 and that of
rs2 and stores the result into register rd, equivalent to a = b (&/|/^) c, a ⇔
rd, b ⇔ rs1, c ⇔ rs2
• Example: and x5, x2, x1
xor x6, x1, x5 Registers
and x4, x1, x3 0 x0/zero
0x12340000 x1
0x00006789 x2
0xFFFFFFFF x3
x4
x5
x6
x7
43
RV32I R-type Logic Operation
• Syntax of instructions: assembly language, two register operands
• AND/OR/XOR: and/or/xor rd, rs1, rs2
Logically bit-wise and/or/xor the value stored in register rs1 and that of
rs2 and stores the result into register rd, equivalent to a = b (&/|/^) c, a ⇔
rd, b ⇔ rs1, c ⇔ rs2

• Used for bit-mask Registers


and x5, x7, x4 0 x0/zero
or x6, x7, x4 0x12340000 x1
0x00006789 x2
0xFFFFFFFF x3
0xFFFF0000 x4
x5
x6
0x12345678 x7
44
RV32I R-type Compare
• Syntax of instructions: assembly language, two register operands
• SLT/SLTU: slt/sltu rd, rs1, rs2
Compare the value stored in register rs1 and that of rs2, sets rd=1, if
rs1<rs2 otherwise rd=0, equivalent to a = b < c ? 1 : 0, a ⇔ rd, b ⇔ rs1,
c ⇔ rs2. Treat the numbers as signed/unsigned with slt/sltu.
• Example: slt x5, x2, x1
slt x4, x3, x1 Registers
sltu x5, x3, x1 0 x0/zero
0x12340000 x1
0x00006789 x2
• Over low detection (unsigned)
0xFFFFFFFF x3
add x5, x3, x3 x4
sltu x6, x5, x3 x5
• Over low detection (signed)? x6
x7
• Try yourself/RTFM
45
f
f
RV32I R-type Shift
• Syntax of instructions: assembly language, two register operands
• Shift left/right (arithmetic): sll/srl/sra rd, rs1, rs2
Left/Right shifts the value stored in register rs1 by that of rs2, equivalent
to a = b <</>> />>>c, a ⇔ rd, b ⇔ rs1, c ⇔ rs2. Arithmetic: sign
extended.
• Example: sll x5, x2, x4
srl x6, x3, x4 Registers
0 x0/zero
sra x7, x3, x4
0x12340000 x1
0x00006789 x2
0xFFFFFFFF x3
0x4 x4
x5
x6
x7
46
RV32I R-type Shift
• Syntax of instructions: assembly language, two register operands
• Shift left/right (arithmetic): sll/srl/sra rd, rs1, rs2
Left/Right shifts the value stored in register rs1 by that of rs2, equivalent
to a = b <</>> />>>c, a ⇔ rd, b ⇔ rs1, c ⇔ rs2.
arithmetic: sign extended.
• Example: sll x5, x2, x4
srl x6, x1, x4 Registers
0 x0/zero
sra x7, x3, x4
0x12340000 x1
• What is the arithmetic effect by shifting? 0x00006789 x2
0xFFFFFFFF x3
0x4 x4
x5
x6
x7
47
Assembly Instructions
• Different types of instructions

• I-type
• Register-Immediate type
• Has two operands (one accessed from source register, another a constant/
immediate, sign-extended) and one output (saved to destination register)
• Can do arithmetic/logic/load from main memory/jump (covered later)

48
RV32I I-type Arithmetic
• Syntax of instructions: assembly language
• Addition: addi rd, rs1, imm
Adds imm to rs1, stores the result to rd, and imm is a signed number.
• Example: addi x5, x4, 10
addi x6, x4, -10
Registers
• Similarly, andi/ori/xori/slti/sltui 0 x0/zero
0x12340000 x1
• All the imm’s are sign-extended x2
0x00006789
0xFFFFFFFF x3
0x3 x4
x5
• slli/srli/srai are special (de ined in x6
RV64I), and can be extended to RV32I usage x7
(RTFM)
49
f
RV32I Arithmetic/Logic Test
• addi x1, x0, -1 Registers
• or x2, x2, x1 0 x0/zero
• add x3, x1, x2
• slt x4, x3, x1 0 x1
• sra x5, x3, x4 0 x2
• sub x0, x5, x4
0 x3
• Register zero (x0) is ‘hard-wired’ to 0; 0 x4
• By convention RISC-V has a speci ic 0 x5
no-op instruction...
0 x6
– addi x0 x0 0
– You may need to replace code later: No- 0 x7
ops can fill space, align data, and
perform other options
– Practical use in jump-and-link
operations (covered later)
50
f

You might also like