0% found this document useful (0 votes)
7 views

lecture1

language based security

Uploaded by

g18603914990
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

lecture1

language based security

Uploaded by

g18603914990
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Program Analysis

1. Introduction

Kihong Heo

1
About Me

• Instructor: Kihong Heo ( , [email protected])

• KAIST CS / GSIS / Programming Systems Lab.

• Homepage: https://kihongheo.kaist.ac.kr / https://prosys.kaist.ac.kr

• O ce: N5 2321

• O ce Hours: after each class (by appointment)

1. Introduction CS524 / KAIST Kihong Heo 2 / 46


ffi
ffi
My Research
• Goal: solid PL theories ⟺ powerful programming systems

• Keywords: programming language, program analysis, SW engineering, SW security

• Good memories:

The Artifact Evaluation Track of


The 44th International Conference on Software Engineering (ICSE 2022)

MAY 8-20 VIRTUAL,


ICSE 2022
MAY 22-27 IN-PERSON
2022, PITTSBURGH, PA Best Artifact Award
USA
Presented to
Hyunsu Kim (KAIST); Mukund Raghothaman (University of Southern California); Kihong Heo (KAIST)

For the artifact


Learning Probabilistic Models for Static Analysis Alarms

Professor Andreas Vogelsang Professor Yan Cai


Artifact Evaluation Co-Chair Artifact Evaluation Co-Chair
University of Cologne Institute of Software at Chinese Academy of Sciences

*https://research.fb.com/blog/2017/02/inferbo-infer-based-bu er-overrun-analyzer/
Germany China

1. Introduction CS524 / KAIST Kihong Heo 3 / 46


ff
My Research

1. Introduction CS524 / KAIST Kihong Heo 4 / 46


My Research

{
1. Introduction CS524 / KAIST Kihong Heo 5 / 46
My Research

{
1. Introduction CS524 / KAIST Kihong Heo 6 / 46
My Research

{
1. Introduction CS524 / KAIST Kihong Heo 7 / 46
My Research

PL AI {
1. Introduction CS524 / KAIST Kihong Heo 8 / 46
Course Information
• Course Website: https://github.com/prosyslab-classroom/cs524-program-analysis

• Q&A Board: https://github.com/prosyslab-classroom/cs524-program-analysis/issues

• TAs (mailing list: [email protected])

• Tae Eun Kim ( )

• Geon Park ( )

• Textbook:

• Lecture slides will be provided

• Xavier Rival and Kwangkeun Yi,


Introduction to Static Analysis: an Abstract Interpretation Perspective, MIT Press, 2020

1. Introduction CS524 / KAIST Kihong Heo 9 / 46


Grading

• Homework: 50%

• Final exam: 40%

• Participation: 10%

• Active participation including questions or discussions (online or o ine)

• DO NOT allow for P/NR grading

• NOTE: nonnegotiable!

• DO NOT send an email for a negotiation (cheating)

1. Introduction CS524 / KAIST Kihong Heo 10 / 46

ffl
Important Notice (1): Academic Integrity

• DO NOT share the course contents (e.g., assignments or exams) with others

• Esp., Github public repository, chegg.com, etc

• DO NOT discuss the details of solutions with others

• DO NOT plagiarize

• We keep all the submissions from previous years!

• Any integrity violation: at LEAST F

• If you have questions: QnA board > TAs > instructor

1. Introduction CS524 / KAIST Kihong Heo 11 / 46


Important Notice (2): In-class
• Language: English (default), Korean (supplementary)

• Attendance: always (default), absence (if necessary)

• No quanti ed attendance score

• “What is essential is invisible to the eye” - The Little Prince

• I expect you to be here, as you expect me to be here!

• Questions & discussion (either in Korean or in English): highly encouraged

1. Introduction CS524 / KAIST Kihong Heo 12 / 46


fi
Important Notice (3): Out-of-class

• All Q&A and public notices: Github issue board

• “Watch” all noti cations

• Private notices (grading, etc): KLMS

• Questions are always welcome except for

• Too detailed ones (TAs are not debuggers!)

• Directly related to the solutions

• Actively discuss with your classmates

1. Introduction CS524 / KAIST Kihong Heo 13 / 46


fi
A long time ago
in a galaxy far, far away….

1. Introduction CS524 / KAIST Kihong Heo 14 / 46


Software
BUGS

1. Introduction CS524 / KAIST Kihong Heo 15 / 46


Software Bugs: A Persistent Problem
• A long time ago, far far away

The Patriot Missile (1991) The Ariane-5 Rocket (1996) NASA’s Mars Climate Orbiter (1999)
Floating-point roundo Integer Over ow Meters-Inches Miscalculation
28 soldiers died $100M $125M

• Unfortunately, it becomes your own problem now

1. Introduction CS524 / KAIST Kihong Heo 16 / 46


fl
ff
Software Bugs: A Persistent Problem
• A long time ago, far far away
COST OF A SOFTWARE BUG

$100 $1,500 $10,000


If found
Thein Gathering
Patriot Missile (1991) The Ariane-5 Rocket (1996) NASA’s Mars Climate Orbiter (1999)
Floating-point roundo If found in QA testing
Integer Over ow phase If found in Production
Meters-Inches Miscalculation
Requirements phase
28 soldiers died $100M $125M

• Unfortunately, it becomes your own problem now


- IBM Systems Sciences Institute, 2015

1. Introduction CS524 / KAIST Kihong Heo 17 / 46


fl
ff
Why Software Still Fails?
Size of Linux Kernel
28MLOC
Million Lines of Code

10KLOC

Kernel Version

Avg. Size of Android Apps


X 20M+ New Developers
6x 85M+ New Repositories
227M+ New Pull Requests
in 2022
Avg APK Size (MBs)

1x

Jan, 2013 Jan, 2014 Jan, 2015 Jan, 2016 Jan, 2017

1. Introduction CS524 / KAIST Kihong Heo 18 / 46


Software Complexity
less-382 (23,822 LOC)

1. Introduction CS524 / KAIST Kihong Heo 19 / 46


The Era of AI

1. Introduction CS524 / KAIST Kihong Heo 20 / 46


Arti cial Unintelligence ( )

Why?

1. Introduction CS524 / KAIST Kihong Heo 21 / 46


fi
SW Bug by Human Developer (1)

• Example: gimp-2.6.7 (CVE-2009-1570)

long ToL (char *pbuffer) { return (puffer[0] | puffer[1]<<8 | puffer[2]<<16 | puffer[3]<<24); }

short ToS (char *pbuffer) { return ((short)(puffer[0] | puffer[1]<<8)); }


1
gint32 ReadBMP (gchar *name, GError **error) {
if (fread(buffer, Bitmap_File_Head.biSize - 4, fd) != 0)
FATALP ("BMP: Error reading BMP file header #3");
Bitmap_Head.biWidth = ToL (&buffer[0x00]);
2
Bitmap_Head.biBitCnt = ToS (&buffer[0x0A]);

rowbytes = ((Bitmap_Head.biWidth * Bitmap_Head.biBitCnt - 1) / 32) * 4 + 4; 4


image_ID = ReadImage (rowbytes);
... 5
}

gint32 ReadImage (int rowbytes) {


buffer = malloc(rowbytes); 6 // malloc with overflowed size
...
}

1. Introduction CS524 / KAIST Kihong Heo 22 / 46


SW Bug by Human Developer (2)

• Example: sam2p-0.49.4 (CVE-2017-1663)

long ToL (char *pbuffer) { return (puffer[0] | puffer[1]<<8 | puffer[2]<<16 | puffer[3]<<24); } 3

short ToS (char *pbuffer) { return ((short)(puffer[0] | puffer[1]<<8)); }

bitmap_type bmp_load_image (FILE* filename) { 1


if (fread(buffer, Bitmap_File_Head.biSize - 4, fd) != 0)
FATALP ("BMP: Error reading BMP file header #3");
Bitmap_Head.biWidth = ToL (&buffer[0x00]); 2
Bitmap_Head.biBitCnt = ToS (&buffer[0x0A]);

rowbytes = ((Bitmap_Head.biWidth * Bitmap_Head.biBitCnt - 1) / 32) * 4 + 4; 4


image.bitmap = ReadImage (rowbytes);
... 5
}

unsigned char* ReadImage (int rowbytes) { 6


unsigned char *buffer = (unsigned char*) new char[rowbytes]; // malloc with overflowed size
...
}

1. Introduction CS524 / KAIST Kihong Heo 23 / 46


SW Bug by AI Developer

• Example (feat. Copilot):

int toLong(char *buffer) {


return (buffer[0]) | (buffer[1] << 8) | (buffer[2] << 16) | (buffer[3] << 24);
}

int f(char *name) {


int width, height, area;
char buffer[10];
FILE *fd = fopen(name, "rb");
fread(buffer, 10, 1, fd);
fclose(fd);

// Copilot, fill in the blank!


width = toLong(buffer + 18);
height = toLong(buffer + 22);
area = width * height;

1. Introduction CS524 / KAIST Kihong Heo 24 / 46


Aftermath?
# CVE
30000

22500 ChatGPT Released


2022.11
Copilot Released
15000
2021.10

7500

0
2017 2018 2019 2020 2021 2022 2023

1. Introduction CS524 / KAIST Kihong Heo 25 / 46


Course Objectives: Principles

Q: How to formally estimate software behavior


automatically before its execution?

1. Introduction CS524 / KAIST Kihong Heo 26 / 46


Course Objectives: Principles
Artifact Subject Principle

F~ = m~a

r · E = ⇢/"0
<latexit sha1_base64="bhxtCDs0D03EgDp8xl6ui3yIJ1I=">AAAC/HicdVJdaxQxFM2MX3X92tZHX4KLUBHXGZG2L0KpCOJTBbctbJblTjazG5pJhuROYQjjX/HFB0V89Yf45r8xsx1ku20vBA735Nxz702yUkmHSfI3im/cvHX7zsbd3r37Dx4+6m9uHTlTWS5G3ChjTzJwQkktRihRiZPSCigyJY6z03ctf3wmrJNGf8a6FJMC5lrmkgOG1HQz2mKZmEvtwVqoG6+aHtOQKaCMzwxSVgAuspz69w1965ldGNq88uwMrCidVEbTqU+ahjJ2je4g6JJVGmUh3MW6L1kJFiWoUAybVen1urYuK6qlPVMix+3/1MeGvljv8GqHYM6snC/weY8JPeuWMO0PkmGyDHoZpB0YkC4Op/0/bGZ4VQiNXIFz4zQpceJbR65E2GjlRAn8FOZiHKCGMMnELx+voc9CZkZzY8PRSJfZVYWHwrm6yMLNtm23zrXJq7hxhfnexEtdVig0PzfKK0XR0PYn0Jm0gqOqAwBuZeiV8gVY4Bj+Sy8sIV0f+TI4ej1Md4Y7n94M9g+6dWyQJ+Qp2SYp2SX75AM5JCPCozr6Gn2PfsRf4m/xz/jX+dU46jSPyYWIf/8Dn+/xIg==</latexit>

r·B=0
r ⇥ E = @t B
r ⇥ B = µ0 (J + "0 @t E)

Ou
rG
oal

1. Introduction CS524 / KAIST Kihong Heo 27 / 46


Static Program Analysis

• General methodology to predict software behavior

• static: before execution


“SW MRI”
• automatic: software is analyzed by software (program analyzer)

• systematic: foundational theory (Abstract Interpretation)

• Applications:

• bug- nding, veri cation, code optimization, etc

1. Introduction CS524 / KAIST Kihong Heo 28 / 46


fi
fi
Success Stories
Domain-speci c General-purpose
Veri cation Bug- nding

Windows Device Driver Stanford / Synopsys Facebook SNU / Fasoo.com Mathworks


Microsoft

Astrée
Airbus Controller GrammaTech Semmle / Github JuliaSoft
ENS / AbsInt

GCC LLVM/Clang

1. Introduction CS524 / KAIST Kihong Heo 29 / 46


fi
fi
fi
Course Objectives: Practice & Challenge
• Homework: design & implement program analyzers

• 7 (main) + 2 (dummy) assignments

• Programming assignments in OCaml using LLVM & Z3

• You will write your analyzers in OCaml

• Your analyzer will analyze LLVM IR code

• You will utilize Z3 in your analyzers

• Why LLVM? (https://llvm.org)

• Why OCaml? (https://ocaml.org)

• Why Z3? (https://github.com/Z3Prover/z3)


1. Introduction CS524 / KAIST Kihong Heo 30 / 46
The LLVM Compiler Infrastructure

• The de-facto standard & well-structured compiler toolchain

• parser, code optimizer, linker, loader, debugger, etc

• A wide variety of frontends: C/C++, Obj-C, Swift, Fortran, etc

• translated to the LLVM IR (intermediate representation)

1. Introduction CS524 / KAIST Kihong Heo 31 / 46


The OCaml Language

• Simple, safe, and realistic programming language

• Strong type system, higher-order functions, etc

• O cial OCaml bindings to LLVM and Z3 API supported

• A lot of growing demands from academia and industry

• See the materials on the course webpage

1. Introduction CS524 / KAIST Kihong Heo 32 / 46


ffi
The Z3 Theorem Prover

• State-of-the-art automated theorem prover by Microsoft Research

• Solving satis ability modulo theory (SMT) problems

• rst-order logic with background theories


(e.g., arithmetic, bit-vectors, arrays, datatypes, uninterpreted functions, etc)

Boolean Satis ability Problem (SAT) Satis ability modulo theory (SMT)

(¬A _ B) ^ (¬B _ C) ^ (A _ ¬C _ B) x + 2 = y =) f (read(write(a, x, 3), y 2)) = f (y x + 1)

Satis able when


A = false Arithmetic Array
Uninterpreted
Functions
B = true
C = true

1. Introduction CS524 / KAIST Kihong Heo 33 / 46


fi
fi
fi
fi
fi
Course Objectives: Active Learning

• The era of AI: NO free lunch and requires more creative thinking

• Naively understanding & memorizing knowledge becomes obsolete

• Creativity comes from “pain” ( )

• Training GPT-3: 355 yrs on a single GPU (2020)

• Three levels of knowledge

• Understandable knowledge ( )
by lecture, exam,
“The greatest scientists are
artists as well”
programming assignment
- A. Einstein
• Explainable knowledge ( )
by various activities
• Arti able knowledge ( )

1. Introduction CS524 / KAIST Kihong Heo 34 / 46


fi
Activity (1): Writing
• Science is communication

• Writing: main communication tool

• Roles of KAIST folks

• World-leading researcher (in English)

• Inspiring communicator to the community (in Korean)

• For newcomers, public o cials, etc

• Idiot’s transition algorithm


String.split_on_char ‘ ’ word
|> List.map translate_word
|> String.concat “ ”

1. Introduction CS524 / KAIST Kihong Heo 35 / 46


ffi
Activity (2): Art Competition ( )

• What does X mean to you?

• Pick a concept X you have learned in this class such as cryptography, program analysis

• Draw a picture that succinctly represents the concept using DALL-E (or similar tools)

• Detailed instructions will be announced later

CS492: Program Reasoning, 2022 CS524: Program Analysis, 2022

1. Introduction CS524 / KAIST Kihong Heo 36 / 46


Homework
• All submissions will be managed using Github / Github Classroom

1. For each HW, a unique invitation URL will be posted on the issue board

2. Once you accept, a private repo for your assignment will be created

3. You can push as many commits as you want before the deadline

4. The nal commit of your main branch will be graded

• Gradescope will be used for written assignments and exams

• 80% credit for 1-day late, 50% credit for 2-days late, NO credit otherwise

1. Introduction CS524 / KAIST Kihong Heo 37 / 46


fi
Homework 0.1: Hello, World!

• Goal: setting up and getting familiarized with OCaml and Git environments

• Implement your “hello-world” program in OCaml

• Test on your machine

• Push to your Github repository

• See the result in Github Action

• The invitation URL will be posted on the course webpage

• Will not be graded

1. Introduction CS524 / KAIST Kihong Heo 38 / 46


Homework 0.1: Hello, World!

1. Accept the invitation


and have your repository

1. Introduction CS524 / KAIST Kihong Heo 39 / 46


Homework 0.1: Hello, World!

2. Commit your code

1. Introduction CS524 / KAIST Kihong Heo 40 / 46


Homework 0.1: Hello, World!

3. See your result

1. Introduction CS524 / KAIST Kihong Heo 41 / 46


Homework 0.2: OCaml Programming

• Goal: getting familiarized with basic OCaml programming

• Solve a few programming problems in OCaml

• Test on your machine

• Push to your Github repository

• See the result in Github Action

• The invitation URL will be posted on the course webpage

• Will not be graded but highly recommended if you are not familiar with OCaml

1. Introduction CS524 / KAIST Kihong Heo 42 / 46


Requirement: Software Engineering Practices
• All programming assignments submissions must follow basic SE practices

• Remove all compile errors and warning File "src/semantics.ml", line 39, characters 6-11:
39 | let count = 1 in
^^^^^
Warning 26 [unused-var]: unused variable count.
• Remember: warning is error
let rec
fact n = let rec fact n =
• Clean code via formatting match n with match n with
| 0 | 1 -> 1 | 0 | 1 -> 1
| _ -> n * | _ -> n * fact (n - 1)
• Acceptable code coverage fact (n - 1)

• Otherwise, NO points will be given

• Detailed instructions will be provided

1. Introduction CS524 / KAIST Kihong Heo 43 / 46


Homework 1: Essay
• Read the stories of Google, Facebook, and Apple (available from the course webpage)

• Write a critique essay

• Syntactic requirements

• Typeset your document in Latex using a provided template (maximum 2 pages)

• Korean (if you are a native Korean speaker), English (otherwise)

• Semantic requirements: Top-down ( )

• For each paragraph, write the topic sentence rst followed by the details.

• See the README for the details

• The invitation URL will be posted on the course webpage


1. Introduction CS524 / KAIST Kihong Heo 44 / 46
fi
Misc (1): Github, Gradescope

• Submit your Github account via the google form (see the Github issue board)

• Join Gradescope (see the Github issue board)

• For written assignments and the nal exam

• Rules for programming assignments

• Preserve the structures (directories, les, types, etc)

• Don’t install further Github App

1. Introduction CS524 / KAIST Kihong Heo 45 / 46


fi
fi
Misc (2): Email Communication

• Primary communication tool for business

• NOT for instant messages like Kakao Talk

• But try to respond as soon as possible (within 24hrs)

• Include all relevant persons by using “CC” ( ) and “Reply All” ( )

• International communication: you can send emails anytime

• Professionals do not use noti cations for emails

• Do you send it to the US at 3 am?

• Other details: Google “email etiquette”

1. Introduction CS524 / KAIST Kihong Heo 46 / 46


fi

You might also like