Lesson 2 Programming
Lesson 2 Programming
ASSEMBLY
LANGUAGES &
THE CONCEPT
OF BYTECODE
THE BASICS
Every computer, no matter how simple or complex, has at its heart exactly two things:
a CPU and some memory.
Together, these two things are what make it possible for your computer to run programs
THE BASICS
On the most basic level, a computer program
is nothing more than a collection of numbers
stored in memory.
Different numbers tell the CPU to do different
things.
The CPU reads the numbers one at a time,
decodes them, and does what the numbers
say.
EXAMPLE 1
MACHINE LANGUAGE
The lowest-level programming languages.
Machine languages are the only languages understood by computers.
While easily understood by computers, machine languages are almost impossible
for humans to use because they consist entirely of numbers.
WHAT IS A MACHINE LANGUAGE?
The set of symbolic instruction codes usually in binary form that is used to
represent operations and data in a machine (as a computer) —called also
machine code
OR
A set of instructions for a specific central processing unit, designed to be usable
by a computer without being translated.
Also called machine code.
MACHINE LANGUAGE…
Machine code, also known as machine language, is the elemental language
of computers, comprising a long sequence of binary digital zeros and ones .
The program above,(in example1) written in assembly language, looks like this:
MOV AX, 47104
MOV DS, AX
MOV [3998], 36
INT 32
THE MAJOR DISADV OF
ASSEMBLY L
Assembly languages generally lack high-level
conveniences such as variables and functions, and
they arenot portable between various families of
processors.
It has the same structures and set of commands as
machine language, but it allows a programmer to use
names instead of numbers. This language is still
useful for programmers when speed is necessary or
when they need to carry out an operation that is not
possible in high-level languages
ASSEMBLER
An assembler is a program that
takes basic computer instructions in
an assembly language and converts
them into a pattern of bits that the
computer's processor can use to
perform its basic operations.
CODE
Code: a general term to refer to the
program text.
When it isn't preceded by a word like
'object' or 'byte' it usually refers to the
human-readable code, also called
'source code.'
SOURCE CODE
Source Code: The human-readable stuff you type in.
Well, sort of human-readable if you ignore all the odd
punctuation that makes files look like something
written in one of those languages that includes clicks,
grunts, and squeals.
OBJECT CODE:
Object Code: Normally this refers to a program in a form that the host
system can run directly using the processor (CPU).
For example, C source code is fed into a C compiler/linker which produces
an object code file that can be run on the host system directly (No virtual
machine like Java's JVM.)
In the case of Java, however, the term tends to get used to refer to
Bytecode.
This use of the word "object" has nothing to do with Objects in Java or
object-oriented programming.
THE CONCEPT OF VIRTUAL
MACHINES
A virtual machine is a program that acts as a virtual
computer or a virtual CPU.
OR
A self-contained operating environment that behaves
as if it is a separate computer.
-------------------------------------------------------------------
It runs on your current operating system – the “host”
operating system – and provides virtual hardware to
“guest” operating systems.
EXAMPLES -THE CONCEPT
OF VIRTUAL MACHINES
E.g.1 Java applets
For example, Java applets run in a Java virtual machine (VM) that has no
access to the host operating system.
---------------------------------machine
SUMMARY
THE COMPILATION PROCESS
The compilation process is divided into three main
pieces.
First and simplest is The compiler initially sees the
source program as a string of characters: then , and
so on, including spaces and line separators.
The first step in compilation
is to turn these characters into symbols, so that the later stages of
compilation can deal with the word as a unit..
The second piece of the compiler is the the part
that recognizes certain patterns of symbols as representing meaningful
units. “Oh,” says
the parser, “I’ve just seen the word so what comes next must be a
procedure
header and then a – block for the body of the procedure
.” Finally, there is the
process of in which each unit that was recognized by the parser is actually
translated into the equivalent machine language instructions
PARSER
That part of the compiler—the
part that determines the syntactic
structure of the source program—
is called the PARSER
PARSER
A parser does two things while processing its
input:
1. Splits the input into tokens.
2. Find the hierarchical structure of the input.
First we have a lexical analyzer (scanner) that
splits the input into tokens (point 1).
The syntax analyzer takes these tokens as input
and generates a syntax tree (point 2).
EXAMPLE:
Consider the following programming statement in java:
int x = 1;
A lexer or tokeniser will split that up into tokens as follows:
1. 'int',
2. 'x',
3. '=',
4. '1',
5. ';'.
A parser will take those tokens and use them to understand in some way:
•we have a statement
•it's a definition of an integer.
•The integer is called “x”
• “x” should be initialized with the value 1.
•We reach the end.
LEXER, TOKENIZER,PARSER
a lexer and a tokenizer are basically the
same thing, and that they smash the text
up into its component parts (the 'tokens').
The parser then interprets the tokens
using a grammar.
SUMMARY
A tokenizer breaks a stream of text into tokens,
usually by looking for whitespace (tabs, spaces, new
lines).
A lexer is basically a tokenizer, but it usually attaches
extra context to the tokens -- this token is a number,
that token is a string literal, this other token is an
equality operator.
A parser takes the stream of tokens from the lexer
and turns it into an abstract syntax tree representing
the (usually) program represented by the original
text.
COMPILATION PHASES
A compiler is a complex program, which should be divided to smaller
components.
These components typically address different compilation phases -
parts of a pipeline, which transform the code from one format to
another.
The following diagram shows the main compiler phases and how a
piece of source code travels through them.
COMPILATION PHASES
EXAMPLE
The lexer reads a string of characters and chops it into tokens.
• The parser reads a string of tokens and groups it into a syntax
tree.
• The type checker finds out the type of each part of the syntax
tree and returns an annotated syntax tree.
• The code generator converts the annotated syntax tree into a list
of target code instructions.
The difference between compilers and interpreters is just in the
last
phase: interpreters don’t generate new code, but execute the old
code.
GRAMMAR IMPLEMENTATION
OF PROGRAMMING
LANGUAGES
A grammar is a systems of rules
for a language.
Used in compilers to define
languages.
grammars are complete by
definition
FROM LANGUAGE TO
BINARY
Machines manipulate bits: 0’s and 1’s.
Bit sequences used in binary encoding.
Information = bit sequences
Binary encoding of integers:
0=0
1=1
2 = 10
3 = 11
4 = 100
BINARY ENCODING OF
LETTERS, VIA ASCII
ENCODING
A = 65 = 1000001
B = 66 = 1000010
C = 67 = 1000011
Thus all data manipulated by computers can be expressed by 0’s and
1’s.
But what about programs?
BINARY ENCODING OF
INSTRUCTIONS
Programs are sequences of bytes - groups of eight 0’s or 1’s (there
are 256 of them)
A byte can encode a numeric value, but also an instruction
Examples: addition and multiplication (of integers)
96 = 0110 0000
104 = 0110 1000
HOW COMPILERS WORK
1. Syntactic analysis: Analyse the expression into an operator F and
its operands X and Y.
2. Syntax-directed translation: Compile the code for X, followed by
the code for Y, followed by the code for F.
COMPILATION VS.
INTERPRETATION
A compiler is a program that translates code to some other
code.
It does execute the program.