0% found this document useful (0 votes)
18 views

Lesson 2 Programming

The document provides an overview of machine languages, assembly languages, and bytecode, explaining their roles in programming and how they interact with CPUs and virtual machines. Machine languages consist entirely of binary numbers, while assembly languages use symbolic instructions for easier human readability. Bytecode is a form of object code that is executed by a virtual machine, allowing for platform independence in programming, particularly with languages like Java.

Uploaded by

masachibwatoh
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Lesson 2 Programming

The document provides an overview of machine languages, assembly languages, and bytecode, explaining their roles in programming and how they interact with CPUs and virtual machines. Machine languages consist entirely of binary numbers, while assembly languages use symbolic instructions for easier human readability. Bytecode is a form of object code that is executed by a virtual machine, allowing for platform independence in programming, particularly with languages like Java.

Uploaded by

masachibwatoh
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 53

MACHINE,

ASSEMBLY
LANGUAGES &
THE CONCEPT
OF BYTECODE
THE BASICS

Every computer, no matter how simple or complex, has at its heart exactly two things:
a CPU and some memory.
Together, these two things are what make it possible for your computer to run programs
THE BASICS
 On the most basic level, a computer program
is nothing more than a collection of numbers
stored in memory.
 Different numbers tell the CPU to do different
things.
 The CPU reads the numbers one at a time,
decodes them, and does what the numbers
say.
EXAMPLE 1
MACHINE LANGUAGE
 The lowest-level programming languages.
 Machine languages are the only languages understood by computers.
 While easily understood by computers, machine languages are almost impossible
for humans to use because they consist entirely of numbers.
 WHAT IS A MACHINE LANGUAGE?
 The set of symbolic instruction codes usually in binary form that is used to
represent operations and data in a machine (as a computer) —called also
machine code
 OR
 A set of instructions for a specific central processing unit, designed to be usable
by a computer without being translated.
 Also called machine code.
MACHINE LANGUAGE…
 Machine code, also known as machine language, is the elemental language
of computers, comprising a long sequence of binary digital zeros and ones .

 They are programming languages that can be directly understood and


obeyed by a a computer without requiring to be translated/

machine languages are Different for each type of CPU ,
 it is the native binary language (comprised of only two characters: 0
and 1) of the computer and is difficult to be read and understood by
humans
ASSEMBLY LANGUAGES
 Although the numbers of the above program in slide 3(example1)make perfect
sense to a computer, they are about as clear as mud to a human.
 Who would have guessed that they put a dollar sign on the screen?
 Clearly, entering numbers by hand is a lousy way to write a program.
 It doesn't have to be this way, though.
 A long time ago, someone came up with the idea that computer programs could be
written using words instead of numbers.
 A special program called an assembler would then take the programmer's words
and convert them to numbers that the computer could understand.
 This new method, called writing a program in assembly language, saved
programmers thousands of hours, since they no longer had to look up hard-to-
remember numbers in the backs of programming books, but could use simple
words instead.
THE HALLMARKOF
ASSEMBLY LANGUAGES
 The hallmark of an assembly language is that each assembly
language instruction is translated into one machine language
instruction
 Intermediate-level language which is higher (is easier to use
but runs slower) than machine language and lower (is more
difficult to use but runs faster) than a high-level language such
as Basic, or java.
 Programs written in assembly language are converted into
machine language by specialized programs called
assemblers for their execution by the machine (computer).
EXAMPLE2 -ASSEMBLY
LANGUAGES

The program above,(in example1) written in assembly language, looks like this:
MOV AX, 47104
MOV DS, AX
MOV [3998], 36
INT 32
THE MAJOR DISADV OF
ASSEMBLY L
 Assembly languages generally lack high-level
conveniences such as variables and functions, and
they arenot portable between various families of
processors.
 It has the same structures and set of commands as
machine language, but it allows a programmer to use
names instead of numbers. This language is still
useful for programmers when speed is necessary or
when they need to carry out an operation that is not
possible in high-level languages
ASSEMBLER
An assembler is a program that
takes basic computer instructions in
an assembly language and converts
them into a pattern of bits that the
computer's processor can use to
perform its basic operations.
CODE
Code: a general term to refer to the
program text.
When it isn't preceded by a word like
'object' or 'byte' it usually refers to the
human-readable code, also called
'source code.'
SOURCE CODE
 Source Code: The human-readable stuff you type in.
 Well, sort of human-readable if you ignore all the odd
punctuation that makes files look like something
written in one of those languages that includes clicks,
grunts, and squeals.
OBJECT CODE:
 Object Code: Normally this refers to a program in a form that the host
system can run directly using the processor (CPU).
 For example, C source code is fed into a C compiler/linker which produces
an object code file that can be run on the host system directly (No virtual
machine like Java's JVM.)
 In the case of Java, however, the term tends to get used to refer to
Bytecode.
 This use of the word "object" has nothing to do with Objects in Java or
object-oriented programming.
THE CONCEPT OF VIRTUAL
MACHINES
 A virtual machine is a program that acts as a virtual
computer or a virtual CPU.
OR
 A self-contained operating environment that behaves
as if it is a separate computer.
 -------------------------------------------------------------------
 It runs on your current operating system – the “host”
operating system – and provides virtual hardware to
“guest” operating systems.
EXAMPLES -THE CONCEPT
OF VIRTUAL MACHINES
 E.g.1 Java applets
 For example, Java applets run in a Java virtual machine (VM) that has no
access to the host operating system.

 E.g.2 The case of Operating Systems


 The guest operating systems run in windows on the host operating
system, just like any other program on your computer.
 The guest operating system runs normally, as if it were running on a
physical computer – from the guest operating system’s perspective, the
virtual machine appears to be a real, physical computer.
WHAT VIRTUAL MACHINES
OFFER…
 Virtual machines provide their own virtual hardware, including
1. a virtual CPU,
2. Virtual memory,
3. virtual hard drive,
4. Virtual network interface,
5. and other devices.
 The virtual hardware devices provided by the virtual machine are
mapped to real hardware on your physical machine.
 For example, a virtual machine’s virtual hard disk is stored in a
file located on your hard drive.
ADVANTAGES OF VIRTUAL
MACHINES
 This design has two advantages:
 System Independence: A Java application will run the same in any
Java VM, regardless of the hardware and software underlying the
system.
 Security: Because the VM has no contact with the operating
system, there is little possibility of a Java program damaging other
files or applications.
 The second advantage, however, has a downside. Because
programs running in a VM are separate from the operating
system, they cannot take advantage of special operating
system features.
BYTE CODE
.
 Bytecode is computer object code that is processed
by a program, usually referred to as a virtual machine
, rather than by the "real" computer machine,
the hardware processor.
 The virtual machine converts each generalized
machine instruction into a specific machine
instruction or instructions that this computer's
processor will understand.
BYTE CODE…

Bytecode is the result of
.

compiling source code


written in a language that
supports this approach.
BYTE CODE…
Using a language that comes with a
virtual machine for each platform, your
source language statements need to be
compiled only once and will then run on
any platform.
The best-known language today that uses
the bytecode and virtual machine is java
BYTECODE
 A byte-code program is interpreted by a bytecode interprter.
 The advantage of this technique compared with outputing
machine code for some particular processor is that the same byte-
code can be executed on any processor on which the byte-code
interpreter runs.
 The byte-code may be compiled to machine code ("native code") for speed
of execution but this usually requires significantly greater effort for each
new taraget architecture than simply porting the interpreter.

For example, java is compiled to byte-code which runs on the java virtual
machinre.
SUMMARY OF WHAT
BYTECODE IS….
 Bytecode is object-oriented programming
(OOP) code compiled to run on a virtual
machine (VM) instead of a central processing
unit (CPU).
 The VM transforms program code into readable
machine language for the CPU because
platforms utilize different code interpretation
techniques
QUESTIONS AND ANSWERS
Read question
Close thy eyes
Think with thy brain
Answer with thy mouth
MAY JESUS CHRIST HELP YOU
DISTINGUISH BETWEEN
MACHINE AND ASSEMBLY
LANGUAGE
 Machine languages consist entirely of numbers and are almost impossible
for humans to read and write.
 Assembly languages have the same structure and set of Commands as
machine languages, but they enable a programmer to use nanes instead of
numbers.
 Assembly language closely approximates binary machine code and uses
equivalent symbols to communicate with the computer language of a
specific machine.
 It is one step removed from machine language.
DISTINGUISH BETWEEN AN ASSEMBLER
AND A COMPILER.

 An assembler translates assembly-language instructions into


machine code.
 A compiler translates high-level language instructions into
machine code (or assembly code).
 The translation of an assembler is one to one: One statement
in assembly language is translated into one statement in
machine code.
 The translation of a compiler is one to many: One high-level
language instruction is translated into many machine
language instructions.
DISTINGUISH BETWEEN A
COMPILER AND AN
INTERPRETER
 The output from a compiler is a machine-language program.
 That program may be stored for later use or immediately
executed, but the execution is a distinct process from the
translation.
 An interpreter translates and executes together.
 The output from an interpreter is a solution to the original
problem, not a program that when executed gives you the
solution.
COMPARE AND CONTRAST
AN ASSEMBLER, A
COMPILER, AND AN
INTERPRETER
 All three are translators.
 They differ in the complexity of the languages they translate and in the
output from the translator.
 Assemblers and compilers produce machine-language programs, which
when run solve the original problem once and for all;
 interpreters produce temporary solutions to the original problem.
 Assemblers translate very simple languages; compilers and interpreters
can translate very complex languages.
DESCRIBE THE PORTABILITY PROVIDED BY
A COMPILER.

A program written in a high-level


language that is compiled can be
translated and run on any
machine that has a compiler for
the language.
DESCRIBE THE PORTABILITY PROVIDED BY
THE USE OF BYTECODE.

 Bytecode is the output from (usually) a Java compiler.


 There is a virtual machine for which Bytecode is the
machine language.
 A program compiled into Bytecode can be executed
on any system that has a simulator for the virtual
machine. The Java Virtual Machine (JVM) executed
Bytecode.
DESCRIBE THE PROCESS OF
COMPILING AND RUNNING A
JAVA PROGRAM
A Java program is compiled into
Bytecode, which can be executed
on any system with a JVM.
SUMMARY
 ……………………………human
 human language
 High level language
 ML Haskell
 Lisp Prolog
 C++ Java
 C
 Low level language
 Assembly languge
 machine language

 ---------------------------------machine
SUMMARY
THE COMPILATION PROCESS
 The compilation process is divided into three main
pieces.
 First and simplest is The compiler initially sees the
source program as a string of characters: then , and
so on, including spaces and line separators.
 The first step in compilation
 is to turn these characters into symbols, so that the later stages of
compilation can deal with the word as a unit..
 The second piece of the compiler is the the part
 that recognizes certain patterns of symbols as representing meaningful
units. “Oh,” says
 the parser, “I’ve just seen the word so what comes next must be a
procedure
 header and then a – block for the body of the procedure
 .” Finally, there is the
 process of in which each unit that was recognized by the parser is actually
 translated into the equivalent machine language instructions
PARSER
That part of the compiler—the
part that determines the syntactic
structure of the source program—
is called the PARSER
PARSER
 A parser does two things while processing its
input:
 1. Splits the input into tokens.
 2. Find the hierarchical structure of the input.
 First we have a lexical analyzer (scanner) that
splits the input into tokens (point 1).
 The syntax analyzer takes these tokens as input
and generates a syntax tree (point 2).
EXAMPLE:
Consider the following programming statement in java:

int x = 1;
A lexer or tokeniser will split that up into tokens as follows:
1. 'int',
2. 'x',
3. '=',
4. '1',
5. ';'.
A parser will take those tokens and use them to understand in some way:
•we have a statement
•it's a definition of an integer.
•The integer is called “x”
• “x” should be initialized with the value 1.
•We reach the end.
LEXER, TOKENIZER,PARSER
a lexer and a tokenizer are basically the
same thing, and that they smash the text
up into its component parts (the 'tokens').
 The parser then interprets the tokens
using a grammar.
SUMMARY
 A tokenizer breaks a stream of text into tokens,
usually by looking for whitespace (tabs, spaces, new
lines).
 A lexer is basically a tokenizer, but it usually attaches
extra context to the tokens -- this token is a number,
that token is a string literal, this other token is an
equality operator.
 A parser takes the stream of tokens from the lexer
and turns it into an abstract syntax tree representing
the (usually) program represented by the original
text.
COMPILATION PHASES
 A compiler is a complex program, which should be divided to smaller
 components.
 These components typically address different compilation phases -
 parts of a pipeline, which transform the code from one format to
 another.
 The following diagram shows the main compiler phases and how a
 piece of source code travels through them.
COMPILATION PHASES
EXAMPLE
 The lexer reads a string of characters and chops it into tokens.
 • The parser reads a string of tokens and groups it into a syntax
 tree.
 • The type checker finds out the type of each part of the syntax
 tree and returns an annotated syntax tree.
 • The code generator converts the annotated syntax tree into a list
 of target code instructions.
 The difference between compilers and interpreters is just in the
last
 phase: interpreters don’t generate new code, but execute the old
code.
GRAMMAR IMPLEMENTATION
OF PROGRAMMING
LANGUAGES
A grammar is a systems of rules
for a language.
Used in compilers to define
languages.
 grammars are complete by
definition
FROM LANGUAGE TO
BINARY
 Machines manipulate bits: 0’s and 1’s.
 Bit sequences used in binary encoding.
 Information = bit sequences
 Binary encoding of integers:
0=0
1=1
 2 = 10
 3 = 11
 4 = 100
BINARY ENCODING OF
LETTERS, VIA ASCII
ENCODING
 A = 65 = 1000001
 B = 66 = 1000010
 C = 67 = 1000011
 Thus all data manipulated by computers can be expressed by 0’s and
 1’s.
 But what about programs?
BINARY ENCODING OF
INSTRUCTIONS
 Programs are sequences of bytes - groups of eight 0’s or 1’s (there
 are 256 of them)
 A byte can encode a numeric value, but also an instruction
 Examples: addition and multiplication (of integers)
 96 = 0110 0000
 104 = 0110 1000
HOW COMPILERS WORK
 1. Syntactic analysis: Analyse the expression into an operator F and
 its operands X and Y.
 2. Syntax-directed translation: Compile the code for X, followed by
 the code for Y, followed by the code for F.
COMPILATION VS.
INTERPRETATION
 A compiler is a program that translates code to some other
code.
 It does execute the program.

 An interpreter does not translate, but it executes the program.


 A source language expression,
5 + 6 * 7
 is by an interpreter turned to its value,
 47
TRADE-OFFS
 Advantages of interpretation:
 • faster to get going
 • easier to implement
 • portable to different machines
 Advantages of compilation:
 • if to machine code: the resulting code is faster to execute
 • if to machine-independent target code: the resulting code is easier
 to interpret than the source code
 Byte Code:
Usually this is like a virtual CPU. Then a program written and compiled to
various true CPU's machine language converts this as it runs to that CPU's
instruction set. Sometimes it incorporates a JIT compiler which can
optimize for the particular CPU as and when it runs - e.g. on a CISC it might
combine some instructions into a more complex instruction code, while on
a RISC it might have to split a specific "action" into a sequence of calls.

You might also like