8-computer language
8-computer language
INTRODUCTION
Computers are to be directed to do any task and these directions are called instructions. A set of instructions to carry
out a task is called a program. The instructions are written with some notation called language. There are three types
of languages have been developed & are: machine language, assembly language, and high-level language.
Assembly language employed mnemonics such as READ,LOAD, etc. in place of operation codes. But still there was
lot of difficulty in programming. High-level languages were later developed to make programming easy. The
instructions in high-level languages resemble ordinary English statements and are easy to learn and use. High-level
languages (HLL) require a translator to convert high-level language instructions into machine language instructions.
This translation program is called a compiler and each HLL requires a compiler.
Machine Language
Computer directly understands only machine language, which is the natural dialect of the machine. The machine
instructions are in binary codes of 0 and 1. But writing machine language instructions is laborious, as it requires
thorough understanding of the machine configuration and programming as well. The machine instruction has two
parts: an operation code and an operand address. The operation code specifies the operation to be carried out and
the operand address specifies the address in memory where the operand or instruction is stored or to be stored.
A set of machine instructions is called machine code or language. The program design is determined by some
features of the machine such as the type of registers, word length of registers and memory; etc.as this language
depends on the computer architecture, the language is called machine dependent language.
Machine language recognized by the CPU without the help of a translator program. The program instructions in
machine language are thus directly converted into electrical signals to execute them.
Advantages
i. M a c h i n e language requires less memory space than other languages.
ii. Programs in machine language can be executed directly.
iii. It does not require any translating program.
Disadvantages
i. The programs in machine language are not as the language is machine dependent.
ii. Programming in machine language is laborious and tedious as it requires keeping track of
memory locations, operation codes, state of execution of commands, intermediate results,
etc.
iii. Programming in machine language is error prone.
iv. The machine language programmer has to remember all the operation codes, what each code
does and how it affects various registers of the processor.
v. One has to keep track of all the operands and know exactly where they are stored in
memory.
vi. It requires deep knowledge of the internal structure of the computer
Assembly Language
To overcome the problems of programming in machine code mnemonics were introduced. These mnemonics are
easy to remember. Thus the first step in the evolution of programming language was the development of assembly
language. In this language, mnemonics, which are usually two to four letter words, are used in place of operation
codes and strings of characters to indicate addresses of location. This language is designed to replace each
machine instruction with a human understandable mnemonics (such as MUL for multiply, DIV for divide and SUB
for subtract, etc.) and each address with an alphanumeric string. This language also a machine dependent &
thus suffers some problem of machine language. The mnemonic operation codes should be translated into
absolute numeric operation codes. The symbolic address should also be translated into absolute numeric addresses.
This translation requires a special program called assembler, usually supplied by the manufacturer along with the
machine. It is called assembler because it assembles the codes after translation into machine codes in the memory
ready for execution. The source code is developed first then assembler converts the source code into object code
1
and a linker program links the object code into a directly executable program.
Page
Advantages of Assembly Language
i. Assembly language is easier to understand and use.
ii. It is easier to locate and correct errors in assembly language.
iii. The program in assembly language can be more easily modified.
iv. Greater flexibility in writing programs in assembly language.
Disadvantages
i. It is machine dependent & thus not portable
ii. Programming in assembly language is difficult. It requires expert knowledge of the internal
structure of the processor and the assembly language programming techniques.
iii. Assembly language programming is time consuming.
High Level Language
It was realized that the enormous potential of computer could be realized only if a non-expert user can effectively
use the computer for problem solving. This necessitated the development of high-level language. The focus shifted
from machine-oriented language to problem-oriented language. Such problem-oriented languages enable a
programmer to write appropriate algorithms to solve problems in natural languages like English.
A high level language to represent algorithms adequately must have some features such as:
1. Facility to describe the data items and data structures.
2. Operators which are appropriate to the data items and data structures in the languages.
3. A set of characters using the symbols and definition of the precise meaning of the symbols or
things of such symbols (e.g. * for multiplication).
4. It should be precise and unambiguous.
5. A set of syntax rules which specify the permissible combination of the words and operators.
6. A set of semantic rules which assign meaning to each valid syntactic structure.
7. The syntax and semantic rules of the language besides being precise should aid in
understanding of the program, and,
8. Control structures to sequence the operation to be performed.
The high level language is not machine dependent & thus i s portable, that is executable on any machine. But the
program requires translation into machine language before execution. This requires a specially written program,
which may be either a compiler or an interpreter. These programs are pre-stored in the machine.
The compilers are written by professional programmers, and they are machine independent. The rules for writing
programs in HLL resemble grammar of any language. These are called the syntax rules of the language. If these
rules are violated, the compiler will detect them and list them while translating instructions. Examples of HLL are
COBOL, BASIC, FORTRAN,etc.
Advantages of High Level Languages
1. It is easy to learn and use as it is close to a familiar language like English.
2. Better documentation.
3. Portability as the language is machine-independent.
4. Increases Programmer productivity.
5. Error detection and correction is easier in high-level language. Syntax errors are detected and
displayed by the compiler for correction.
6. Fewer errors.
7. Programs in high-level languages are easier to maintain than those in the low-level languages.
Disadvantages
1. The machine takes more time and main memory to run a program in high-level language.
2. The high level language is less flexible than machine language as the automatic features always
occur and are not under the control of the programmer.
3. A translator program (interpreter or compiler) is required in translating source code into object
code.
HIGHER LEVEL PROGRAMMING LANGUAGES
Till about 1955, computers were slow, and had a small memory. Thus programming efficiency was very important
and assembly language was dominant. The use of computers was also limited to a small group of scientists. With
2
improvements in technology, the tremendous potential of computer applications in diverse areas was foreseen. It
Page
was evident that this potential could be realized only if a non-expert user could effectively use the computer to solve
problems. It was thus clear that a user should be concerned primarily with the development of appropriate
algorithms to solve problems of interest to him and not with the details of the internal logical structure of a
computer. Consequently a good notation to express algorithms became an essential requirement. It would be ideal if
an algorithm written in a natural (spoken) language such as English were translated to machine language
automatically by the computer and executed. But This is not possible because natural languages are not precise or
unambiguous.
The interpretation of the meaning of a natural language sentence depends on the context also. For example,
the sentence give me a ring'' may mean either give me a ring to wear or a ring on the telephone depending on the
context. Thus lf algorithms are to be executed by computers, it is necessary to develop a sample, concise, precise
and unambiguous notation to express them. The notation should also match the type of algorithm For example,
algorithm to solve science and engineering problems would have complex arithmetic operations and would use
mathematical functions such as tan x, cosh x , etc Thus a notation to express such algorithms should Include complex
arithmetic operations and mathematical function. On the other hand, algorithms for processing business data would
have operations to be performed on massive amounts of organized data known as files. The notation In this case
must facilitate describing files and processing files. Such notations used to express algorithms are known as high
level procedure oriented programming languages.
In general, a high level language must have the following features to express algorithms
1. Facility to describe the nature of the data to be processed. For example specifications of integers, reals, complex
numbers, characters, etc. Besides individual data Items, collection of similar types of data making up a
composite, known as a data structure, is very important in developing algorithms. Examples of data structures
are arrays, matrices, sets, and strings of characters. Each high level language has the facility to describe some of
these structures depending on the area of application of the language. Rigid specification of variable type is one
of the most Important features of high level procedure oriented programming languages
2. Operators which are appropriate to the data Items and data structures In the language For example, lf we have a
facility to represent complex numbers, then, complex addition, subtraction, multiplication and division
operations would be useful. '
3. A set of characters using which symbols in the language are constructed. These symbols have a precise meaning
In the context of the language. For example, the symbol ** is used to represent the exponentiation operation In
FORTRAN. Thus A**B would mean raising A to the power B.
4. Control structures to sequence the operations to be performed are Important. 'Thus a high level language should
provide control structures appropriate to express algorithms For example, a common control structure found In
a high level language is (of A > B then X:= Y + Z else X = P + Q; which means compare the numbers stored in A and
B. lf the number stored in A is larger than that stored in B then add the number stored In Y to that stored in Z
and place It in X, otherwise add the number stored In P with that stored In Q and place It In X''.
5. A set of words each with a precise and unambiguous meaning and a role to play in creating the program.
6. A set of syntax rules which precisely specify the combination of words and operators permissible in the language
For example, a language may specify that A * B is a legal combination In a language whereas A*/B may be illegal.
The syntax rules are rules of grammar valid for the language. These rules are derived systematically and their
number is kept small to enable users to memorize them.
7. A set of semantic rules which assign a single precise and unambiguous meaning to each legal syntactic structure
in the language. For example, the statement: C = B/D would have the meaning divide the number stored in B by
the one stored in D and store the result in C'' in a particular high level language.
8. A syntactically correct statement is not necessarily semantically meaningful. In natural language (English) for
instance the sentences: Ram play football” Football plays Ram'' are both syntactically correct. The second
sentence, however, is semantically meaningless. In high level languages for computers, there should be no
semantic ambiguity. Each syntactically correct structure should have one and only one semantic Interpretation.
This is in contrast with natural languages.
9. The syntax and semantic rules of the language, besides being concise and precise, should aid in understanding
the program. An understandable program is self-documenting and thus easily maintainable.
Besides this, the facility to intersperse the program with comments (which are not part of program) should provide
to aid program understanding.
High level languages are designed independent of the structure of a specific computer. This facilitates
3
executing a program written in such a language on different computers. Associated with each high level language is
Page
an elaborate computer program which translates it into the machine language of the computer in which it is to be
executed The translator program is normally written in the assembly language of that computer. Bellow
Figure explains various terms used in high level language translation. It illustrates how machine-
independence is achieved by using different translators to translate a high level language program to
machine languages of different computers.
Observe that one high level language statement is translated into many machine language statements. This is one to-
many translation. The terminology, high level language, arises due to this. An assembly language is a low level
language as its translation to machine language is one-to-one. It is possible to translate a high level language to one
at a lower level, but the reverse is not always possible.
There are two approaches to writing language translators. One method is to take one statement of a high
level language at a time and translate it into a machine instruction which is immediately executed. This is called an
interpreter.
Interpreters are easy to write and they do not require large memory space in the computer. The main
disadvantage of interpreters is that they require more time to execute on a computer.
The other approach to translation is to store the high level language program, scan it and translate the whole
program into an equivalent machine language program. Such a translator is known as a compiler. A compiler is a
complex program compared to an interpreter. It takes more time to compile than to interpret. However, a compiled
machine language program runs much faster than an interpreted program.
The difference between an interpreter and a compiler may be understood with the help of the following
analogy. Suppose we want to translate-a speech from Russian to English. There are two approaches one can use. The
translator can listen to sentence in Russian and immediately translate it to English. Alternatively, the translator can
listen to the whole passage in Russian and then give the equivalent English passage. If the speaker repeats the some
or a similar sentence, then, in the first case, the equivalent English sentence will also to repeated. In the second case,
the translation will be more concise as the English equivalent of a whole passage will be given and there will be no
repetitions. A person who can translate a whole passage has to be a better translator and must remember more
information than one who translates sentence-by-sentence. An interpreter is similar to sentence-by- sentence
translation whereas a compiler is similar to translation of the whole passage.
High level languages which have the power to express a general class of algorithms are known as procedure
oriented languages. These languages express in detail the procedure used to solve a problem. In other words, a
programmer gives details of how to solve a problem. Another class of high level languages is called problem oriented
languages. These languages are designed to solve a narrower class of problems. A user of such a language need not
express in detail the procedure used to solve the problem. Ready-made procedures are preprogrammed. The user
merely presents the Input data to the program in a flexible language''. For example, a problem oriented language
called STRESS (STRuctural Equation System Solver) .
The user need not specify how to solve the problem. He merely has to state what problem is to be solved
using the appropriate language. Recent popular problem oriented languages are MATLAB and MATHEMATICA.
MATLAB is popular among scientists and engineers to solve a wide class of problems modeled by differential
4
equations, and matrices. MATITFIMATICA is used to simplify complex algebraic expressions, find expressions
Page
Object
Source program
Analysis Synthesis
program (M/c
phase phase
(High level language of
language) Specified
Machine)
Lexical rules specify the valid syntactic elements or words of the language. Syntax rules specify how the valid
syntactic elements are combined to form statements of the language. Semantic rules assign meanings to valid
statements of the language.
For example, consider the following statement ln a high level language.
Principal= principal*( 1+ rate/100) The syntactic elements of the statement are' principal=,*,(,1,+, rate,/, 100 , and)
The syntactic elements principal and rate are called identifiers. The symbols : = , + , and / are operators. The
numbers 1 and 100 are Integer constants and the symbols (, ) and ; are called delimiters. Each syntactic element is
defined using the syntax rules of the language .The syntax rules are given using a notation called Backus Naur Form
abbreviated BNF In honour of Backus and Naur who Invented this notation to describe computer ' languages. Each
syntactic unit is given a name and shown as <name>. For example the syntactic unit digit is defined as <digit> --> 0 l
1 l 2 1 3 1 4 l 5 1 6 I 7 l 8 l 9 l ! The arrow --> represents “defined as” and the vertical bar | is used to represent ‘or’.
The above definition is thus read as: <dlglt> is defined as 0 or 1 or 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9. We define letter
as <letter>→ a | b | c .. . x | y | z ' In other words a <letter> is any one of the lower case English letters. These
characters are combined to form a syntactic unit called <identifier> which is defined as <identifier> := <letter> |
<identifier><letter> |<identifier><digit>. Observe that the above definition is defined in terms of Itself. This is called
a recursive function. Using this rule the following are valid identifiers: p, pr , pr2 , principal .
Some other rules are
<a.o>→ +|-|X|/ . where a.o is arithmetic operator.
<delimiters>→)|(|; .
<assignment operator>→ := .
Thus after defining “words” of the language next we define the sentences of the language using syntax rules.
<a.e> → <a.e> <a.o><a.e>. where <a.e> is arithmetic expression.
Next we have to assign meanings to syntactically correct units. The steps used in the process of translating a higher
level language source programme to executable code is given in fig. 9.7. The first block is a lexical analyzer (or
scanner). It reads successive lines of a program and breaks them into individual lexical items , namely,
Identifier, operator, delimiter etc… and attaches a type tag to each of these. Besides this it construct a symbol table
for each identifier & finds the internal representation of each constant. The symbol table ids used latter to allocate
memory to each variable.
The second stage of translation is called syntax analysis or parsing. In this phase expression, statements,
declaration etc… are identified by using the results of lexical analysis. Syntax analysis aided by using techniques
based on formal grammar of the programming language.
In the semantic analysis phase the syntactic units recognize by the syntax analyzer are processed. An
intermediate representation of the final M/C language code is produced. This phase bridges the analysis and
synthesis phases of the translation.
5
Page
lexical rules syntax rules semantic rules
intermediate code
Executable code
object code from other compilations
The last phase of the translation is code generation. A number of optimization to reduce the length of the
M/C language programs are carried out during this phase. The output of the code generator is the M/C language
programme of the specified computer. If a sub-programme library is used or if some sub routine are separately
translated & compiled a final linking & loading step is needed to produce the complete M/C language programme
ready for execution. If sub routines are separately compiled the address of the resulting M/C language instructions
will not be their final address, when all the routines are placed together in main memory. The linkers’ job is to find
the correct location of the final executable programme . The loader will then place then in the memory at their right
addresses.
Tools to build compilers.
As the methodology of analysis is well understand & applicable to a varity of language processors, tools have been
developed to automatically generate pgms for scanning the source code to identify the syntactic unit, parsing the
pgms & generating the intermediate code. 2 of these tools which are popular are called lex (lexical analyser) & yacc
(yet another compiler compiler).
Types of High Level Language
High-level languages are divided into two classes such as: procedural languages, and non-procedural languages.
In case of procedural language the program control is with the programmer and he decides in w hat sequence the
instructions are to be carried out. In the case of non-procedural languages the program control is left to the
system but the programmer instructs the system to produce certain results and does not specify the small steps
required in achieving the result. COBOL is a procedural language. There are some non-procedural languages that
fall in fourth generation language (4GL) category.
Some POPULAR HIGH LEVEL LANGUAGES
Some of the popular high-level languages are briefly introduced in the following section.
BASIC (Beginners all-purpose symbolic instruction code)
Prof. John G Kemeny and Thomas E. Kurtz developed the BASIC language in the year 1964 at Dartmouth College in
the USA. Their purpose was to develop a language that would be very easy to learn and program. A person with
little or no knowledge of computers and programming can learn BASIC programming in a short period of time.
As soon as a program in BASIC is being entered, its statements are checked for syntax errors which can be
immediately corrected. This feature of BASIC makes it one of the most popular. Though simple and easy to learn,
yet it is quite flexible and reasonably powerful. It can be used both for business and scientific applications. Probably
6
the greatest drawback of this language is that it has not yet been standardized. The language varies significantly
Page
from one computer system to another. Thus a BASIC program written on one computer may not work on
another unless modified.
FORTRAN
FORTRAN (FORMULA TRANSLATION) is one of the oldest and popular high-level languages. It was originally
developed by IBM for its 704 computer in 1957 to solve scientific and engineering problems. The language,
designed as an algebra-based programming language, any mathematical relationship can be easily expressed as a
FORTRAN instruction. FORTRAN program consists of a series of statements. These statements supply input/ output,
calculation, logic operation and other basic instructions to the computer.
PASCAL
This language is named after the French Mathematician Blaise Pascal and was first introduced in 1971 by Prof.
Niklaus Wirth of the Federal Institute of Technology, Switzerland. It is the first language to fully embody in an
organized way the concepts of structured programming. The language is relatively easy to learn and it allows the
programmer to structure the programming problem. The program is designed as modules and a main module,
which controls the program, calls the other modules. This language can be used for both scientific and commercial
applications.
ADA
This language is named after lady Ada Augustus Lovelace, daughter of Lord Byron & the 1st computer programmer.
She was also an associate of Charles Babbage. ADA is a general-Purpose language developed in 1980 at the
Honeywell Computer Company by a group of scientists headed by Ichbiah on request by the Department of
Defense of the US government for military applications. ADA is an extremely complicated language with a very
large number of features and capability to use normal packages. Another feature of ADA is the use of tasks. Tasks
are used to allow concurrent programming which is very useful for military applications.
C
C is a relatively new language and was designed at Bell Telephone Laboratories, USA. C is fast becoming the most
popular language. Like PASCAL, C is a block structured language and has several features that allow the user of
various concepts of structured programming. A special feature of C is that it allows the manipulation of internal
processor registers of the computer. Thus the language also enjoys the advantage of having some of the powers of
assembly language. Because of this feature, C is now being extensively used for systems programming like design of
compilers and operating systems. Most computer vendors of today supply this language along with their computer
systems.
COBOL
COBOL is a very structured language and it has very powerful data organization and file handling capabilities. The
programmer can define convenient data structures, design input and output formats and pet form operations on
these data structures using COBOL statements. COBOL is a programming language developed during the 1960's and
later standardized by ANSI. It is a procedural language and compared to fourth generation languages (4GLs),
programming in COBOL is a bit too cumbersome. But its closeness to English language and its superior and powerful
file handling facility make it most suitable for business data processing.
JAVA
Java is an object oriented programming language. The Java executable code is machine independent and
will run on 2"'\1 system like Macintosh, x8S, Pentium, silicon-graphics or So.. Sparc. Recently all major operating
system vendors have announced to incorporate Java in their operating system.
Other Languages
There are a large number of other high level languages. Of which, some are very popular such as Visual Basic, C++
and JavaScript. Similarly there are a number of programmer productivity tools that are called fourth generation
languages (4GL).
Fourth Generation Languages
These languages are considered to be superior to high-level languages. These are some programming packages
with built in database management facilities etc. that help in defining data, validating data, designing input and
output forms, handling queries etc. Compared to high-level languages, these languages require much less coding.
ORACLE, INGRES, SYBASE AND INFORMIX fall in this category.
Characteristics of a Good Language
7
QUESTIONS
1. What is computer language?
2. What are the types of computer language?
3. What is machine language? What are the advantages of machine language?
4. What is High Level Language? What are the advantages of high-level language?
5. Briefly explain any two high level languages.
6. What are the characteristics of a high level language?
7. What is procedural language?
8. What are the characteristics of a good programming language?
9. What is fourth generation language?
8
Page