Survey of Programming Languages
Survey of Programming Languages
FACULTY OF SCIENCES
Lagos Office
14/16 Ahmadu Bello Way
Victoria Island, Lagos
Departmental email: [email protected]
NOUN e-mail: [email protected]
URL: www.nou.edu.ng
First Printed 2022
ISBN: 978-058-557-5
January 2022
ii
CIT 332 COURSE GUIDE
COURSE
GUIDE
Course Team:
Dr. Kehinde Sotonwa - (Developer/Writer)
Dr. Greg Onwodi - Content Editor
iii
CIT 332 COURSE GUIDE
MAIN
COURSE
CONTENTS PAGE
Introduction………………………………………….…..…………………..…………….. v
Course Aim.………………………………………………….……………......................... v
Course Objectives………………………………………….……..……………………….. v
Course Justification…………………………………………………………………………vii
Assessment……………………………………………………....................................…… xiii
iv
CIT 332 COURSE GUIDE
Introduction
CIT 332: Survey of Programming Languages is a Second Semester course. It is a four (4) credit
degree course available to three hundred level students offering Computer
Science and all related courses, studying towards acquiring a Bachelor of Science in Computer
Science and other related disciplines.
The course is divided into five (5) modules and 12 study units. It entails the overview of
programming languages. It further deals with brief survey of programming languages. The course
also introduces language description declaration and types, abstraction mechanism and modules
in programming languages.
What you will Learn in this Course
The overall aim of this course is to teach you general overview of programming languages:
evolution, generation, characteristics, taxonomy of different level of programming languages and
difference between these level of programming languages. The course is designed to describe the
fundamental concepts of programming language by discussing the design issues of the various
language constructs, examining the design choices for these constructs in some of the most
common languages, and critically comparing design alternatives. Fundamental syntactic and
semantic concepts underlying modern programming languages, different modern programming
languages: C/C++, Java, LISP and PROLOG etc, their syntax: bindings and scope, data types and
type checking, functional scheme, expression of assignments, control structure, program
statements, program units etc., analyzes modules in programming languages and finally language
comparison.
Course Aim
This course aims to take a step further in teaching you the basic and best approach to survey
programming languages. It is hoped that the knowledge would enhance both the
programmer/developer expertise and Students to be familiar with popular programming languages
and the advantages they have over each other.
Course Objectives
It is important to note that each unit has specific objectives. Students should study them carefully
before proceeding to subsequent units. Therefore, it may be useful to refer to these objectives in
the course of your study of the unit to assess your progress. You should always look at the unit
objectives after completing a unit. In this way, you can be sure that you have done what is required
of you by the end of the unit.
The general objective of the course as an integral part of the Bachelor Degree for Computer
Science Students in National Open University, Abeokuta, is to:
• Demonstrate understanding of the evolution of programming languages and relate how this
history has led to the paradigms available today.
v
CIT 332 COURSE GUIDE
• Identify at least one outstanding and distinguishing characteristic for each of the
programming paradigms covered in this unit.
• Evaluate the tradeoffs between the different paradigms, considering such issues as space
efficiency, time efficiency (of both the computer and the programmer), safety, and power
of expression.
• Identify at least one distinguishing characteristic for each of the programming paradigms
covered in this unit.
• Describe the importance and power of abstraction in the context of virtual machines.
• Explain the benefits of intermediate languages in the compilation process.
• Evaluate the tradeoffs in reliability vs. writability.
• Compare and contrast compiled and interpreted execution models, outlining the relative
merits of each.
• Describe the phases of program translation from source code to executable code and the
files produced by these phases.
• Explain the differences between machine-dependent and machine-independent translation
and where these differences are evident in the translation process.
• Explain formal methods of describing syntax (backus-naur form, context-free grammars,
and parser tree).
• Describe the meanings of programs (dynamic semantics, weakest precondition).
• Identify and describe the properties of a variable such as its associated address, value,
scope, persistence, and size.
• Explain data types: primitive types, character string types, user-defined ordinal types, array
types, associative arrays, point and reference types
• Demonstrate different forms of binding, visibility, scoping, and lifetime management.
• Demonstrate the difference between overridden and overloaded subprograms
• Explain functional side effects.
• Demonstrate the difference between pass-by-value, pass-by-result, pass-by-value-result,
pass-by-reference, and pass-by-name parameter passing.
• Explain the difference between the static binding and dynamic binding.
• Discuss evolution, history, program structure and features of some commonly used
programming languages paradigm such as C/C++, Java and PROLOG.
• Examine, evaluate and compare these languages
vi
CIT 332 COURSE GUIDE
To complete this course, you are required to study all the units, the recommended text books, and
other relevant materials. Each unit contains some self-assessment exercises and tutor - marked
assignments, and at some point in this course, you are required to submit the tutor marked
assignments. There is also a final examination at the end of this course. Stated below are the
components of this course and what you have to do.
Course Justification
Any serious study of programming languages requires an examination of some related topics
among which are formal methods of describing the syntax and semantics of programming
languages and its implementation techniques. The need to use programming language to solve our
day-to-day problems grows every year. Students should be able to familiar with popular
programming languages and the advantage they have over each other. They should be able to know
which programming language solves a particular problem better. The theoretical and practical
knowledge acquired from this course will give the students a foundation from which they can
appreciate the relevant and the interrelationships of different programming languages.
Course Materials
1. Course Guide
2. Study Units
3. Text Books
4. Assignment Files
5. Presentation Schedule
Course Requirements
This is a compulsory course for all computer science students in the University. In view of this,
students are expected to participate in all the course activities and have minimum of 75%
attendance to be able to write the final examination.
Study Units
There are 5 modules and 12 study units in this course. They are:
vii
CIT 332 COURSE GUIDE
Bjarne, Dines; Cliff, B. Jones (1978). The Vienna Development Method: The Meta-Language,
Lecture Notes in Computer Science 61. Berlin, Heidelberg, New York: Springer.
Maurizio Gabbrielli and Simone Martini (2010). Programming Languages: Principles and
Paradigm, Springer-Verlag London Limited, 2010.
Allen B. Tucker and Robert Noonan. Programming Languages Principles and Paradigm. Mcgraw-
Hill Higher Education, 2006.
viii
CIT 332 COURSE GUIDE
Robert Harle, Object Oriented Programming IA NST CS and CST Lent 2009/10.
Aho, A. V., J. E. Hopcroft, and J. D. Ullman. The design and analysis of computer algorithms.
Boston: Addison-Wesley, 2007.
Kasabov NK, (1998) Foundations of Neural Networks, Fuzzy Systems and Knowledge
Engineering. The MIT Press Cambridge.
Bramer M (2005) Logic Programming with Prolog. Springer. USA 10. Prolog Development
Center (2001) Visual Prolog Version 5.x: Language Tutorial. Copenhagen, Denmark
Bergin, Thomas J. and Richard G. Gibson, eds. History of Programming Languages-II. New York:
ACM Press, 1996.
Christiansen, Tom and Nathan Torkington. Perlfaq1 Unix Manpage. Perl 5 Porters, 1997-1999.
Wexelblat, Richard L., ed. History of Programming Languages. New York: Academic Press,
1981.
Robers, Eric S. The Art and Science of C. Addison-Wesley Publishing Company. Reading: 1995.
Tucker: Programming Languages: Principles and paradigms, Tata McGraw –Hill. E Horowitz:
Programming Languages, 2nd Edition, Addison Wesley
Questions-on-principle-of-programming-language.pdf (oureducation.in),
https://blog/oureducation.in/wp-content/uploads/2013/Questions-on-principle -of-
programming-language.pdf#.
x
CIT 332 COURSE GUIDE
A. Pyster, Compiler Design and Construction. New York, NY: Van Nostrand Reinhold, 1988.
J. Tremblay, P. Sorenson, The Theory and Practice of Compiler Writing. New York, NY:
McGraw-Hill, 1985.
Semantics of Programming Languages Computer Science Tripos, Part 1B, Peter Sewell Computer
Laboratory, University of Cambridge, 2009.
Hennessy, M. (1990). The Semantics of Programming Languages. Wiley. Out of print, but
available on the web at http://www.cogs.susx.ac.uk/users/matthewh/ semnotes.ps.gz12.
Boston: Addison-Wesley, 2007.
C. W. Morris. Foundations of the theory of signs. In Writings on the Theory of Signs, pages 17–
74. Mouton, The Hague, 1938.
The Evolution of Lisp: ACM History of Programming Languages Gabriel and Steele’s (FTP).
A brief history of the Python programming language | by Dan Root | Python in Plain English
xii
CIT 332 COURSE GUIDE
Assignment File
The assignment file will be given to you in due course. In this file, you will find all the details of
the work you must submit to your tutor for marking. The marks you obtain for these assignments
will count towards the final mark for the course. Altogether, there are 12 tutor marked assignments
for this course.
Presentation Schedule
The presentation schedule included in this course guide provides you with important dates for
completion of each tutor marked assignment. You should therefore endeavor to meet the deadlines.
Assessment
There are two aspects to the assessment of this course. First, there are tutor marked assignments;
and second, the written examination. Therefore, you are expected to take note of the facts,
information and problem solving gathered during the course. The tutor marked assignments must
be submitted to your tutor for formal assessment, in accordance to the deadline given. The work
submitted will count for 40% of your total course mark. At the end of the course, you will need to
sit for a final written examination. This examination will account for 60% of your total score.
There are 12 TMAs in this course. You need to submit all the TMAs. The best 5 will therefore be
counted. The total marks for the best five (5) assignments will be 40% of your total course mark.
Assignment questions for the units in this course are contained in the Assignment File. You should
be able to complete your assignments from the information and materials contained in your set
textbooks, reading and study units. However, you may wish to use other references to broaden
your viewpoint and provide a deeper understanding of the subject.
When you have completed each assignment, send them to your tutor as soon as possible and make
certain that it gets to your tutor on or before the stipulated deadline. If for any reason you cannot
xiii
CIT 332 COURSE GUIDE
complete your assignment on time, contact your tutor before the assignment is due to discuss the
possibility of extension. Extension will not be granted after the deadline, unless on extraordinary
cases.
The final examination for the course will carry 60% percentage of the total marks available for
this course. The examination will cover every aspect of the course, so you are advised to revise all
your corrected assignments before the examination.
This course endows you with the status of a teacher and that of a learner. This means that you
teach yourself and that you learn, as your learning capabilities would allow. It also means that you
are in a better position to determine and to ascertain the what, the how, and the when of your
language learning. No teacher imposes any method of learning on you.
The course units are similarly designed with the introduction following the contents, then a set of
objectives and then the dialogue and so on. The objectives guide you as you go through the units
to ascertain your knowledge of the required terms and expressions.
Assessment Marks
Assignments 1-12 12 assignments, 40% for the best 5 total = 8% X 5 = 40%
Final Examination 60% of overall course marks
Total 100% of Course Marks
Course Overview
This table indicates the units, the number of weeks required to complete them and the assignments.
xiv
CIT 332 COURSE GUIDE
In distance learning the study units replace the university lecturer. This is one of the great
advantages of distance learning; you can read and work through specially designed study materials
at your own pace, and at a time and place that suit you best. Think of it as reading the lecture
instead of listening to a lecturer. In the same way that a lecturer might set you some reading to do,
the study units tell you when to read your set books or other material. Just as a lecturer might give
you an in-class exercise, your study units provide exercises for you to do at appropriate points.
Each of the study units follows a common format. The first item is an introduction to the subject
matter of the unit and how a particular unit is integrated with the other units and the course as a
whole. Next is a set of learning objectives. These objectives enable you know what you should be
able to do by the time you have completed the unit. You should use these objectives to guide your
study. When you have finished the units you must go back and check whether you have achieved
the objectives. If you make a habit of doing this, you will significantly improve your chances of
passing the course.
Remember that your tutor’s job is to assist you. When you need help, don’t hesitate to call and ask
your tutor to provide it.
2. Organize a study schedule. Refer to the ‘Course Overview’ for more details. Note the time
you are expected to spend on each unit and how the assignments relate to the units.
Whatever method you chose to use, you should decide on it and write in your own dates
for working on each unit.
xv
CIT 332 COURSE GUIDE
3. Once you have created your own study schedule, do everything you can to stick to it. The
major reason that students fail is that they lag behind in their course work.
4. Turn to Unit 1 and read the introduction and the objectives for the unit.
5. Assemble the study materials. Information about what you need for a unit is given in the
Overview at the beginning of each unit. You will almost always need both the study unit
you are working on and one of your set of books on your desk at the same time
6. Work through the unit. The content of the unit itself has been arranged to provide a
sequence for you to follow. As you work through the unit you will be instructed to read
sections from your set books or other articles. Use the unit to guide your reading.
7. Review the objectives for each study unit to confirm that you have achieved them. If you
feel unsure about any of the objectives, review the study material or consult your tutor.
8. When you are confident that you have achieved a unit’s objectives, you can then start on
the next unit. Proceed unit by unit through the course and try to pace your study so that you
keep yourself on schedule.
9. When you have submitted an assignment to your tutor for marking, do not wait for its return
before starting on the next unit. Keep to your schedule. When the assignment is returned,
pay particular attention to your tutor’s comments, both on the tutor-marked assignment
form and also written on the assignment. Consult your tutor as soon as possible if you have
any questions or problems.
10. After completing the last unit, review the course and prepare yourself for the final
examination. Check that you have achieved the unit objectives (listed at the beginning of
each unit) and the course objectives (listed in this Course Guide).
• You do not understand any part of the study units or the assigned readings
xvi
CIT 332 COURSE GUIDE
• You have a question or problem with an assignment, with your tutor’s comments on an
assignment or with the grading of an assignment.
You should try your best to attend the tutorials. This is the only chance to have face to face contact
with your tutor and to ask questions which are answered instantly. You can raise any problem
encountered in the course of your study. To gain the maximum benefit from course tutorials,
prepare a question list before attending them. You will learn a lot from participating in discussions
actively.
xvii
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 What is Programming Language?
3.2 Classification of Programming Languages
3.2.1 First Generation
3.2.2 Second Generation
3.2.3 Third Generation
3.2.4 Fourth Generation
3.2.5 Fifth Generation
3.3 Characteristics of Each Generation
3.3.1 Computer Characteristics and Capabilities for 1GL
3.3.2 Computer Characteristics and Capabilities for 2GL
3.3.3 Computer Characteristics and Capabilities for 3GL
3.3.4 Computer Characteristics and Capabilities for 4GL
3.3.5 Computer Characteristics and Capabilities for 5GL
3.4 Taxonomy of Programming Language
3.4.1 Low Level Language
3.4.2 Middle Level Language
3.4.3 High Level Language
3.4.4 Higher Level Language
3.4.5 Translation used by Programming Languages
3.4.6 Some General Comments on High Level Languages
3.4.7 Difference between Low Level Language and High Level Language
4.0 Self-Assessment Exercise(s)
5.0 Conclusion
6.0 Summary
7.0 Tutor-Marked Assignment
8.0 References/Further Reading
1.0 INTRODUCTION
Ever since the invention of Charles Babbage’s difference engine in 1822, computers have required
a means of instructing them to perform a specific task. This means is known as a programming
language. Computer languages were first composed of a series of steps to wire a particular
1
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
program; these morphed into a series of steps keyed into the computer and then executed; later
these languages acquired advanced features such as logical branching and object orientation. The
computer languages of the last fifty years have come in two stages, the first major languages and
the second major languages, which are in use today.
Most computer programming languages were inspired by or built upon concepts from previous
computer programming languages. Today, while older languages still serve as a strong foundation
for new ones, newer computer programming languages make programmers’ work simpler.
Businesses rely heavily on programs to meet all of their data, transaction, and customer service
needs. Science and medicine need accurate and complex programs for their research. Mobile
applications must be updated to meet consumer demands. And all of these new and growing needs
ensure that computer programming languages, both old and new, will remain an important part of
modern life.
2.0 OBJECTIVES
Before getting into computer programming, what is computer programs and what they do. A
computer program is a sequence of instructions written using a Computer Programming Language
to perform a specified task by the computer. The two important terms used in the above definition
are:
• Sequence of instructions
• Computer Programming Language
To understand these terms, consider a situation when someone asks you the road to a particular
place say available hospital? You will direct the person exactly the road? You will use Human
Language to say your it. something as follows:
First go straight, after half kilometer, take left from the red light and then drive around one
kilometer and you will find Hospital or Clinic at the right. Here, you have used English Language
to give several steps to be taken to reach Hospital. If they are followed in the following sequence,
then you will reach available Hospital:
Step1 Go straight
Step 2 Drive half kilometer
Step 3 Take left
2
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Now, try to map the situation with a computer program. The above sequence of instructions is
actually a Human Program written in English Language, which instructs on how to reach nearby
Hospital from a given starting point. This same sequence could have been given in Spanish, Hindi,
Arabic, or any other human language, provided the person seeking direction knows any of these
languages.
The above computer program instructs the computer to print "Hello, World!" on the computer
screen. A computer program is also called a computer software, which can range from two lines
to millions of lines of instructions.
• Computer program instructions are also called program source code and computer
programming is also called program coding.
• A computer without a computer program is just a dump box; it is programs that make
computers active.
Computer programming is the process that professionals use to write code that instructs how a
computer, application or software program performs. At its most basic, computer programming is
a set of instructions to facilitate specific actions. A programming language is a
computer language that is used by programmers (developers) to communicate with computers. It
is a set of instructions written in any specific language to perform a specific task. It is made up of
a series of symbols that serves as a bridge that allow humans to translate our thoughts into
instructions computers can understand. Human and machines process information differently, and
programming languages are the key to bridging the gap between people and computers therefore
the first computer programming language was created.
Generations of computers have seen changes based on evolving technologies. With each new
generation, computer circuitry, size, and parts have been miniaturized, the processing and speed
doubled, memory got larger, and usability and reliability improved. Note that the timeline specified
for each generation is tentative and not definite. The generations are actually based on evolving
chip technology rather than any particular time frame.
The five generations of computers are characterized by the electrical current flowing through the
processing mechanisms listed below:
• Artificial intelligence
Assembly languages: the obvious drawbacks of binary programming became smaller by the
introduction of second generation languages (2GL). These languages allowed mnemonic
abbreviations as symbolic names and the concept of commands and operands was introduced. A
programmer's work became much easier, since the symbolic notation and addressing of
instructions and data was now possible. Compilation systems, called assemblers, were developed
to translate the assembly language/symbolic programs into machine code. Assembly languages
still reflect the hardware structure of the target machine - not on the flip-flop level, but on the
register level, i.e. the abstraction has changed from the flip-flop to the register level. The
instruction set of the target computer directly determines the scope of an assembly language. With
the introduction of linking mechanisms, assembling of code separately became possible and the
first steps towards program structuring were recognizable, although the term structured
programming cannot be used for programming assembly code. The major disadvantage of
assembly languages is that programs are still machine dependent and, in general, only readable by
the authors.
Third generation languages /high level languages/Problem oriented languages: these languages
allow control structures which are based on logical data objects: variables of a specific type. They
provide a level of abstraction allowing machine-independent specification of data, functions or
processes, and their control. A programmer can now focus on the problem to be solved without
being concerned with the internal hardware structure of the target computer. When considering
high level programming languages, there are four groups of programming languages: imperative,
declarative, object oriented and functional languages.
Fourth generation languages/Non procedural languages deal with the following two fields which
become more and more important: database and query languages, and program or application
generators. The steadily increasing usage of software packages like database systems, spread
sheets, statistical packages, and other (special purpose) packages makes it necessary to have a
4
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
medium of control available which can easily be used by non-specialists. In fourth generation
languages the user describes what he wants to be solved, instead of how he wants to solve a
problem - as it is done using procedural languages. In general, fourth generation languages are not
only languages, but interactive programming environments. E.g. SOL (Structured Query
Language): a query language for relational databases based on Codd's requirements for non-
procedural query languages. Another example is NATURAL emphasizing on a structured
programming style. Program or application generators are often based on a certain specification
method and produce an output (e.g. a high level program) to an appropriate specification. There
exist already a great number of fourth generation languages:
5GL is a programming language based around solving problems using constraints given to
program rather using an algorithm written by a programmer. 5GL allows computers to have their
own ability to think and their own inferences can be worked out by using the programmed
information in large databases. 5GL gave birth to the dream of robot with AI and Fuzzy Logic.
The fifth-generation languages are also called 5GL. It is based on the concept of artificial
intelligence.
It uses the concept that that rather than solving a problem algorithmically, an application can be
built to solve it based on some constraints, i.e., we make computers learn to solve any problem.
Parallel Processing & superconductors are used for this type of language to make real artificial
intelligence. Advantages of this generation is that machines can make decisions, it reduces
programmer effort to solve a problem and very easier than 3GL or 4GL to learn and use.
Examples are: PROLOG, LISP, etc.
5
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• Size – Smaller than Second Generation Computers. Disk size mini computers.
• Speed – Relatively fast as compared to second generation, Million instructions per second
(MIPS).
• Cost – cost lower than Second generation.
• Language– High level languages like PASCAL, COBOL, BASIC, C etc.
• Reliability – Failure of circuits in Weeks.
• Power– Low power Consumption.
• Main Component – Large scale integrated (LSI) Semiconductor circuits called MICRO
PROCESSOR or chip and VLSI (Very Large scale integrated).
• Main Memory – Semiconductor memory like RAM, ROM and cache memory is used as
a primary memory.
• Secondary Memory – Magnetic disk, Floppy disk, and Optical disk (CD, DVD).
• Input Media – keyboard.
• Output Media – Video displays, Audio responses and printed reports.
• Example – CRAY 2, IBM 3090/600 Series, IBM AS/400/B60 etc.
• Main Component – based on ULSI (Ultra Large scale integrated) Circuit .that is also
called Parallel Processing method.
• Memory – Optical disk and magnetic disk.
• Input Media – Speech input, Tactile input.
• Output Media – Graphics displays, Voice responses.
• Example – Lap-Tops, palm –Tops, Note books, PDA (personal Digital Assistant) etc.
Till now, thousands of programming languages have come into form. All of them have their own
specific purposes. All of these languages have a variation in terms of the level of abstraction that
they all provide from the hardware. A few of these languages provide less or no abstraction at all,
while the others provide a very high abstraction.
On the basis of the programming language level of abstraction, there are two types of programming
languages: Low-level language and High-level language. The first two generations are called low-
level languages. The next three generations are called middle level language, very high-level
languages and higher level language. Both of these are types of programming languages that
provide a set of instructions to a system for performing certain tasks. Though both of these have
specific purposes, they vary in various ways.
The primary difference between low and high-level languages is that any programmer can
understand, compile, and interpret a high-level language feasibly as compared to the machine. The
7
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
machines, on the other hand, are capable of understanding the low-level language more feasibly
compared to human beings. Below is the hierarchy of level of programming language.
Low-level languages are considered to be closer to computers. In other words, their prime function
is to operate, manage and manipulate the computing hardware and components. Programs and
applications written in a low-level language are directly executable on the computing hardware
without any interpretation or translation.
It is hardware dependent language. Each processor has kept its own instruction set, and these
instructions are the patterns of bits. There is the class of processors using the same structure, which
is specified as an instruction set; any instruction can be divided into two parts: the operator or
opcode and operand. The starting bits are known as the operator or opcode whose role is to identify
the kind of operation that are required to be performed. The remaining bits are called operand,
whose purpose is to show the location of activity. The programs are written in various
programming languages like C, C++. Java python etc. The computer is not capable of
8
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
understanding these programming languages therefore, programs are compiled through compiler
that converts into machine language. Below is sample of how machine language is executed.
• Fast processing: All the computer instructions are directly recognized. Hence, programs
run very quickly.
• No need of Translator: The machine code is the natural language directly understood by
the computer. Therefore, translators are not required. Assembly language on the other hand
requires assemblers for conversion from mnemonics to equivalent machine code.
• Error prone: The instructions are written using 0's and 1's. This makes it a herculean task.
Hence, there are more chances of error prone codes in these languages. Errors are reduced
in assembly language when compared to machine language.
• Difficult to use: The binary digits (0 and 1) are used to represent the complete data or
instructions, so it is a tough task to memorize all the machine codes for human beings.
Assembly language is easier to use when compared to machine language.
• Difficulty in debugging: When there is a mistake within the logic of the program then it
is difficult to find out the error (bug) and debug the machine language program.
• Difficult to understand: It is very difficult to understand because it requires a substantial
knowledge of machine code for different system architectures.
• Lack of portability: They are computer dependent such that a program which is written
for one computer cannot be run on different computer because every computer has unique
architecture.
• In-depth Knowledge of Computer Architecture: To program in the Assembly
languages, the programmer need to understand the computer architecture as for as the
machine languages.
9
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
The middle-level language is an output of any programming language, which is known as source
code. The source code is written in a high-level language. This kind of middle-level language is
designed to improve the translated code before the processor executes it. The improvement
schedule helps to adjust the source code according to the computational framework of the target
machine of various languages.
The processor of CPU does not directly execute the source code of the middle-level language. It
needs interpretation into binary code for the execution. The improvement schedule helps to adjust
the source code according to the computational framework of the target machine of various
languages. The processor of CPU does not directly execute the source code of the middle-level
language. It needs interpretation into binary code for the execution. See the diagram of overlapping
of middle level language below:
10
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
The high-level language is a programming language that allows a programmer to write the
programs which are independent of a particular type of computer. The high-level languages are
considered as high-level because they are closer to human languages than machine level language.
When writing a program in a high-level language, then the whole attention needs to be paid to the
logic of the problem. A compiler is required to translate a high-level language into a low-level
language as seen in the diagram below.
The program of high-level language must be interpreted before the execution. The high-level
language deal with the variables, arrays, objects, complex arithmetic or Boolean expression,
subroutines and functions, loops, threads, locks, etc. The high-level languages are closer to human
languages and far from machine languages. It is similar to human language, and the machine is
not able to understand this language.
#include<stdio.h>
int main()
{
Printd(“hello”);
getch()
return 0;
}
This is the example of C language, which is a middle-level language because it has the feature of
both the low and high-level language. The human can understand this example easily, but the
machine is not able to understand it without the translator. Every high-level language uses a
different type of syntax. Some languages are designed for writing desktop software programs, and
other languages are used for web development. The diagram below showed how high level
language is translated to low level language for execution.
11
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• Third generation Language: High Level Languages are 3rd Generation Languages that
are majorly refinement to the 2nd generation languages.
• Understandability: programs written in High-level language is easily understood
• Debugging: Errors can be easily located in theses languages because of the presence of
standard error handling features.
• Portability: programs written on one computer architecture can run on another computer.
This also means that they are machine independent.
• Easy to Use: Since they are written in English like statements, they are easy to use.
• Problem-oriented Languages: High level languages are problem oriented such that they
can be used for specific problems to be solved.
• Easy Maintenance: We can easily maintain the programs written in these languages.
• Translator Required: There is always a need to translate the high-level language program
to machine language.
These are fifth Generation Languages (5GLs). Fifth Generation systems (5GSs) are characterized
by large scale parallel processing (many instructions being executed simultaneously), different
memory organizations, and novel hardware operations predominantly designed for symbol
manipulation. Popular example of 5GL is prolog. These type of languages requires design of
interface between human being and computer to permit affective use of natural language and
images.
12
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Programming languages are artificial languages and they use translators for the computer system
to convert them to usable forms. There are three levels of programming languages: They are
Machine Language (ML), Low Level Language (al known as Assembly Language) and High Level
Language. The translators used by these languages differ and they have been classified into:
Compiler, Interpreter or Assembler. Machine Language does not use translator.
Low level language uses a translator called Assembler. An Assembler converts program written
in Low Level Language (Assembly Lang) to machine code. High Level Language uses compiler,
interpreter or both. Our emphasis in this course is on some selected High Level Languages. Some
examples of HLL include C, C++, Delphi, PASCAL, FORTRAN, Scala, Python, PERL, Delphi,
QBASIC and so on. The language that will be considered mostly are: C, Java, Python and
VBscript.
These all languages are considered as the high-level language because they must be processed with
the help of a compiler or interpreter before the code execution. The source code is written in
scripting languages like Perl and PHP can be run by the interpreter. These languages can convert
the high-level code into binary code so that the machine can understand as seen in the diagram
below.
The compiler is the translator program software. This software can translate into its equivalent
machine language program. The compiler compiles a set of machine language instructions for
every program in a high-level language. Below is the sample of compiler from high level language
to machine language.
13
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
The linker is used for the large programs in which we can create some modules for the different
task. When we call the module, the whole job is to link to that module and program is processed.
We can use a linker for the huge software, storing all the lines of program code in a single source
file. The interpreter is the high-level language translator. It takes one statement of the high-level
language program and translates it into machine level language instruction seen from the diagram
below. Interpreter immediately executes the resulting machine language instruction. The compiler
translates the entire source program into an object program, but the interpreter translates line by
line.
The high-level language is easy to read, write, and maintain as it is written in English like words.
The languages are designed to overcome the limitation of low-level language, i.e., portability. The
high-level language is portable; i.e., these languages are machine-independent.
14
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
15
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Speed of High-level languages take more time The translation speed of low-level
Execution for execution as compared to low- languages is very high.
level languages because these require
a translation program.
Abstraction High-level languages allow a higher Low-level languages allow very
abstraction. little abstraction or no abstraction
at all.
Need of Hardware One does not require a knowledge of Having knowledge of hardware is
hardware for writing programs. a prerequisite to writing programs.
Facilities Provided High-level languages do not provide Low-level languages are very
various facilities at the hardware close to the hardware. They help in
level. writing various programs at the
hardware level.
Ease of The process of modifying programs is The process of modifying
Modification very difficult with high-level programs is very easy in low-level
programs. It is because every single programs. Here, it can directly
statement in it may execute a bunch map the statements to the
of instructions. processor instructions.
Examples Some examples of high-level Some examples of low-level
languages include Perl, BASIC, languages include the Machine
COBOL, Pascal, Ruby, etc. language and Assembly language.
5.0 CONCLUSION
The study of programming languages is valuable for some important reasons: It gives insight to
generation of programming languages, enables to know background of computer as a whole, and
the components attached to each of the generation. In terms of speed, programs written in low-
level languages are faster than those written in middle and high-level languages. This is because
these programs do not need to be interpreted or compiled. They interact directly with the registers
and memory. On the other hand, programs written in a high-level language are relatively slower.
All these were discussed in this unit.
6.0 SUMMARY
This unit has explained what programming language is, classification and explanation of different
programming language generation, basic components of each computer programming generation.
You also saw different characteristics of each programming language generation in terms of
16
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
computer characteristic, their capabilities, trend and development in computer hardware for
different generation.
There are clear differences between high-level, mid-level, and low-level programming languages.
We can also point out that each type of programming language is designed to serve its specific
purpose. For this reason, depending on the purpose each programming language is serving is what
make one type of programming better and prefer over the other.
17
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Language paradigm
3.1.1 Procedural languages
3.1.2 Object Oriented languages
3.1.3 Functional languages
3.1.4 Declarative-Non Algorithmic languages
3.1.5 Scripting languages
4.0 Self-Assessment Exercise(s)
5.0 Conclusion
6.0 Summary
7.0 Tutor-Marked Assignment
8.0 References/Further Reading
1.0 INTRODUCTION
When programs are developed to solve real-life problems like inventory management, payroll
processing, student admissions, examination result processing, etc., they tend to be huge and
complex. The approach to analyzing such complex problems, planning for software development
and controlling the development process is called programming methodology.
New software development methodologies (e.g. object Oriented Software Development) led to
new paradigms in programming and by extension, to new programming languages. A
programming paradigm is a pattern of problem-solving thought that underlies a particular genre of
programs and languages. Also a programming paradigm is the concept by which the methodology
of a programming language adheres to.
2.0 OBJECTIVES
18
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Paradigm is a model or world view. Paradigms are important because they define a programming
language and how it works. A great way to think about a paradigm is as a set of ideas that a
programming language can use to perform tasks in terms of machine-code at a much higher level.
These different approaches can be better in some cases, and worse in others. A great rule of thumb
when exploring paradigms is to understand what they are good at. While it is true that most modern
programming languages are general-purpose and can do just about anything, it might be more
difficult to develop a game, for example, in a functional language than an object-oriented language.
Many people classify languages into these main paradigms:
These are mostly influenced by the von Neumann computer architecture. Problem is broken down
into procedures, or blocks of code that perform one task each. All procedures taken together form
the whole program. It is suitable only for small programs that have low level of complexity.
Typical elements of such languages are assignment statements, data structures and type binding,
as well as control mechanisms; active procedures manipulate passive data objects. Example: For
a calculator program that does addition, subtraction, multiplication, division, square root and
comparison, each of these operations can be developed as separate procedures. The key concepts
of imperative programming languages are:
• Variables
19
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• Commands
• Procedures
• Data abstraction.
In the main program each procedure would be invoked on the basis of user’s choice. E.g.
FORTRAN Algol, Pascal, C/C++, C#, Java, Perl, JavaScript, Visual BASIC.NET.
know the object representing data as well as procedures. Data structures and their appropriate
manipulation processes are packed together to form a syntactical unit. Here the solution revolves
around entities or objects that are part of problem. The solution deals with how to store data related
to the entities, how the entities behave and how they interact with each other to give a cohesive
solution. Example − If we have to develop a payroll management system, we will have entities
like employees, salary structure, leave rules, etc. around which the solution must be built. The key
concepts of OOP are:
• Objects
• Classes
• Subclasses
• Inheritance
• Inclusion polymorphism.
Object Oriented Programming has been widely accepted in the computing world because objects
give a very natural way to model both real-world and cyber-world entities. Classes and class
hierarchies lead to highly suitable as well as reusable units for constructing large programs. Object-
oriented programming fits well with object-oriented analysis and design and hence supports these
development of large software systems. E.g. SIMULA 67, SMALLTALK, C++, Java, Python, C#,
Perl, Lisp or EIFFEL.
These type of languages have no assignment statements. Their syntax is closely related to the
formulation of mathematical functions. Thus, functions are central for functional programming
languages. Here the problem, or the desired solution, is broken down into functional units. Each
unit performs its own task and is self-sufficient. These units are then stitched together to form the
complete solution. Example − A payroll processing can have functional units like employee data
maintenance, basic salary calculation, gross salary calculation, leave processing, loan repayment
processing, etc. The key concepts of functional programming are:
20
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
naturally polymorphic, and parametric polymorphism greatly magnifies the power and
expressiveness of a functional language.
• Data abstraction is a key concept in the more modern functional languages such as ML and
HASKELL. Data abstraction supports separation of important issues, which is essential for
the design and implementation of large programs.
• Lazy evaluation is based on the simple notion that an expression whose value is never used
need never be evaluated. E.g. LISP, Scala, Haskell, Python, Clojure, Erlang. It may also
include OO (Object Oriented) concepts.
Facts and rules (the logic) are used to represent information (or knowledge) and a logical inference
process is used to produce results. In contrast to this, control structures are not explicitly defined
in a program, they are part of the programming language (inference mechanism). Here the problem
is broken down into logical units rather than functional units. Example: In a school management
system, users have very defined roles like class teacher, subject teacher, lab assistant, coordinator,
academic in-charge, etc. So the software can be divided into units depending on user roles. Each
user can have different interface, permissions, etc. e.g. PROLOG, PERL, this may also include
OO concepts. The key concepts of logic programming are therefore:
• Assertions
• Horn clauses
• Relations
A scripting language or script language is a programming language for a runtime system that
automates the execution of tasks that would otherwise be performed individually by a human
operator. Scripting languages are usually interpreted at runtime rather than compiled. Scripting
languages are a popular family of programming languages that allow frequent tasks to be
performed quickly. Early scripting languages were generally used for niche applications – and as
“glue languages” for combining existing systems. With the rise of the World Wide Web, a range
of scripting languages emerged for use on web servers. Since scripting languages simplify the
processing of text, they are ideally suited to the dynamic generation of HTML pages.
The user can learn to code in scripting languages quickly, not much knowledge of web
technology is required. It is highly efficient with the limited number of data structures and
variables to use, it helps in adding visualization interfaces and combinations in web pages. There
are different libraries which are part of different scripting languages. They help in creating new
applications in web browsers and are different from normal programming languages. Examples
are Node js, JavaScript, Ruby, Python, Perl, bash, PHP etc. The key concept includes:
High-level string processing: All scripting languages provide very high-level support for string
processing. The ubiquitous nature of textual data such as e-mail messages, database queries and
results and HTML documents necessitated the need for High-level string processing. High-level
21
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
graphical user interface support: High-level support for building graphical user interfaces (GUIs)
is vital to ensure loose coupling between the GUI and the application code. This is imperative
because GUI can evolve rapidly as usability problems are exposed.
Dynamic typing: Many scripting languages are dynamically typed. Scripts must be able to pass
data to and from subsystem written in different languages when used as glue. Scripts often process
heterogeneous data, whether in forms, databases, spreadsheets, or Web pages.
5.0 CONCLUSION
The unit has explained what programming language is, classification and explanation of different
programming language generation, basic components of each computer programming generation.
You also saw different characteristics of each programming language generation in terms of
computer characteristic, their capabilities, trend and development in computer hardware for
different generation.
6.0 SUMMARY
All programming paradigms have their benefits to both education and ability. Functional languages
historically have been very notable in the world of scientific computing. Of course, taking a list of
the most popular languages for scientific computing today, it would be obvious that they are all
multi-paradigm. Object-oriented languages also have their fair share of great applications. Software
development, game development, and graphics programming are all great examples of where
object-oriented programming is a great approach to take.
The biggest note one can take from all of this information is that the future of software and
programming language is multi-paradigm. It is unlikely that anyone will be creating a purely
functional or object-oriented programming language anytime soon. If you ask me, this isn’t such a
bad thing, as there are weaknesses and strengths to every programming approach that you take, and
a lot of true optimization is performing tests to see which methodology is more efficient or better
than the other overall. This also puts a bigger thumbtack into the idea that everyone should know
multiple languages from multiple paradigms. With the paradigms merging using the power of
generics, it is never known when one might run into a programming concept from an entirely
different programming language
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 What are Major Reasons for Studying Concepts of Programming Languages
3.2 Roles of Programming Language
3.3 Language Evaluation Criteria
4.0 Self-Assessment Exercise(s)
5.0 Conclusion
6.0 Summary
7.0 Tutor-Marked Assignment
8.0 References/Further Reading
1.0 INTRODUCTION
It is natural for students to wonder how they will benefit from the study of programming language
concepts. After all, many other topics in computer science are worthy of serious study. The
following is what we believe to be a compelling list of potential benefits of studying concepts of
programming languages.
2.0 OBJECTIVES
23
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• To increase the student’s capacity to use different constructs and develop effective
algorithms.
• To improve use of existing programming and enable students to choose language more
intelligently.
• To make learning new languages easier and design your own language.
• To encounter fascinating ways of programming you might never have imagined before.
It is natural for students to wonder how they will benefit from the study of programming language
concepts. After all, many other topics in computer science are worthy of serious study. The
following is what we believe to be a compelling list of potential benefits of studying concepts of
programming languages.
understanding the choices among programming language constructs and the consequences
of those choices. Certain kinds of program bugs can be found and fixed only by a
programmer who knows some related implementation details. It allows visualization of
how a computer executes various language constructs. It provides hints about the relative
efficiency of alternative constructs that may be chosen for a program. For example,
programmers who know little about the complexity of the implementation of subprogram
calls often do not realize that a small subprogram that is frequently called can be a highly
inefficient design choice.
• Better use of languages that are already known: Many contemporary programming
languages are large and complex. It is uncommon for a programmer to be familiar with and
use all of the features of a language uses. By studying the concepts of programming
languages, programmers can learn about previously unknown and unused parts of the
languages they already use and begin to use those features.
• The overall advancement of computing: The study of programming language concepts
should be justified and the choose languages should be well informed so that better
languages would eventually squeeze out poorer ones.
Programming languages evolve and eventually pass out of use. Algo from 1960 is no longer used
and was replaced by Pascal which in turn was replaced by C++ and Java. In addition, the older
languages still is used have undergone periodic revisions to reflect changing influence from other
areas of computing. As result, newer languages reflect a composite of experiences gain in the
design and use of older languages. Having stated that, the roles of programming language are:
• To improve computer capability: Computer have evolved from the small, old and costly
vacuum tubes machine of the 1950`s to the super and microcomputers of today. At the
same time layer of OS software have been inserted between the programming language
and they underlying computer hardware. This factors have influence both the structure and
cost of using the features of high-level language.
• Improved application: Computer use has moved rapidly from the original concentration
on military, scientific business and industrial application where the cost could have been
justified to computer games, artificial intelligence, robotics, machine learning and
application in learning of every human activity. The requirement of these new application
areas influences the designs of new language and the revision and extension of older ones.
• Improved programming method: Language designs have evolved to reflect our changing
understanding of good method for writing large and complex programs. It has also reflected
the changing environment in which programming is done.
• Improved implementation method: The development of better implementation method
has affected the change of features in the new languages.
• Standardization: The development of language that can be implemented easily on variety
of computer system has made it easy for programs to be transported from one computer to
another. This has provided a strong conservative influence on the evolution of language
designs.
25
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• Expressivity: means the ability of a language to clearly reflect the meaning intended by
the algorithm designer (the programmer). Thus an “expressive” language permits an
utterance to be compactly stated, and encourages the use of statement forms associated
with structured programming (usually “while “loops and “if – then – else” statements).
• Well-Definedness: By “well-definiteness”, we mean that the language’s syntax and
semantics are free of ambiguity, are internally consistent and complete. Thus the
implementer of a well-defined language should have, within its definition a complete
specification of all the language’s expressive forms and their meanings. The programmer,
by the same virtue should be able to predict exactly the behavior of each expression before
it is actually executed.
• Data types and structures: By “Data types and Structures”, we mean the ability of a
language to support a variety of data values (integers, real, strings, pointers etc.) and non-
elementary collect ions of these.
• Readability: One of the most important criteria for judging a programming language is
the ease with which programs can be read and understood. Maintenance was recognized as
a major part of the cycle, particularly in terms of cost and once the ease of maintenance is
determined in large part by the readability of programs, readability became an important
measure of the quality of programs and programming languages.
• Overall Simplicity: The overall simplicity of a programming language strongly affects its
readability. A language with a large number of basic constructs is more difficult to learn
than one with a smaller number.
• Modularity: Modularity has two aspects: the language’s support for sub-programming and
the language’s extensibility in the sense of allowing programmer – defined operators and
data types. By sub programming, we mean the ability to define independent procedures
and functions (subprograms), and communicate via parameters or global variables with the
invoking program.
• Input-Output facilities: In evaluating a language’s “Input-Output facilities” we are
looking at its support for sequential, indexed, and random access files, as well as its support
for database and information retrieval functions.
• Portability: A language which has “portability” is one which is implemented on a variety
of computers. That is, its design is relatively “machine – independent”. Languages which
are well- defined tend to be more portable than others.
• Efficiency: An “efficient” language is one which permits fast compilation and execution
on the machines where it is implemented. Traditionally, FORTRAN and COBOL have
been relatively efficient languages in their respective application areas.
• Orthogonality: in a programming language means that a relatively small set of primitive
constructs can be combined in a relatively small number of ways to build the control and
data structures of the language. Furthermore, every possible combination of primitives is
legal and meaningful.
• Pedagogy: Some languages have better “pedagogy” than others. That is, they are
intrinsically easier to teach and to learn, they have better textbooks; they are implemented
in a better program development environment, they are widely known and used by the best
programmers in an application area.
26
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
5.0 CONCLUSION
This unit has explained different reasons for studying concepts of programming languages, enables
students to know language more intelligently, make programming easy to learn, enhanced problem
solving skills, help to choose language appropriate for a particular project, internally diverse
networking, it has also created opportunities for invention and innovation which enable to learning
new languages.
Among the most important criteria for evaluating languages are readability, writability, reliability,
and overall cost. These will be the basis on which we examine and judge the various language
features that were discussed.
6.0 SUMMARY
Like any language spoken or written, it is easier to learn earlier in life. Computer languages teach
us logical skills in thinking, processing and communicating. Combined with creative visions, some
of the most influential products were designed around a programming language. The best
27
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
inventions are born where skills and creativity meet. After all, who is better than the young
generations of the world to imagine the future of our technology? One of the benefit of learning
how to code at a young age is enhanced academic performance. Learning how this technology
works will prepare children for a quick advancing world in which computers and smartphones are
utilized for almost every function of our daily lives.
Evaluation results are likely to suggest that your program has strengths as well as limitations,
which should not be a simple declaration of program success or failure. Evidence that your
program is not achieving all of its ambitious objectives can be hard to swallow, but it can also help
to learn where to best to put limited resources. A good evaluation is one that is likely to be
replicable, meaning that the same evaluation should be conducted and get the same results. The
higher the quality of the evaluation design, its data collection methods and its data analysis, the
more accurate its conclusions and the more confident others will be in its findings.
29
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Syntactic Structure
3.1.1 Language Recognizer
3.1.2 Language Generator
3.1.3 Syntactic Ambiguity
3.2 Abstract Syntax Tree
3.2.1 Parsing
3.2.2 Parse Trees
3.2.3 Parser
3.2.4 Types of Parser
3.3 Expression Notations
3.3.1 Overloaded Operators
3.3.2 Short Circuit Evaluation
3.3.3 Categories of Expression
3.3.4 Operator Precedence
3.4 Lexical Syntax
3.5 Grammar for Expressions and Variants Grammars
4.0 Self-Assessment Exercise(s)
5.0 Conclusion
6.0 Summary
7.0 Tutor-Marked Assignment
8.0 References/Further Reading
1.0 INTRODUCTION
understand the description. Among these are initial evaluators, implementers, and users. Most new
programming languages are subjected to a period of scrutiny by potential users, often people within
the organization that employs the language’s designer, before their designs are completed. These
are the initial evaluators. The success of this feedback cycle depends heavily on the clarity of the
description. Programming language implementers obviously must be able to determine how the
expressions, statements, and program units of a language are formed, and also their intended effect
30
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
when executed. The difficulty of the implementers’ job is, in part, determined by the completeness
and precision of the language description.
Finally, language users must be able to determine how to encode software solutions by referring
to a language reference manual. Textbooks and courses enter into this process, but language
manuals are usually the only authoritative printed information source about a language. The study
of programming languages, like the study of natural languages, can be divided into examinations
of syntax and semantics. The syntax of a programming language is the form of its expressions,
statements, and program units while its semantics is the meaning of those expressions, statements,
and program units.
The semantics of this statement form is that when the current value of the Boolean expression is
true, the embedded statement is executed. Otherwise, control continues after the while construct.
Then control implicitly returns to the Boolean expression to repeat the process. Although they are
often separated for discussion purposes, syntax and semantics are closely related. In a well-
designed programming language, semantics should follow directly from syntax; that is, the
appearance of a statement should strongly suggest what the statement is meant to accomplish.
Describing syntax is easier than describing semantics, partly because a concise and universally
accepted notation is available for syntax description, but none has yet been developed for
semantics.
2.0 OBJECTIVES
A language, whether natural such as English or artificial such as Java, is a set of strings of
characters from some alphabet. The strings of a language are called sentences or statements. The
syntax rules of a language specify which strings of characters from the language’s alphabet are in
the language. English, for example, has a large and complex collection of rules for specifying the
syntax of its sentences. By comparison, even the largest and most complex programming
languages are syntactically very simple.
31
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Syntax is the set of rules that define what the various combinations of symbols mean. This
tells the computer how to read the code. Syntax refers to a concept in writing code dealing
with a very specific set of words and a very specific order to those words when we give
the computer instructions. This order and this strict structure is what enables us to
communicate effectively with a computer. Syntax is to code, like grammar is to English
or any other language. A big difference though is that computers are really exacting in
how we structure that grammar or our syntax.
This syntax is why we call programming coding. Even amongst all the different languages
that are out there. Each programming language uses different words in a different structure
in how we give it information to get the computer to follow our instructions. Syntax
analysis is a task performed by a compiler which examines whether the program has a proper
associated derivation tree or not. The syntax of a programming language can be interpreted using
the following formal and informal techniques:
• Lexical syntax for defining the rules for basic symbols involving identifiers, literals,
punctuators and operators.
• Concrete syntax specifies the real representation of the programs with the help of lexical
symbols like its alphabet.
• Abstract syntax conveys only the vital program information.
The Syntax of a programming language is used to signify the structure of programs without
considering their meaning. It basically emphasizes the structure, layout of a program with their
appearance. It involves a collection of rules which validates the sequence of symbols and
instruction used in a program. In general, languages can be formally defined in two distinct ways:
by recognition and by generation.
The syntax analysis part of a compiler is a recognizer for the language the compiler translates. In
this role, the recognizer need not test all possible strings of characters from some set to determine
whether each is in the language. Rather, it need only determine whether given programs are in the
language. In effect then, the syntax analyzer determines whether the given programs are
syntactically correct. The structure of syntax analyzers, also known as parsers as discussed before.
Language recognizer is like a filters, separating legal sentences from those that are incorrectly
formed.
A language generator is a device that can be used to generate the sentences of a language. a
generator seems to be a device of limited usefulness as a language descriptor. However, people
prefer certain forms of generators over recognizers because they can more easily read and
understand them. By contrast, the syntax-checking portion of a compiler (a language recognizer)
is not as useful a language description for a programmer because it can be used only in trial-and-
error mode. For example, to determine the correct syntax of a particular statement using a compiler,
the programmer can only submit a speculated version and note whether the compiler accepts it.
On the other hand, it is often possible to determine whether the syntax of a particular statement is
32
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
correct by comparing it with the structure of the generator. There is a close connection between
formal generation and recognition devices for the same language which led to formal languages.
Syntactic ambiguity is a property of sentences which may be reasonably interpreted in more than
one way, or reasonably interpreted to mean more than one thing. Ambiguity may or may not
involve one word having two parts of speech or homonyms. Syntactic ambiguity arises not from
the range of meanings of single words, but from the relationship between the words and clauses of
a sentence, and the sentence structure implied thereby. When a reader can reasonably interpret the
same sentence as having more than one possible structure, the text is equivocal and meets the
definition of syntactic ambiguity.
3.2.1 Parsing
Parsing is the process of analyzing a text, made of a sequence of tokens for example, words, to
determine its grammatical structure with respect to a given (more or less) formal grammar. Parsing
can also be used as a linguistic term, especially in reference to how phrases are divided up in
garden path sentences. The diagram below shows overview process.
Overview of process
33
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
One of the most attractive features of grammars is that they naturally describe the hierarchical
syntactic structure of the sentences of the languages they define. These hierarchical structures are
called parse trees. For example, the parse tree in the diagram below showed the structure of the
assignment statement derived. Every internal node of a parse tree is labeled with a nonterminal
symbol; every leaf is labeled with a terminal symbol. Every subtree of a parse tree describes one
instance of an abstraction in the sentence.
A = B * (A + C)
3.2.3 Parser
In computing, a parser is one of the components in an interpreter or compiler, which checks for
correct syntax and builds a data structure; often some kind of parse tree, abstract syntax tree or
other hierarchical structure implicit in the input tokens. The parser often uses a separate lexical
analyzer to create tokens from the sequence of input characters. Parsers may be programmed by
hand or may be (semi-)automatically generated (in some programming languages) by a tool.
The task of the parser is essentially to determine if and how the input can be derived from the start
symbol of the grammar. This can be done in essentially two ways:
34
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
This is the process of recognizing an utterance (a string in natural languages) by breaking it down
to a set of symbols and analyzing each one against the grammar of the language. Most languages
have the meanings of their utterances structured according to their syntax—a practice known as
compositional semantics. As a result, the first step to describing the meaning of an utterance in
language is to break it down part by part and look at its analyzed form (known as its parse tree in
computer science, and as its deep structure in generative grammar) as discussed earlier.
Programming languages generally support a set of operators that are similar to operations in
mathematics. A language may contain a fixed number of built-in operators (e.g. + - * = in C and
C++), or it may allow the creation of programmer defined operators (e.g. Haskell). Some
programming languages restrict operator symbols to special characters like + or := while others
allow also names like div. Sample of expression is given in the diagram
In some programming languages an operator may be ad-hoc polymorphic, that is, have definitions
for more than one kind of data, (such as in Java where the + operator is used both for the addition
of numbers and for the concatenation of strings). Such an operator is said to be overloaded. In
languages that support operator overloading by the programmer but have a limited set of operators,
operator overloading is often used to define customized uses for operators.
35
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Short-circuit evaluation and minimal evaluation, denotes the semantics of some Boolean operators
in some programming languages in which the second argument is only executed or evaluated if
the first argument does not suffice to determine the value of the expression: when the first argument
of the AND function evaluates to false, the overall value must be false; and when the first argument
of the OR function evaluates to true, the overall value must be true. In some programming
languages like Lisp, the usual Boolean operators are short-circuit. In others like Java, Ada; both
short-circuit and standard Boolean operators are available.
• Boolean Expression - an expression that returns a Boolean value, either true or false. This
kind of expression can be used only in the IF-THEN-ELSE control structure and the
parameter of the display condition command. A relational expression has two operands and
one relational operator. The value of a relational expression is Boolean. In programming
languages that include a distinct Boolean data type in their type system, like Java, these
operators return true or false, depending on whether the conditional relationship between
the two operands holds or not.
• Numeric Expression - an expression that returns a number. This kind of expression can
be used in numeric data fields. It can also be used in functions and commands that require
numeric parameters.
• Character Expression - an expression that returns an alphanumeric string. This kind of
expression can be used in string data fields (format type Alpha). They can also be used in
functions and command that require string parameters.
• Relational Expression: Relational operator is a programming language construct or
operator that test or define some kind of relation between two entities. These include
numerical equality (e.g., 5 = 5) and inequalities (e.g., 4 ≥ 3). An expression created using
a relational operator forms what is known as a relational expression or a condition.
Relational operators are also used in technical literature instead of words. Relational
operators are usually written in infix notation, if supported by the programming language,
which means that they appear between their operands (the two expressions being related).
When several operations occur in an expression, each part is evaluated and resolved in a
predetermined order called operator precedence. Parentheses can be used to override the order of
precedence and force some parts of an expression to be evaluated before other parts. Operations
within parentheses are always performed before those outside. Within parentheses, however,
normal operator precedence is maintained. When expressions contain operators from more than
one category, arithmetic operators are evaluated first, comparison operators are evaluated next,
and logical operators are evaluated last. Comparison operators all have equal precedence; that is,
they are evaluated in the left-to-right order in which they appear. Arithmetic and logical operators
are evaluated in the following order of precedence:
36
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
When multiplication and division occur together in an expression, each operation is evaluated as
it occurs from left to right. Likewise, when addition and subtraction occur together in an
expression, each operation is evaluated in order of appearance from left to right. The string
concatenation operator (&) is not an arithmetic operator, but in precedence it does fall after all
arithmetic operators and before all comparison operators. The Is operator is an object reference
comparison operator. It does not compare objects or their values; it checks only to determine if
two object references refer to the same object.
Program Operators are the foundation of any programming language. Thus, the functionality of
C/C++ /Java programming language is incomplete without the use of operators. We can define
operators as symbols that helps us to perform specific mathematical and logical computations on
operands. In other words, we can say that an operator operates the operands. For example, consider
the below statement:
c = a + b;
Here, ‘+’ is the operator known as addition operator and ‘a’ and ‘b’ are operands. The
addition operator tells the compiler to add both of the operands ‘a’ and ‘b’. C/C++ has many
built-in operator types and they can be classified as:
37
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
a = 10;
b = 20;
ch = 'y'; “+=”: This operator is combination of ‘+’ and ‘=’ operators. This operator
first adds the current value of the variable on left to the value on right and then assigns the
result to the variable on the left. Example:
“-=”: This operator is combination of ‘-‘ and ‘=’ operators. This operator first
subtracts the current value of the variable on left from the value on right and then assigns
the result to the variable on the left.
“*=”: This operator is combination of ‘*’ and ‘=’ operators. This operator first
multiplies the current value of the variable on left to the value on right and then 21 assigns
the result to the variable on the left.
“/=”: This operator is combination of ‘/’ and ‘=’ operators. This operator first divides the
current value of the variable on left by the value on right and then assigns the result to the
variable on the left.
A lexical analyzer is essentially a pattern matcher. A pattern matcher attempts to find a substring
of a given string of characters that matches a given character pattern. Pattern matching is a
traditional part of computing. One of the earliest uses of pattern matching was with text editors,
such as the ed line editor, which was introduced in an early version of UNIX. Since then, pattern
matching has found its way into some programming languages—for example, Perl and JavaScript.
It is also available through the standard class libraries of Java, C++, and C#. A lexical analyzer
serves as the front end of a syntax analyzer. Technically, lexical analysis is a part of syntax
analysis. A lexical analyzer performs syntax analysis at the lowest level of program structure.
There are three reasons why lexical analysis is separated from syntax analysis:
• Simplicity: Techniques for lexical analysis are less complex than those required for syntax
analysis, so the lexical-analysis process can be simpler if it is separate. Also, removing the
low-level details of lexical analysis from the syntax analyzer makes the syntax analyzer
both smaller and less complex.
• Efficiency: Although it pays to optimize the lexical analyzer, because lexical analysis
requires a significant portion of total compilation time, it is not fruitful to optimize the
syntax analyzer. Separation facilitates this selective optimization.
• Portability: Because the lexical analyzer reads input program files and often includes
buffering of that input, it is somewhat platform dependent. However, the syntax analyzer
can be platform independent. It is always good to isolate machine-dependent parts of any
software system.
39
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
An input program appears to a compiler as a single string of characters. The lexical analyzer
collects characters into logical groupings and assigns internal codes to the groupings according to
their structure. These logical groupings are named lexemes, and the internal codes for categories
of these groupings are named tokens. Lexemes are recognized by matching the input character
string against character string patterns. Although tokens are usually represented as integer values,
for the sake of readability of lexical and syntax analyzers, they are often referenced through named
constants.
Token Lexeme
IDENT result
ASSIGN_OP =
IDENT oldsum
SUB_OP -
IDENT value
DIV_OP /
INT_LIT 100
SEMICOLON ;
Lexical analyzers extract lexemes from a given input string and produce the corresponding tokens.
In the early days of compilers, lexical analyzers often processed an entire source program file and
produced a file of tokens and lexemes. Now, however, most lexical analyzers are subprograms that
locate the next lexeme in the input, determine its associated token code, and return them to the
caller, which is the syntax analyzer. So, each call to the lexical analyzer returns a single lexeme
and its token. The only view of the input program seen by the syntax analyzer is the output of the
lexical analyzer, one token at a time.
The lexical-analysis process includes skipping comments and white space outside lexemes, as they
are not relevant to the meaning of the program. Also, the lexical analyzer inserts lexemes for user-
defined names into the symbol table, which is used by later phases of the compiler. Finally, lexical
analyzers detect syntactic errors in tokens, such as ill-formed floating-point literals, and report
such errors to the user.
Lexical units are considered the building blocks of programming languages. The lexical structured
of all programming languages are similar and normally include the following kinds of units:
• Identifiers: Names that can be chosen by programmers to represent objects lie variables,
labels, procedures and functions. Most programming languages require that an identifier
start with an alphabets letter and can be optionally followed by letters, digits and some
special characters.
40
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• Keywords: Names reserved by the language designer are used to form the syntactic
structure of the languages.
• Operators: Symbols used to represent the operations. All general-purpose programming
languages should provide certain minimum operators such as mathematical operators like
+, -, *, ?, relational operators like <, ≤, ==. >, ≥, and logic operators like
AND, OR, NOT, etc.
• Separators: Symbols used to separate lexical or syntactic units of the language. Space,
comma, colon, semicolon and parentheses are used as separators.
• Literals: Values that can be assigned to variables of different types. For example, integer-
type literals are integer numbers, character-type literals are any character from the character
set of the language and string-type literals are any string of characters.
• Comments: Any explanatory text embedded in the program. Comments start with a
specific keyword or separator. When the compiler translates a program into machine code,
all comments will be ignored.
• Write a formal description of the token patterns of the language using a descriptive
language related to regular expressions. These descriptions are used as input to a software
tool that automatically generates a lexical analyzer. There are many such tools available
for this. The oldest of these, named lex, is commonly included as part of UNIX systems.
• Design a state transition diagram that describes the token patterns of the language and write
a program that implements the diagram.
• Design a state transition diagram that describes the token patterns of the language and
hand-construct a table-driven implementation of the state diagram.
A state transition diagram, or just state diagram, is a directed graph. The nodes of a state diagram
are labeled with state names. The arcs are labeled with the input characters that cause the
transitions among the states. An arc may also include actions the lexical analyzer must perform
when the transition is taken. State diagrams of the form used for lexical analyzers are
representations of a class of mathematical machines called finite automata. Finite automata can be
designed to recognize members of a class of languages called regular languages. Regular
grammars are generative devices for regular languages. The tokens of a programming language
are a regular language, and a lexical analyzer is a finite automaton.
The formal language-generation mechanisms, usually called grammars, are commonly used to
describe the syntax of programming languages. A formal grammar is a set of rules of a specific
kind, for forming strings in a formal language. The rules describe how to form strings from the
language's alphabet that are valid according to the language's syntax.
A grammar describes only the form of the strings and not the meaning or what can be done with
them in any context. A formal grammar is a set of rules for rewriting strings, along with a "start
symbol" from which rewriting must start. Therefore, a grammar is usually thought of as a language
generator. However, it can also sometimes be used as the basis for a recognizer.
41
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
The grammar of a language establishes the alphabet and lexicon. Then by means of a syntax, it
defines those sequences of symbols corresponding to well-formed phrases and sentences. A
grammar mainly consists of a set of rules for transforming strings. To generate a string in the
language, one begins with a string consisting of a single start symbol. The production rules are
then applied in any order, until a string that contains neither the start symbol nor designated
nonterminal symbols is produced. The language formed by the grammar consists of all distinct
strings that can be generated in this manner. Any particular sequence of production rules on the
start symbol yields a distinct string in the language. If there are multiple ways of generating the
same single string, the grammar is said to be ambiguous.
A grammar is usually thought of as a language generator. However, it can also sometimes be used
as the basis for a recognizer: a function in computing that determines whether a given string
belongs to the language or is grammatically incorrect. To describe such recognizers, formal
language theory uses separate formalisms, known as automata theory. One of the interesting results
of automata theory is that it is not possible to design a recognizer for certain formal languages
That is, each production rule maps from one string of symbols to another, where the first string
(the "head") contains at least one non-terminal symbol. In the case that the second string (the
"body") consists solely of the empty string – i.e. it contains no symbols at all, it may be denoted
with a special notation (often _, e or _) in order to avoid confusion.
A distinguished symbol, that is, the start symbol. A grammar is formally defined as the tuple (N,
_, P, S). Such a formal grammar is often called a rewriting system or a phrase structure
grammar in the literature A grammar can be formally written as four tuple element i.e. G = {N,
ℇ, S, P}
Example 1: Assuming the alphabet consists of a and b, the start symbol is S and we have the
following production rules:
42
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
1. S→aSb
2. S→ ba
then we start with S, and can choose a rule to apply to it. If we choose rule 1, we obtain the string
aSb. If we choose rule 1 again, we replace S with aSb and obtain the string aaSbb. If we now
choose rule 2, we replace S with ba and obtain the string aababb, and are done.
We can write this series of choices more briefly, using symbols: S→aSb →aaSbb →aababb.
The language of the grammar is then the infinite set {an babn | n ≥ 0} = {ba, abab,
aababb, aaababbb, ------}, where ak is a repeated k times (and n in particular
represents the number of times production rule 1 has been applied).
Example 2: Consider the grammar G where N = {S, B} = {a, b, c}, S is the start
symbol and P consists of the following production rules:
1. S→ aBSc
2. S→ abc
3. Ba → aB
4. Bb → bb
This grammar defines the language L(G) = anbncn | n ≥ 1} where an denotes a string of
n consecutive a’s. Thus the language is the set of strings that consist of one or more a’s, followed
by the same number of b’s and then by the same number of c’s.
Solution:
S →2abc
S → 1aBSc →2aBabcc →3aaBbcc →4aabbcc
S →1aBSc →1aBaBScc →2 aBaBabccc →3aaBBabaccc →3aaBaBbccc →3aaaBBbccc
→4aaaBbbccc →4aaabbbccc
Solution:
S → 1AB
S → 1AB → 2AaB → 3AaBb → 3AaBbb → 3AaBbbb → 4aaBbbb → 5aabbbb
Example 4: Analyze the structure of this grammar G = {V, T, S, P} where V is used in place
of N and T is used in place of ℇ and produce 4 sample strings from the grammar?
43
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
1. P: S → ASB
2. A → a
3. B → b
4. S → λ
Solution:
S → 4λ
S → 1ASB → 2aSB → 3aSb → 4ab
S → 1ASB → 4AB
S → 1ASB → 2aSB → 4aB
In the middle to late 1950s, two men, Noam Chomsky and John Backus, developed the same syntax
description formalism, which subsequently became the most widely used method for programming
language syntax. Four classes of generative devices or grammars that define four classes of
languages were described.
Two of these grammar classes, named context-free and regular, turned out to be useful for
describing the syntax of programming languages. The forms of the tokens of programming
languages can be described by regular grammars. The syntax of whole programming languages,
with minor exceptions, can be described by context-free grammars.
It is remarkable that BNF is nearly identical to Chomsky’s generative devices for context-free
languages, called context-free grammar and context-free grammars simply refer to as grammars.
Furthermore, the terms BNF and grammar are used interchangeably.
The difference between these types is that they have increasingly strict production rules and can
express fewer formal languages. Two important types are context-free grammars (Type 2) and
regular grammars (Type 3). The languages that can be described with such a grammar are called
context-free languages and regular languages, respectively.
Although much less powerful than unrestricted grammars (Type 0), which can in fact express any
language that can be accepted by a Turing machine, these two restricted types of grammars are
most often used because parsers for them can be efficiently implemented.
The following table and diagram summarize each of Chomsky's four types of grammars, the class
of language it generates, the type of automaton that recognizes it, and the form its rules must have.
5.0 CONCLUSION
45
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
A formal grammar is a set of rules of a specific kind, for forming strings in a formal language. It
has four components that form its syntax and a set of operations that can be performed on it, which
form its semantic. An attribute grammar is a descriptive formalism that can describe both the
syntax and static semantics of a language. Finally, we discussed different types of grammar sand
obtained some grammars from some exercise questions.
6.0 SUMMARY
Syntax analyzers are either top-down, meaning they construct leftmost derivations and a parse tree
in top-down order, or bottom-up, in which case they construct the reverse of a rightmost derivation
and a parse tree in bottom-up order. Parsers that work for all unambiguous grammars have
complexity O(n3). However, parsers used for implementing syntax analyzers for programming
languages work on subclasses of unambiguous grammars and have complexity O(n).
A lexical analyzer is a pattern matcher that isolates the small-scale parts of a program, which are
called lexemes. Lexemes occur in categories, such as integer literals and names. These categories
are called tokens. Each token is assigned a numeric code, which along with the lexeme is what the
lexical analyzer produces. There are three distinct approaches to constructing a lexical analyzer:
using a software tool to generate a table for a table-driven analyzer, building such a table by hand,
and writing code to implement a state diagram description of the tokens of the language being
implemented.
The state diagram for tokens can be reasonably small if character classes are used for transitions,
rather than having transitions for every possible character from every state node. Also, the state
diagram can be simplified by using a table lookup to recognize reserved words.
Backus-Naur Form and context-free grammars are equivalent metalanguages that are well suited
for the task of describing the syntax of programming languages. Not only are they concise
descriptive tools, but also the parse trees that can be associated with their generative actions give
graphical evidence of the underlying syntactic structures. Furthermore, they are naturally related
46
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
to recognition devices for the languages they generate, which leads to the relatively easy
construction of syntax analyzers for compilers for these languages.
An attribute grammar is a descriptive formalism that can describe both the syntax and static
semantics of a language. Attribute grammars are extensions to context-free grammars. An attribute
grammar consists of a grammar, a set of attributes, a set of attribute computation functions, and a
set of predicates, which together describe static semantics rules.
14. Learning Programming Methodologies: Absolute Beginners, Tutorials Point, Simply Easy
Learning
15. Michael L. Scott, Programming Language Pragmatics, Third Edition, Morgan Kaufmann
Publishers, Elsevier.
16. Robert W. Sebesta. Concepts of Programming Languages. Addison Wesley, 2011.
17. Aho, A. V., J. E. Hopcroft, and J. D. Ullman. The design and analysis of computer
algorithms. Boston: Addison-Wesley, 2007.
18. Composing Programs by John DeNero, based on the textbook Structure and Interpretation
of Computer Programs by Harold Abelson and Gerald Jay Sussman, is licensed under
a Creative Commons Attribution-ShareAlike 3.0 Unported License
19. UTD: Describing Syntax and Semantics: Dr. Chris Davis, [email protected]
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Informal Semantics
3.1.1 Operational Semantics
3.1.2 Denotational Semantics
3.1.3 Axiomatic or Logical Semantics
3.2 Types of Semantic Analysis
3.2.1 Static Semantic
3.3.2 Dynamic Semantic
3.3 Semantic Analyzer
3.4 Semantic Errors
4.0 Self-Assessment Exercise(s)
5.0 Conclusion
6.0 Summary
7.0 Tutor-Marked Assignment
8.0 References/Further Reading
1.0 INTRODUCTION
Parsing only verifies that the program consists of tokens arranged in a syntactically valid
combination. Now we will move forward to semantic analysis, where we delve even deeper to
check whether they form a sensible set of instructions in the programming language.
Whereas any old noun phrase followed by some verb phrase makes a syntactically correct English
sentence, a semantically correct one has subject-verb agreement, proper use of gender, and the
components go together to express an idea that makes sense. For a program to be semantically
valid, all variables, functions, classes, etc. must be properly defined, expressions and variables
must be used in ways that respect the type system, access control must be respected, and so forth.
48
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Semantic analysis is the front end’s penultimate phase and the compiler’s last chance to weed out
incorrect programs. We need to ensure the program is sound enough to carry on to code generation.
2.0 OBJECTIVES
• Be familiar with rule-based presentations of the operational semantics and type systems for
some simple be able to prove properties of an operational semantics using various forms
of induction
• Understand the fundamental semantic issues of variables, nature of names and special
words in programming languages
• Know semantic analysis judges whether the syntax structure constructed in the source
program derives any meaning or not.
• Be familiar with some operationally-based notions of semantic equivalence of program
phrases and their basic properties
Semantics term in a programming language is used to figure out the relationship among the syntax
and the model of computation. It emphasizes the interpretation of a program so that the
programmer could understand it in an easy way or predict the outcome of program execution.
Determining the meaning of a program in place of the calculation steps which are necessary to
idealized execution. Some definitions used structural operational semantics which intermediate
state is described on the basis of the language itself others use abstract machine to make use of
more ad-hoc mathematical constructions.
With an operational semantics of a programming language, one usually understands a set of rules
for its expressions, statements, programs, etc., are evaluated or executed.
49
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
The definition of a program defining indirectly, by providing the axioms of logic to the
characteristics of the program. Compare with specification and verification. Axiomatic semantics
is also based on mathematical logic. The approach provides rules of inference (the axioms) which
show the change of data after the execution of a certain sentence of the language. The typical
50
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
situation is that the execution (i.e. the transformation of data) takes place provided that the data
satisfies certain conditions, and the inference rules are used to deduce the results satisfying also
some conditions. Example:
{ P} S {Q }
where S is a program (or a part of a program) written in a certain programming language, and P
and a are logical expressions (assertions, using standard mathematical notations together with
logical operators) describing conditions on the program variables used in S. The meaning is "If
assertion P is true when control is at the beginning of program S, then assertion Q will be true
when control is at the end of program S. The language of the assertions is predicate logic, the
logical expressions P and a are called predicates. Considering the transformation of predicates
leads to axiomatic semantics.
A simple example can be given in the following way, where we assume a and b to be integers:
meaning that if b > 10 before executing the assignment a: = 10 + 4 *b, then the value of
a will be greater 20 after the assignment is executed.
Types of semantic analysis involves the following: static and dynamic semantics.
The static semantics defines restrictions on the structure of valid texts that are hard or impossible
to express in standard syntactic formalisms. For compiled languages, static semantics essentially
include those semantic rules that can be checked at compile time. Examples include checking that
every identifier is declared before it is used (in languages that require such declarations) or that
the labels on the arms of a case statement are distinct.
Many important restrictions of this type, like checking that identifiers are used in the appropriate
context (e.g. not adding an integer to a function name), or that subroutine calls have the appropriate
number and type of arguments, can be enforced by defining them as rules in a logic called a type
system. Other forms of static analyses like data flow analysis may also be part of static semantics.
Newer programming languages like Java and C# have definite assignment analysis, a form of data
flow analysis, as part of their static semantics.
Once data has been specified, the machine must be instructed to perform operations on the data.
For example, the semantics may define the strategy by which expressions are evaluated to values,
or the manner in which control structures conditionally execute statements. The dynamic semantics
(also known as execution semantics) of a language defines how and when the various constructs
of a language should produce a program behavior. There are many ways of defining execution
51
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
semantics. Natural language is often used to specify the execution semantics of languages
commonly used in practice. A significant amount of academic research went into formal semantics
of programming languages, which allow execution semantics to be specified in a formal manner.
Results from this field of research have seen limited application to programming language design
and implementation outside academia.
It uses syntax tree and symbol table to check whether the given program is semantically
consistent with language definition. It gathers type information and stores it in either syntax tree
or symbol table. This type information is subsequently used by compiler during intermediate-
code generation.
• Type mismatch
• Undeclared variable
• Reserved identifier misuse.
• Multiple declaration of variable in a scope.
• Accessing an out of scope variable.
• Actual and formal parameter mismatch.
5.0 Conclusion
6.0 Summary
Static and dynamic sematic were discussed with different types of semantic error, binding and
scope were discussed with their types as well and finally, fundamental semantic issues of variables,
nature of names and special words in programming languages were analyzed.
52
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
1. Some programming languages are type less. What are the obvious advantages and
disadvantages of having no types in a language?
2. What determines whether a language rule is a matter of syntax or of static semantics?
3. Why is it impossible to detect certain program errors at compile time, even though they
can be detected at run time?
4. In what ways are reserved words better than keywords?
53
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Primitives Data Types
3.1.1 Common Data Types
3.2 Data Types in Programming Languages
3.3 Type Conversion in Languages
3.3.1 Type Conversion in C
3.3.2 Type Conversion in Java
3.4 Declaration Model
3.4.1 Variable
3.4.2 Naming Conventions
3.5 Variable Declaration in Languages
3.5.1 Implicit Variable Declaration
3.5.2 Explicit Variable Declaration
3.5.3 Variable Declaration in C Programming Language
3.5.4 Variable Declaration in C++ Programming Language
3.5.5 Variable Declaration in Java Programming Language
3.5.6 The Basic Data Type Associated with Variables in Languages
3.5.7 Data Structures in Java
3.7 Binding in Programming Languages
3.7.1 Dynamic Binding
3.7.2 Static Binding
3.7.3 Demonstration of Binding in C and Java
3.3.1 Visibility
3.8 Scope
3.8.1 Static Scope
3.8.2 Dynamic Scope
3.8.3 Referencing
3.9 Flow Control Structure in Programming Languages
3.9.1 Sequence logic or Sequential flow
3.9.2 Selection logic or conditional flow
3.9.3 Iteration logic or repetitive flor
3.10 Runtime Consideration in Programming Language
4.0 Self-Assessment Exercise(s)
5.0 Conclusion
6.0 Summary
7.0 Tutor-Marked Assignment
8.0 References/Further Reading
54
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
1.0 INTRODUCTION
Various possibilities exist to define structured data types to express real world problems. But we
know that complex real world problems do not require only an abstraction in terms of data
structures but also in terms of operations on data objects of such structured types. This means, that
programming languages should provide constructs for the definition of abstract data types. The
fundamental ideas are that data and the appropriate operations on it belong together, and that
implementation details are hidden to those who use the abstract data types. The concept that data
and the appropriate operations on it should form a syntactic unit (i.e. the concept of abstract data
types).
Every programming language uses different word sets in different orders, which means
that each programming language uses its own syntax. But, no matter the programming
language, computers are really exacting in how we structure our syntax. Programming
language is more than just a means for instructing a computer to perform tasks. The language also
serves as a framework within which we organize our ideas about computational processes.
Programs serve to communicate those ideas among the members of a programming community.
Thus, programs must be written for people to read, and only incidentally for machines to execute.
When we describe a language, we should pay particular attention to the means that the language
provides for combining simple ideas to form more complex ideas. Every powerful language has
three such mechanisms:
• primitive expressions and statements, which represent the simplest building blocks that
the language provides,
• means of combination, by which compound elements are built from simpler ones, and
• means of abstraction, by which compound elements can be named and manipulated as
units.
In programming, we deal with two kinds of elements: functions and data. Informally, data is stuff
that we want to manipulate, and functions describe the rules for manipulating the data. Thus, any
powerful programming language should be able to describe primitive data and primitive functions,
as well as have some methods for combining and abstracting both functions and data. To develop
any instruction there are some elements needed or we can essentially present in all language.
2.0 OBJECTIVES
55
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
This is a classification identifying one of various types of data, such as floating point, integer, or
Boolean, that determines the possible values for that type; the operations that can be done on values
of that type; and the way values of that type can be stored. There are several classifications of data
types, some of which include: It is a basic data type which is provided by a programming language
as a basic building block. Most languages allow more complicated composite types to be
recursively constructed starting from basic types. It also a built-in data type for which the
programming language provides built-in support.
• Integer: a whole number that can have a positive, negative, or zero value. It cannot be a
fraction, nor can it include decimal places. It is commonly used in programming,
especially for increasing values. Addition, subtraction, and multiplication of two integers
results in an integer. However, division of two integers may result in either an integer or a
decimal. The resulting decimal can be rounded off or truncated in order to produce an
integer.
• Character: any number, letter, space, or symbol that can be entered in a computer. Every
character occupies one byte of space.
• String: used to represent text. It is composed of a set of characters that can include spaces
and numbers. Strings are enclosed in quotation marks to identify the data as strings, and
not as variable names, nor as numbers.
• Floating Point Number: a number that contains decimals. Numbers that contain fractions
are also considered floating-point numbers.
• Varchar: as the name implies, a varchar is a variable character, on account of the fact that
the memory storage has variable length. Each character occupies one byte of space, plus
2 bytes additional for length information.
• Array: a kind of a list that contains a group of elements which can be of the same data type
as an integer or string. It is used to organize data for easier sorting and searching of related
sets of values. An array is a collection of items stored at contiguous memory locations.
The idea is to store multiple items of the same type together. This makes it easier to
calculate the position of each element by simply adding an offset to a base value, i.e., the
memory location of the first element of the array (generally denoted by the name of the
array). The base value is index 0 and the difference between the two indexes is the offset.
56
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Each variable, called an element, is accessed using a subscript (or index). All arrays consist
of contiguous memory locations. The lowest address corresponds to the first element and
the highest address to the last element.
Programming languages have support for different data types. This is why they are able to handle
input values to be supplied by the users of programs developed using such languages. Some
programming languages categorize the kinds of data they handle into: Simple and Composite data.
The simple data types are generally termed primitives. The data types supported by programming
languages differ in different forms.
For instance, C programming language has the following examples of data as being supported:
Simple: integer, long integer, short integer, float, double, character, Boolean and so on. While the
composite data types include: Enumeration, Structure, Array, and String. The various data types
are used to specify and handle the kind of data to be used in programming problem. These data
types are declared in a programming problem through a process called variable declaration
Different programming languages supports different kinds of data types. At times in programming,
there may be a need to convert a data from one type to the other. The term used for describing this
process is called Type Conversion. Java and C++ (and some other languages) are good examples
of programming language that support type conversion.
The type conversion process in C is basically converting one type of data type to other to perform
some operation. The conversion is done only between those datatypes wherein the conversion is
possible ex – char to int and vice versa.
This type of conversion is usually performed by the compiler when necessary without any
commands by the user. Thus it is also called "Automatic Type Conversion".
57
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
The compiler usually performs this type of conversion when a particular expression contains more
than one data type. In such cases either type promotion or demotion takes place. Example:
int a = 20;
double b = 20.5;
a + b char
ch='a'; 30
int a =13; a
+ c
Explicit type conversion rules out the use of compiler for converting one data type to another
instead the user explicitly defines within the program the datatype of the operands in the
expression. The example below illustrates how explicit conversion is done by the user. Example:
double da = 4.5;
double db = 4.6;
double dc = 4.9;
result = 12
Thus, in the above example we find that the output result is 12 because in the result expression the
user has explicitly defined the operands (variables) as integer data type. Hence, there is no implicit
conversion of data type by the compiler.
When you assign value of one data type to another, the two types might not be compatible with
each other. If the data types are compatible, then Java will perform the conversion automatically
known as Automatic Type Conversion and if not then they need to be casted or converted
explicitly. For example, assigning an int value to a long variable.
58
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Widening conversion takes place when two data types are automatically converted. This happens
when:
• The two data types are compatible.
• When we assign value of a smaller data type to a bigger data type.
For Example, in java the numeric data types are compatible with each other but no automatic
conversion is supported from numeric type to char or Boolean. Also, char and Boolean are not
compatible with each other.
class Test
{
public static void main(String[] args)
{
int i = 100;
// automatic type conversion
long l = i;
// automatic type conversion
float f = l;
System.out.println("Int value "+i);
System.out.println("Long value "+l);
System.out.println("Float value "+f);
}
}
If we want to assign a value of larger data type to a smaller data type we perform explicit type
casting or narrowing.
• This is useful for incompatible data types where automatic conversion cannot be done.
• Here, target-type specifies the desired type to convert the specified value to.
• char and number are not compatible with each other. Let’s see when we try to convert one
into other.
• filter_none
• edit
• play_arrow
• brightness_4
59
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Example:
filter_none 33
edit
play_arrow
brightness_4
long l = (long)d;
//explicit type casting
int i = (int)l;
System.out.println("Double value "+d);
//fractional part lost
System.out.println("Long value "+l);
//fractional part lost
System.out.println("Int value "+i);
}
}
61
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
3.4.1 Variables
Variables in programming tells how the data is represented which can be range from very simple
value to complex one. The value they contain can be change depending on condition. When
creating a variable, we also need to declare the data type it contains. This is because the program
will use different types of data in different ways. Programming languages define data types
differently. Data can hold a very simplex value like an age of the person to something very complex
like a student track record of his performance of whole year.
It is a symbolic name given to some known or unknown quantity or information, for the purpose
of allowing the name to be used independently of the information it represents. Compilers have to
replace variables' symbolic names with the actual locations of the data. While the variable name,
type, and location generally remain fixed, the data stored in the location may get altered during
program execution.
For example, almost all languages differentiate between ‘integers’ (or whole numbers, eg 12),
‘non-integers’ (numbers with decimals, eg 0.24), and ‘characters’ (letters of the alphabet or words).
In programming languages, we can distinguish between different type levels which from the user's
point of view form a hierarchy of complexity, i.e. each level allows new data types or operations
of greater complexity.
• Elementary level: Elementary (sometimes also called basic or simple) types, such as
integers, reals, boo leans, and characters, are supported by nearly every programming
language. Data objects of these types can be manipulated by well-known operators, like +,
- , *, or /, on the programming level. It is the task of the compiler to translate the operators
onto the correct machine instructions, e.g. fixed-point and floating-point operations. See
diagram below:
• Structured level: Most high level programming languages allow the definition of
structured types which are based on simple types. We distinguish between static and
dynamic structures. Static structures are arrays, records, and sets, while dynamic structures
are a b it more complicated, since they are recursively defined and may vary in size and
shape during the execution of a program. Lists and trees are dynamic structures.
• Abstract level: Programmer defined abstract data types are a set of data objects with
declared operations on these data objects. The implementation or internal representation of
abstract data types is hidden to the users of these types to avoid uncontrolled manipulation
of the data objects (i.e the concept of encapsulation).
A Global variable is the kind of variable that is accessible from other classes outside the
program or class in which it is declared. Different programming languages have various of
ways in which global variables are being declared, when the need arises.
A local variable is the kind of variable that is not accessible from other classes outside the
program or class in which it is declared. These are variables that are used within the current
program unit (or function) in a later section we will looking at global variables - variables
that are available to all the program's functions.
Unlike their mathematical counterparts, programming variables and constants commonly take
multiple-character names, e.g. COST or total. Single-character names are most commonly used
only for auxiliary variables; for instance, i, j, k for array index variables. Some naming conventions
are enforced at the language level as part of the language syntax and involve the format of valid
identifiers. In almost all languages, variable names cannot start with a digit (0-9) and cannot
contain whitespace characters.
Whether, which, and when punctuation marks are permitted in variable names varies from
language to language; many languages only permit the underscore (_) in variable names and forbid
all other punctuation. In some programming languages, specific (often punctuation) characters
(known as sigils) are prefixed or appended to variable identifiers to indicate the variable's type.
Case-sensitivity of variable names also varies between languages and some languages require the
use of a certain case in naming certain entities; most modern languages are case-sensitive; some
older languages are not.
Some languages reserve certain forms of variable names for their own internal use; in many
languages, names beginning with 2 underscores ("__") often fall under this category.
63
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
There are various kinds of programming languages. Some languages do not require that variables
are declared in a program before being used while some require that variables are declared. We
can therefore have implicit and explicit variable declaration. Variables can also be classified as
global or local, depending on the level of access from within the program.
That is, in Implicit Variable Declaration programming languages in which the variables to be used
in a program may not be declared are said to support implicit variable declaration. A good example
of programming language that supports this variable declaration type is Python. That is, in Python,
a programmer may declare or choose not to declare the variable that he intends using in a
programming language.
Programming languages in which the variables to be used in a program should be declared are said
to support Explicit variable declaration. Examples of such languages that support explicit variable
declaration are: Java, C, C++, PASCAL, FORTRAN, and many others. It is important for a
programmer to always declare variable before using them in languages like C, C++, VBscript and
Java. Depending on the data type to be used in a program, examples of variable declaration some
selected languages are as below:
int num1,num2,result;
float score1,score2,score3;
double x1,x2,x3,x4,total;
bool x,y;
int num1,num2,result;
float score1,score2,score3;
double x1,x2,x3,x4,total;
bool x,y;
int num1,num2,result;
float score1,score2,score3;
double x1,x2,x3,x4,total;
bool x,y;
64
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
3.5.6 The Basic data types associated with variables in languages such as C and Java
Note: In C++ and Java, the modifiers or specifiers are used to indicate whether a variable is global
or local.
It has earlier been pointed out that Java has a wide range of data structures. For instance, Java has
It has earlier been pointed out that Java has a wide range of data structures. The data structures
provided by the Java utility package are very powerful and perform a wide range of functions.
These data structures consist of the following interface and classes:
• Enumeration
• BitSet
• Vector
• Stack
• Dictionary
• Hashtable
• Properties
We may have to start this part by asking the question “What is binding in programming
languages?” Recall that a variable is the storage location for the storing of data in a program.
Binding simply means the association of attributes with program entities. Binding can be static or
dynamic. C, Java are examples of programming languages that support binding. Dynamic Binding
allows greater flexibility but at the expense of readability, efficiency and reliability. Binding
describes how a variable is created and used (or "bound") by and within the given program, and,
possibly, by other programs, as well. There are two types of binding; Dynamic, and Static binding.
Also known as Dynamic Dispatch) is the process of mapping a message to a specific sequence of
code (method) at runtime. This is done to support the cases where the appropriate method cannot
be determined at compile-time. It occurs first during execution, or can change during execution of
the program.
65
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
It occurs first before run time and remains unchanged throughout program execution
3.7.3 Demonstration of Binding in C and Java
As an example, let us take a look at the program below, implemented in C. In line 1, we have
defined three names:
inti,x=0;
voidmain(){
for(i=1;i<=50;i++)
x+=do_something(x);
}
The same implementation can also be done in Java, which is as
follows:
publicclassExample{
inti,x=0;
publicstaticvoidmain(String[]args){
for(i=1;i<=50;i++){
x+=do_something(x);
}
}
}
int, i and x. One of them represents a type while the others represent the declaration of two
variables. The specification of the C language defines the meaning of the keyword int. The
properties related to this specification are bound when the language is defined. There are other
properties that are left out of the language definition. An example of this is the range of values for
the int type. In this way, the implementation of a compiler can choose a particular range for
the int type that is the most natural for a specific machine. The type of variables i and x in the
first line is bound at compilation time. In line 4, the program calls the function do_something
whose definition can be in another source file. This reference is solved at link time.
The linker tries to find the function definition for generating the executable file. At loading time,
just before a program starts running, the memory location for main, do_something, i and
x are bound. Some bindings occur when the program is running, i.e., at runtime. An example is
the possible values attributed to i and x during the execution of the program.
66
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Initialization is the binding of a variable to a value at the time the variable is being bounded to
storage. For instance, in Java, C and C++, a variable can be declared and initialized at the same
time as follows:
float x1,x2;
int num1,num2,total,avareagevalue;
bool P,R,Q;
3.8 Scope
The scope of a variable describes where in a program's text, the variable may be used, while the
extent (or lifetime) describes when in a program's execution a variable has a (meaningful) value.
Scope is a lexical aspect of a variable. Most languages define a specific scope for each variable (as
well as any other named entity), which may differ within a given program. The scope of a variable
is the portion of the program code for which the variable's name has meaning and for which the
variable is said to be "visible". It is also of two type; static and dynamic scope.
The static scope of a variable is the most immediately enclosing block, excluding any enclosed
blocks where the variable has been re-declared. The static scope of a variable in a program can be
determined by simply studying the text of the program. Static scope is not affected by the order in
which procedures are called during the execution of the program.
The dynamic scope of a variable extends to all the procedures called thereafter during program
execution, until the first procedure to be called that re-declares the variable.
3.8.3 Referencing
The referencing environment is the collection of variable which can be used. In a static scoped
language, one can only reference the variables in the static reference environment. A function in a
static scoped language does have dynamic ancestors (i.e. its callers), but cannot reference any
variables declared in that ancestor.
Different programming languages have support for flow control. Flow control constructs such as
if-then-else, while-do, and Do while take as arguments a Boolean expression and one or more
action statements. The actions are either flow control constructs or assignment operations, which
can be constructed recursively too. The Java, or C++ class written to provide an
67
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
if<condition<then<else< construct takes three arguments in its constructor, a Boolean result object
and two action objects, each constructed using one or more primitive constructs.
The evaluate() method of the if-then-else class performs the if-then-else logic upon invocation
with appropriate arguments. Most flow control constructs can be provided in a similar manner. In
programming languages, a complex program logic can be constructed as a hierarchy of objects
corresponding to the primitive constructs. Such a conglomerate of objects, each individually
performing a simple task, and cooperating together to achieve a complex one is the key design
principle in this technology and ideally suited for distributed processing. Other programming
languages like C, PASCAL, DELPHI, Python, FORTRAN, Scala among others have their
syntaxes for handling flow control. The flow control structures are generally used when there is a
need to reach some conclusion based on some set of conditions.
Control structures allow the programmer to define the order of execution of statements. The
availability of mechanisms that allow such a control makes programming languages powerful in
their usage for the solution of complex problems. Usually problems cannot be solved just by
sequencing some expressions or statements, rather they require in certain situations decisions -
depending on some conditions - about which of some alternatives has to be executed, and/or to
repeat or iterate parts of a program an arbitrary time - probably also depending on some conditions.
Control Structures are just a way to specify flow of control in programs. Any algorithm or
program can be more clear and understood if they use self-contained modules called as logic or
control structures. It basically analyzes and chooses in which direction a program flows based
on certain parameters or conditions. There are three basic types of logic, or flow of control,
known as:
Sequential logic as the name suggests follows a serial or sequential flow in which the flow
depends on the series of instructions given to the computer. Unless new instructions are given,
the modules are executed in the obvious sequence. The sequences may be given, by means of
numbered steps explicitly. Also, implicitly follows the order in which modules are written. Most
of the processing, even some complex problems, will generally follow this elementary flow
pattern.
Selection Logic simply involves a number of conditions or parameters which decides one out of
several written modules. The structures which use these type of logic are known as Conditional
Structures. These structures can be of three types:
In this way, the flow of the program depends on the set of conditions that are written. This can be
more understood by the following flow charts:
The Iteration logic employs a loop which involves a repeat statement followed by a module known
as the body of a loop. The two types of these structures are:
Repeat for i = A to N by I:
[Module]
[End of loop]
Here, A is the initial value, N is the end value and I is the increment. The loop ends when A>B. K
increases or decreases according to the positive and negative value of I respectively.
Repeat-For Flow
Repeat-While Structure: It also uses a condition to control the loop. This structure has the form:
In this, there requires a statement that initializes the condition controlling the loop, and there
must also be a statement inside the module that will change this condition leading to the end of
the loop.
70
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Run time is also called execution time. It is the time during which a program is running (executing),
in contrast to other program lifecycle phases such as compile time, link time and load time. When
a program is to be executed, a loader first performs the necessary memory setup and links the
program with any dynamically linked libraries it needs, and then the execution begins starting from
the program's entry point. Some program debugging can only be performed (or is more efficient
or accurate when performed) at runtime. Logic errors and array bounds checking are examples of
such errors in programming language.
Runtime programming is about being able to specify program logic during application execution,
without going through the code-compile-execute cycle. This article describes key elements of the
infrastructure required to support runtime programming in Java, and presents a fairly detailed
analysis of the design. Programmers are expected to consider the run time for his choice of
implementation in the chosen programming language.
1. Write a program in the language of your choice that behaves differently if the language
used name equivalence than if it used structural equivalence
5.0 CONCLUSION
At the end of this unit various primitive data types and variables declaration were discussed.
Examples of variable declaration were shown in some programs.Type conversion, control flow
structures, binding and scope were discussed with their types as well.
6.0 SUMMARY
The data types of a language are a large part of what determines that language style and usefulness.
The primitive data types of most imperative languages include numeric, character, string, floating
71
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
points, arrays and varchar. Case sensitivity and the relationship of names to special words represent
design issues of names. Variables are characterized by the sextuples: name, address, value, type,
lifetime, scope. Binding is the association of attributes with program entities, expression is of 5
categories: Boolean, character, numerical and relational expression and so also control structure
are of 3 types sequence, selection and iteration were all discussed in this unit.
72
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Type Checking
3.1. Strongly Typed and Weakly Typed
3.1.2 Type Compatibility
3.2 Garbage Collection
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/Further Reading
1.0 INTRODUCTION
For the discussion of type checking, the concept of operands and operators is generalized to include
subprograms and assignment statements. Subprograms will be thought of as operators whose
operands are their parameters. The assignment symbol will be thought of as a binary operator, with
its target variable and its expression being the operands.
2.0 OBJECTIVES
73
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Type checking is the activity of ensuring that the operands of an operator are of compatible types.
A compatible type is one that either is legal for the operator or is allowed under language rules to
be implicitly converted by compiler-generated code (or the interpreter) to a legal type. This
automatic conversion is called a coercion. For example:
if an int variable and a float variable are added in Java, the value of the int variable is coerced to
float and a floating-point add is done.
A type error is the application of an operator to an operand of an inappropriate type. For example,
in the original version of C, if an int value was passed to a function that expected a float value, a
type error would occur (because compilers for that language did not check the types of parameters).
If all bindings of variables to types are static in a language, then type checking can nearly always
be done statically. Dynamic type binding requires type checking at run time, which is called
dynamic type checking.
Some languages, because of their dynamic type binding, allow only dynamic type checking. It is
better to detect errors at compile time than at run time, because the earlier correction is usually less
costly. The penalty for static checking is reduced programmer flexibility. Fewer shortcuts and
tricks are possible. Such techniques, though, are now generally recognized to be error prone and
detrimental to readability.
Type checking is complicated when a language allows a memory cell to store values of different
types at different times during execution. In these cases, type checking, if done, must be dynamic
and requires the run-time system to maintain the type of the current value of such memory cells.
So, even though all variables are statically bound to types in some languages not all type errors
can be detected by static type checking.
A programming language is strongly typed if type errors are always detected. This requires that
the types of all operands can be determined, either at compile time or at run time. The importance
of strong typing lies in its ability to detect all misuses of variables that result in type errors. A
strongly typed language also allows the detection, at run time, of uses of the incorrect type values
in variables that can store values of more than one type.
A good example of strongly type language is LISP while, weakly typed languages have a weak
type-check and allows expressions between various different types which has looser typing rules
and may produce unpredictable or even erroneous results or may perform implicit type conversion
at runtime. C and C++ are weakly typed languages because both include union types, which are
not type checked. However, the program designer can choose to implement as strict type checking
as desired, by checking for type-compatibility during construction of each of the primitive
constructs.
74
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
In most language a value’s type must be compatible with that of the context in which it appears.
In an assignment statement, the type of the right-hand side must be compatible with that of the
left-hand side. The types of the operands of + must both be compatible with some common type
that supports addition (integers, real numbers, or perhaps strings or sets). In a subroutine call, the
types of any arguments passed into the subroutine must be compatible with the types of the
corresponding formal parameters, and the types of any formal parameters passed back to the caller
must be compatible with the types of the corresponding arguments.
Type checking is the practice of ensuring that data objects which are somehow related are of
compatible types. Two objects are related by forming the left and right side of an operator, by
forming the left and right side of an assignment statement and by being actual and formal
parameters. Consistency checks which are made before the execution of a source program (i.e. by
the compiler) are said to be static checks, while those checks performed during the execution of a
source program are called dynamic checks (or run-time checks). Checking the syntax is an example
for static checks, while type checks are an example of checks which often can be done statically,
and which sometimes must be done dynamically.
Explicit reclamation of heap objects is a serious burden on the programmer and a major source of
bugs (memory leaks and dangling references). The code required to keep track of object lifetimes
makes programs more difficult to design, implement, and maintain. An attractive alternative is to
have the language implementation notice when objects are no longer useful and reclaim them
automatically. Automatic reclamation (otherwise known as garbage collection) is more-or-less
essential for functional languages: delete is a very imperative sort of operation, and the ability to
construct and return arbitrary objects from functions means that many objects that would be
allocated on the stack in an imperative language must be allocated from the heap in a functional
language, to give them unlimited extent.
Over time, automatic garbage collection has become popular for imperative languages as well. It
can be found in, among others, Clu, Cedar, Modula-3, Java, C#, and all the major scripting
languages. Automatic collection is difficult to implement, but the difficulty pales in comparison to
the convenience enjoyed by programmers once the implementation exists. Automatic collection
also tends to be slower than manual reclamation, though it eliminates any need to check for
dangling references.
Garbage collection presents a classic tradeoff between convenience and safety on the one hand
and performance on the other. Manual storage reclamation, implemented correctly by the
application program, is almost invariably faster than any automatic garbage collector. It is also
more predictable: automatic collection is notorious for its tendency to introduce intermittent
“hiccups” in the execution of real-time or interactive programs.
5.0 CONCLUSION
This unit discussed how type checking serves two principal purposes: they provide implicit context
for many operations, freeing the programmer from the need to specify that context explicitly, and
they allow the compiler to catch a wide variety of common programming errors. A type system
consists of a set of built-in types, a mechanism to define new types. Type compatibility determines
when a value of one type may be used in a context that “expects” another type. A language is said
to be strongly typed if it never allows an operation to be applied to an object that does not support
it; a language is said to be statically typed if it enforces strong typing at compile time.
6.0 SUMMARY
Strong typing is the concept of requiring that all type errors be detected. The value of strong typing
is increased reliability. Type theories have been developed in many areas. In computer science, the
practical branch of type theory defines the types and type rules of programming languages. Set
theory can be used to model most of the structured data types in programming languages.
Techniques for garbage collection, that is the automatic recovery of memory, briefly presenting
collectors based on reference counters, mark and sweep, mark and compact and copy.
76
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
8. Semantics of Programming Languages Computer Science Tripos, Part 1B, Peter Sewell
Computer Laboratory, University of Cambridge, 2009.
9. Specifying Syntax
10. UTD: Describing Syntax and Semantics: Dr. Chris Davis, [email protected]
11. Hennessy, M. (1990). The Semantics of Programming Languages. Wiley. Out of print, but
available on the web at http://www.cogs.susx.ac.uk/users/matthewh/ semnotes.ps.gz12.
12. Boston: Addison-Wesley, 2007.
13. Pierce, B. C. (2002) Types and Programming Languages. MIT Press
14. Composing Programs by John DeNero, based on the textbook Structure and Interpretation
of Computer Programs by Harold Abelson and Gerald Jay Sussman, is licensed under
a Creative Commons Attribution-ShareAlike 3.0 Unported License.
15. Pierce, B. C. (ed) (2005) Advanced Topics in Types and Programming Languages. MIT
Press.
16. Winskel, G. (1993). The Formal Semantics of Programming Languages. MIT Press. An
introduction to both operational and denotational semantics; recommended for the Part II
Denotational Semantics course.
17. Plotkin, G. D. (1981). A structural approach to operational semantics. Technical Report
DAIMI FN-19, Aarhus University.
77
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Procedures Mechanisms
3.1.1 Characteristics
3.1.2 Procedure Implementations
3.2 Functions and Subroutine Mechanisms
3.2.1 Advantages of Subroutine
3.2.2 Disadvantages
3.3 Iteration Mechanisms
3.3.1 Strategies to Consider Iteration
3.3.2 Iteration Relationship with Recursion
3.3.3 Characteristics for Iteration
4.0 Self-Assessment Exercise
5.0 Conclusion
6.0 Summary
7.0 Tutor-Marked Assignment
8.0 References/Further Reading
1.0 INTRODUCTION
Data Abstraction may also be defined as the process of identifying only the required
characteristics of an object ignoring the irrelevant details. The properties and behaviors of an
object differentiate it from other objects of similar type and also help in classifying/grouping the
objects. In programming we do apply the same meaning of abstraction by making classes those
are not associated with any specific instance. The abstraction is done when we need to only inherit
from a certain class, but not need to instantiate objects of that class. In such case the base class can
be regarded as "Incomplete". Such classes are known as "Abstract Base Class".
2.0 OBJECTIVES
78
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
An Abstract Data Type is a user-defined data type that satisfies the following two conditions:
• The representation of, and operations on, objects of the type are defined in a single syntactic
unit.
• The representation of objects of the type is hidden from the program units that use these
objects, so the only operations possible are those provided in the type's definition
Advantages of Abstraction
Helps to increase the security of an application or program as only important details are provided
to the user.
Today, procedures are still used to write multiply occurring code only once in a program, but the
concept of abstraction in the usage of procedures has become more important, since procedures
are now used to modularize complex problems. Procedures are a mechanism to control and reduce
complexity of programming systems by grouping certain activities together into syntactically
separated program units. Therefore, the multiple usage of program code is no longer a criterion for
separating code, rather procedures represent more and more source code which is used only once
within a program, but which performs some basic operations. This is shown below:
3.1.1 Characteristics
Although procedures might be slightly different in their usage and implementation in different
programming languages, there exist some common characteristics. Among them are:
79
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• Procedures are referred to by a name. They usually have local variable declarations, as
well as some parameters forming the communication interface to other procedures and
the main program. In addition to this, functions are bound to a particular type, i.e. the t~
result they produce.
Code Reduction
Complexity Reduction
80
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• A procedure has only one entry point (FORTRAN makes an exception, since it allows the
execution of a subprogram to begin at any desired executable statement using the ENTRY
statement; but the overall concept is the same as for single entry procedures).
• The procedure call mechanism allocates storage for local data structures.
• The procedure call mechanism transfers control to the called instance and suspends the
calling instance during the execution of the procedure (i.e. no form of parallelism or
concurrency is allowed).
• The procedure return mechanism deallocates storage for local data structures.
• The procedure return mechanism transfers control back to the calling instance when the
procedure execution terminates.
The principle actions associated with a procedure call/return mechanism are as follows:
The element of the programming allows a programmer to use snippet of code into one location
which can be used over and over again. The primary purpose of the functions is to take arguments
in numbers of values and do some calculation on them after that return a single result. Functions
are required where you need to do complicated calculations and the result of that may or may not
be used subsequently used in an expression. If we talk about subroutines that return several results.
Where calls to subroutines cannot be placed in an expression whether it is in the main program
where subroutine is activated by using CALL statement which include the list of inputs and outputs
that enclosed in the open and closed parenthesis and they are called the arguments of the
subroutines. There are some of the rules follow by both to define name like less than six letters
and start with the letters. The name should be different that used for variables and functions.
Subroutine (also called procedure, function, routine, method, or subprogram) is a portion of code
within a larger program that performs a specific task and is relatively independent of the remaining
code. A subroutine is often coded so that it can be started ("called") several times and/or from
several places during a single execution of the program, including from other subroutines, and then
branch back (return) to the next instruction after the "call" once the subroutine's task is done.
81
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
A subroutine may be written so that it expects to obtain one or more data values from the calling
program (its parameters or arguments). It may also return a computed value to its caller (its return
value), or provide various result values or out(put) parameters. Indeed, a common use of
subroutines is to implement mathematical functions, in which the purpose of the subroutine is
purely to compute one or more results whose values are entirely determined by the parameters
passed to the subroutine. (Examples might include computing the logarithm of a number or the
determinant of a matrix.)
• Decomposition of a complex programming task into simpler steps: this is one of the two
main tools of structured programming, along with data structures.
• Reducing the duplication of code within a program,
• Enabling the reuse of code across multiple programs,
• Hiding implementation details from users of the subroutine.
3.2.2 Disadvantages
• The invocation of a subroutine (rather than using in-line code) imposes some
computational overhead in the call mechanism itself.
82
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• The subroutine typically requires standard housekeeping code—both at entry to, and exit
from, the function (function prologue and epilogue—usually saving general purpose
registers and return address as a minimum).
Iteration is the technique marking out of a block of statements within a computer program for a
defined number of repetitions. That block of statements is said to be iterated. An iterator is a
mechanism for control abstraction that allows a procedure to be applied iteratively to all members
of a collection in some definite order. The elements of the collection are provided by the iterator
one at a time. But the policy that selects the next element from the collection is hidden to the user
of the iterator and implemented by the iterator. Below is an example of iteration; the line of code
between the brackets of the for loop will iterate three times:
a = 0
for I from 1 to 3 //loop three times
{
a = a + i //add the current value of i to a
}
Print a // the number 6 is printed (0 + 1;
1+ 2; 3 + 3)
It is permissible, and often necessary, to use values from other parts of the program outside the
bracketed block of statements, to perform the desired function. In the example above, the line of
code is using the value of i as it increments.
An iteration abstraction is an operation that gives the client an arbitrarily long sequence of values.
Ideally, it computes the elements in the sequence lazily, so it does not do work computing elements
that might never be used if the instruction does not look that far into the sequence. There are four
strategies to consider in iteration
Return an array:
A straightforward approach is to just return a big array containing all the values that might wanted.
This is not a terrible approach, because the entire array needs to be computed. If the number of
elements in the sequence is very large, the array will just be too big. Also, returning an array invites
implementations that expose the rep when the underlying implementation stores the elements in
an array which avoid committing the implementer to using arrays, we can alternatively add
observer methods for doing iteration, but have an array-like interface:
83
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
class Collection<T> {
...
int numElements();
T getElement(int n);
...
}
The downside of this is that it is often hard to implement random access to the nth element without
simply computing the whole array of results.
Another idea is to have the collection know about the state of the iteration, with an interface like
this:
class Collection<T> {
...
/** Set the state of the iteration back to the first
element. */
void resetIteration();
/** Return the next element in the sequence of
iterated values. */
T next();
...
}
This approach is tempting, but has serious problems. It makes it hard to share the object across
different pieces of code, because there can be only one instruction trying to iterate over the
object at a time. Otherwise, they will interfere with each other.
Iterator pattern
The standard solution to the problem of supporting iteration abstraction is what is known as
the iterator pattern. It is also known as cursor objects or generators.
The idea is that since we cannot have the collection keep track of the iteration state directly, we
instead put that iteration state into a separate iterator object. Then, different instruction can have
their own iteration proceeding on the same object, because each will have their own separate
iterator object, this is what the interface to that object looks like:
interface Iterator<T> {
84
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
class Collection<T> {
Iterator<T> iterator();
}
Collection<String> c;
for (Iterator<String> i = c.iterator(); i.hasNext(); ) {
String x = i.next();
// use x here
}
Java even has syntactic sugar for writing a loop like this:
Collection<String> c;
for (String x : c) {
// use x here
}
Under the covers, exactly the same thing is happening when this loop runs, even though you do
not have to declare an iterator object i.
Notice there is another operation in the interface, remove, which changes the underlying
collection to remove the last element returned by next. Not every iterator supports this operation,
because it does not make sense for every iteration abstraction to remove elements.
Because all Java collections provide an iterator method, iterate over the elements of a collection
can be done without needing to know what kind of collection it is, or how iteration is implemented
85
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
for that collection, or even that the iterator comes from a collection object in the first place. That
is the advantage of having an iteration abstraction.
Specifications for iteration abstractions are similar to ordinary function specifications. There are
couple of issues specific to iteration. One is the order in which elements are produced by the
iterator. It is useful to specify whether there is or is not an ordering and the second issue is what
operation can be performed during iteration. Usually these consist of the observers. However,
sometimes observers have hidden side effects that conflict with iterators.
Implementing Iterator
Java Iterators are a nice interface for providing iteration abstraction, but they do have a downside:
they are not that easy to implement correctly. There are several problems for the implementer to
confront:
• The iterator must remember where it is in the iteration state and be able to restart at any
point. Part of the solution is that the iterator has to have some state.
• The methods next and hasNext do similar work and it can be tricky to avoid
duplicating work across the two methods. The solution is to have hasNext save the work
it does so that next can take advantage of it.
• The underlying collection may be changed either by invoking its methods or by invoking
the remove method on this or another iterator. Mutations except through the current iterator
invalidate the current iterator. The built-in iterators throw
a ConcurrentModificationException in this case.
Example: A list iterator. Suppose we have a linked list with a header object of
class LinkedList (In this example, it is a parameterized class with a type parameter T).
class LinkedList<T> {
ListNode<T> first;
Iterator<T> iterator() {
return new LLIter<T>(first);
}
}
class ListNode<T> {
T elem;
ListNode<T> next;
}
LLIter(ListNode<T> first) {
curr = first;
}
boolean hasNext() {
return (curr != null);
}
T next() {
if (curr == null) throw new NoSuchElementException();
T ret = curr.next;
curr = curr.next;
return ret;
}
void remove() {
if (curr != null) {
// oops! can't implement without a pointer to
// previous node.
}
}
}
Notice that we can only have one iterator object if any iterator object is mutating the collection.
But if no iterators are mutating the collection, and there are no other mutations to the collection
from other instruction, then there can be any number of iterators.
It is actually possible to implement iterators that work in the presence of mutations, but this
requires that the collection object keep track of all the iterators that are attached to it, and update
their state appropriately whenever a mutation occurs.
This example showed how to implement iterators for a collection class, but we can implement
iteration abstractions for other problems. For example, suppose we wanted to do some computation
on all the prime numbers. We could define an iterator object that iterates over all primes.
int p = i.next();
// use p
}
Coroutine iterators
Some languages, support an easier way to implement iterators as coroutines. You can think of
these as method that run on the side of the instruction and send back values whenever they want
without returning. Instead of return, a coroutine uses a yield statement to send values to the client.
This makes writing iterator code simple.
class TreeNode {
Node left, right;
int val;
A final way to support iteration abstraction is to wrap up the body of the loop that you want to do
on each iteration as a function object. Here is how that approach would look as an interface:
class Collection<T> {
void iterate(Function<T> body);
}
interface Function<T> {
/** Perform some operation on elem. Return true if the loop
should
* continue. */
boolean call(T elem);
}
The idea is that iterate calls body.call(v) for every value v to be sent to the client, at exactly
the same places that a coroutine iterator would use yield. The client provides an implementation
of Function<T>.call that does whatever it wants to on the element v. So instead of writing:
for (int x : c) {
print(x);
}
88
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
...
Iteration and recursion both repeatedly execute the set of instructions. Recursion is when a
statement in a function calls itself repeatedly while iteration is when a loop repeatedly executes
until the controlling condition becomes false. The primary difference between recursion and
iteration is that recursion is a process, always applied to a function and iteration is applied to
the set of instructions which we want to get repeatedly executed.
5.0 CONCLUSION
This unit discussed the abstract classes and methods, advantages and disadvantages of an
abstraction. We later discussed procedure mechanism as an abstraction, its characteristics and
procedure implementation as the principle actions associated with a procedure call/return
mechanism. The unit has taken the students through the functions and subroutines mechanism,
advantages and disadvantages of subroutine and finally, discussed iterations mechanism.
89
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
6.0 SUMMARY
A subprogram definition describes the actions represented by the subprogram. Subprograms can
be either functions or procedures. We have seen a couple of good options for declaring and
implementing iteration abstractions. Recursion defines an operation in terms of simpler instances
of itself; it depends on procedural abstraction. Iteration repeats an operation for its side effect(s).
90
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Parameter Passing Method
3.1.1 Function Parameters
3.1.2 Value of using Parameter
3.2 Parameter Techniques
3.2.1 Parameter-by-value
3.2.2 Parameter-by-reference
3.3 Activation Records
3.4 Memory/Storage Management
3.4.1 Static Allocation
3.4.2 Dynamic/Stack Allocation
3.4.3 Heap Allocation
4.0 Self-Assessment Exercise(s)
5.0 Conclusion
6.0 Summary
7.0 Tutor-Marked Assignment
8.0 References/Further Reading
1.0 INTRODUCTION
2.0 OBJECTIVES
91
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
There are some issues to consider when using local variables instead of global variables.
• How can the value of a variable in one sub-program be accessible to another sub-program?
• What happens if the value of a variable needed in more than one subprogram is to change
within one subprogram but not in the main program?
The solution to these issues is to use parameter passing. Parameter passing allows the values of
local variables within a main program to be accessed, updated and used within multiple sub-
programs without the need to create or use global variables.
Parameter is a special kind of variable, used in a subroutine to refer to one of the pieces of data
provided as input to the subroutine. These pieces of data are called arguments. An ordered list of
parameters is usually included in the definition of a subroutine, so that, each time the subroutine
is called, it is arguments for that call can be assigned to the corresponding parameters.
Parameters identify values that are passed into a function. For example, a function to add three
numbers might have three parameters. A function has a name, and it can be called from other
points of a program. When that happens, the information passed is called an argument.
Each function parameter has a type followed by an identifier, and each parameter is separated from
the next parameter by a comma. The parameters pass arguments to the function. When a program
calls a function, all the parameters are variables. The value of each of the resulting arguments is
copied into its matching parameter in a process call pass by value. The program uses parameters
and returned values to create functions that take data as input, make a calculation with it and return
the value to the caller.
• Parameters allow a function to perform tasks without knowing the specific input values
ahead of time.
• Parameters are indispensable components of functions; which programmers use to divide
their code into logical blocks.
The terms parameter and argument are sometimes used interchangeably. However, parameter
refers to the type and identifier, and arguments are the values passed to the function. In the
following C++ example, int a and int b are parameters, while 5 and 3 are the arguments passed
to the function.
92
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
There are a number of different ways a programming language can pass parameters:
• Pass-by-value
• Pass-by-reference
• Pass-by-value-result
• Pass-by-name
3.2.1 Pass-by-value
This method uses in-mode semantics. Changes made to formal parameter do not get
transmitted back to the caller. Any modifications to the formal parameter variable inside the
called function or method affect only the separate storage location and will not be reflected in
the actual parameter in the calling environment. Example of C++ code:
#include <iostream>
#include <cstdlib>
In a pass-by-value system, the statement AssignTo(x) creates a copy of the argument x. This
copy has the same value as the original argument (hence the name "pass-by-value"), but it does
93
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
not have the same identity. Thus the assignment within AssignTo modifies a variable that is
distinct from the supplied parameter. Thus, the output in this passing style is 3 and not 5.
This technique uses in/out-mode semantics. Changes made to formal parameter do get
transmitted back to the caller through parameter passing. Any changes to the formal parameter
are reflected in the actual parameter in the calling environment as formal parameter receives a
reference (or pointer) to the actual data. Example:
#include <iostream>
#include <cstdlib>
In a pass-by-reference system, the function invocation does not create a separate copy of the
argument; rather, a reference to the argument (hence the name "pass-by-reference") is supplied
into the AssignTo function. Thus, the assignment in AssignTo modifies not just the variable
named param, but also the variable x in the caller, causing the output to be 5 and not 3.
main()
{
int z = 27;
doit( z );
printf("z is now %d\n", z);
return 0;
}
To aid the understanding of the learner, we explain the parameter passing mechanisms in Java as
follows:
95
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
In most cases, parameter passing in Java is always Pass-by-Value. However, the context changes
depending upon whether we're dealing with Primitives or Objects. This statement means that the
two popular parameter passing in Java (pass-by-reference and pass-by-value) occur under the
following conditions:
In Java, all objects are dynamically stored in Heap space under the hood. These objects are referred
from references called reference variables. A Java object, in contrast to Primitives, is stored in two
stages. The reference variables are stored in stack memory and the object that they're referring to,
are stored in a Heap memory.
Whenever an object is passed as an argument, an exact copy of the reference variable is created
which points to the same location of the object in heap memory as the original reference variable.
As a result of this, whenever we make any change in the same object in the method, that change is
reflected in the original object. However, if we allocate a new object to the passed reference
variable, then it won't be reflected in the original object.
Pass-by-Reference in Java
The fundamental concepts in any programming language are “values” and “references”. In Java,
primitive variables store the actual values, whereas non-primitives store the reference variables
which point to the addresses of the objects they're referring to. Both values and references are
stored in the stack memory. That is, arguments in Java are always passed-by-value.
Control stack is a run time stack which is used to keep track of the live procedure activations i.e.
it is used to find out the procedures whose execution have not been completed. When it is called
(activation begins) then the procedure name will push on to the stack and when it returns
(activation ends) then the procedure name will pop. Activation record is used to manage the
information needed by a single execution of a procedure. An activation record is pushed into the
stack when a procedure is called and it is popped when the control returns to the caller function.
96
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Return value
Actual parameters
Control link
Access link
Saved machine status
Local data
temporaries
Memory management is one of the functions of the interpreter associated with an abstract machine.
This functionality manages the allocation of memory for programs and for data, that is determines
how they must be arranged in memory, how much time they may remain and which auxiliary
structures are required to fetch information from memory. The different ways to allocate memory
are:
Static memory management is performed by the complier before execution starts. Statically
allocated memory objects reside in a fixed zone of memory (which is determined by the compiler)
and they remain there for the entire duration of the program’s execution because the compiler can
decide the amount of storage needed by each data object. Thus, it becomes easy for a compiler to
identify the address of these data in the activation record. In static allocation, names are bound to
storage locations.
Typical elements for which it is possible statically to allocate memory are global variables which
can be stored in a memory area that is fixed before execution begins because they are visible
throughout the program. The object code instructions produced by the compiler can be considered
another kind of static object, given that normally they do not change during the execution of the
program, so in this case also, memory will be allocated by the compiler. Constants are other
elements that can be handled statically (in the case in which their values do not depend on other
97
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
values which are unknown at compile time). Finally, various compiler-generated tables, necessary
for the runtime support of the language (for example, for handling names, for type checking, for
garbage collection) are stored in reserved areas allocated by the compiler.
If memory is created at compile time, then the memory will be created in static area and only once.
Static allocation supports the dynamic data structure, that means memory is created only at compile
time and deallocated after program completion. Some of the drawback with static storage
allocation is that the size and position of data objects should be known at compile time, there is
restriction of the recursion procedure and the static allocation can be completed if the size of the
data object is called on compile time.
Most modern programming languages allow block structuring of programs. Blocks, whether in-
line or associated with procedures, are entered and left using the LIFO scheme. When a block A is
entered, and then a block B is entered, before leaving A, it is necessary to leave B. It is therefore
natural to manage the memory space required to store the information local to each block using a
stack.
Example:
A:{int a = 1;
int b = 0;
B:{int c = 3;
98
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
int b = 3;
}
b=a+1;
}
At runtime, when block A is entered, a push operation allocates a space large enough to hold the
variables a and b, as shown in the diagram below.
When block B is entered, we have to allocate a new space on the stack for the variables c and b
(recall that the inner variable b is different from the outer one) and therefore the situation, after
this second allocation, is that shown in diagram below.
When block B exits, on the other hand, it is necessary to perform a pop operation to deallocate
the space that had been reserved for the block from the stack. The situation after such a deallocation
and after the assignment is shown in the diagram.
99
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Analogously, when block A exits, it will be necessary to perform another pop to deallocate the
space for A as well. The case of procedures is analogous. The memory space, allocated on the
stack, dedicated to an in-line block or to an activation of a procedure is called the activation record,
or frame. Note that an activation record is associated with a specific activation of a procedure (one
is created when the procedure is called) and not with the declaration of a procedure. The values
that must be stored in an activation record (local variables, temporary variables, etc.) are indeed
different for the different calls on the same procedure. The stack on which activation records are
stored is called the runtime (or system) stack.
It should finally be noted, that to improve the use of runtime memory, dynamic memory
management is sometimes also used to implement languages that do not support recursion. If the
average number of simultaneously active calls to the same procedure is less than the number of
procedures declared in the program, using a stack will save space, for there will be no need to
allocate a memory area for each declared procedure, as must be done in the case of entirely static
management.
100
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
{int a =3;
b= (a+x)/ (x+y);}
could take the form shown in table below, where the intermediate results (a+x) and (x+y)
are explicitly stored before the division is performed.
The need to store intermediate results on the stack depends on the compiler being used and on the
architecture to which one is compiling. On many architectures they can be stored in registers.
Local variables: Local variables which are declared inside blocks, must be stored in a memory
space whose size will depend on the number and type of the variables. This information in general
is recorded by the compiler which therefore will be able to determine the size of this part of the
activation record. In some cases, however, there can be declarations which depend on values
recorded only at runtime (this is, for example, the case for dynamic arrays, which are present in
some languages, whose dimensions depend on variables which are only instantiated at execution
time). In these cases, the activation record also contains a variable-length part which is defined at
runtime.
101
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Dynamic chain pointer: This field stores a pointer to the previous activation record on the stack
(or to the last activation record created). This information is necessary because, in general,
activation records have different sizes. It is also called dynamic link or control link. The set of
links implemented by these pointers is called the dynamic chain.
The case of procedures and functions is analogous to that of in-line blocks but with some additional
complications due to the fact that, when a procedure is activated, it is necessary to store a greater
amount of information to manage correctly the control flow. The structure of a generic activation
record for a procedure is shown in table below.
A function, unlike a procedure, returns a value to the caller when it terminates its execution.
Activation records for the two cases are therefore identical with the exception that, for functions,
the activation record must also keep tabs on the memory location in which the function stores its
return value. Examine various fields of an activation record:
102
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• Intermediate results, local variables, dynamic chain pointer: The same as for in-line
blocks.
• Static chain pointer: This stores the information needed to implement the static scope
rules
• Return address: Contains the address of the first instruction to execute after the call to the
current procedure/function has terminated execution.
• Returned result: Present only in functions. Contains the address of the memory location
where the subprogram stores the value to be returned by the function when it terminates.
This memory location is inside the caller’s activation record.
• Parameters: The values of actual parameters used to call the procedure or function are
stored here
The organization of the different fields of the activation record varies from implementation to
implementation. The dynamic chain pointer and, in general, every pointer to an activation record,
points to a fixed (usually central) area of the activation record. The addresses of the different fields
are obtained, therefore, by adding a negative or positive offset to the value of the pointer.
Variable names are not normally stored in activation records and the compiler substitutes
references to local variables for addresses relative to a fixed position in (i.e., an offset into) the
activation record for the block in which the variables are declared. This is possible because the
position of a declaration inside a block is fixed statically and the compiler can therefore associate
every local variable with an exact position inside the activation record.
In the case of references to non-local variables, also, it is possible to use mechanisms that avoid
storing names and therefore avoid having to perform a runtime name-based search through the
activation record stack in order to resolve a reference. Finally, modern compilers often optimize
the code they produce and save some information in registers instead of in the activation record.
In any case for greater clarity, in the examples, it is assuming that variable names are stored in
activation records. To conclude, note, that all the observations made about variable names, their
accessibility and storage in activation records, can be extended to other kinds of denotable object.
Heap allocation is the most flexible allocation scheme. Allocation and deallocation of memory can
be done at any time and any place depending upon the user's requirement. Heap allocation is used
to allocate memory to the variables dynamically and when the variables are no more used then
claim it back.
• Heap management is specialized in data structure theory: There is generally some time and
space overhead associated with heap manager. For efficiency reasons, it may be useful to
handle small activation records of a particular size as a special case, as follows: For each
size of interest, keep the linked list of free blocks of that size.
• If possible fill the request for size s with a block of size S’, where S’ is the smallest size
greater than or equal to s. When the block is deallocated return back to the linked list.
• For a larger block of storage use the heap manager.
Given that the memory deallocation operations are performed in the same order as allocations (first
p, then q), the memory cannot be allocated in LIFO order. Heap management methods fall
into two main categories according to whether the memory blocks are considered to be of fixed or
variable length.
Fixed-Length Blocks
The heap is divided into a certain number of elements, or blocks, of fairly small fixed length, linked
into a list structure called the free list, as seen in the diagram below:
At runtime, when an operation requires the allocation of a memory block from the heap (for
example using the malloc command), the first element of the free list is removed from the list,
104
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
the pointer to this element is returned to the operation that requested the memory and the pointer
to the free list is updated so that it points to the next element.
When memory is, on the other hand, freed or deallocated (for example using free), the freed block
is linked again to the head of the free list. The situation after some memory allocations is seen in
the diagram below.
Free list for heap of fixed-size blocks after allocation of some memory. Grey blocks are allocated
(in use)
Conceptually, therefore, management of a heap with fixed-size blocks is simple, provided that it
is known how to identify and reclaim the memory that must be returned to the free list easily.
Variable-Length Blocks
In the case in which the language allows the runtime allocation of variable-length memory spaces,
for example to store an array of variable dimension, fixed-length blocks are no longer adequate.
In fact; the memory to be allocated can have a size greater than the fixed block size, and the storage
of an array requires a contiguous region of memory that cannot be allocated as a series of blocks.
In such cases, a heap-based management scheme using variable-length blocks is used.
This type of management uses different techniques, mainly defined with the aim of increasing
memory occupation and execution speed for heap management operations. It performed at runtime
and therefore impact on the execution time of the program. These two characteristics are difficult
to reconcile and good implementations tend towards a rational compromise. In particular, as far as
memory occupancy is concerned, it is a goal to avoid the phenomenon of memory fragmentation.
105
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
So-called internal fragmentation occurs when a block of size strictly larger than the requested by
the program is allocated. The portion of unused memory internal to the block clearly will be wasted
until the block is returned to the free list. But this is not the most serious problem because external
fragmentation is worse. This occurs when the free list is composed of blocks of a relatively small
size and for which, even if the sum of the total available free memory is enough, the free memory
cannot be effectively used.
The diagram below shows an example of this problem. If we have blocks of size x and y on the
free list and we request the allocation of a block of greater size, our request cannot be satisfied
despite the fact that the total amount of free memory is greater than the amount of memory that
has been requested.
External Fragmentation
The memory allocation techniques tend therefore to compact free memory, merging contiguous
free blocks in such a way as to avoid external fragmentation. To achieve this objective, merging
operations can be called which increase the load imposed by the management methods and
therefore reduce efficiency.
There are two methods used for Heap management are as follows:
• Garbage Collection Method: When the all-access path to an object is destroyed, but data
object continues to exist, such types of objects are said to be garbage. Garbage collection
is a technique that is used to reuse that object space. In garbage collection, firstly, we mark
all the active objects, and all the remaining elements whose garbage collection is on are
garbaged and returned to the free space list.
106
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• Space Efficiency− A memory manager should minimize the total heap space needed by a
program.
• Program Efficiency− A memory manager should make good use of the memory
subsystem to allow programs to run faster. As the time taken to execute an instruction can
vary widely depending on where objects are placed in memory.
• Low Overhead− Memory allocation and deallocation are frequent operations in many
programs. These operations must be as efficient as possible. That is, it is required to
minimize the overhead. The fraction of execution time spent performing allocation and
deallocation.
4.0 SELF-ASSESSMENT EXERCISE(S)
1. There are a number of different ways a programming language can pass parameters;
enumerate them and discuss the most common one out of the enumerated?
2. What is the function of activation record in a program?
3. List out the three (3) type of memory allocation?
5.0 CONCLUSION
We examined the parameter passing and discussed the most common out of them with some code
samples, we also discussed the activation record memory management: main techniques for both
static and dynamic memory management, illustrating the reasons for dynamic memory
management using a stack and those that require the use of a heap. We have illustrated in detail
the stack-based management and the format of activation records for procedures and in-line blocks.
6.0 SUMMARY
Formal parameters are the names that subprograms use to refer to the actual parameters given in
subprogram calls. Actual parameters can be associated with formal parameters by position or by
keyword. Parameters can have default values. Subprograms can be either functions, which model
mathematical functions and are used to define new operations, or procedures, which define new
statements. Local variables in subprograms can be stack dynamic, providing support for recursion,
or static, providing efficiency and history-sensitive local variables. In the case of heap-based
management, we saw some of the more common techniques for its handling, both for fixed- and
variable-sized blocks and the fragmentation problem and some methods which can be used to limit
it.
107
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
108
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
109
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Object Oriented Programming
3.2 Basic Concepts of Object Oriented Programming
3.2.1 Sample Object Oriented Programming Program in C/C++
3.2.2 C/C++ Program Structure
3.3 Features of Object Oriented Programming
3.4 Benefits of Object Oriented Programming Language
3.5 Limitations of Object Oriented Programming Language
4.0 Self-Assessment Exercise(s)
5.0 Conclusion
6.0 Summary
7.0 Tutor-Marked Assignment
8.0 References/Further Reading
1.0 INTRODUCTION
Modular programming is the process of subdividing a computer program into separate sub-
programs. A module is a separate software component. It can often be used in a variety of
applications and functions with other components of the system.
• Some programs might have thousands or millions of lines and to manage such programs
it becomes quite difficult as there might be too many of syntax errors or logical errors
present in the program, so to manage such type of programs concept
of modular programming approached.
• Each sub-module contains something necessary to execute only one aspect of the desired
functionality.
• Modular programming emphasis on breaking of large programs into small problems to
increase the maintainability, readability of the code and to make the program handy to
make any changes in future or to correct the errors.
2.0 OBJECTIVES
110
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
The main necessity behind inventing object oriented approach is to remove the drawback
encountered in the procedural approach. The programming paradigm object treats data as an
element in the program development and holds it tightly rather than allowing it to move freely
around the system. It ties the data to the function that operates on it; hides and protects it from
accidental updates by external functions. Object oriented programming paradigm allows
decomposition of the system into the number of entities called objects and then ties properties and
function to these objects. An object’s properties can be accessed only by the functions associated
with that object but functions of one object can access the function of other objects in the same
cases using access specifiers.
Object-oriented programming paradigm methods enable us to create a set of objects that work
together to produce software that is better understandable and models their problem domains than
produced using traditional techniques. The software produced using object-oriented programming
paradigm is easier to adapt to the changing requirements, easier to maintain, create modules of
functionality, promote greater design, be more robust, and perform desired work efficiently.
Object orientation techniques work more efficiently than traditional techniques due to the
following reasons:
• The higher level of abstraction: Top-down approach support abstraction at the Functional
level while object oriented approach support abstraction at the object level.
• The seamless transition among different software development phases: It uses the
same language for all phases, which reduces the level of complexity and redundancy makes
software development clear and robust.
• Good programming practice: The subroutine and attributes of a class are held together
tightly.
• Improves reusability: it supports inheritance due to which classes can be built from each
other. So only difference and enhancement between classes need to be designed and coded.
All the previous functionality remains as it is and can be used without change.
• Objects: Objects are nothing but real or abstract items that contain data to define the object
and methods that can manipulate that information. Thus the object is a combination of data
and methods.
• Classes: Class is a group of objects that has the same properties and behavior and the same
kind of relationship and semantics. Once a class has been defined, we can create any
111
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
number of objects belonging to thy class. Objects are variables of the class. Each object is
associated with data of the type class with which they are created; this class is the collection
of objects of a similar type.
112
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
The programming language was created, designed and developed by a Danish Computer
Scientist named Bjarne Stroustrup at Bell Telephone Laboratories in Murray Hill, New Jersey
in 1979. As he wanted a flexible and a dynamic language which was similar to C with all its
features, but with additional of active type checking, basic inheritance, default functioning
argument, classes, in-lining, etc.
The C++ language is a procedural and as well as object-oriented programming language and is
a combination of both low-level and high-level language, regarded as (Intermediate/Middle-
Level Language). C++ is sometimes called a hybrid language because it is possible to write object
oriented or procedural code in the same program in C++. This has caused some concern that some
C++ programmers are still writing procedural code, but are under the impression that it is object
orientated, simply because they are using C++. Often it is an amalgamation of the two. This usually
causes most problems when the code is revisited or the task is taken over by another coder.
C/C++ continues to be used and is one of the preferred programming languages to develop
professional applications. C/C++ has dynamic memory allocation but does not have garbage
113
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
collection: that allow program to misuse and lack memory. It also supports dangerous raw memory
pointers and pointer arithmetic which decrease the time needed for software development. C/C++
has no official home in web as it follows the old tradition of translating C++ into C and others are
native compiler. C/C++ is the mother of all high level languages. Example of C++ compiler are
GNU, C/C++ compiler, GCC.
• The C++ language defines several headers, which contain information that is either
necessary or useful to your program. For this program, the header <iostream> is
needed.
• The line using namespace std; tells the compiler to use the std namespace.
Namespaces are a relatively recent addition to C++.
• The next line '// main() is where program execution begins.' is a
single-line comment available in C++. Single-line comments begin with // and stop at the
end of the line.
• The line int main() is the main function where program execution begins.
• The next line cout << "Hello World"; causes the message "Hello World" to be
displayed on the screen.
• The next line return 0; terminates main( )function and causes it to return the
value 0 to the calling process.
• Semicolons and Blocks in C++: In C++, the semicolon is a statement terminator.
That is, each individual statement must be ended with a semicolon. It indicates the end of
one logical entity. For example, following are three different statements:
x = y;
y = y + 1;
add(x, y);
A block is a set of logically connected statements that are surrounded by opening and closing
braces. For example:
114
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• The length of the programs developed using OOP language is much larger than the
procedural approach. Since the program becomes larger in size, it requires more time to
be executed which leads to slower execution of the program.
• We cannot apply OOP everywhere as it is not a universal language. It is applied only
when it is required. It is not suitable for all types of problems.
115
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• Programmers need to have brilliant designing skill and programming skill along with
proper planning because using OOP is little bit tricky.
• OOPs take time to get used to it. The thought process involved in object-oriented
programming may not be natural for some people.
• Everything is treated as object in OOP so before applying it we need to have excellent
thinking in terms of objects.
5.0 CONCLUSION
Object orientation is so-called because this method sees things that are part of the real world of
objects. In this unit, we have been able to explain the concept of object-oriented programming, we
have given the sample program in C/C++ as an intermediate language of middle level language
and explain its structure in program. Features, benefits and limitations of object oriented
programming were also looked into for proper understanding.
6.0 SUMMARY
There has been a general but deep, examination of the object-oriented paradigm, introduced as a
way of obtaining abstractions in the most flexible and extensible way which characterized the
object-oriented paradigm when in the presence of: Encapsulation of data, a compatibility relation
between types which we call subtype, a way to reuse code, which we refer to as inheritance and
which can be divided into single and multiple inheritance and techniques for the dynamic dispatch
of methods. These four concepts were viewed principally in the context of languages based on
classes. We then discussed the implementation of single and multiple inheritance and dynamic
method selection. It was concluded with a study of some aspects relevant to the type systems in
object oriented languages: Subtype polymorphism. another form of parametric polymorphism
which is possible when the language admits generics. The problem of overriding of co- and
contravariant methods. The object-oriented paradigm, in addition to represent a specific class of
languages, is also a general development method for system software codified by semiformal rules
based on the organization of concepts using objects and classes.
1. K. Arnold, J. Gosling, and D. Holmes. The Java Programming Language, 4th edition.
Addison-Wesley Longman, Boston, 2005.
116
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
117
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Functional Programming Languages
3.1.1 Categories of Functional Programming Language
3.1.2 Concepts of Functional Programming Language
3.1.3 Characteristics of Functional Programming Language
3.1.4 Benefits of Functional Programming Language
3.1.5 Advantages of Functional Programming Language
3.16 Limitations of Functional Programming Language
4.0 Self-Assessment Exercise
5.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/Further Reading
INTRODUCTION
2.0 OBJECTIVES
118
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
It can be called as the smallest programming language in the world. It gives the definition of
what is computable. Anything that can be computed by lambda calculus is computable. It is
equivalent to Turing machine in its ability to compute because It provides a theoretical
framework for describing functions and their evaluation. It forms the basis of almost all current
functional programming languages. Functional programming (also called FP) is a way of thinking
about software construction by creating pure functions. It avoids concepts of shared state and
mutable data observed in OOP.
The objective of any FP language is to mimic the mathematical functions. However, the basic
process of computation is different in functional programming. Some most prominent functional
programming languages are: Lisp, Python, Haskell, SML, Clojure, Scala, Erlang, Clean, F#,
Scheme, XSLT, SQL, Mathematica
A pure function is a type of function whose inputs are declared as inputs and none of them should
be hidden. The outputs are also declared as outputs. Pure functions act on their parameters. It is
not efficient if not returning anything. Moreover, it offers the same output for the given parameters.
Example:
Function Pure(a,b)
{
return a+b;
}
119
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Impure Functional Languages: These types of functional language support the functional
paradigms and imperative style programming examples printf(), rand(), time(),
etc. They have hidden inputs or output; which is called impure. Impure functions cannot be used
or tested in isolation as they have dependencies.
Example:
int z;
function notPure(){
z = z+10;
}
• Pure functions: These functions have two main properties. First, they always produce
the same output for same arguments irrespective of anything else.
Secondly, they have no side-effects i.e. they do not modify any arguments or local/global
variables or input/output streams as earlier said. Later property is called immutability.
Pure function only result; is the value it returns and they are deterministic. Programs
done using functional programming are easy to debug because pure functions have no
side effects or hidden I/O. Pure functions also make it easier to write parallel/concurrent
applications. When the code is written in this style, a smart compiler can do many things
like: it can parallelize the instructions, wait to evaluate results when needing them, and
memorize the results since the results never change as long as the input does not
change. Example of the pure function:
sum(x, y) // sum is function taking x and y
as arguments
return x + y // sum is returning sum of x and
y without changing them.
• Recursion: There are no “for” or “while” loop in functional languages. Iteration in
functional languages is implemented through recursion. Recursive functions repeatedly
call themselves, until it reaches the base case. Example of the recursive function:
fib(n)
if (n <= 1)
return 1;
else
return fib(n - 1) + fib(n - 2);
• Referential transparency: In functional programs variables once defined do not change
their values throughout the program. Functional programs do not have assignment
statements. If we have to store some value, we define new variables instead. This
eliminates any chances of side effects because any variable can be replaced with its actual
value at any point of execution. State of any variable is constant at any instant. Example:
120
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
121
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• Bugs-Free Code: Functional programming does not support state, so there are no side-
effect results and we can write error-free codes.
• Efficient Parallel Programming: Functional programming languages have NO Mutable
state, so there are no state-change issues. One can program "Functions" to work parallel as
"instructions". Such codes support easy reusability and testability.
• Efficiency: Functional programs consist of independent units that can run concurrently. As
a result, such programs are more efficient.
• Supports Nested Functions: Functional programming supports Nested Functions.
• Lazy Evaluation: Functional programming supports Lazy Functional Constructs like Lazy
Lists, Lazy Maps, etc.
As a downside, functional programming requires a large memory space. As it does not have state,
you need to create new objects every time to perform actions. Functional Programming is used in
situations where we have to perform lots of different operations on the same set of data.
• Lisp is used for artificial intelligence applications like Machine learning, language
processing, Modeling of speech and vision, etc.
• Embedded Lisp interpreters add programmability to some systems like Emacs.
5.0 CONCLUSION
In this unit, we have been able to explained in detail functional programming language, we
discussed the concepts of functional language, give some other examples of functional language
123
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
and how a structure of functional language should look like in a program. There was discussions
characteristics, features, advantages and disadvantages of functional language.
6.0 SUMMARY
The functional programming paradigm in its pure form, this is a computational model which does
not contain the concept of modifiable variable. Computation proceeds by the rewriting of terms
that denote functions. The main concepts are: abstraction which is a mechanism for passing from
one expression denoting a value to one denoting a function, application also a mechanism dual to
abstraction, by which a function is applied to an argument, computation by rewriting or reduction,
in which an expression is repeatedly simplified until a form that cannot be further reduced is
encountered.
1. Is the define primitive of Scheme an imperative language feature? Why or why not?
2. Deliberate with example on the categories of functional programming language and show
how it can be differentiated in the program
3. Write purely functional Scheme functions to:
(a) return all rotations of a given list. For example, (rotate ’(abcd e)) should
return ((a b c d e) (b c d e a) (c d e a b) (d e a b c)
(e a b c d)) (in some order).
(b) return a list containing all elements of a given list that satisfy a given predicate. For
example, (filter (lambda (x) (< x 5)) ’(3958 2 4 7)) should
return (3 2 4).
4. What are the differences between functional language and object oriented language?
1. J. Backus. Can programming be liberated from the von Neumann style? A functional style
and its algebra of programs. Commun. ACM, 21(8):613–641, 1978.
doi:10.1145/359576.359579.
2. H. Barendregt. The Lambda Calculus: Its Syntax and Semantics. Elsevier, Amsterdam,
1984.
3. A. Church. The Calculi of Lambda Conversion. Princeton Univ. Press, Princeton, 1941.
4. G. Cousineau and M. Mauny. The Functional Approach to Programming. Cambridge Univ.
Press, Cambridge, 1998. References 367
5. R. Hindley and P. Seldin. Introduction to Combinators and Lambda-Calculus. Cambridge
Univ. Press, Cambridge, 1986.
6. P. J. Landin. The mechanical evaluation of expressions. The Computer Journal, 6(4):308–
320, 1964. citeseer.ist.psu.edu/cardelli85understanding.html.
7. J. McCarthy. Recursive functions of symbolic expressions and their computation by
machine, part I. Commun. ACM, 3(4):184–195, 1960. doi:10.1145/367177.367199.
8. R. Milner and M. Tofte. Commentary on Standard ML. MIT Press, Cambridge, 1991.
9. R. Milner, M. Tofte, R. Harper, and D. MacQueen. The Definition of Standard ML—
Revised. MIT Press, Cambridge, 1997.
124
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Logic Programming Languages
3.1.1 Logic Programming Terms
3.1.2 Logic Programming Division
3.1.3 Sample Logic Program in Prolog
3.1.4 Structure of Prolog Program
3.2 Characteristics of Logic Programming Language
3.3 Features of Logic Programming Language
3.4 Advantages/Benefits o of Logic Programming Language
4.0 Self-Assessment Exercise(s)
5.0 Conclusion
6.0 Summary
7.0 Tutor-Marked Assignment
8.0 References/Further Reading
1.0 INTRODUCTION
The logic programming paradigm includes both theoretical and fully implemented languages, of
which the best known as. Even if there are big differences of a pragmatic and, for some, of a
theoretical nature between these languages, they all share the idea of interpreting computation as
logical deduction.
2.0 OBJECTIVES
• Adopt the approach that has characterized the rest of the text while examining this
paradigm.
125
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
• Provide enough basis for understanding and, in a short time, mastering this and other logic
programming languages.
• examine these concepts while trying to limit the theoretical part.
Programming in logic, is a declarative programming language that is based on the ideas of logic
programming. Prolog was an idea to make the logic look like a programming language and also
the idea of the prolog was to allow the logic to be controlled by a programmer to advance the
research for theorem-proving.
Computer scientists are using logic programming methods to try to allow machines to reason
because this method is useful for knowledge representation. The logic paradigm is based on first-
order predicate logic.
3.1.1 Logic Programming Terms
In logic programming, the logic used to represent knowledge is the clausal form that is a subset of
first-order predicate logic. First-order logic is used because of its understandable nature as well as
it can represent all computational problems. The resolution inference system is used to manipulate
knowledge. It is required for proving theorems in clausal-form logic. The essence of logic
programming is shown in the below diagram.
126
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
In the logic paradigm, a program consists of a set of predicates and rules of inference. Predicates
are fact-based statements i.e. water is wet. On the other hand, rules of inference are statements
such as if X is human then X is mortal.
There are multiple different logic programming languages. The most common language is Prolog
or programming in logic, can also interface with other programming languages such as Java and
C. On top of being the most popular logic programming language, Prolog was also one of the
first such languages, with the first prolog program created in the 1970s for use with
interpretations. Prolog was developed using first-order logic, also called predicate logic, which
allows for the use of variables rather than propositions. Prolog utilizes artificial intelligence (AI)
to help form its conclusions and can quickly process large amounts of data. Prolog can be run
with or without manual inputs, meaning in it can be programmed to run automatically as part of
data processing.
Logic programming, and especially Prolog, can help businesses and organizations through:
• Natural language processing: Natural language processing (NLP) allows for better
interactions between humans and computers. NLP can listen to human language in real
time, and then processes and translate it for computers. This allows technology to
understand natural language. However, NLP is not limited just to spoken language.
Instead, NLP can also be utilized to read and understand documentation, both in physical
print or from word processing programs. NLP is used by technologies such as Amazon
Alexa and Google Home to process and understand spoken instructions, as well as by
email applications to filter spam emails and warn of phishing attempts.
Database management: Logic programming can be used for the creation, maintenance,
and querying of NoSQL databases. Logic programming can create databases out of big
data. The programming can identify which information has been programmed as relevant,
and store it in the appropriate area. Users can then query these databases with specific
questions, and logic languages can quickly sift through all of the data, run analyses, and
return the relevant result with no additional work required by the user.
Predictive analysis: With large data sets, logic languages can search for inconsistencies
or areas of differentiation in order to make predictions. This can be useful in identifying
potentially dangerous activities or for predicting failures of industrial machines. It can
also be used to analyze photos and make predictions around the images, such as
predicting the identity of objects in satellite photos, or recognizing the patterns that
differentiate craters from regular land masses.
127
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
For instance, Socrates is a man, all men are mortal, and therefore Socrates is mortal.
The following is a simple Prolog program which explains the above instance:
man(Socrates).
mortal(X) :- man(X).
The first line can be read: Socrates is a man. It is a base clause, which represents a simple fact.
The second line can be read: X is mortal if X is a man; in other words, "All
men are mortal.'' This is a clause, or rule, for determining when its input X is
"mortal.'' (The symbol ":-'', sometimes called a turnstile, is pronounced "if''.) We can test the
program by asking the question:
?- mortal(Socrates).
that is, "Is Socrates mortal?'' (The "?-'' is the computer's prompt for a question). Prolog will
respond "yes''. Another question we may ask is:
?- mortal(X).
That is, "Who (X) is mortal?'' Prolog will respond "X = Socrates''.
To give you an idea, Paul is Bhoujou's and Lizy's father. Sarahyi is Bhoujou 's and Lizy's mother.
Now, if someone asks a question like "who is the father of Bhoujou and Lizy?" or "who is the
mother of Bhoujou and Lizy?" we can teach the computer to answer these questions using logic
programming.
Example in Prolog:
128
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
Explanation:
We are asking Prolog what value of X makes this statement true? X should be Sarahyi to make
the statement true and it will respond X = Sarahyi
?- mother(X, Bhoujou).
X = Sarahyi
Example:
domains
being = symbol
predicates
animal(being) % all animals are beings
dog(being) % all dogs are beings
die(being) % all beings die
clauses
animal(X) :- dog(X) % all dogs are animals
dog(fido). % fido is a dog
die(X) :- animal(X) % all animals die
In logical programming the main emphasize is on knowledge base and the problem. The
execution of the program is very much like proof of mathematical statement, e.g., Prolog
sum of two number in prolog:
predicates
sumoftwonumber(integer, integer)
clauses
sum(0, 0).
sum(n, r):-
n1=n-1,
sum(n1, r1),
r=r1+n
3.1.4 Structure of Prolog Program
129
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
General structure of Prolog programs can be summarized by the following-four statement types:
• Logical programming can be used to express knowledge in a way that does not depend on
the implementation, making programs more flexible, compressed and understandable.
• It enables knowledge to be separated from use, i.e. the machine architecture can be changed
without changing programs or their underlying code.
• It can be altered and extended in natural ways to support special forms of knowledge, such
as meta-level of higher-order knowledge.
• It can be used in non-computational disciplines relying on reasoning and precise means of
expression.
• The system solves the problem, so the programming steps themselves are kept to a
minimum;
• Proving the validity of a given program is simple.
• Easy to implement the code.
• Debugging is easy.
• Since it's structured using true/false statements, we can develop the programs quickly
using logic programming.
• As it's based on thinking, expression and implementation, it can be applied in non-
computational programs too.
• It supports special forms of knowledge such as meta-level or higher-order knowledge as
it can be altered.
5.0 CONCLUSION
130
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
In this unit you have been introduced to how programmers do not focus on the development of an
algorithm to achieve any goal rather focus only on the goals of any computation which makes logic
programming language a road-map to machine learning languages. The primary characteristics of
the logic programming paradigm were discussed, so also sample logic program, some features and
benefits of logic were explained.
6.0 SUMMARY
The logic programming method, also known as predicate programming, is based on mathematical
logic. Instead of a sequence of instructions, software programmed using this method contains a set
of principles, which can be understood as a collection of facts and assumptions. All inquiries to
the program are processed, with the interpreter applying these principles and previously defined
rules to them in order to obtain the desired result.
1. Write in Prolog a program that computes the length (understood as the number of elements)
of a list and returns this value in numeric form. (Hint: consider an inductive definition of
length and use the is operator to increment the value in the inductive case.)
2. List the principle differences between a logic program and a Prolog program.
3. Discuss and with the aid of diagram how the three major division of logic program has held
in the essence of logic programming.
4. Given the following logic program:
member(X, [X| Xs]).
member(X, [Y| Xs]):- member(X, Xs).
State what is the result of evaluating the goal: member(f(X), [1, f(2), 3] ).
5. Given the following PROLOG program (recall that X and Y are variables, while a and b
are constants):
p(b):- p(b).
p(X):- r(b).
p(a):- p(a).
r(Y).
State whether the goal p(a) terminates or not; justify your answer.
6. Explain in detail how logic programming has helped businesses and organization through
their routines?
1. K. Apt. From Logic Programming to Prolog. Prentice Hall, New York, 1997.
2. S. Ceri, G. Gottlob, and L. Tanca. Logic Programming and Databases. Springer, Berlin,
1989.
3. K. L. Clark. Predicate logic as a computational formalism. Technical Report Res. Rep.
DOC 79/59, Imperial College, Dpt. of Computing, London, 1979.
4. H. C. Coelho and J. C. Cotta. Prolog by Example. Springer, Berlin, 1988.
5. E. Eder. Properties of substitutions and unifications. Journal of Symbolic Computation,
1:31– 46, 1985.
131
CIT 332 SURVEY OF PROGRAMMING LANGUAGES
132