0% found this document useful (0 votes)
24 views

Comparative Programming

Uploaded by

nonosniper
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
24 views

Comparative Programming

Uploaded by

nonosniper
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 11
Introduction 1.1 The diversity of languages 1.2 The software development process 1.3. Language design 1.4 Languages or systems? 1.5 The lexical elements 1 ‘This chapter looks at the different stages involved in the development of software and con- cludes that the main purpose of a programming language isto help in the construction of reliable software. It also discusses how designers have tried to include expressive power, simplicity and orthogonality in theirlanguages whist noting that pragmatic matters such as implementation and error detection have a significant influence. We also consider the distinction between a language and its development environment. The basic low-level building blocks used in the construction of a language are con- sidered; that is, the character set, the rules for identifiers and special symbols, and how comments, blanks and layout are handled. (G1) The diversity of languages Although over a thousand different programming languages have been designed by various research groups, international committees and computer companies, most of these languages have never been used outside the group which designed them while others, once popular, have been replaced by newer languages. Nevertheless, a large number of languages remain in current use and new languages continue to emerge. ‘This situation can appear very confusing to students who have mastered one language, often Pascal, Delphi, C+ or Java, and perhaps have a reading knowledge of a couple of others, They might well ask: “Does a lifetime of learning new languages await 2 Introduction Fortunately, the situation is not as bleak as it appears because, although two languages may seem to be superficially very different, they often have many more similarities than differences. Individual languages are not usually built on separate ‘principles; in fact, their differences are often due to quite minor variations in the same principle. The aim of this book is to consider the principal programming language concepts and to show how they have been dealt with in various languages. We will see that by studying these features and principles we can better understand why languages have been designed in the way they have. Furthermore, when faced with a new language, we can identify where the language differs from those we already know and where it provides the same facilities disguised in a different syntax. (42) The software development process ‘A computer isa tool that solves problems by means of programs (or software) written in a programming language. The development of software is a multi-stage process. First, it is necessary to determine what needs to be done. Unfortunately, initial informal ‘ser requirements are usually vague, inconsistent, ambiguous and incomplete. The purpose of requirements analysis is to understand and clarify the requirements and ofien involves resolving the conflicting views of different users. ‘The next stage is concemed with the production of a document, the specification, which defines as accurately as possible the problem to be solved; in other words, it determines what the system is to do. Requirements analysis and specification are the ‘most difficult tasks in software development. Having defined what the system is to achieve, we then design a solution and imple- ‘ment the design on a computer. Its only atthe implementation stage tha a programming language becomes directly involved. The aim of validation and verification js to show that the implemented solution does what the users expect and satisfies the original specification, Although there has been a lot of theoretical work on verification, or program proving, it is usually still necessary to run the program with carefully chosen test data. But the problem with program testing is that it can only show the presence of errors, it can never prove theit absence, ‘The final stage of software development, usually termed maintenance, covers two quite distinet activities: 1. The correction of errors that were missed at an earlier stage but have been detected after the program has been in active service. 2. Modification of the program to take account of additions or changes in the users" requirements, Although a programming language is only explicitly introduced during the implemen- tation stage, it has traditionally influenced the earlier stages of the process. Designers are, for example, often aware of the implementation language to be used and bias theit designs to take account of the language's strong points ‘The software development process. 3 Tequremants analysis specification | Implementation ‘maintenance | Figure 1.1. The waterfall made! of software development. Development models I is important to realise that the software development process is iterative, not sequen- tial, Therefore, knowledge gained at any one of the stages outlined can (and should) be used to give feedback to earlier stages. The traditional approach isto treat the different stages in the development process as being self-contained and this has led to the waterfall ‘model of software development shown in Figure 1.1 However, there has been increasing acceptance of the idea that an ineremental and iterative approach is much mote realistic. Central to this approach is the idea of risk management. Every time we make a decision, there is the possibility that we get it ‘wrong. We therefore want to have continual feedback to show up possible errors because the longer an error remains undetected, the more expensive it will he to put right. This led tothe spiral mode! shown in Figure 1.2 (Boehm, 1988). We start at the centre of the spiral and go repeatedly through the different stages as our system is built incrementally, Many modem languages are objectoriented and this has led to the creation of object-oriented development methods. In other development methods, there is a clear distinction in the techniques used in specification, design and implementation. However, in object-oriented development, a problem can be understood and a solution designed and then implemented using the same framework of a set of communicating objects. The object-oriented development process is therefore well suited to an incremental and iterative approach. At any given stage, different objects can be described at different levels of abstraction, As the iterative development process continues, we incrementally add more detail to the object descriptions. ‘The need for 2 notation in which a specification or design can be written down {Introduction Determine objectives, | Evaluate alternatives, alteratves, constraints | identity and resahe sks i Pan next phase I Develop and verity Figure 1.2. Simplified Boehm spiral model has led to the development ofboth specification and program design languages. Such languages are at higher level of abstraction and give fewer details than implementation languages. Many specification languages are mathematical in form and are amenable to proof techniques. However, languages of this type are outside the scope ofthis book and s0 we ony look at what are conventionally considered to be implementation languages, althoogh some funetional languages have been used as executable specification lan uages. ‘Another approach is to use graphical notions to capture the requirements and represent designs. Examples of diagrams that occur in many different development methods are dataflow diagrams, entity relationship diagrams, state transition diagrams and message sequence chars. A problem is that each development method can Us its ‘own setof diagram notations which, although they are representing much the same thing, can differ in detail. This situation is far from satisfactory as it can suggest differences in the development process that do not really exist. In object-oriented development, standard notation called the Unified Modeling Language (UML) has been adopted $0 that different methods now use a common notation (Booch et al, 1998). There are several different kinds of diagram in UML, but the single most important isthe lass dingram which shows the classes involved in an object system and their associations ‘An example class diagram is shown in Chapter 8 The use ofa systematic software development process has greatly influenced both language design and how languages are used. For example, Pascal was designed to support the ideas of structured programming. The problems of constructing large systems and of program maintenance led to the introduction of language features that allow large systems to be broken down into self-contained modules, Packages in Ads and classes in object-oriented languages satisfy that need. It is clear, therefor, that programming languages éonotexistina vactum; rather, he design of modern languages ‘isa drect response tothe needs and problems ofthe software development process. Language design 5 (G3) Language design Most widely used programming languages are imperative; exaraples are Fortran, COBOL, C, C++, Pascal, Ada and Java. A program writen in an imperative language achieves its effect by changing the value of variables or the attributes of objects by means of assignment statements. Until quite recently, most widely used imperative languages were procedural, that is ther organisation was centred around the defnition of procedures. Many procedural languages have now been extended to include abject- oriented features (C by C++, Pascal by Delphi, Ada by Ada95, COBOL by OOCOBOL, Basic by Visual Basic) while other new purely object-oriented languages such as Fiffel and Java have been designed. Object-oriented programs are organised asa set of objects ‘hich communicate with one another through small strictly defined interfaces Other approaches to language design include functional languages (such as pure Lisp and ML) and logic languages (such as Prolog). These alternative approaches are dealt with in Chapters 9 and 10 respectively. ‘The primary purpose of & programming language is to support the construction of reliable software. Hence, in most modera languages, type checking takes place at compile time, which isa considerable help in catching logical erors before the program is run. [Li also important that a language is user friendly so that its straightforward to design, write, read, tes, run, document and modify programs written in that language. “To understand how these objectives may be achieved, the issues of language design can be divided into several broad categories expressive power, simplicity and orthogonality, implementation, ‘stror detection and correction, + correctness and standards, Expressive power ‘A programming language with high expressive power enables solutions to be expressed in terms of the problem being solved rather than in terms of the computer on which the solution isto be implemented. Hence, the programmer can concentrate on problem solving. Such a language should provide a convenient notation to describe both algo- ‘rithms and data structures in addition to supporting the ideas of structured programming ‘and modalarisation, Another aspect of expressive power is the number of types provided together with their associated operations. Instead of providing a large number of built-in types, most ‘modern languages provide facilities, such as the Ada package or the C+ and Java class, for defining new types, called abstract data types. Such languages can then provide a ‘wide range of predefined types by means of standard libraries which the programmer can use to build new types for the problem in hand. When a language, together with its standard libraries, does not include a suitable range of types and operations, then the programmer generally has to provide these by declarations, thereby distracting Introduction the programmer's attention to the lower level aspects of solving the problem. Often, languages may have high expressive power in some areas, but not in others; for example, ‘Ada has a range of numerical operations that give it expressive power for numerical work, but i is less effective in data processing applications. ‘Also included under the heading of expressive power is readability; tha is, the ease with which someone familiar with the language can read and understand programs written by other people. Readability is considerably enhanced by a well-designed comment facility, and good layout and naming conventions. In practice, it should be possible to write programs that can act, to an extent, as their own documentation, thereby making maintenance and extension of the prggram much easier, Simplicity and orthogonality ‘Simplicity implies that a language allows programs to be expressed concisely in a manner that is easily written, understood and read. This objective is often underrated by computer scientists, but is « high priority for non-professional programmers. The success, fist of Basic and then of Visual Basic is an eloquent commentary on the importance that users place on simplicity. ‘A simple language either avoids complexity or handles it well. Inherentin most simple languages is the avoidance of features that most human programmers find difficult. ‘Simple languages should not allow altemative ways of implementing constructs nor should they produce surprising results from standard applications of their rules. An orthogonal language is one in which any combination ofthe basic language constructs is allowed and so there ae few, if any, restrictions or special cases. Examples of orthogonal languages are Algol 68 and Smalltalk, which were both designed with the aim of keeping the number of basic concepts as small as possible. The idea was that the resulting language would be simple as it would only consist of combinations of features from a smal set of basic concepts. ‘There can, however, be a clash between the ideas of orthogonality and simplicity For example, Pascal, which is not orthogonal, is simpler to learn and use than Algol 68. Where a new special construct is introduced in Pascal, the same effect is achieved in Algol 68 by the combination of simpler existing constructs. As an example, Pascal separates the notion of the type of a parameter from whether itis a value or a variable parameter. (Details are given in Chapter 6.) Algol 68, on the other hand, combines both pieces of information within the parameter type. Although the Algol 68 approach is elegant and powerful, the more pragmatic approach (Wirth, 1975) taken in the design of Pascal has led to a more understandable language. What is generally agreed is that the use of constructs should be consistent; that is, they should have a similar effect wherever they appear. This is an important design principle for any language although itis obviously of great importance in an orthogonal language which gets its expressive power from a large number of combinations of basic concepts. Whether simplicity or orthogonality is the goal, once the basic constructs are known, their combination should be predictable. This is sometimes called the law of minimum surprise, However, again the importance of simplicity should not be underestimated. In Java, the declaration int: x; defines x to be an integer variable while the declaration SomeClass x; defines x to be a reference to a SomeClass Language design 7 object. We therefore have the same syntax meaning different things. Although this is inconsistent, it can be argued that inventing new syntax to make the distinetion clear ‘would have just complicated matters, Implementation Execution of a program written in an imperative language, such as Pascal, Ada or C++, normally takes place by translating (compiling) the source program into an equivalent machine code program. This machine code program is then executed. The ease with which a language can be translated and the efficiency of the resulting code can be 1ajor factors in a language's success. Large languages, for example, have an inherent disadvantage in this respect because the compiler will, almost inevitably, be large, slow and expensive. ‘An altemative to compiling a source program is to use an interpreter. An interpreter can directly execute a source program, but what is more common is fora source program to be translated into some intermediate form which is then executed by the interpreter. ‘The interpreter can be said to implement a virtual machine. Executing a program ‘under the control of an interpreter is much slower than running the equivalent machine code program, but does give much more flexibility at run time. The added flexibility is important in languages whose main purpose is symbolic manipulation rather than ‘numerical calculation. Examples of such languages are the string processing language SNOBOL4, the object-oriented language Smalltalk, the functional language Lisp, the logic language Prolog and the scripting language Perl ‘The use of an interpreter also supports an interactive programming environment in which programs may be developed incrementally. When developing. Lisp program, for example, a programmer can interact directly with the Lisp interpreter and type in the definition of functions followed by expressions which call these functions. The expressions are immediately executed and the results made available. This allows the carly detection, and easy correction, of logical errors. Once the complete program has ‘been developed, it ean be compiled so that it will run faster. Java is an imperative language and so we would expect that it would normally be compiled into machine code. However, that is not the case; Java programs are interpreted. An exciting use of Java isto animate web pages. A person can download a web page which contains a Java applet (e small application) and, using a Java-enabled web browser such as Netscape or Intemet Explorer, can run the applet. To achieve this, it must be possible for a Java program to be translated on one computer and to ran ‘on a different kind of computer and the easiest way of doing this is to translate Fave source programs into code for a Java virtual machine. Java-enabled browsers provide interpreters for the Java virtual machine. Some language designers, notably Wirth the designer of Pascal and Modula-2, have ‘made many of their design decisions on the basis ofthe ease with which a feature can bbe compiled and executed efficiently. One of the many advantages of having a close ‘working relationship between the language design and language implementation teams is that the designers can obtain early feedback on constructs that are causing trouble. ‘Often, features that are difficult to translate are also difficult for human programmers to understand. Algol 68 is a prime example of a language that had a lack of success due Introduction to the fact that it was designed by a committee who largely ignored implementation considerations, as they felt that such considerations would restrict the ability to produce «8 powerful language. In contrast, the implementation of C, C++, Pascal and Java went hand in hand with their design and the Ada design team was dominated by language implementers. However, it is necessary to achieve a proper balance between the introduction of powerful new features and their ease of implementation. ISO Standard Pascal, for example, has features, such as procedures being able to accept array parameters of differing lengths, which were omitted from the original version of the language on the rounds that they were too expensive to implement. Error detection and correction [cis important that programs are correct and satisfy ther original specification. However, demonstrating that this is indeed the case is no easy matter. As most programmers stil rely on program testing as a means of showing that a program is error free, a good language should assist in this task. It is therefore sensible for language designers to include features that help in error detection and to omit features that ate difficult to check. Ideally, erors should be found at compile time when they are easier to pinpoint and correct. The later an error is detected in the software development process. the more difficult itis to find and correct without destroying the program structure. ‘As an example of the importance of language design on error detection, consider the original Fortran method of type declarations where the initial letter of a variable name implicitly determines the type of the variable. Although this method is convenient and greatly reduces the number of declarations required, it isin fact inherently unsound since any misspelling of variable names is not detected at compile time and leads to logical errors Conversely, explicit type declarations have the following advantages. Firstly, they provide extra information that enables more checking to be carried out at compile time and, secondly, they act as part of the program documentation. Correctness and standards ‘The most exacting requirement of correctness is proving that a program satisfies its original specification. With the major exception of purely functional languages, which are amenable to mathematical reasoning, such proofs of correctness have not, as yet, hhad a major influence on language design. However, the basic ideas of structured programming do support the notion of proving the correctness of a program, as it is clearly easier to reason about a program with high-level control structures than about, cone with unrestricted goto statements. ‘To prove that a program is correct, or to reason about the meaning of a program, is necessary to have a rigorous definition of the meaning of each language construct. (Methods for defining the syntax and semantics of a language are discussed in Chap- ter 12.) However, although itis not difficult to provide a precise definition of the syntax Languages or systems? 9 ofa language, itis very difficul, if not impossible, to produce a full semantic definition, and as far as most programmers are concerned it is unreadable anyway. ‘tis therefore vital inthe early stages ofa language’s development to have an informal description that is understandable by programmers. As in many aspects of computer science, there needs to be a compromise between exactness and informality A programming language should also have an official standard definition to which all implementers adhere. Unfortunately, this seldom happens as implementers often omit features that are difficult to implement and add features that they feel will improve the language. As a result, program portability suffers. The exception to this is Ada. An Ada compiler must be validated using a spetially constructed suite of test programs before ican be called an Ada compiler. It is interesting that one of the aims of these tests is to rule ont supersets as well as subsets of the language. This is an excellent idea and it is hoped that it will become the norm. ED Languages or systems? ‘An important feature of many modem languages is that they support network program- ming and the creation of graphical user interfaces (GUIS). These features are often provided through libraries and so we have the question of whether they are part of a language or part of its support environment. The problem is compounded by the fact ‘that GUIs and networking are often highly dependent on the facilities provided by the operating system, A major advantage of Java is that it provides support for GUIs and network pro- ‘gramming in a way that is independent of any particular operating system. Java has an extensive set of standard libraries where the necessary facilities are defined in terms of the Java Virtual Machine. As all Java programs make heavy use of these libraries, they are regarded by Java programmers as an integral part of the language. ‘Languages such as Visual Basic and Delphi also provide these facilities through an extensive set of libraries, but differ from Java in that they are closely tied to particular operating system, namely Microsoft Windows. This allows a close and efficient integration between the language and operating system facilities. However, it does do away with one of the major advantages of high-level languages which is that they are machine independent. It also raises the question of when are we talking about anew language and when are we talking about a new implementation of an existing language, ‘There are many different implementations of C++, each of which provides its own set, of libraries for GUIs. It is therefore clear when we are talking about the C++ language and when we are talking about a particular implementation. However, the GUI and networking facilities of Visual Basic and Delphi form such a large part of the system used by their programmers that they can claim to be new languages although they do hhave Basic and Object Pascal respectively as their core, Moreover, Visual Basic and Delphi both provide extensive visual development environments. One view is therefore that they are not languages, but are system development environments. This lack of a clear distinction between a language and its development environment will continue to increase as support facilities become ever more sophisticated. 10 Introduction ‘An important feature of programs that use graphical user interfaces is that they are event driven. They wait for some user event such as the click of a mouse over the representation of a bation on the screen, handle that event and then wai for the next is leads to a very different program structure from that provided by traditional programming languages. Writing event driven programs is difficult, but is dealt with in languages such as Java, Visual Basic and Delphi by most ofthe work being done behind the scenes. This allows the programmer to work at a very high level of abstraction and not worry about implementation details. With earlier languages, event handling had to be explicitly programmed. This is therefore another example of where inction between a language and its sugporting environment has become blurred, (AS) The lexical elements. ‘The basic building blocks used in writing programs in a particular language are often known as the lexical elements, This covers such items as the character set, the rules for identifiers and operators, the use of keywords or reserved words, how comments are ‘written, and the manner in which blanks and layout are handled. Character set ‘The character set can be thought of as containing the basic building blocks of a pro- ramming language — letters, digits and special characters such as arithmetic operators and punctuation symbols. Two different approaches were taken when deciding the character set to be used in early languages. One isto choose all the characters deemed ‘necessary. This is the approach taken with APL. and Algol 60, bu it as the drawback that either special imput/output equipment has to be used or changes have to be made to the published language when itis used on a computer, The otter approach is to use only the characters commonly available with current input and output devices. Hence, the character set of early versions of Fortran was restricted by the 64 characters avaiable with punched cards while Pascal initially was constrained by the character set available with the CDC 6000 series computer on which it was ist implemented Since the early 1970s, most inpot and output devices have supported internationally accepted character sets such as ASCII (American Standard Code for Information TIterchange) and this has been reflected in the character ses of languages. The ASCTI character set has 128 characters of which 95 are printable; the remaining characters are special control characters. The printable characters are the upper and lower case leer, digits, punctuation characters, arithmetic operators and three different sets of brackets OI) and). Composite symbols are used to extend the range of symbols available Commonly used examples are the relational operators <= and >= and the assignment operator = used in the Algol family of languages ‘More recently, the Unicode character set has been created to give a much larger range of characters. Each Unicode character occupies 16 bits rather than the 8 used ‘with ASCII characters. Java uses the Unicode characte se. The lexical elements 11 Identifiers and reserved words The character sets the collection from which the symbols making up the vocabulary of programming language are formed. Clearly, a language needs conventions for grouping characters into words so that names (usually known as identifiers in computing) can be given to entities such as variables, constants, etc, (Naming conventions are discussed in Chapter 3.) Some of the words in a programming language are given a special meaning. Examples of this are DO and GOTO in Fortran and begin, end and for in Pascal. TWo methods are used for including such words in @ Janguage. The method adopted by Fortran is to allow such words to have their special well-defined meaning in certain contexts. The ‘words are then called keywords. This method was also adopted by the designers of PLL since it limited the number of special words tha the programmer had to remember ~ the scientific programmerusing PL/Tis unlikely to know all the business-oriented keywords, ‘while the business programmer is unlikely to know all the scientific keywords. However, the drawback of this method is thatthe reader of @ program written in a language with keywords has the task of deciding whether a keyword is being used for its special meaning or is an occurrence of an ordinary identifier. Furthermore, when an error ‘occurs due to the inadvertent use of an unknown keyword, i is not always clear when 1 word has its special meaning, without consulting all the declarations. The alternative method, used initially by COBOL and adopted by most modem languages, isto restrict the use of such words to their special meaning, The words are then called reserved words. The advantage of the reserved word method is best scen in languages like Pascal and C++ where the number of such words is quite small. In COBOL, however, the number of reserved words is much larger ~ over 300 — and so the programmer has the task of remembering a large number of words that ‘ust not be used for such things as variables. As well as reserved words, languages often have predefined identifiers, These are ordinary identifiers that have been given an initial definition by the system, but which may be redefined by the programmer. Examples, in Pascal, are the predefined type Tnteger and the input and output procedures read and write. In languages such as Ada and Java, a large number of identifiers are defined in the standard libraries, Such identifiers can be redefined in programs. In Algol 60 programs, reserved words are written in a different typeface, either underlined or bold face, depending on the situation. ‘The drawback of this is that many input devices cannot cope with underlined words, so less attractive alternatives, such as ‘writing reserved words in quotes, had to be used. In handwritten versions of programs in Pascal and Ada, for example, reserved words ate often underlined so that they stand ‘out while in books they are often printed in boldface. In the version presented to & compiler, however, they are typed in the same way as ordinary identifiers. Comments Almost all languages allow comments, thereby making the program more readily understood by the human reader. Such comments are, however, ignored by the compiler. In early languages such as Fortran, which has a fixed format of one statement per line,

You might also like