Background of Study
:: History of Pascal ::
The language Component Pascal was the culmination of several decades of research. It was the youngest member of the Algol family of languages. Algol, defined in 1960, was the first high-level language with a readable, structured, and systematically defined syntax.
In the late sixties, several proposals for an evolutionary successor to Algol were developed. The most successful one was Pascal, defined in 1970 by Prof. Niklaus Wirth at ETH Zurich, the Swiss Federal Institute of Technology. His [Prof. Niklaus] principle objectives for Pascal were for the language to be efficient to implement and run, allow for the development of well structured and well organized programs, and to serve as a vehicle for the teaching of the important concepts of computer programming. Pascal, which was named after the mathematician Blasé Pascal, is a direct descendent from ALGOL 60, which Wirth helped develop. Pascal also draws programming components from ALGOL 68 and ALGOL-W. The original published definition for the Pascal language appeared in 1971 with latter revisions published in 1973. It was designed to teach programming techniques and topics to college students and was the language of choice to do so from the late 1960’s to the late 1980’s.
Pascal received a big boost when ETH released a Pascal compiler that produced a simple intermediate code for a virtual machine (P-code), instead of true native code for a particular machine. This simplified porting Pascal to other processor architectures considerably, because only a new P-code interpreter needed be written for this purpose, not a whole new compiler. One of these projects had been undertaken at the University of California, San Diego. Remarkably, this implementation (UCSD Pascal) didn’t require a large and expensive mainframe computer, it ran on the then new Apple II personal computers. This gave Pascal a second important boost. The third one came when Borland released Turbo Pascal, a fast and inexpensive compiler, and integrated development environment for the IBM PC. Later, Borland revived its version of Pascal when it introduced the rapid application development environment Delphi.
Pascal has greatly influenced the design and evolution of many other languages, from Ada to Visual Basic.
Significance of Study
:: Application Areas ::
Pascal contains some significant language features that allow it to used as a powerful learning tool in introducing structured programming techniques:
- Built in Data Types- Pascal contains it’s own built in data types of Integer, Real, Character, and Boolean.
- User defined Data Types – Has the ability to define scalar types as well as sub-ranges of those data types.
- Provides a defined set of Data Structures- These data structures include Arrays, Records, Files and Sets.
- Has a strong data typing element – Pascal compliers can diagnose an incompatible assignment of one type to a variable to another type.
- Supports Structured Programming – This is accomplished through the use of subprograms called procedures and functions.
- Simplicity and Expressivity – Because the language is simple and expressive in nature it allows for effective teaching of computer programming techniques.
The Pascal Language Pascal was originally designed by Niklaus Wirth with two primary goals: to teach programming as a systematic discipline, and to implement programs in a reliable and efficient way. But there are other reasons for the widespread interest in Pascal as a general purpose programming language and as a system program implementation language. • • • • Pascal, a relatively modern language, has benefited from many earlier languages (such as ALGOL) and is forming the basis of several new-ones. The key emphasis of Pascal is its role as a higher level language ; that is, a language providing useful abstract tools for specifying data structures and algorithms, independent of the underlying implementation. The user need not be concerned with the repre sentation of data (the number of bytes per variable, organization of arrays, address size, and so on). The programmer can more accurately specify the characteristics of variables, such as the range of values allowed (V AR I: 0 .. 99) or decide whether 1-3 to trade space for time in accessing variable components. (PACKED keyword) IBM has added several goals to this list:
- 2. 3. IBM Personal Computer Pascal is designed to be a systems implementation language, especially suitable for writing compilers, interpreters, operating systems, and so on. Generating efficient code is paramount. The various language extensions, and the global optimizer phase (pass two of the compiler) all work toward minimizing the time and space needed by compiled programs. Operations that are done easily in assembly language should be easy to do in IBM Personal Computer Pascal.
Each line defines a computer instruction consisting of an operation code possibly followed by some arguments. The code becomes a bit more readable if were place the operation codes by readable names and enclose the arguments in parentheses:
Program (1, 2, 5, 1)
LocalVar(3)
Constant(1)
Simple Assign End Program
A system program, known as a compiler, performs the translation from Pascal to machine code. Compilers play a crucial role in software development: They enable you to ignore the complicated instruction sets of computers and write programs in a readable notation. Compilers also detect numerous programming errors even before you begin testing your programs. However, you can ignore the code generated by a compiler only if you know that the compiler never makes a mistake! If a compiler does not always work, programming becomes extremely complicated. In that case, you will discover that the Pascal report no longer defines exactly what your program does. This is obviously unacceptable. So the following design rule is essential:
Rule 1.1: A compiler must be error-free. This requirement is quite a challenge to the compiler designer when you consider that a compiler is a program of several thousand lines. Compiler writing forces you to apply very systematic methods of program design, testing, and documentation.
It is one of the best educational experiences for a software engineer. The input to a compiler is a program text.
The first task of the compiler is to read the program text character by character and recognize the symbols of the language. The compiler will, for example, read the characters program
WHAT A COMPILER DOES
recognize them as the single word program. At this point, the compiler views the previous program example as the following sequence of symbols: program name semicolon var name x colon name integer semicolon begin name x becomes numeral 1 end period This phase of compilation is called lexical analysis. (The word lexical means “pertaining to the words or vocabulary of a language.’’) The next task of the compiler is to check that the symbols occurring the right order. For example, the compiler recognizes the sentence xi=1 as.an assignment statement.
But if you write x=1 instead, the compiler will indicate that this is not a valid statement. This phase of compilation is called syntax analysis. (The word syntax means “‘the structure of the word order in a sentence.’’) The syntax analysis is concerned only with the sequence of symbols in sentences. As long as a sentence consists of a name followed by the :=
symbol and an expression, it will be recognized as an assignment statement. But even though the syntax of the statement is correct, it may still be meaning less, as in the following example: yirl which refers to an undefined variable y. The assignment statement x ‘= true is also meaningless because the variable x and the value true are of different types (integer and Boolean). The phase of compilation that detects meaning less sentences such as these is called semantic analysis. (The word semantics means “‘the study of meaning.’’) As these examples show, the compiler must perform two kinds of se mantic checks: First, the compiler must make sure that the names used in a program refer to known objects: either predefined standard objects, such as the type integer, or objects that are defined in the program, such as the variable x. The problem is to recognize a definition such as var x: integer;
WHAT A COMPILER DOES
and determine in which part of the program the object x can be used. This task is called scope analysis. (The word scope means “‘the extent of application.”) During this part of the compilation, the compiler will indicate if the program uses undefined names, such as y, or introduces ambiguous names in definitions, such as var x: integer; x: integer;
Second, the compiler must check that the operands are of compatible types. This task is called type analysis. When you compile a new program for the first time, the compiler nearly always finds some errors in it. It will often require several cycles of editing and recompilation before the compiler accepts the program as correct.
So, in designing a compiler, you must keep in mind that most of the time it will be used to compile incorrect programs! If there are many errors in a program, it is convenient to output the error messages in a file which can be inspected or printed after the compilation. However, the compiler will be able to complete this file and close it properly only if the compilation itself terminates properly. If the compilation of an incorrect program causes a run-time failure, such as an arithmetic overflow, the error messages will be lost and you will have to guess what happened. To avoid this situation, we must impose the following design requirement:
Rule 1.2: A compilation must always terminate, no matter what the input looks like. The easiest way to satisfy this requirement is to terminate the compilation when the first error has been detected. The user must then remove this error and recompile the program to find the next error, and so on. Since a compilation may take several minutes, this method is just too slow. To speed up the program development process, we must add another design requirement: .
Rule 1.3: A compiler should attempt to find as many errors as possible during a single compilation. As you will see later, this goal is not easy to achieve. There is one exception to the rule that a compilation must always Terminate: If a program is so big that the compiler exceeds the limits
The part of a compiler that performs lexical analysis is called the scanner. This chapter describes the Pascal— scanner (pass 1) and explains how it recognizes symbols in a program text and replaces them by a more convenient representation. Much of the discussion concerns the pro’ tem of searching for word symbols and names in a symbol table.
SOURCE TEXT The scanner reads a program text character by character. It uses a control character to indicate that it has reached the end of the source text const ETX = 38C; This is the Pascal x notation for ASCII character number three, known as ETX (End of Text). When the scanner finds a character that cannot occur in any symbol, it reports the problem. A line number in the error message helps you locate the error in the program text and correct it. However, if an invisible character causes an error message, it is very confusing since you cannot see where the 28 Sec.
INTERMEDIATE CODE 29 problem is by looking at the printed program text. To avoid this situation we must make sure that ‘“‘what you see is what the compiler gets.” In other words, the compiler must ignore all control characters (except NL and. ETX). The set of invisible characters is stored in a variable of type CharSet: type CharSet = set of char; var Invisible: CharSet; The scanner uses Algorithm 4.1 to input the next character and store it ina global variable named ch. This procedure illustrates the use of set types in a compiler. Later, we will see many other examples of this. procedure NextChar; begin if EOF then ch := ETX else begin Read(ch); if ch in Invisible then NextChar end end;
Algorithm INTERMEDIATE CODE In the source text, a word symbol such as const consists of a sequence of letters, in this case c o n s t. This representation of symbols is very bulky and awkward to work with in Pascal. If you store a symbol as a string of characters var Symbol: String; you can easily test, for example, if the symbol is the word const: if Symbol = ’const’ then… But you cannot easily form sets of strings and ask whether the current symbol is either const, type, var, or procedure. The use of symbol sets plays an important role in syntax analysis. To be able to define symbol sets, we must represent the symbols by simple values. These values should, however, have readable names to make the compiler understandable. What we need, then, is an enumerated type that introduces a name for each symbol in the language:
Linear Searching To illustrate how important efficient searching is in lexical analysis, very inefficient method called linear searching. The idea behind it is the simplest possible: The symbol table is organized as a linked list of records each describing one word. The list is unordered and is always searched from the beginning. New words can be inserted anywhere in the list.
Letter Indexing Well then, let’s improve things a bit by using a ‘‘thumb index’ of the words as in a dictionary. Instead of a single linked list we will use a separate list for each letter of the alphabet The first list will hold all words beginning with the letter “A”. The second list is for the letter ““B’’, and so on. The first letter of each word is used as an index in an array to locate the start of the corresponding list, Each list is unordered.
Conclusion
WHAT A COMPILER DOES
The only reasonable thing to do is to report this and stop the compilation. This is called a compilation failure. If the compiler finds no formal errors in a program, it proceeds to the last phase of compilation, code generation. In this phase, the compiler determines the amount of storage needed for the code and variables of the program and emits final instructions. The main difficulty is that most computers have very unsystematic instruction sets that are ill suited for automatic code generation. These, then, are the major tasks of a compiler:
Lexical analysis
Syntax analysis
Scope analysis
Type analysis
Code generation
http://pascal.hansotten.com/uploads/pbh/brinch%20hansen%20on%20pascal%20compilers%20OCR.pdf
References:
http://pascal.hansotten.com/uploads/pbh/brinch%20hansen%20on%20pascal%20compilers%20OCR.pdf