Peer to Peer Journal
Knowledge Based Articles:
Literature Review:
Chapter 1
1.1.
Introduction Functional programming is an approach to programming based on function calls as the primary programming construct. It provides practical approaches to problem solving in general and insights into many aspects of computing. In particular, with its roots in the theory of computing, it forms a bridge between formal methods in computing and their application. In this chapter we are going to look at how functional programming differs from traditional imperative programming. We will then consider functional programming’s origins in the theory of computing and survey its relevance to contemporary computing theory and practice. Finally, we will discuss the role of the λ (lambda) calculus as a basis for functional programming.
1.2 Names and values in imperative and functional languages
Traditional programming languages are based around the idea of a variable as a changeable association between a name and values. These languages are said to be imperative because they consist of sequences of commands: ; ; ; …
Typically, each command consists of an assignment which changes a variables’ value. This involves working out the value of an expression and associating the result with a name: := In a program, each command’s expression may refer to other variables whose values may have been changed by preceding commands. This enables values to be passed from command to command. Functional languages are based on structured function calls.
A functional program is an expression consisting of a function call which calls other functions in turn: (( … ) … ))
Thus, each function receives values from and passes new values back to the calling function. This is known as function composition or nesting. In imperative languages, commands may change the value associated with a name by a previous command so each name may be and usually will be associated with different values while a program is running. In imperative languages, the same name may be associated with different values. In functional languages, names are only introduced as the formal parameters of functions and given values by function calls with actual parameters. Once a formal parameter is associated with an actual parameter value there is no way for it to be associated with a new value. There is no concept of a command which changes the value associated with a name through assignment. Thus, there is no concept of a command sequence or command repetition to enable successive changes to values associated with names.
Data structures in functional languages In imperative languages, array elements and record fields are changed by successive assignments. In functional languages, because there is no assignment, sub-structures in data structures cannot be changed one at a time. Instead, it is necessary to write down a whole structure with explicit changes to the appropriate sub-structure. Functional languages provide explicit representations for data structures. Functional languages do not provide arrays because without assignment there is no easy way to access an arbitrary element.
Certainly, writing out an entire array with a change to one element would be ludicrously unwieldy. Instead nested data structures like lists are provided. These are based on recursive notations where operations on a whole structure are described in terms of recursive operations on sub-structures. The representations for nested data structures are often very similar to the nested function call notation. Indeed, in LISP the same representation is used for functions and data structures. This ability to represent entire data structures has a number of advantages. It provides a standard format for displaying structures which greatly simplifies program debugging and final output as there is no need to write special printing sub-programs for each distinct type of structure. It also provides a standard format for storing data structures which can remove the need to write special file I/O sub-programs for distinct types of structure. A related difference lies in the absence of global structures in functional languages. In imperative languages, if a program manipulates single distinct data structures then it is usual to declare them as globals at the top level of a program. Their sub-structures may then be accessed and modified directly through assignment within sub-programs
without passing them as parameters. In functional languages, because there is no assignment, it is not possible to change independently sub-structures of global structures. Instead, entire data structures are passed explicitly as actual parameters to functions for substructure changes and the entire changed structure is then passed back again to the calling function. This means that function calls in a functional program are larger than their equivalents in an imperative program because of these additional parameters. However, it has the advantage of ensuring that structure manipulation by functions is always explicit in the function definitions and calls. This makes it easier to see the flow of data in programs.
1.3 Origins of functional languages Functional programming
has its roots in mathematical logic. Informal logical systems have been in use for over 2000 years but the first modern formalisations were by Hamilton, De Morgan and Boole in the mid 19th century. Within their work we now distinguish the propositional and the predicate calculus. The propositional calculus is a system with true and false as basic values and with and, or, not and so on as basic operations. Names are used to stand for arbitrary truth values. Within propositional calculus it is possible to prove whether or not an arbitrary expression is a theorem, always true, by starting with axioms, elementary expressions which are always true, and applying rules of inference to construct new theorems from axioms and existing theorems. Propositional calculus provides a foundation for more powerfull logical systems. It is also used to describe digital electronics where on and off signals are represented as true and false, and electronic circuits are represented as logical expressions.
The predicate calculus extends propositional calculus to enable expressions involving non-logical values like numbers or sets or strings. This is through the introduction of predicates to generalise logical expressions to describe properties of non-logical values, and functions to generalise operations on non-logical values. It also introduces the idea ofPredicate calculus may be applied to different problem areas through the development of appropriate predicates, functions, axioms and rules of inference. For example, number theoretic predicate calculus is used to reason about numbers.
Functions are provided for arithmetic and predicates are provided for comparing numbers. Predicate calculus also forms the basis of logic programming in languages like Prolog and of expert systems based on logical inference. Note that within the propositional and predicate calculi, associations between names and values are unchanging and expressions have no necessary evaluation order. The late 19th century also saw Peano’s development of a formal system for number theory. This introduced numbers in terms of 0 and the successor function, so any number is that number of successors of 0. Proofs in the system were based on a form of induction which is akin to recursion. At the turn of the century, Russell and Whitehead attempted to derive mathematical truth directly from logical truth in their “Principia Mathematica.” They were, in effect, trying to construct a logical description for mathematics. Subsequently, the German mathematician Hilbert proposed a ’Program’ to demonstrate that ’Principia’ really did describe totally mathematics.
He required proof that the ’Principia’ description of mathematics was consistent, did not allow any contradictions to be proved, and complete, allowed every true statement of number theory to be proved. Alas, in 1931, Godel showed that any system powerful enough to describe arithmetic was necessarily incomplete. However, Hilbert’s ’Program’ had promoted vigorous investigation into the theory of computability, to try to develop formal systems to describe computations in general. In 1936, three distinct formal approaches to computability were proposed: Turing’s Turing machines, Kleene’s recursive function theory (based on work of Hilbert’s from 1925) and Church’s λ calculus.
Each is well defined in terms of a simple set of primitive operations and a simple set of rules for structuring operations. Most important, each has a proof theory. All the above approaches have been shown formally to be equivalent to each other and also to generalised von Neumann machines – digital computers. This implies that a result from one system will have equivalent results in equivalent systems and that any system may be used to model any other system. In particular, any results will apply to digital computer languages and any may be used to describe computer languages. Contrariwise, computer languages may be used to describe and hence implement any of these systems. Church hypothesized that all descriptions of computability are equivalent.
While Church’s thesis cannot be proved formally, every subsequent description of computability has been proved to be equivalent to existing descriptions. An important difference between Turing machines, and recursive functions and λ calculus is that the Turing machine approach concentrated on computation as mechanized symbol manipulation based on assignment and time ordered evaluation. Recursive function theory and λ calculus emphasized computation as structured function application. They both are evaluation order independent.
Turing demonstrated that it is impossible to tell whether or not an arbitrary Turing machine will halt – the halting problem is unsolvable This also applies to the λ calculus and recursive function theory, so there is no way of telling if evaluation of an arbitrary λ expression or recursive function call will ever terminate. However, Church and Rosser showed for the λ calculus that if different evaluation orders do terminate then the results will be the same.
They also showed that one particular evaluation order is more likely to lead to termination than any other. This has important implications for computing because it may be more efficient to carry out some parts of a program in one order and other parts in another. In particular, if a language is evaluation order independent then it may be possible to carry out program parts in parallel. Today, λ calculus and recursive function theory are the backbones of functional programming but they have wider applications throughout computing
Chapter 2: Clojure Programming Language:
Clojure is focusing on the programming paradigm of functional programming, in which the functions are treated as “first-class citizen”. In functional programming, functions are not only defined and used but treated as any other data. They can be connected, used as parameters or for function results. Programs are constructed by applying and composing functions. In this context, “first-class citizen” or “first-class function” means that they can also be bound to names, passed as arguments, or also returned from other functions since functions are treated as data. Instead of using a sequence of instructions, functional programming uses complex functions which resolve in a better understanding, e.g. for calculations.[11]
(defn fac [n]
(if (zero? n) 1
(* n (fac (dec n)))))
Functional Programming Concepts Functional programming leads to code that is easier to write, read, test, and reuse. Here’s how it works. Pure Functions Programs are built out of pure functions. A pure function has no side effects; that is, it does not depend on anything but its arguments, and its only influence on the outside world is through its return value. Mathematical functions are pure functions. Two plus two is four, no matter where and when you ask. Also, asking doesn’t do anything other than return the answer. Program output is decidedly impure. For example, when you println, you change the outside world by pushing data onto an output stream. Also, the results of println depend on state outside the function: the standard output stream might be redirected, closed, or broken. If you start writing pure functions, you will quickly realize that pure functions and immutable data go hand in hand. Consider the following mystery function:
(defn mystery [input]
(if input data-1 data-2))
If mystery is a pure function, then regardless of what it does, data-1 and data-2 have to be immutable! Otherwise, changes to the data would cause the function to return different values for the same input. A single piece of mutable data can ruin the game, rendering an entire call chain of functions impure. So, once you make a commitment to writing pure functions, you end up using immutable data in large sec tions of your application. Persistent Data Structures Immutable data is critical to Clojure’s approach to both FP and concur rency. On the FP side, pure functions cannot have side effects, such as updating the state of a mutable object. On the concurrency side, Clojure’s concurrency primitives require immutable data structures to implement their thread-safety guarantees. The fly in the ointment is performance. When all data is immutable, “update” translates into “create a copy of the original data, plus my changes.” This will use up memory quickly! Imagine that you have an
address book that takes up 5MB of memory. Then, you make five small updates. With a mutable address book, you are still consuming about 5MB of memory. But if you have to copy the whole address book for each update, then an immutable version would balloon to 25MB! Clojure’s data structures do not take this naive “copy everything” approach. Instead, all Clojure data structures are persistent. In this con text, persistent means that the data structures preserve old copies of themselves by efficiently sharing structure between older and newer versions.
Structural sharing is easiest to visualize with a list. Consider list a with two elements:
(def a ‘(1 2)) ⇒ ⇒ #’user/a
Then, from a you can create a b with an additional element added: (def b (cons 0 a)) #’user/b b is able to reuse all of a’s structure, rather than having its own private copy: b a 0 1 2
All of Clojure’s data structures share structure where possible. For structures other than simple lists, the mechanics are more complex, of course. If you are interested in the details, check out the following articles:
• “Ideal Hash Trees”2 by Phil Bagwell
• “Understanding Clojure’s PersistentVector Implementation”3 by Karl Krukow Laziness and Recursion Functional programs make heavy use of recursion and laziness. A re cursion occurs when a function calls itself, either directly or indirectly. Laziness is when an expression’s evaluation is postponed until it is
actually needed. Evaluating a lazy expression is called realizing the expression. In Clojure, functions and expressions are not lazy. However, sequences are generally lazy. Because so much Clojure programming is sequence manipulation, you get many of the benefits of a fully lazy language. In particular, you can build complex expressions using lazy sequences and then pay only for the elements you actually need. Lazy techniques imply pure functions. You never have to worry about when to call a pure function, since it always returns the same thing. Impure functions, on the other hand, do not play well with lazy tech niques. As a programmer you must explicitly control when an impure function is called, because if you call it at some other time, it may behave differently!
- The merits of functional programming
At first sight, a language without variables or sequencing might seem completely impractical. This impression cannot be dispelled simply by a few words here. But we hope that by studying the material that follows, readers will gain an appreciation of how it is possible to do a lot of interesting programming in the functional manner. There is nothing sacred about the imperative style, familiar though it is. Many features of imperative languages have arisen by a process of abstraction from typical computer hardware, from machine code to assemblers, to macro assemblers, and then to FORTRAN and beyond. There is no reason to suppose that such languages represent the most palatable way for humans to communicate programs to a machine. After all, existing hardware designs are not sacred either, and computers are supposed to do our bidding rather than conversely. Perhaps the right approach is not to start from the hardware and work upwards, but to start with programming languages as an abstract notation for specifying algorithms, and then work down to the hardware (Dijkstra 1976). Actually, this tendency can be detected in traditional languages too. Even FORTRAN allows arithmetical expressions to be written in the usual way. The programmer is not burdened with the task of linearizing the evaluation of subexpressions and finding temporary storage for intermediate results. This suggests that the idea of developing programming languages quite dif ferent from the traditional imperative ones is certainly defensible. However, to emphasize that we are not merely proposing change for change’s sake, we should say a few words about why we might prefer functional programs to imperative ones. Perhaps the main reason is that functional programs correspond more directly to mathematical objects, and it is therefore easier to reason about them. In order to get a firm grip on exactly what programs mean, we might wish to assign an abstract mathematical meaning to a program or command — this is the aim of denotational semantics (semantics = meaning).