






























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The importance of computation as a science and the need for a mathematical science of computation. It explores three established directions of mathematical research relevant to a science of computation and their limitations. The paper aims to create a basis for a mathematical theory of computation and lists goals such as developing a universal programming language and defining a theory of the equivalence of computation processes. 34 pages long and was presented at the Western Joint Computer Conference in May 1961.
Typology: Lecture notes
1 / 38
This page cannot be seen from the preview
Don't miss anything!































JOHN McCARTHY
Computation is sure to become one of the most important of the sciences. This is because it is the science of how machines can be made to carry out intellectual processes. We know that any intellectual process that can be carried out mechanically can be performed by a general purpose digital computer. Moreover, the limitations on what we have been able to make computers do so far clearly come far more from our weakness as pro- grammers than from the intrinsic limitations of the machines. We hope that these limitations can be greatly reduced by developing a mathemati- cal science of computation. There are three established directions of mathematical research relevant to a science of computation. The first and oldest of these is numerical analysis. Unfortunately, its subject matter is too narrow to be of much help in forming a general theory, and it has only recently begun to be affected by the existence of automatic computation. The second relevant direction of research is the theory of computability as a branch of recursive function theory. The results of the basic work in this theory, including the existence of universal machines and the existence of unsolvable problems, have established a framework in which any theory of computation must fit. Unfortunately, the general trend of research in this field has been to establish more and better unsolvability theorems, and there has been very little attention paid to positive results and none to establishing the properties of the kinds of algorithms that are actually used. Perhaps for this reason the formalisms for describing algorithms are too cumbersome to be used to describe actual algorithms. The third direction of mathematical research is the theory of finite automata. Results which use the finiteness of the number of states tend not to be very useful in dealing with present computers which have so
l) This paper is a corrected version of the paper of the same title given at the Western Joint Computer Conference, May 1961. A tenth section discussing the relations between mathematical logic and computation has been added.
34 JOHN MCCARTHY
many states that it is impossible for them to go through a substantial fraction of them in a reasonable time. The present paper is an attempt to create a basis for a mathematical theory of computation. Before mentioning what is in the paper, we shall discuss briefly what practical results can be hoped for from a suitable mathematical theory. This paper contains direct contributions towards only a few of the goals to be mentioned, but we list additional goals in order to encourage a gold rush.
English which is not well suited to the formal description of procedures, and that UNCOLis an exercise in group wishful thinking. The formalism for describing computations in this paper is not presented as a candidate for a universal programming language because it lacks a number of features, mainly syntactic, which are necessary for convenient use.
AND RELATED ENTITIES
In this part we describe a number of new formalisms for expres- sing computable functions and related entities. The most important section is 1, the subject matter of which is fairly well understood. The other sections give formalisms which we hope will be useful in construct- ing computable functions and in proving theorems about them.
1. Functions Computable in Terms of Given Base Functions. Suppose
having certain domains and ranges. In the case of the non-negative integers, we may have the successor function and the predicate of equality, and in the case of the S-expressions discussed in reference 7, we have the five basic operations. Our object is to define a class of functions C { 9 } which we shall call the class of functions computable in
Before developing C { 9 } formally, we wish to give an example, and in order to give the example, we first need the concept of condi- tional expression. In our notation a conditional expression has the form (pi -+ e1,pz -+ ez,... , pn -+ en)
which corresponds to the ALGOL 60 reference language (12) expression
i f p l then el else i f p z then ez ... else i f p n then en.
Here p l ,... , p n are propositional expressions taking the values T or F standing for truth and falsity respectively. The value of ( P I -+ el, p z -+ e 2 ,... , p n -+ en) is the value of the e corresponding to the first p that has value T. Thus
(4< 3 -+ 7 , 2 > 3 +- 8 , 2 < 3 -+ 9 , 4 < 5 -+ 7) = 9.
Some examples of the conditional expressions for well known func- tions are 1 x 1 = ( x < 0 -+ - x , x 2 0 -f x) 611 = (i=j -+ 1, i # j + 0)
and the triangular function whose graph is given in figure 1 is represented
by the conditional expression tri(x) = (XI- 1 + 0 , x s 0 + x + 1, x < 1 + 1-x, x > 1 + 0).
Fig. 1 Now we are ready to use conditional expressions to define functions recursively. For example, we have
n! = (n = 0 + 1, n # 0 + _n(n-_* I)!) Let us evaluate 2! according to this definition. We have 2! = (2 = 0 + 1,2 # 0 + 2.(2- I)!) = 2.1!
= 2 * 1 * 0!
= 2. 1 - 1 = 2.
= 2.(1 = 0 + 1, 1 # 0 + l ’ ( 1 - I)!)
= 2. 1 * ( 0 = 0 + 1 , 0 #O+O.(O-l)!)
The reader who has followed these simple examples is ready for the construction of C{S} which is a straightforward generalization of the above together with a tying up of a few loose ends.
give) of functions whose domains and ranges are certain sets. C { 9 } will be a class of functions involving the same sets which we shall call
Suppose f is a function of n variables, and suppose that if we write y = f(x1,. .., x,), each xi takes values in the set Ut and y takes its value in the set V. It is customary to describe this situation by writing f : U l X UZX ...x Un + V The set U1 x ...X Un of n-tuples (XI,. .., x,) is called the domain of f, and the set V is called the range off.
The n-tuples for which a function described by composition is de- fined is determined in an obvious way from the sets of n-tuples for which the functions entering the composition are defined. If all the functions occurring in a composition are total functions, the new func- tion is also a total function, but the other processes for defining functions are not so kind to totality. When the word “function” is used from here on, we shall mean partial function. Having to introduce partial functions is a nuisance, but an unavoidable one. The rules for defining computable functions sometimes give com- putation processes that never terminate, and when the computation process fails to terminate, the result is undefined. It is well known that there is no effective general way of deciding whether a process will terminate. Predicates and propositional forms. The space 17 of truth values whose only elements are T (for truth) and F (for falsity) has a special role in our theory. A function whose range islTis called apredicate. Examples of predicates on the integers are prime defined by
T if x is prime prime@) = F otherwise
and Iess defined by T i f x < y F otherwise
Iess(x,y) =
We shall, of course, write x < y instead of Iess(x,y). U there is a predicate equ of two arguments defined by
For any space
We shall write x = y instead of equ(x,y), but some of the remarks about functions might not hold if we tried to consider equality a single predicate defined on all spaces at once. A form with values in 17 such as x < y, x = y, or prime(x) is called a propositional form. Propositional forms constructed directly from predicates such as prime(x) or x < y may be called simple. Compound propositional forms can be constructed from the simple ones by means of the proposi- tional connectives A , V, and N. We shall assume that the reader is familiar with the use of these connectives.
Conditional forms or conditional expressions. Conditional forms require a little more careful treatment than was given above in con- nection with the example. The value of the conditional form
(pi + el,. .. ,pn + en) is the value of the e corresponding to the first p that has value T; if all p’s have value F , then the value of the conditional form is not defined. This rule is complete provided all the p’s and e’s have defined values, but we need to make provision for the possibility that some of the p’s or e’s are undefined. The rule is as follows:
If an undefined p occurs before a true p or if all p’s are false or if the e corresponding to the first true p is undefined, then the form is undefined. Otherwise, the value of the form is the value of the e corresponding to the first true p.
We shall illustrate this definition by additional examples: ( 2 < 1 + 1 , 2 > 1 + 3 ) = 3 ( 1 < 2 + 4 , 1 < 2 + 3)= (2 < 1 --f 1,3 < 1 + 3 ) is undefined (O/O < 1 -+ 1,1 < 2 -+ 3 ) is undefined (1 < 2 + O/O, 1 < 2 + 1) is undefined (1 < 2 + 2 ,l < 3 + O/O) = 2 The truth value T can be used to simplify certain conditional forms. Thus, instead of
we shall write
1 x 1 = (x < 0 -+ -x, x# 0 + x),
1 x 1 = (X < 0 + -x, T + x). The propositional connectives can be expressed in terms of condi- tional forms as follows: P A 4 = ( P + 4 , T + F ) P v 4 = (p+ T, T + 4)
P=4 = ( p - + q , T + T )
- p = ( p + F , T + T )
Considerations of truth tables show that these formulae give the same results as the usual definitions. However, in order to treat partial functions we must consider the possibility that p or q may be undefined. Suppose that p is false and 4 is undefined; then according to the
Note also that if we consider a wider domain than the non-negative integers, n! as defined above becomes a partial function, since unless n is a non-negative integer, the recursion process does not terminate. In general, we can either define single functions by recursion or define several functions together by simultaneous recursion, the former being a particular case of the latter. To define simultaneously functions 5,. .., fk, we write equations
fib... , x n ) = el
fk(x1,. ..,&) = ek
The expressions el,. .. ,ek must contain only known functions and the functions fi,...,fk. Suppose that the ranges of the functions are to be V1,. .., V k respectively; then we further require that the expressions el,. .., e k be associated with these spaces respectively, given that within el,. ..,ek the f’s are taken as having the corresponding V‘s as ranges. This is a consistency condition. fr(xt,. .. ,Xk) is to be evaluated for given values of the x’s as follows.
1. If ec is a conditional form then the p’s are to be evaluated in the prescribed order stopping when a true p and the corresponding e have been evaluated. 2. If e{ has the formg(el, ...,em),then el, ...,emare to be evaluated and then the function g applied.
by repeated applications of the above recursive definition process.
A BASIS FOR A MATHEMATICAL THEORY OF COMPUTATION (^) 43
2. Recursive Functions of the Integers. In Reference 7 we develop the recursive functions of a class of symbolic expressions in terms of the conditional expression and recursive function formalism. As an example of the use of recursive function definitions, we shall give recursive definitions of a number of functions over the integers. We do this for three reasons: to help the reader familiarize himself with recursive definition, to show how much simpler in practice our methods of recursive definition are than either Turing machines or Kleene’s formalism, and to prove that any partial recursive function
only the successor function and the predicate equality. Let Z be the set of non-negative integers {0,1,2,. ..} and denote the successor of an integer n by n‘ and denote the equality of integers n and n2 by nl = n2. If we define functions succ and eq by
succ(n) = n’
functions in C { S } will have either integers or truth values as values. First we define the predecessor function pred (not defined for n = 0) by
pred(n) = pred2(n,0)
We shall denote pred(n) by n-. Now we define the sum
m + n = (n = 0 + m, T -+ m’+n-),
mn = (n = 0 + 0, T -+ m+mn-),
m - n = (n = 0 + m, T + m--n-)
the product
the difference
defined by
{ut}. If we regard the sequence as a function f we can write the recursive definition
or in terms of the conventional notation
t = m t=m+l Functions with functions as arguments are called functionuls.
integer n such that p(n) for a predicate p. We have
Another example is the functional feast(p) which gives the least
least(p) = least 2(p,O)
leust2(p,n) = ( p ( n ) -+ n,T -+ leust2(p,n+ 1)).
where
In order to use functionals it is convenient to have a notation for naming functions. We use Church's [l] lambda notation. Suppose we
For example, the name of the function f defined byf(x,y) = x 2 + y is
Thus we have % ( X , Y ) Y X 2 + Y)(3,4) = 13,
but
The variables occurring in a il definition are dummy or bound variables and can be replaced by others without changing the function provided the replacement is done consistently. For example, the expressions
and
all represent the same function. n
t -
In the notation i 2 is represented by sum(l,n,il((i),i2)) and the least
When the functions with which we are dealing are defined recursively, a difficulty arises. For example, consider factorial defined by fuctoriul(n) = (n = 0 + 1,T + n-fuctoriuZ(n-1)).
A((n),(n = 0 + 1,T + n.fuctoriul(n-1)))
cannot serve as a name for this function because it is not clear that the occurrence of “factorial” in the expression refers to the function defined by the expression as a whole. Therefore, for recursive functions we adopt an additional convention. Namely,
lubel(f,l((xl,...,x,),e))
stands for the function f defined by the equation
The expression
f(x1, ..., x n ) = e
where any occurrences of the function letter f within e stand for the function being defined. The letter f is a dummy variable. The factorial function then has the name
lubeZ(fuctoriul, A( (n),(n = 0 + 1,T + n - fuctoriuZ(n - l ) ) ) ) ,
and since factorial and n are dummy variables the expression
lubel(g,A((r),(r = 0 -+ 1,T + r - g ( r - 1 ) ) ) )
represents the same function. If we start with a base domain for our variables, it is possible to consider a hierarchy of functionals. At level 1 we have functions whose arguments are in the base domain. At level 2 we have functionals taking functions of level 1 as arguments. At level 3 are functionals taking functionals of level 2 as arguments, etc. Actually functionals of several variables can be of mixed type. However, this hierarchy does not exhaust the possibilities, and if we allow functions which can take themselves as arguments we can eliminate the use of Zubel in naming recursive functions. Suppose that
f(x) = 4m where S(x,f, is some expression in x and the function variable5 This function can be named ,ubelCfAI((x),b(x,f))).
48 JOHN MCCARTHY
Our next step is to allow the description operator 1. i((x),p(x)) stands for the unique x such that p ( x ) is true. Unless there is such an x and it is unique, L((x),p(x)) is undefined. In the case of the integers L( ( ( x ) , p ( x ) ) can be defined in terms of the universal quantifier using conditional ex- pressions, but this does not seem to be the case in domains which are not effectively enumerable, and one may not wish to do so in domains where enumeration is unnatural. The next step is to allow quantification over functions. This gets us to Kleene’s [ 5 ] analytic hierarchy and presumably allows the functions used in analysis. Two facts are worth noting. First V(cf>,tp(f>) refers to all functions on the domain and not just the computable ones. If we restrict quantification to computable functions, we get different results. Secondly, if we allow functions which can take themselves as arguments, it is difficult to assign a meaning to the quantification. In fact, we are apparently confronted with the paradoxes of naive set theory.
5. Ambiguous Functions. Ambiguous functions are not really functions. For each prescription of values to the arguments the ambiguous function has a collection of possible values. An example of an ambiguous function is less(n) defined for all positive integer values of n. Every non-negative integer less than n is a possible value of less@). First we define a basic ambiguity operator amb(x,y) whose possible values are x and y when both are defined: otherwise, whichever is defined. Now we can define less(n) by less (n) = amb (n - 1, less (n - 1)).
less(n) has the property that if we define
ult(n) = (n = 0 + 0,T + ulr(less(n))) then v ( ( n ) , u l t ( n ) = 0) = T.
There are a number of important kinds of mathematical arguments whose convenient formalization may involve ambiguous functions. In order to give an example, we need two definitions. Iff and g are two ambiguous functions, we shall say thatfis a descendant of g if for each x every possible value of f ( x ) is also a possible value of
Secondly, we shall say that a property of ambiguous functions is
A BASIS FOR A MATHEMATICAL THEORY OF COMPUTATION 49
hereditary if whenever it is possessed by a function g it is also possessed by all descendants of g. The property that iteration of an integer valued function eventually gives 0 is hereditary, and the function less has this property. So, therefore, do all its descendants. Therefore any integer- function g satisfying g(0) = 0 and n > 0 = g(n) < n has the property that _g(n)_* = (n = 0 +- 0,T +- _g(g(n)))_* is identically 0 since g is a descendant of less. Thus any function, however complicated, which always reduces a number will if iterated sufficiently always give 0. This example is one of our reasons for hoping that ambiguous functions will turn out to be useful. With just the operation amb defined above adjoined to those used to generate C{F}, we can extend F to the class C * { Y } which may be called the computably ambiguous functions. A wider class of ambiguous functions is formed using the operator Am(x,n(x)) whose values are all x’s satisfying n(x).
6. Recursive Definitions of Sets. In the previous sections on recursive de- finition of functions the domains and ranges of the basic functions were prescribed and the defined functions had the same domains and ranges. In this section we shall consider the definition of new sets and the basic functions on them. First we shall consider some operations whereby new sets can be defined. 1. The Cartesian product A x B of two sets A and B is the set of all ordered pairs ( a - b ) with a E A and b E B. If A and B are finite sets and n(A) and n(B) denote the numbers of members of A and B respectively then n(A x B ) = _n(A)n(B)._* Associated with the pair of sets (A,B) are two canonical mappings:
@ A , B : A X B +- B defined by _@A,B((ab))_* = b. The word “canonical” refers to the fact that Z A , B and @A,B are defined by the sets A and B and do not depend on knowing anything about the members of A and B. The next canonical function y is a function of two variables YA,B:A,B +-
y ~ , ~ ( a , b ) = ( a - b ).
For some purposes functions of two variables, x from A and y from B,
consisting of the integers from 1 to n by n. We have A O O 2 : A A x O l l O A x l l l A A x 2 21 A 8 A (n terms, associate to left by convention) AOII 1 (by convention) Al- A An II AX ...x A (n terms, associate to left by convention) Suppose we write the recursive equation S = { A } e A x S. We can interpret this as defining the set of sequences of elements of A as follows :
al-(a2.A)) and al~(a~-(u~~A))) etc.
sequence. Suppose we substitute {A> 8 A x S for S in the right side of S = { A } 8 A x S. W e g e t
s = {A} 8 A x ({A} 8 A x S).
If we again substitute for S and expand by the distributive law expressed in equation (2) above we get
s = { A } o A x { A } 8 A x A x { A > o ...
which, if we now denote the set {A} by 1, becomes
S = l 8 A O A 2 Q A S Q ...
which is another way of writing the set of sequences. We shall denote the set of sequences of elements of A by seq(A). We can also derive this relation by writing S = 1 8 A x S and solving formally for S, getting S = 1/(1-A) which we expand in geometric series to get S = 1 8 A 8 A2 8 ... j ust as before.
52 JOHN MCCARTHY
Another useful recursive construction is
S = A O S x S.
Its elements have the forms a or (al-a2) or _((uIu~)u~)** or **(a1-(a2-a3))** etc. Thus we have the set of S-expressions on the alphabet A which we may denote by sexp(A). This set is the subject matter of Reference 7, and the following paragraph refers to this paper. When sets are formed by this kind of recursive definition, the canonical mappings associated with the direct sum and Cartesian product opera- tions have significance. Consider, for example, sexp(A). We can define the basic operations of Lisp, i.e. atom, eq, car, cdr and cons by the equations atom(x) = P A , S ~ S ( X ) eq(x,y) = **( ~ A , S X S ( X )_** = iA,sxs(y))
assuming that equality is defined on the space A.
Definition of the set of integers. Let 0 denote the null set as before. We can define the set of integers I by
Its elements are then O,(O.O), (O.(O.O)), etc.
which we shall denote by 0,1,2,3 etc. The successor and predecessor functions are then definable in terms of the canonical operations of the defining equation. We have
The first part of this paper was solely concerned with presenting descriptive formalisms. In this part we shall establish a few of the proper-