Noam Chomsky's UCLA Lectures: An Introductory Sketch, Lecture notes of Theory of Computation

Noam Chomsky's UCLA lectures on the generative enterprise, which focuses on the principles and processes that underlie the syntactic properties of human languages. The lectures address fundamental questions for the study of human language, including what is language and what is the nature of the generative enterprise. The document also discusses Chomsky's focus on principles and processes, which aligns linguistics with the natural sciences.

Typology: Lecture notes

2021/2022

Uploaded on 05/11/2023

dylanx
dylanx 🇺🇸

4.7

(21)

286 documents

1 / 77

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
The UCLA Lectures
(April 29 May 2, 2019)
Noam Chomsky
with an introduction by Robert Freidin
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d

Partial preview of the text

Download Noam Chomsky's UCLA Lectures: An Introductory Sketch and more Lecture notes Theory of Computation in PDF only on Docsity!

The UCLA Lectures (April 29 – May 2, 2019) Noam Chomsky with an introduction by Robert Freidin

i Noam Chomsky’s UCLA Lectures: an introductory sketch Robert Freidin Princeton University The revolution in the study of language that began in the 1950s and continues today—what has come to be called the generative enterprise , the topic of Noam Chomsky’s UCLA Lectures (April 29 – May 2, 2019)—is encapsulated in a single sentence published in 1957, a sentence that is with no exaggeration whatsoever the heart and soul of a research program which, as these lectures demonstrate, remains vibrant and promising well into its third quarter century. The first sentence of the first chapter of Syntactic Structures states simply and succinctly: Syntax is the study of the principles and processes by which sentences are constructed in particular languages. The radical nature of this definition is easily demonstrated by comparing it to the definitions of syntax that came before, all of which focus on linguistic phenomena (see Freidin ( 2020 ) for a detailed discussion; see also Freidin ( 2013 )). The comparison highlights a conceptual shift in focus from mere description of data from languages to the principles and processes that underlie the syntactic properties of human languages, a shift that resulted from Chomsky’s abandonment of the structuralist goal stating regularities in linguistic data and his adoption instead of what his teacher Zellig Harris characterized as “synthesizing utterances”. Harris’s Methods in Structural Linguistics (1951) mentions both, but opts for the first without any further discussion of the second. (See Freidin ( 2013 ) for discussion). This conceptual shift to principles and processes aligns linguistics with the natural sciences, which since the Scientific Revolution in the 17th^ century have been concerned with understanding the underlying principles and processes that animate the natural world. (For additional discussion of the conceptual shifts that occurred to initiate the generative enterprise as well as those that have occurred within in it, see Chomsky (1983) and Freidin (1994).) Chomsky’s focus on principles and processes, a constant in his work for more than six decades, is demonstrated once again in these lectures, which address fundamental questions for the study of human language, including the perennial what is language? and what is the nature of the generative enterprise? as well as a new one about what would constitute genuine explanation in the study of language. As stated at the beginning of the second lecture, the generative enterprise is concerned with the basic property of language , which Chomsky characterizes as “namely that each language constructs in the mind an infinite array of structured expressions each of which has a semantic interpretation that expresses a thought, each of which can be externalized in one or another motor system, typically sound, but not necessarily.” As

iii not present to consciousness, in effect, everything that we can conceive and the most diverse movements of our soul. Here we shift from the alphabet to the sounds it represents and from which “an infinite variety of words” can be created, a formulation which links language with infinite creation. What’s missing in these formulations is any reference to the infinite array of structured expressions that make the expression of thoughts and feelings possible. Nonetheless, in recognition that Galileo appears to have been the first to recognize what is essentially the basic property of language and to be amazed and puzzled by it, Chomsky has characterized the goal of the generative enterprise to provide a principled explanation for this property the Galilean challenge. What might constitute a principled explanation for some aspects of the basic property is also a focus in these lectures, what Chomsky is now calling genuine explanation. In these lectures, a genuine explanation for properties of language, as opposed to descriptions of them, must meet two austere requirements: show that the property is acquired by individuals (learnability) and show that property could have been acquired by the species (evolvability). And for any property under investigation there are three factors to be considered: one, what is genetically determined (innate) and thus universal across the species (Universal Grammar UG); two, the data available for language acquisition (generally impoverished because limited to externalizations of the internal structured expressions); and three, third factor principles not specific to language that may in part help shape the property investigated. A genuine explanation of properties of language instantiated in linguistic phenomena (for example, structure dependence (discussed in some detail in lecture #3)) must start with the principles and processes with which sentences are constructed, what has come to be called the computational system for human language. A computational system consists of the operations for constructing the internal structured expressions of the language including whatever general principles govern the derivations and representations these operations produce in conjunction with a lexicon for a particular language. To the extent that these operations and principles apply to lexicons of all languages, they are part of UG and therefore meet the condition of learnability because they are an intrinsic part of human biology, not something that is learned on the basis of external evidence. In these lectures Chomsky discusses the state of the generative enterprise: what has been accomplished, what problems remain, and what it might be possible to accomplish in the immediate future. The entire discussion is informed by the Strong Minimalist Thesis (SMT), essentially the hypothesis that the internal computational system for human language, which

iv generates structured expressions that interface with the Conceptual-Intensional systems of the brain, is a “perfect system” (Chomsky (1995, p. 1). On the same page, Chomsky characterizes what might be considered a perfect system as follows: This work is motivated by two related questions: (1) what are the general conditions that the human language faculty should be expected to satisfy? and (2) to what extent is the language faculty determined by these conditions, without special structure that lies beyond them? The first question in turn has two aspects: what conditions are imposed on the language faculty by virtue of (A) its place within the array of cognitive systems of the mind/brain, and (B) general considerations of conceptual naturalness that have some independent plausibility, namely, simplicity, economy, symmetry, nonredundancy, and the like? and conjectures that “to the extent that the answer to question (2) is positive, language is something like a “perfect system,” meeting external constraints as well as can be done, in one of the reasonable ways.” The first part of the discussion of what has been accomplished in the generative enterprise reviews how the problematic early proposals of the 1950s were eventually replaced with a computational system that could serve as a basis for genuine explanations of linguistic properties and phenomena. The earliest proposals of the 1950s employed two distinct kinds of operation (called rules ): phrase structure rules to capture composition (how the constituent parts of sentences were hierarchically organized) and transformations to capture what later came to be called dislocation (or elsewhere (including in these lectures) displacement ), where a single syntactic object is interpreted as if it occupies multiple syntactic contexts although in phonetic form it only occurs in one. As Chomsky notes, both operations were “much too complex to meet the conditions of learnability or evolvability”, and therefore cannot provide a basis for genuine explanation. The specific problems with phrase structure rules are one, they allow too many possible rules, including those which would never be proposed for obvious reasons—e.g. S➔P+VP, cited in lecture #1, and two, they conflate three distinct aspects of language which, it has become clear in recent years, ought to be handled separately: hierarchical structure, linear order, and projection (which determines what kind of element each syntactic element is).

vi

(5) S ➔ N'' V''

Thus clauses like John proved the theorem had exocentric structure in contrast to a corresponding derived nominal John’s proof of the theorem , which was endocentric. The problems that remain in X-bar theory are resolved with the formulation of Merge in the early 1990s. Instead of too many possible rules to account for compositionality, there is only one operation Merge. And this creates only hierarchical structure, not linear order or projection. And because it doesn’t determine projection, it can create both endocentric and exocentric hierarchical structures depending on how projection is determined. Moreover, the operation Merge accounts as well for dislocation, thereby uniting composition and dislocation, which eliminates the need for the complex transformational rules of the 1950s or even an additional transformational operation Move a of the 1980s. When Merge applies to two unconnected syntactic elements, we get composition via what’s called external Merge. And when it applies to two syntactic elements, where one is contained within the other, we get dislocation via internal Merge—the same operation, two different cases for its application. Note further than internal Merge is also a compositional operation that creates new hierarchical structure by merging the element contained with the syntactic object that contains it, creating a new syntactic object. The third lecture turns to some problems with the formulation of Merge that are revealed by considering how the operation interacts with the workspace (WS) that contains the objects constructed by Merge and the atoms from which those objects are constructed (mentioned but not discussed in Chomsky (2013)). By focusing attention on this essential aspect of derivations under Merge, what was basically assumed but not considered, Chomsky is able to formulate a more restrictive concept of Merge, called MERGE, where the operation becomes a mapping between workspaces in which new syntactic objects are constructed. For example, in the case where two lexical atoms a and b are merged, MERGE takes the WS = [ a , b ] (using square brackets to designate the workspace) and changes WS into WS' = [ { a , b }, … ]. How MERGE is a more restrictive formulation than previous versions of Merge becomes clear when we consider what … contains. If MERGE is like normal general recursion (e.g. proof theory), then WS' will contain a and b as well as { a , b }. But as Chomsky points out, this makes it possible to generate illegitimate structures that violate well established linguistic constraints. Thus { a , b } could be expanded to any island construction X and then if the WS containing X contains either a or b as a separate element, either could be merged with X creating a chain with a or b contained in X, violating the island constraint. Therefore the formulation of MERGE must exclude this possibility, which demonstrates that recursion for language differs from recursion in general—it’s more restrictive.

vii This difference between recursion for language and recursion in general is ascribed to a restriction on resources available for computation by MERGE, what Chomsky postulates as a condition on resource restriction prohibiting each application of MERGE from generating more than one additional accessible element in a workspace. If WS' = [ { a , b } ], then there are three accessible elements ( { a , b } plus a and b ), where in contrast WS = [ a , b ] contained two. But if WS' = [ { a , b }, a , b ] then the increase in accessible elements would be five, in violation of the resource restriction. This raises an apparent problem for internal MERGE where WS = [ { a , b } ] is mapped onto WS' = [ { b , { a , b }} ]. WS contains three accessible elements, where WS' would contain five if each element b is counted separately. However, as Chomsky points out, given minimal search b in { a , b } would not be accessible; so in this way internal MERGE satisfies the resource restriction, which Chomsky conjectures “probably reduces to a general third factor property of the nature of the brain”. Another potentially problematic issue with internal MERGE also concerns the creation of copies, considered unique to this operation. The existence of copies raises two questions: a) how does the computational system distinguish copies from non-copies (sometimes referred to as “repetitions” especially when two or more apparently identical syntactic objects show up in phonetic form)? and b) how does the computational system determine what is a copy of what? In answer to the first question, the lectures turn to the concept of the duality of semantics : argument structure vs. discourse-oriented/information-related and scopal properties. External MERGE generates only argument structure whereas internal MERGE generates only the other properties. Given the duality of semantics, syntactic objects that are interpreted as having an argument function (q-role) but occur in non-argument positions are copies of identical syntactic objects that occur in argument positions. This assumes that copies are governed by a principle of Stability by which they must be syntactically and interpretatively identical (“a general property of computations”). And conversely, two syntactic objects in argument positions even when they appear to be identical are never copies (thus in John saw John , neither John can be interpreted as a copy of the other). Chomsky suggests that the answer to the second question comes from “some internal conspiracy about the nature of language”—“there are answers, but it’s not trivial” and leaves it as “something to think through”. In these lectures Chomsky proposes an alternative conception for the operation MERGE in which both external and internal MERGE create copies. Thus when external MERGE generates { a , b } from [ a , b ], it copies a and b and merges them as { a , b }. The copies of a and b are eliminated by the resource restriction, so in effect the element { a , b } replaces a and b in WS'. In

ix parameter. Chomsky suggests trying to demonstrate that all parameters have two properties: a) they are exclusively part of externalization, with no role to play in the interface between the internal language and C-I systems, and b) they are simply options left open for externalization that have to be decided one way or another and therefore don’t evolve. The contrast between the internal language and its externalization leads to Chomsky’s conjecture that the communicative inefficiency that shows up in the externalized language in the form of parsing, perception, and communication problems (specifically, with filler-gap constructions, structural ambiguities, and garden-path constructions) is strong evidence that language, based on the simplest computational operation MERGE, is designed to be computationally efficient (even when this results in communicative inefficiency). The computational efficiency of language will of course be determined on the basis of what other operations aside from MERGE the computational system utilizes. The fourth lecture examines a long-standing problem in generative grammar concerning “unbounded unstructured coordination”, illustrated in (6). (6) I met someone young, happy, eager to go to college, tired of wasting time, … The problem with these constructions, first discussed in Chomsky & Miller (1963), was characterized as follows: In order to generate such strings, a constituent-structure grammar must either impose some arbitrary structure (e.g., using a right-recursive rule), in which case an incorrect structural description is generated, or it must contain an infinite number of rules. Clearly, in the case of true coordination, by the very meaning of this term, no internal structure should be assigned at all within the sequence of coordinate items. The fourth lecture proposes a solution to the problem that introduces a second computational operation Pair-Merge, distinct from MERGE (which forms sets, hence Set-MERGE). Chomsky considers Pair-Merge to be the next simplest operation after MERGE. The main general idea is that Pair-Merge is required to handle sequences. Exactly how this is done is left open with the proviso: “There are a number of possible ways to implement these general ideas, but to go into them here would carry us too far afield. They are a topic of current research.” However the discussion in lecture #4 provides a few details that are suggestive. The notation for Pair-Merged elements employs angle brackets, the standard notation for sequences, not the curly brackets of Set-MERGE that are the standard notation for sets. So for a sentence like I met someone young , the noun and the adjective would be Pair Merged as < someone , young > on the assumption that adjuncts are Pair-Merged, not Set-

x MERGEd. Furthermore, neither element that is Pair-Merged is accessible to further operations (e.g. internal MERGE) and therefore might provide a principled account of adjunct islands (though, as Chomsky cautions, the facts are complicated). But the syntactic object < someone , young > will be merged with the verb met to form the set { met , < someone , young > }. Another extremely intriguing suggestion is that because internal structures need not be limited to two dimensions (like a computer screen), adjuncts can occupy other dimensions, as many dimensions as there are adjuncts in an expression, so that each adjunct can directly associate with the element it modifies. Chomsky suggests that this is also true for constructions in which there is only one adjunct (e.g. someone young ). Lecture #4 extends the discussion to coordinate structure with conjunctions. The fourth lecture goes on to investigate the possible scope of Pair-Merge beyond coordination that might include puzzles about perception verbs and quasi-causative verbs ( make and let ) in conjunction with bare verbs (e.g. they saw the man walk down the street vs. * the man was seen walk down the street & they let the man walk down the street vs. * the man was let walk down the street ) and persistent problems with head movement. The fourth lecture continues with some comments about what the atomic elements used in computations are, a topic that moves the discussion “towards the general domain of semantics … and how this domain relates to the [generative - RF] enterprise”. Chomsky points out that classical semantics as developed in the 20th^ century (from Frege to Quine and others) is based on notions of truth, reference and denotation which relate to the mind-independent world. In contrast, formal semantics in linguistics and philosophy, which Chomsky characterizes as “some of the richest and most exciting work going on in the field in the last couple of decades, as “pure syntax: symbolic manipulations of postulated entities that are not part of the mind- independent world, whatever their real-world motivation.” (The same can be said of generative phonology, as the lecture discusses.) The question that remains is how the mind-independent notions of truth, reference and denotation might connect to language, which would involve getting beyond syntax. The answer would have to involve words in the lexicon, but Chomsky demonstrates that garden variety words like house , river , and London do not refer. So for the case of house , he says: … a house is something that we construct in our minds, which has a material element, but a crucial part of it is what Aristotle called the form. That’s something that’s part of our mental operations. When we use the word house , we’re referring to a mind-independent object which we interpret as a house. Referring is an

xii discussion in the lectures of unbounded unstructured coordination shows, the study of underlying processes must be informed by an attention to the particular phenomena that are found in languages. The situation is reminiscent of the one in mathematics noted in Courant (1937). The point of view of school mathematics tempts one to linger over details and to lose one’s grasp of general relationships and systematic methods. On the other hand, in the ‘higher’ point of view there lurks the opposite danger of getting out of touch with concrete details, so that one is left helpless when faced with the simplest cases of individual difficulty, because in the world of general ideas one has forgotten how to come to grips with the concrete. The reader must find his own way of meeting this dilemma. In this he can only succeed by repeatedly thinking out particular cases for himself and acquiring a firm grasp of the application of general principles in particular cases; herein lies the chief task of anyone who wishes to pursue the study of Science. And for the science of language as discussed in these lectures, Chomsky demonstrates how this continues to be a process of discovery that results from rethinking over and over the conceptual foundations and empirical basis of the field. References: Chomsky, Noam. 1966. Cartesian Linguistics: a chapter in the history of rationalist thought. Harper and Row. Chomsky, Noam. 1970. Remarks on Nominalizations. In R. Jacobs and P. Rosenbaum (eds.), Readings in English Transformational Grammar. 184-221. Ginn & Co. Chomsky, Noam. 19 76. Conditions on Rules of Grammar. Linguistic Analysis , 2, 303–351. Chomsky, Noam. 1983. Some Conceptual Shifts in the Study of Language. In L. S. Cauman, I. Levi, C. D. Parsons, and R. Schwartz (eds.), How Many Questions?: Essays in honor of Sidney Morgenbesser. 154-169. Hackett Publishing Company, Inc. Chomsky, Noam. 19 86. Barriers. M.I.T. Press. Chomsky, Noam. 1995. The Minimalist Program. M.I.T. Press. Chomsky, Noam. 2006. Language and Mind. 3rd^ edition. Cambridge University Press. Chomsky, Noam. 201 3. Problems of Projection. Lingua 130, 33-49. Chomsky, Noam. 2017 a. The Galilean Challenge. Inference: International Review of Science , vol. 3, No. 1 (https://inference-review.com/article/the-galilean-challenge). Chomsky, Noam. 2017b. The Galilean Challenge: Architecture and Evolution of Language.

xiii Journal of Physics : Conf. Series 880 (2017) 012015. Chomsky, Noam. 2019. Some Puzzling Foundational Issues: The Reading Program. Catalan Journal of Linguistics (Special Issue), 263 – 285. Chomsky, Noam. and George Miller. 1963. Introduction to the Formal Analysis of Natural Languages. In R. D. Luce, R. Bush, and E. Galanter (eds.), Handbook of Mathematical Psychology , vol. 2, 269-322. Wiley. Courant, Richard. 1937. Differential and Integral Calculus. Vol. 1. Blackie and Son. Freidin, Robert. 1994. Conceptual shifts in the science of grammar 1951 – 1992. In C. P. Otero (ed.) Noam Chomsky: Critical Assessments , vol. 1, tome 2, 653–90. Routledge. Reprinted in Robert Freidin. 2007. Generative Grammar: Theory and its History. 266–

  1. Routledge. Freidin, Robert. 2013. Chomsky’s contribution to linguistics: A sketch. In Keith Allan (ed.), The Oxford Handbook of the History of Linguistics , 439–67. Oxford University Press. Freidin, Robert. 2020. Syntactic Structures : a radical appreciation. LingBuzz (https://ling.auf.net/lingbuzz/004996). Harris, Zellig. 1951. Methods in Structural Linguistics. University of Chicago Press.

2 system that provides “a recursive specification of a denumerable set of sentences,” ultimately in phonetic form,^3 later sharpened to yielding what is called “the Basic Property” in these lectures and elsewhere. More generally, the theory of computation enabled the first clear formulation of Aristotle’s crucial distinction between “the possession of knowledge and the actual exercise of knowledge” ( de Anima ), between competence and performance, in modern terminology, the former being the essential element. The “Galilean challenge” discussed in the lectures kept to the “exercise of knowledge,” production and perception, which access the knowledge that is possessed and of course involve other faculties as well.^4 The same limits hold for the rich tradition of “general and rational grammar” that developed from the Galilean challenge: possession of knowledge was ignored. Humboldt’s by now famous aphorism that language involves “infinite use of finite means” keeps to use of knowledge , not the more fundamental notion of possession of knowledge , which can be accessed and used in various ways. The same holds more generally into the modern period, and in many ways still does. With these concepts available it becomes possible to move from the “closed world” of taxonomic science to the “infinite universe” of search for explanation, to paraphrase the title of a famous work on history of science. In the closed world of taxonomic science, there remains little to do beyond applying existing tools to more data. In the “infinite universe” of search for explanation, even the rudiments are little understood and each new discovery raises new and exciting challenges. The lectures below discuss some of the ways these challenges have been explored in the generative enterprise. Lecture #1 (April 29, 2019) What I would like to do in these lectures, which is actually just one continuous talk broken up into parts, is take them as far as I can up to contemporary work and problems, if we make it. I would like to discuss the state of the generative enterprise as it’s been called by some of its leading practitioners: what’s been accomplished, what the problems are, and what we can hope to see in the future. From the origins of this initiative, which incidentally revived a tradition that had long been forgotten and that was unknown at that time. But from the origins, the holy grail was genuine explanations of fundamental properties of human language, of the faculty of language. And that’s not such a simple matter to capture properly and to the extent you can, it’s been an elusive goal. And I think present moment is unusual in the long history of the field, twenty-five hundred years. And that goal, I think, seems perhaps within reach. And if that’s the case, it would be a matter of no slight significance not just for linguistics, but beyond. These are the questions I’d like to explore in this extended lecture. (^3) Chomsky (1949). (^4) The study of production, furthermore, is crucially limited by the inability – which persists -- to deal with the Cartesian property of “creative use of language.”

3 So to begin with, we have to clarify some basic questions, but highly contested questions about what the field is about. What’s the nature of the enterprise? I’ve personally always found it helpful to rethink these matters over and over. I hope you will too. So let’s begin with what sounds like the simplest question—namely, “what is language?”. Well, that question is plainly consequential. The answer to it will determine what we focus on, what kind of work we do, how we proceed, what counts as a result, and critically what counts as an actual explanation, a genuine explanation. There’ve been many proposed answers over the years. They differ in interesting ways, and–if we think about it a little–the question turns out to be not so simple. So suppose, for example, we asked the question in some other discipline, let’s say physics. We ask: what is the physical world? What is energy, what is mass, what is work? Any such question. The answer that we’ll get is some technical definition internal to explanatory theory. So we won’t get an account of what people intuitively think of as the physical world or think about energy and so on. That’s not to the point. We’ll find answers within a particular explanatory theory. Suppose we ask biologists “what is life?”. There it will be a little bit more ambiguous because the theoretical understanding has not reached the point where it’s obvious what the essential conceptual notions are. So it’s exploratory. Suppose we ask “what is thinking?”. Well, here it gets a little more complicated. Actually as you know, the question was posed by Alan Turing in a famous paper in 1950 which initiated the field of artificial intelligence and papers about whether machines think. And he starts off by saying that the question is too meaningless to deserve discussion.^5 So he’s not going to discuss it because the notion thinking is so vague and amorphous that you can’t give a response in the manner in which you might in say physics or even biology. He’s asked what thinking is, he says it’s some kind of buzzing in the head, but nothing much more to say than that. So what he does is something quite different. He proposes a notion which he says, might be somewhere within the range of what people call thinking and maybe it’s useful notion. He suggests that it is, and in particular it might stimulate the development of new software, new machines. That’s the famous imitation game, the so-called Turing Test. Let’s move on. Notice that when you ask the question what is thinking?, what is language?, what is meaning?, what is belief?, and so on, the answers that you get are really what philosopher Charles Stevenson once called persuasive definitions , saying here’s what I think is interesting in the general domain of this loose notion. Here’s something I think it’s worth looking at. Well, you go back to the Turing test, notice that it’s not an attempt to explain and understand anything about thinking. It’s about an attempt to simulate some of the aspects of thinking. That’s a quite crucial difference. It didn’t seem so crucial in (^5) Turing, Alan. 1950. Computing Machinery and Intelligence. Mind 59:236,433–460. Cf. Chomsky, Noam. 2009. Turing on the “Imitation Game.” In Epstein, Robert, Gary Robert and Grace Beber (eds.), Parsing the Turing Test: Philosophical and Methodological Issues in the Quest for the Thinking Computer , 103–106. Springer.

5 The second approach is illustrated by the structuralist, behavioralist approaches to language of the first half of the twentieth century—still, of course, continuing. That took language, the object of study, to be, say, a corpus of materials that a field worker would elicit from an informant, or perhaps a set of sentences, or some other entity that’s external to people. So if you look at the actual formulations, for de Saussure, the founder of the structural linguistics, a language is a kind of a social contract in a community, some collection of word images in the minds of the people of the community. Go to the leading American linguist of the early half of the twentieth century, Leonard Bloomfield, one of whose answers to the question “what is language?” is that “language is the totality of utterances that can be made in a speech community”—so something out there. Go to philosophy of language. For W.V.O. Quine, perhaps the leading and most influential philosopher of language in the mid twentieth century, a set of “significant sequences, [which,] being subject to no length limit, are infinite in variety” (“The Problem of Meaning in Linguistics”). David Lewis, another influential philosopher took the same view in his influential article “Languages and Language”: language is an infinite set of sentences used by a population. Both Quine and Lewis concluded that while it makes sense to say that a population uses this infinite set, it doesn’t make any sense to say that there’s a particular way of characterizing the set. To look for that, Quine said, would be “folly.” Lewis wrote that he could make no sense of the notion. If that’s what language is, what’s linguistics? Well, linguistics, naturally, be a way of taking data, however you get it, typically from an informant, by applying various procedures and methods to get an organized form of that data. The most sophisticated version of this was (as Tim Stowell mentioned) Zellig Harris’s Methods in Structural Linguistics. In Europe, Trubetzkoy’s Principles of Phonology was constructed on similar grounds. Well, this characterizes almost completely the structuralist-behavioralist approach to language. There is something kind of paradoxical about it. So what are these entities? What is the set of sentences spoken in a speech community? How can members of population use an infinite set unless they have some way of determining what’s in the set or out of the set? In fact, how can we even coherently talk about an infinite set unless we have a method that characterizing it? So it seems to me at least that the approach of leading philosophers and logicians was kind of confused. It’s really the opposite. If you want to talk about infinite sets, you first have to discuss what the internal mechanism for characterizing that set is—what’s been called an I-language (I for internal). Well, whatever these ideas are supposed to mean from the structuralist, behavioralist period, which I think is not easy to answer, but whatever it is, there’s something external to people, which people have some relation to. Now that by no means has ended; it continues right up to the present. There are strong currents that take very similar views, and I think one can ask the same questions about them, including within, roughly speaking, the generative enterprise.

6 Suppose instead, we adopt Jespersen’s view, then the linguist is studying something that’s in the mind of the speaker —namely, the mature state that has been attained, that has, in Jespersen’s term “come into existence”, and also the innate endowment of the speaker, the faculty of language, which first of all determines the principles that underlie the grammars of all languages and also makes possible the transition from finite data to the state attained, to the I-language in modern terms. As I said, the mature state attained is called the I-language (internal language in technical terms), and the innate principles are nowadays called universal grammar (UG), taking a traditional term and adapting it to a new context. The letter “I” in I- language is convenient. It refers to the fact that the internal language is first of all internal, secondly it’s individual, and thirdly it’s intensional (with an “s”). We’re interested in the actual procedure, the actual algorithm, not the set of things that it enumerates. So for example, if you’re studying, say, a person’s knowledge of arithmetic, you want to know exactly how that person carries out addition. You’re not talking about sets of triples X, Y, Z, such a Z is the sum of X and Y. Here too, we want to understand the generative system in intension. Well, I should say that with regard to universal grammar, there’s a good deal of confusion which exists right up to the present and which is worth dissolving. It’s very common to hear that UG has been refuted or that it doesn’t exist. What people presumably mean by that is that generalizations about language have exceptions, which is, of course, true. That’s true of generalizations, but that’s not what UG is about. UG in the contemporary sense is about the innate endowment that enables this transition that Jespersen talked about from finite data to the concept of structure in the mind. The concept of structure is what we call the I-language. So it should be clear that to deny the existence of this is not debatable. That would be senseless. If it doesn’t exist, language acquisition is magic. There is a kind of coherent version of this common claim, see Tomasello and many others.^6 A coherent version would be to claim that there is some general learning mechanism, which has nothing specific to do with language. Or maybe some collection of cognitive capacities which integrates somehow to make it possible to achieve the properties of language, what the faculty of language did. There are a couple of problems with these proposals. One problem is simply that they reduce to hand waving. Or if they’re made at all explicit, they’re very quickly refuted. A second problem is that you can expect in advance that they’re not going to work for reasons that were discussed by Eric Lenneberg in his classic book on the biology of language fifty years ago, in which he pointed out that there are doubled disassociations between language and other cognitive processes.^7 This work has since been greatly extended. Susan Curtiss is the person who’s done the most extensive work on this, and in fact there are (^6) Cf. Tomasello, “Universal Grammar is Dead,” Brain and Behavioral Science (2009). (^7) Lenneberg, Eric. 1967. Biological foundations of language. New York: Wiley.