Tracking the Origins of Transformational Generative Grammar


 Tracking the Origins of Transformational Generative Grammar



: Tracking the Origins of Transformational Generative Grammar   2011-08-09, 13:31

To appear in Journal of Linguistics 43 (2007).

Tracking the origins of transformational generative grammar

University of Edinburgh

Marcus Tomalin, Linguistics and the formal sciences: The origins of generative grammar (Cambridge
Studies in Linguistics 110). Cambridge: Cambridge University Press, 2006.
Pp. xiv + 233.

Tracking the main influences of 19th- and 20th-century mathematics, logic and philosophy on pre-1958
American linguistics and especially on early Transformational Generative Grammar (TGG) is an
ambitious cross-disciplinary endeavour. Ideally it would call for expertise in the methods of intellectual
historiography, the history and content of 20th-century American linguistics, the history and philosophy
of science (including logic and mathematics), the tools and results of mathematical logic, and the theory
of computable functions. Scholars fully versed in all of these fields are rare indeed. If Marcus Tomalin
makes some mistakes in his book (henceforth LFS), that should not be surprising. What is surprising is
how much progress he makes in furthering intellectually serious work on the history of modern
linguistics, and how wide his reading in the relevant technical literature has been.
LFS locates the intellectual roots of TGG in the methods developed by 19th- and 20th-century
mathematics and logic for exhibiting the conceptual structure of theories and constructing rigorous
proofs of theorems. Tomalin discusses the methods developed by Augustin-Louis Cauchy for the
rigorisation of the calculus in the 1820s; Whitehead & Russells use of the axiomatic method in
Principia Mathematica (19101913); the Hilbert program (in the 1920s) to prove all of mathematics
consistent; Bloomfields early axiomatisation of a general linguistic theory (1926); Carnaps logical
empiricist proposals for the logical reconstruction of science on an experiential basis in the 1920s and
1930s; and Goodmans (1951) adaptation and revision of Carnap (1928). Earlier histories of TGG have
not investigated the linguistic relevance of this literature. Tomalin argues persuasively that the two-level
approach to theorising now associated with early TGG (grammars as theories of languages, plus a
metatheory about the form of grammars) was a specialised adaptation to linguistics of techniques
developed for doing metatheoretical work on mathematics and logic, which were also adopted and
applied in the philosophy of science. And he points out that the approach can be found in linguistics
from Bloomfield (1926) on.
In this article we concentrate mainly on places where we think there are further questions that
should be asked, or where we disagree with Tomalin, or where we think he made mistakes. However,
we want to emphasise from the start that Tomalin deserves much credit: the scope of his reading of the
primary literature is broad and deep, and his book provides a valuable scaffolding for future work in the
area, even if it goes wrong in some of the details.

The introductory chapter of LFS presents a rough characterisation of what the formal sciences are. For
Tomalin, the phrase the formal sciences

will be understood to include various branches of pure mathematics and symbolic logic, but, in
addition, it will also be stretched to include various branches of applied mathematics and logic 2
. . . Consequently, the phrase will come to denote a rather heterogeneous collection of related
theories drawn mainly from mathematics and philosophy . . . However, it is important to
recognise that the theories grouped together beneath this umbrella term all utilise some form of
the axiomatic-deductive method and that, therefore, despite their many differences, they all
involve the deduction of consequences (i.e., theorems) from a small set of intuitively obvious
axioms or assumptions, and, as a result, they can be viewed as being unified by the same basic
scientific method. (LFS: 2f.)

This crucially assumes that logical consequences in the formal sciences are DEDUCTIVE consequences,
not semantic consequences. Thus Tomalin immediately signals what will become a central theme of the
book, that the kind of proofs that are characteristic of formal sciences are DERIVATIONS. This excludes
some tools of logic and mathematics that are typically taken to fall within axiomatic deductive methods.
As Tomalin tells the story, what was really central to the birth of TGG was the idea of a formal theory,
which he identifies with Hilberts proof theory (Beweistheorie). So model theory appears to be excluded
from the methods of the formal sciences a very radical restriction, excluding from the formal sciences
a huge amount of pure and applied mathematics logic, and large areas of formal linguistics. (As we shall
see, the exclusion of semantics was NOT in fact characteristic of the methods used to rigorise the
Tomalins view of what 19th-century mathematicians would have understood by rigorisation is
the one expressed by Grabiner (1981: 5, quoted in LFS: 26f.):

First, every concept of the subject had to be explicitly defined in terms of concepts whose nature
was held to be already known . . . Second, theorems had to be proved, with every step in the
proof justified by a previously proved theorem, by a definition, or by an explicitly stated axiom
. . . Third, the definitions chosen, and theorems proved, had to be sufficiently broad to support
the entire structure of valid results belonging to the subject.

Newton, Leibniz and others had accepted certain results of the calculus on the grounds that they were
predictively accurate. Cauchy wanted to prove these already accepted results from explicitly stated
axioms and, crucially, explicit definitions of concepts such as limit, convergence, continuity,
derivative, etc. The idea was not only to put the calculus on an epistemologically secure foundation
but also to see the connection between the defined concepts and the result proved. Thus the proofs were
not mere symbol manipulations; as Grabiner says, the derivation of a result by manipulating symbols
was not a proof of the result. Both the definitions and the axioms were supplied with meaning through
a reduction to algebra, which was regarded as more secure than the calculus was in the 18th century.
Tomalin does not separate out the various components making up Cauchys axiomatic-deductive
methods, but they could be stated as practical maxims. Our proposed phrasing of the maxims
attributable to Cauchy (we will suggest other maxims later) would be as follows:

(A) State the fundamental assumptions of the pre-formal theory as axioms.
(B) Define all non-primitive concepts explicitly.
(C) In proofs of theorems, use only definitions, axioms and previously established theorems.
(D) Reduce the axioms, definitions, and theorems to a better-understood and less controversial theory.

The use of (A) and (B) in mathematics dates back at least to Euclids Elements, so they are
certainly not novel in 19th-century mathematics. Even Newton called his laws of motion axioms, so
(A) does not distinguish Cauchys rigorous methods from Newtons empirically justified, prediction-
oriented calculus. But in (B) we see a dramatic difference between Euclid and Cauchy, on the one hand,
and Newton and Leibniz, on the other: Cauchy wanted to replace Newtons inchoate intuitive notion of
fluxion and Leibnizs use of momentary differences by explicitly defined concepts.
And even more importantly, (C) and (D) are key components of Cauchys contribution to the
development of axiomatic-deductive methods. The use of reductive methods like (D) is typical when
theories are axiomatised for epistemological reasons. For Cauchy, (D) was adopted in part to exhibit the
relationships between concepts, fundamental assumptions, and theorems. But there are also other
epistemological reasons for adopting (D). One is EPISTEMOLOGICAL FOUNDATIONALISM: the view that
theories must be made secure against rational doubt. The idea that mathematical and logical theories can
achieve foundationalist goals by means of (D) came under attack in the last half of the 20th century, and
the need for foundationalist projects in mathematics has been rejected in recent decades (see Shapiro
1991 for an excellent discussion). Tomalin, however, appears to assume that epistemological
foundationalism is not just a motivation for using the axiomatic method, but part of it: he fails to
distinguish between foundations and foundationalism (see, for example, pp. 28, 3845).
Tomalin does not compare the goals of different mathematical or logical programmes that
developed and advocated the use of axiomatic methods. Indeed, he writes as if there is a unique,
univocal axiomatic-deductive method regardless of goals. This, we think, is a mistake: not every maxim
is adopted by every theorist, because different maxims are useful in achieving different goals.
The set-theoretic paradoxes that were discovered in the second half of the 19th century (LFS:
2932) provoked at least three philosophical movements which Tomalin surveys: Logicism, Formalism,
and Intuitionism. It is the first two that loom largest in LFS.
Logicism is the programme in the philosophy of mathematics that was pursued by Frege, and by
Whitehead and Russell; Tomalin describes it as the project of using logic as a basis for arithmetic (p.
33), though in fact there are various different formulations (see Rayo 2005 for a discussion).
Hilberts Formalism is a different philosophical programme. Most philosophers of mathematics
would characterise it as the programme of showing that the whole of mathematics is consistent and
complete. Hilberts proof theory was not the whole programme; rather, it was the most famous strategy
for working on the programme.
Tomalin claims that perhaps the main achievement of this book has been to associate TGG with
both Formalism and Logicism (p. 186). The trouble is, Tomalin equivocates on Formalism, correctly
pointing out a common misconception, but then apparently falling into it himself. To make this clear we
need to explain a bit about Hilberts programme.
A crucial step in Hilberts strategy for showing the consistency of mathematics (see Hilbert
1927) was to convert ordinary arithmetical propositions into formulas by a fully automatic procedure
referred to by Tomalin as FORMALISATION (p. 40). Hilberts insight here is that the propositions of
mathematics can be mechanically paired with syntactically defined formulas. This conversion step is a
formalisation in the sense that it mechanically produces formulas that serve as meaningless
representatives of the propositions of ordinary arithmetic. Hilbert calls them copies (Abbilder) of those
A formula is then said to be PROVABLE if it is the last line of a DERIVATION in a sense that should
be very familiar to those who know generative grammar: a derivation is a sequence of formulas each of
which either is an axiom, or is a substitution instance of an axiom, or is obtained from the axioms and
substitution instances by an inference rule. Consistency was then to be proved by showing that some
mathematical contradiction like 0 ≠ 0 is NOT provable. (That shows that the whole system is consistent,
because in an inconsistent system EVERYTHING is provable. Incidentally, the linguist acquainted with
basic results on generative capacity will see here the seeds of Gödels celebrated later result: to assume
we have a mechanical method for showing unprovability of a formula is to assume that the set of all
provable formulas is a decidable set; the problem is that it might not be.)
Inference rules function rather like rewriting rules in a generative grammar (see section 5
below), and are formal in the sense that they rely purely on form: they do not depend on any
interpretation of the formulas. Hilbert thus employs string manipulation techniques on meaningless
copies of the meaningful propositions of arithmetic, as a way of building consistency proofs. But IT
DOES NOT FOLLOW from this that mathematics itself is a game played with meaningless symbols: the
propositions of mathematics are not meaningless (Hilberts strategy merely employs meaningless
Abbilder as surrogates for them).
Tomalin seems to understand all this; he spends a page (44) warning that it is a common
misconception that Hilbert thought mathematics was a game played with meaningless symbols a
misconception that was partly due to people ignoring the distinction between the formalisation process
and the metamathematical process.
Note carefully that Tomalin uses formalisation for the step of converting mathematical
propositions into formulas; formal for systems and proofs that make no reference to meaning; and
Formalism for the larger enterprise of proving mathematics consistent.
But only five pages later, in the context of pointing ahead to his subsequent discussion of
Hilberts influence on Carnap and Quine, Tomalin (49) changes his terminological policy, and conflates
Formalism with being formal. He notes correctly the interest in the syntax of formal languages
displayed in Quines Mathematical Logic (1940), but incorrectly claims his presentation is
conspicuously influenced by Formalism. The example he cites involves a passage where Quine

introduces an alphabet of primitive symbols and adds, with reference to strings formed from this
basic symbol set, all these characterisations are formal systems, in that they speak only of the
typographical constitution of the expressions in question and do not refer to the meanings of
those expressions (Quine 1940: 283), and it is this emphasis on form, rather than the content of
symbolic expressions, that reveals the influence of Hilberts proof theory (as mediated by

In Tomalins view, it is not the programme of showing all of mathematics to be consistent that has
conspicuously influenced Quine in electing to work with strings over a fixed finite alphabet; nor is it
what Tomalin earlier called the FORMALISATION step; it is merely being formal.
This is not a mere quibble. Which of the three the formalisation step, the project of Formalism,
or being formal does Tomalin claim to have linked to the origins of TGG?
Tomalin mentions some other axiomatisations of mathematical and logical theories that all
satisfy at least maxims (A)(C): John Youngs Fundamental Concepts of Algebra and Geometry (1911),
W. V. O. Quines Mathematical Logic (1940), and Alonso Churchs Introduction to Mathematical
Logic (1944), for example.
In the context of discussing Church, he hints at what he understands the
axiomatic-deductive method amounted to (49):

Church presented the same basic topics which, by the mid-1950s, were rapidly becoming
an incantational mantra (i.e., primitive symbols, variables, quantifiers, propositional
calculus, first-order predicate calculus, second-order predicate calculus, and so on) . . .

The first three topics that Tomalin mentions are not topics at all, but just standard categories employed
in the syntax and semantics of typical formal languages. What Tomalin might have in mind is that the
following maxim is often adopted in the most formal presentations of mathematical and logical theories:

(E) Use a well-defined invented formal language to state the axioms.

And the last three topics are simply distinct types of logic, some of which had properties that were
well known by the mid-1940s. Possibly Tomalin means to suggest another maxim, (F):

(F) Identify the logic that is used in deriving the consequences of the (non-logical) theory.

(This would be a special case of Suppes observation (1956: 156) that in mathematics a statement of
what theories are assumed is needed.)
The motivation for adopting (E) is often taken to be to rid axiomatisations of the vagueness and
ambiguity of natural language. The motivation for adopting (F) is that without it there is no explicit
objective standard of what follows from the axioms of the theory.
LFS began by claiming that the formal sciences can be viewed as being unified by the same
basic scientific method (3). It should be clear that (A)(F) do not characterise any single method. They
are a family of maxims, from which different researchers adopt different subsets for particular purposes.
There is no unified axiomatic-deductive method: which of the maxims a theorist chooses to respect
will depend on the goals of the axiomatisation project, and goals are diverse.

Tomalins long chapter headed Mathematical linguistics is not about mathematical linguistics. It is a
survey, in 53 pages, of historical material rich enough to make a book on its own: the extent of the
influence of mathematical logic, recursive function theory, and logical empiricism on the
mathematicisation of linguistics in the period 1926 to 1953. It is necessarily somewhat superficial, but
could serve as a useful guide to sources for the future work on the use of formal methods in linguistics
and how they influenced linguists conceptions of good linguistic theorising.
The chapter argues that the axiomatic method as developed in mathematics and logic was
applied to presenting linguistic theories during the quarter of a century before Chomsky (1955 [1975]).
So part of what makes the chapter important given the goals of LFS is its role in showing that there was
tacit agreement among a group of American linguists (in particular those most influenced by
Bloomfield) about the general form linguistic theories should take.
Tomalin begins by discussing Bloomfields paper A set of postulates for the science of
language (1926), which attempted to set out an entire approach to linguistic theorising in axiomatic
form. He quotes one or two examples of Bloomfields axioms and definitions, but does not discuss the
content of the paper in detail. This is an interesting signal that LFS itself takes the curious view that the
form of linguistic theories is worth discussing quite independently of their content. This is curious
because evaluation of the appropriateness of axiomatising must surely depend in part on what is to be
We do not agree at all with Tomalins comparison between Bloomfield (1926) and Cauchys
rigorisation of the calculus. Tomalin writes (55):

The comparison with Cauchys rigorisation programme is not vacuous since the emphasis . . . is
upon stating assumptions explicitly, and determining which aspects of a theory are
interdependent and which can be treated independently.

But the mere fact that both Cauchy and Bloomfield (like Euclid) respected maxims (A) and (B) does not
make it reasonable to draw a favorable comparison between Bloomfields short paper and Cauchys
major project. The two projects are utterly different in important ways. Maxim (C) is irrelevant to
Bloomfield (1926), because he did not prove, or try to prove, anything. Nor did Bloomfield follow
maxim (D): he did not reduce any linguistic theory or grammar to a different theory that he favored at
the time say, Weisss (1925) axiomatisation of behaviourist psychology (which, as Tomalin notes,
was a model for Bloomfield 1926).
Bloomfield (1926) laid out the assumptions of his general approach to linguistic theorising, and
urged that other linguists should do so as well. He wanted technical vocabulary to be explicitly defined
in order to avoid certain kinds of error, and to show which specialised linguistic terms were undefinable
and primitive and which could be defined in terms of the primitives (1926:26). He warned that
otherwise linguists were in danger of falling prey to making obscure speculations about languages, as
did Humboldt, or engaging in fruitless psychological disputes like the ones that had gone on between
Paul, Wundt, and Delbrck.
Tomalin performs a very useful service in reminding linguists (or perhaps in many cases
informing them for the first time) of the almost forgotten work of Harwood (1955), which suggested
stating grammars for generating finite sets of strings in two parts: (i) a set of strings taken to be axioms,
and (ii) a set of inference rules for deriving further strings from them. Chomsky (1957) does cite this
work, but only to reject it as inadequate for not containing any provision for recursion. Harwoods
motivation was apparently to consider methods for measuring economy of description of finite corpora.
That may explain why Harwood seems content with a description that entails a finite bound on sentence
Tomalins consideration of recursive devices (66) attempts to link Dedekinds use of recursive
functions, Peanos inductive definition of the successor function, and Bar-Hillels definition of
sentence that uses sentence in the definiens (Bar-Hillel calls the latter recursive in disguise). He
then explores the influence on non-TGG linguistics of the Lvov-Warsaw school of logic founded by
The logicians who emerged from this extraordinarily important school included
Łukasiewicz, Lesniewski, Ajdukiewicz, and Tarski. Tomalin is especially (and rightly) interested in the
influence of Ajdukiewiczs definitions of logical calculi on Bar-Hillels conception of syntax, which
gave rise to modern categorial grammar. In our view, however, Bar-Hillels work was more important
to the history of linguistics than Tomalin explicitly recognises: it is not just all the work in categorial or
type-logical syntax and semantics that largely sprang from it (Bach, Carpenter, Dowty, Jacobson,
Lambek, Montague, Moortgat, Partee, Steedman, Szabolcsi, . . .), but also to some significant degree
early HPSG, and Chomskyan minimalism, and the work of Keenan & Stabler (2003).
As Tomalin sees it, Bar-Hillel attempted to combine two things. One was the methods Harris
(1951) developed to attempt to read off the structure of languages from corpora. The other was
Ajdukiewiczs notation and approach to defining logical calculi. Tomalin holds that Bar-Hillels
adaptation of Ajdukiewicz-style formation rules implies that the type of formal languages used in
symbolic logic (particularly the techniques developed by Ajdukiewicz) are closely related to natural
languages and that therefore methods developed to analyse the former can readily be adapted in order to
provide analyses of the latter (73). Tomalin regards it as significant that categories can be combined to
form well-formed strings by a purely mechanical process (72). So Bar-Hillels mechanical
combinatory procedure could be regarded as analogous to Hilberts formalisation, but it is not formal in
the sense of being unassociated with a semantics.
Tomalin fails to note that, although Bar-Hillel never uses the word generate, he was
unquestionably proposing a generative grammar, several years before Chomsky. Bar-Hillels proposal
was for a BOTTOM-UP generative grammar, in the sense that well-formed expressions are built up from
smaller categorised parts (as in the post-1990 minimalist program) rather than being derived top-down
from a start symbol as in pre-1958 TGG. This is important for Tomalins project. As he notes later in
the book (164), the particular concepts and techniques that eventually came to be associated with TGG
were already spreading throughout the linguistics community, albeit in an imprecisely articulated form,
by the mid-1950s. Bar-Hillels work, rooted in the work of the Lvov-Warsaw school, should have been
mentioned in the same context.

Tomalin is the first historian of linguistics to seriously examine the relevance to TGG of Carnaps
constructional system theory, and Nelson Goodmans revision of it. Carnaps project encapsulated the
logical empiricist research programme. It was set out in Carnaps Der logische Aufbau der Welt:
Versuch einer Konstitutionstheorie der Begriffe (1928) (hereafter Aufbau). In The Structure of
Appearance (1951; henceforth TSA), Goodman significantly modified Carnaps project.
Tomalin aims to articulate the direct line of intellectual influence from Carnap to Goodman and
Quine, and from Goodman and Quine to Chomsky. He claims that the Aufbau exerted a lasting
influence over syntactic theory in the 1950s (LFS: 73). This is certainly right, in the sense that Aufbau
influenced conceptions of what a scientific theory is, quite generally, from the 1930s to at least the
It is well known that Chomsky took courses from Goodman at the University of Pennsylvania;
but LFS, to its credit, actually discusses the content of Goodmans TSA (a revision of his dissertation),
and Carnaps Aufbau. What Tomalin aims to show is that Goodmans revision of Carnaps logical
empiricist project for rationally reconstructing all of science was adopted in Chomsky (1951) and
However, LFS does not discuss in any detail the aims of the logical empiricist program as
articulated in the Aufbau. What matters to Tomalin is that the Aufbau endorsed maxims (A)(D) in its
proposal for constructing (or really, reconstructing) all of science. Hilbert had earlier proposed
extending his program of Formalism to physics this way (see Corry 2004). But what the logical
positivists wanted to do was to reconstruct all of science, psychological and sociological theories
included; it was not part of their project to offer an explicit proof of consistency.
The Aufbau aimed to provide a metatheoretical framework that would (i) unify all scientific
theories, (ii) secure the foundation of science by showing that it can be reconstructed on a purely
experiential base, and (iii) show that all philosophical problems are pseudo-problems. Importantly, the
Aufbau did not rationally reconstruct any particular scientific theory. It merely set out a programme, and
addressed metatheoretical problems that might impede it. This is the way Carnap describes the project
of reconstructing science:

Unlike other conceptual systems, a constructional system undertakes more than the division of
concepts into various kinds and the investigation of the differences and mutual relations between
these kinds. In addition, it attempts a step-by-step derivation or construction of all concepts
from certain fundamental concepts, so that a genealogy of concepts results in which each one
has its definite place. It is the main thesis of construction theory that all concepts can in this way
be derived from a few fundamental concepts, and it is in this respect that it differs from most
other ontologies. (Carnap 1967: 5, quoted in LFS: 74f.)

The fundamental experiential basis was an autopsychological basis that consists only of conscious
experiences (1967: 102f.). These conscious phenomenal experiences are the given: a theory-neutral
indubitable foundation in terms of which all concepts and objects of science were to be explicitly
defined. The empiricism of logical empiricism is located in this indubitable autopsychological base.
As Tomalin notes, this reductive empiricist base was intended to be the secure epistemological
foundation of science, and the idea that empirical knowledge needs such a base is the foundationalist
dogma of empiricism that Quine (1953) inveighs against.
Tomalin seems to think that the Aufbau aimed to consider questions of knowledge acquisition
(74). This is something of a misunderstanding. The goal of the project described in the Aufbau was to
JUSTIFY the claims that scientific knowledge is unified, based only on experience, epistemologically
secure, and free of metaphysics. The ACQUISITION of scientific knowledge can be described historically
or psychologically, but it is hard to understand how a programme for reconstructing all of science on an
experiential basis could answer any question about the way people acquire scientific knowledge. For
surely, even working scientists acquire much of their scientific knowledge from the testimony and
publications of other scientists, not merely from their own phenomenal experiences.
Tomalin tells us that Quines familiarity with the Aufbau began in the early 1930s: it motivated
him to visit the Vienna Circle in 1932 and 1933, and ultimately to visit Carnap, who was lecturing in
Prague at the time. (We would add that Quine also visited Warsaw during this year; Dresner (1999)
argues that through the influence of Carnap, Ajdukiewicz had a lasting influence on Quines philosophy
of language.) He reports that Quine first came into regular contact (78) with Goodman when Carnap
was lecturing at Harvard on logical syntax in 1935. And Tomalin emphasises that Quine and Goodman
shared an interest in the Aufbau.
What is more important, because of its later direct influence on Chomsky (1955 [1975]), is that
Tomalin discusses perhaps for the first time for an audience of linguists the content of TSA,
Goodmans revision of Carnaps programme in the Aufbau. Moreover, Tomalin emphasises that by
1943 Goodman had developed a deep interest in economy and simplicity as criteria of adequacy for
constructional systems that would influence early TGG.
The discussion of economy in LFS includes the following striking passage that is very closely
paraphrased in Chomsky (1955 [1975]: 114, n. 2, quoted in LFS: 116). Goodman writes:

The motives for seeking economy in the basis of a system are much the same as the motives for
constructing the system itself. A given idea A need be left as primitive in a system only so long
as we have discovered between A and the other primitives no relationship intimate enough to
permit defining A in terms of them; hence the more the set of primitives can be reduced without
becoming inadequate, the more comprehensively will the system exhibit the network of
interrelationships that comprise its subject matter. (Goodman 1943: 107, quoted in LFS: 83)

This idea of economy was, as Tomalin shows, adopted in early TGG as an adequacy condition on being
one of the genuine primitives of an axiomatic deductive system, the primitives being all those concepts
(or terms denoting them) that cannot be explicitly defined in the system without circularity. Highly
formalised mathematical and logical theories aim for this kind of economy, i.e., they require that the
primitives be independent of each other. If the goal of using the axiomatic-deductive method is to make
the STRUCTURE of the subject matter perspicuous, then reflecting on the relations between the primitive
concepts of the pre-formal theory (and their relations to each other as stated in the axioms) might be
thought to display a structure of the subject matter of the theory. But, we should note, there will be other
structures that are attributable to the subject matter under reaxiomatisations of the theory.
And we also note that Goodman speaks of THE network of interrelationships: he seems to be
claiming in the above passage that basal economy might serve to constrain the theory up to the UNIQUE
structure of a scientific subject matter!
Although Tomalin does not mention this point, the above passage from Goodman might be
understood to express Hilberts idea that the axioms of a mathematical theory give an implicit definition
of the primitives of the theory. For example, in his correspondence with Frege, Hilbert wrote that in
geometry terms like point and line should not be explicitly defined in intuitively spatial terms, as
Frege thought, but rather that the set of axioms should be taken to implicitly define them, thus
identifying the relations that exhibit the structure of the subject matter. In philosophy, this kind of
definition is called an implicit functional definition, and it has been used by philosophers to state
functionalist definitions of mind as an alternative to behaviourism. (The connections to syntax should be
clear: a phrase structure grammar gives an implicit functional definition of the categories it employs.)
This suggests that it might be worth investigating the genealogy and uses of implicit definitions from
Hilbert through Carnap and Goodman to Chomsky. Instead, what Tomalin himself takes from the above
quotation is this:

In essence, a simpler system is a better system, so long as it does not become inadequate; and
better in this context means a more economical system, since such systems are understood to
provide more profound insights into the phenomena analyzed. (LFS: 84)

This is a seriously uncharitable reading of Goodman. It is more plausible that Goodman is introducing a
further standard part of the axiomatic-deductive method (see Suppes 1956: 156). It could be expressed
in a further maxim:

(G) Ensure that all primitive terms independent of each other.

Tomalins aim in the following section, entitled Formal linguistic theory (88106), is to
establish a stream of influence flowing from Hilbert through Carnaps logical syntax to Bloomfieldian
linguistics. The first step is to show Hilbert influencing Carnap. But there is a major flaw in Tomalins
exposition. In order to illustrate the indebtedness to Formalism in Carnaps book he quotes Carnaps
The Logical Structure of Language saying that a system is formal when

no reference is made to the meaning of the symbols (for example words) or to the sense of
expressions (e.g. sentences), but simply and solely to the kinds and order of the symbols from
which the expressions are constructed. (Carnap 1934 [1937]: 1, quoted in LFS: 91)

But once again, Tomalin is merely referring to being formal, though he calls it Formalism. It is true
that Hilbert has been mistaken for a formalist (in the game-playing sense the common
misconception referred to earlier), and also that Carnap was one of those who misread Hilbert in this
way; but Corry (2004) argues convincingly that it is indeed a misreading (as Tomalin seemed to accept
on page 44). Tomalin is thus trying to trace Hilbertian influence through the work of someone who
misunderstood Hilbert. If Tomalin is going to catalogue the genuine influence of Hilberts actual views
on Carnap and early TGG, then these threads of misunderstanding need to be sorted out.
The next step in Tomalins argument is to establish the influence of Carnap on Bloomfield.
Tomalin regards Bloomfield as a Carnapian formalist. But why? Solely because Bloomfield
repeatedly expressed skepticism about the role of meaning in linguistic theory (95). It is true that
Bloomfield located the study of meaning in psychology, not linguistics; and, notoriously, by the time he
wrote Language (1933) he regarded behaviourism as the most promising research program in
psychology. But Bloomfields view that linguistics is about form in natural languages and not about
meaning is present in the postulates that he published two years before the first (German) edition of the
Aufbau. Bloomfield (1926: 27) states that the language of a speech community is the totality of
utterances that can be made in that speech community, and that an utterance is made up wholly of
forms. If utterances are constituted by their non-semantic properties, and linguistics studies utterances,
then it is no part of linguistics to investigate meanings. Bloomfields exclusion of meaning from the
subject matter of linguistics is thus entirely independent of Carnap (1928).
Bloomfields view of linguistics has nothing to do with either formalisation, or Formalism, or
being formal, as these were earlier characterised. Bloomfield proposed axioms and definitions, but they
were fully interpreted: when he said utterance he meant to refer to utterances. He was not interested in
proofs of consistency; he proposed no mechanical procedures for obtaining such proofs; and he did not
take either grammatical descriptions or general linguistic theories to be uninterpreted symbol systems.

The fourth chapter opens with a brief personal and intellectual biography of Noam Chomsky up to the
mid-1950s. After that it attempts to identify the presence of a particular influence on Chomsky, and to
trace its development as his research gradually matured during the 1950s (108). (This sounds like
development over a whole academic career, but keep in mind that Tomalin is talking about largely
unpublished work by a young man between the ages of 22 and 28!) The influences discussed include
reminders of what was established in chapter 3, which makes the chapter rather repetitious.
Tomalin understands Chomskys MA thesis, The Morphophonemics of Modern Hebrew (1951,
henceforth MMH), as aiming to combine Harris goal of developing a discovery procedure for
grammars from a corpus with Goodmans constructional system theory and conception of simplicity.
There is a long discussion of Chomskys use of the concept of simplicity in MMH, and of his more
substantive view of simplicity in his massive 19551956 work The Logical Structure of Linguistic
Theory (Chomsky 1955 [1975], LSLT). In effect, Tomalin thinks that Chomsky took Goodmans
program in TSA and applied it to Hebrew morphophonemics. (This would mean that Goodmans
influence on Chomsky predated the publication of TSA, perhaps through material presented in classes.)
In section 4.4, Constructive nominalist syntax (121125), Tomalin discusses Chomskys first
published paper, Systems of syntactic analysis (1953), which definitely does show some influence of
Goodmans technical work. But we see again the equivocation about Formalism: The Formalist
emphasis of Chomskys paper is clear, he asserts, because he aims to keep syntactic analysis purely
formal (122), following the early 1950s trend in syntactic theory to develop methods of analysis that
do not require access to semantic information (121). This remark embodies a confusion (the same one
that we just saw with reference to Bloomfield). The semantic information that the analysis needs no
access to is the semantics of the natural language being analysed, not the semantics of the language in
which the analysis is couched. Chomskys interest in a mechanical procedure for going from a corpus
to a description is perhaps reminiscent of what Tomalin calls formalisation; but it is not the project of
proving mathematics consistent that is influencing linguistics here.
We finally arrive at a discussion of early TGG in Tomalins fifth chapter, which revisits (i)
Chomskys much-discussed rejection of discovery procedures and advocacy of evaluation measures; (ii)
the idea that the constructional levels of the Aufbau are hierarchical and so is the general theory of
linguistic structure in LSLT; and (iii) the differences between transformational rules, recursive rules, and
formal syntax. (The frequent returns to previously discussed topics for the purpose of examining their
influence on early TGG reveals again that LFS is organised in a less than optimal way.)
Tomalin misses an important opportunity in his treatment of Hockett (1955) on pp. 144f., in a
section where he rehearses Chomskys already much-repeated arguments against finite-state automata
as an adequate model for English. To set the stage for this, Tomalin notes that in the introduction to
A Manual of Phonology Hockett sketches a Grammatical Headquarters (GHQ), which is responsible
for generating the sentences that are spoken (LFS: 144). This takes the form of a stochastic finite-state
generator, not just another structuralist analytical method. What Hockett proposes is that a probabilistic
generative grammar is an actual component of the human language user. The sketch given is crude; the
illustrative example, which generates a tiny finite language, is just an expository toy, and many
important differences separate Hocketts GHQ from later conceptions of the language faculty.
Nonetheless, Tomalin completely misses this evidence that Hockett (whose book Chomsky actually
reviewed in International Journal of American Linguistics in 1955) was an advocate of both generative
grammars and their neuropsychological reality (see also Hockett 1948, also missed by Tomalin, where
the latter point is extremely clear).
Another opportunity is missed in the discussion of Chomskys arguments against stochastic
models of languages the only new topic introduced in this chapter. LFS cites the true claim from
Chomsky (1957) that Colorless green ideas sleep furiously and Furiously sleep ideas green colorless
had the same frequency in English before 1957 (namely, zero). But he adds the false claim that
Therefore, [Chomsky] is obliged to conclude that frequency reveals nothing about grammaticality
(148). This is not true. Chomsky does assert that in any statistical model for grammaticalness, these
sentences will be ruled out on identical grounds as equally remote from English, but he is wrong. He is
assuming that the probability of a type of event must be regarded as zero if it has not occurred so far.
That is the result that one gets from using the technique now known as maximum likelihood estimation
(MLE). Chomsky was not obliged to adopt MLE. A more suitable technique had been developed during
the Second World War by A. M. Turing and I. J. Good (see Sampson 2001, chapter 7, for an elementary
exposition), and although it took a while to become known, Good had published on it by 1953.
Chomsky was simply not very interested in applying statistical methods to linguistic material,
and knew little about them. In his disdain for such work he was followed by most linguists for the next
forty years. But when Pereira (2000) finally applied Good-Turing estimation (smoothing) to the
question of how different the probabilities of the two famous word sequences are from normal English
text, he found that the first (the grammatical one) had a probability 200,000 times that of the second.
In a section devoted to the topic of discovery procedures and grammar evaluation (149155),
Tomalin addresses Chomskys early bias toward, and later bias against, logical empiricism and the
influence of Goodmans thinking. Here we encounter a significant mistake in philosophical
interpretation: Tomalin erroneously takes Goodman to be a logical empiricist. Tomalin refers to the
kind of logical empiricism advocated by Goodman (150) and writes that in the mid-1950s Chomsky
abandoned the hard-line logical empiricism espoused by Goodman, and (seemingly) championed by
Chomsky (151). One can certainly read parts of LSLT as following the path of Carnapian (reductive)
logical empiricism, and Syntactic Structures as rejecting it. But it is a mistake to charge Goodman with
being a reductive empiricist. He was an outspoken critic of reductive empiricism, and of
foundationalism, just as Quine was. As Geoffrey Hellman put it (1977: xxiii):

Goodman has consistently been an original and leading opponent of the traditional empiricist
dogma that all knowledge can be built up from some perceptual stratum free of
conceptualization, for it is denied that such a stratum exists.

Tomalin counts his extended discussion of the influence Goodman and Quine had on Chomsky as one
of the significant contributions that LFS makes to the history of linguistics. He is certainly right that the
influence of Carnap, Goodman, and Quine on early TGG had previously been given no serious
attention. But perhaps because Tomalin is venturing into literature that is unfamiliar to him, he gets
Goodman wrong.
The question of whether Chomsky followed Quine and Goodman in rejecting foundationalism is
worth considering on at least two levels: with reference to the epistemology of linguistics, and with
reference to the epistemological question of how we can know a language. There is some support for
thinking that Chomsky retained his earlier foundationalism. However, to show this we first have to
point out a further mistake of Tomalins in interpreting Goodman.
Tomalin speculates that Chomskys mistrust of inductive generalisation was based on
Goodmans arguments in Fact, Fiction and Forecast (1954, henceforth FFF). He thinks that Goodman
sees induction as a research method, and that the standard Humean problem of induction as discussed in
FFF was taken as an alarming condemnation of empiricism as a practical philosophy (153). But in
fact the point of Goodmans discussion of Humes (old) problem of induction in FFF is to argue that
principles of induction (e.g., the principle that the future will resemble the past) are justified in just the
same way that deductive rules of inference are justified. Goodman sees inductive and deductive rules of
inference as equally justified. His discussion concerns the justification of rules of inference generally,
not research methods, or empiricism (reductive or not), and there is little reason to think it contains the
seeds of Chomskys doubts. If it did, it would follow that justification of deductive inference was no
better off. One may speculate that Chomskys mistrust of conclusions reached by inductive inference
stemmed from his foundationalism, since even good inductive arguments do not establish their
conclusions with the same certainty as good deductive arguments. But his doubts concerning induction
are not based on Goodmans arguments.

Tomalin moves on to discuss TGG specifically, first describing some aspects of the theory developed in
LSLT, and then discussing transformations and recursion. It is regrettable that he does not make use of
or cite the very important and thought-provoking assessment of LSLT in Sampson (1979); it would have
been useful to see Tomalin engage with the contrarian argument given there.
In the remainder of the discussion of TGG we find the most significant citational failure in LFS:
its failure to note the importance of the work of the mathematical logician Emil Leon Post for the
mathematical foundations of TGG. Post is cited very briefly seven times in LFS, always in order to
mention (in a rather repetitive way) his work on the mathematics of mechanically enumerable sets of
positive integers as summarised informally for a mathematical audience in Post (1944). There Post uses
the device of finitely describable idealized procedures (e.g., write down 2, 4, 8, 16, 32, . . . , and
continue thus forever) as a way of specifying infinite sets whose membership could be computationally
enumerated. In the terminology Post uses, these procedures GENERATE the sets. And to read Tomalin,
one might think that the only relevant thing about Post is that Chomsky (1959: 137n.) credits him with
having used the technical term generates to denote the relation between an enumeration procedure and
the enumerated set (see LFS: 64, 169170). 13
That citation of Post (1944) as the source for generates appears to be the only actual
bibliographical citation of Post that Chomsky ever made. The passing mentions of Post in Chomsky
(1962: 539) and (1965: 9) are not accompanied by citations. Chomsky nowhere mentions Post or
references his work in LSLT, though he does cite Rosenbloom (1950), which was strongly influenced by
Post (see especially Rosenblooms chapter IV). This may mean that Chomsky got his understanding of
Post mainly via Rosenbloom. The fact is that Chomsky never cites the papers in which Post makes the
contributions that are most relevant to TGG: the ones applying his formalisation of deduction to general
combinatorial problems about strings.
Posts 1920 doctoral dissertation, cut by a third to produce the published version in Post (1921),
was devoted to providing a fully mathematicised treatment of the deductive system of propositional
logic employed in Whitehead & Russell (19101913). Part of this involved devising a kind of rule
system for manipulating uninterpreted symbol strings, one which would be general enough to embrace
any imaginable deductive consequence relation (see 1921: 276). This involved the formalisation step in
that it mechanically mapped logical propositions to meaningless strings of symbols; it was formal in the
sense that no semantic properties of propositions were relevant to what the system did with their
corresponding strings; and it was a kind of Formalism in that it was aimed at establishing consistency
proofs, not for mathematics but for Whitehead & Russells propositional calculus. It would thus have
been essential material for Tomalin, given his aim of linking Formalism to the origins of TGG, but he
overlooks the whole of Posts main body of work.
Taking up his work twenty years later (after a long period of illness), Post (1943) defined a
canonical production system over a finite symbol alphabet A as a set of initially given strings defined
over A (the primitive assertions) and a set of rules (he called them productions) for deriving further
strings. A rule, very much like a (generalised) transformation in early TGG, consists of a structural
description and a structural change. The structural description is a set of patterns of the form g0P1g1P2g2
. . . Pkgk against which symbol strings are to be matched. The gs are given strings that must be matched
by copies of themselves, and the Ps are free variables over strings that can be matched by any substring.
A set of such patterns produces a further string, defined by an analog of the structural change of a
transformation: another sequence of strings and variables in which all of the free variables are copied
from those in the structural description. A canonical production system is said to generate the set of all
and only those strings that can be produced through iterative use of its rules.
What these papers reveal is that it was Post who invented rewriting systems. Linguists appear to
have overlooked this point, but theoretical computer scientists have not; see the very clear discussions in
Brainerd & Landweber (1974: 159ff.) and Kozen (1997: 256ff.).
Post also did the first work on generative power. The extremely general canonical systems of
Post (1943) were equivalent to Churchs lambda calculus and to Turing machines. Post defined a
NORMAL production system as a very restricted kind of canonical system, with only one initial string,
only one pattern in a structural description, only one variable in a pattern, and rules limited to the very
spare form g0P1 produces P1g1 more simply, xZ → Zy where x and y are specific strings and Z is a
variable over strings. Each such rule removes the designated x from the beginning of the string and adds
y at the end. Post proved by a succession of reductions that normal systems can define any set of strings
over an alphabet A that canonical systems can define, provided only that certain extra symbols not in A
are permitted to appear in the strings manipulated by the rules. These extra symbols are what linguists
now call non-terminals, or syntactic category symbols. (The free string variables are different: they
appear only in rules. The extra non-terminal symbols appear in intermediate strings in derivations,
though not in the strings ultimately counted as belonging to the language.) 14
The differences between Posts systems and the TGGs of Chomsky (1957) lie mainly in the
additional devices that Chomsky assumed, like rule ordering and obligatory application; but these turn
out neither to restrict nor to enhance generative power. Post had, in effect, already proved that
restricting transformations to the form xZ → Zy did not reduce the generative power of a grammar
(and thus that Is string w generated by grammar G? is not in general decidable a partial anticipation
of the later result of Peters & Ritchie concerning the 1965 version of TGG).
There is another simplified form of rule which Post also showed did not reduce generative
power: a system in which all the rules are of the form ZxW → ZyW . Post (1947) called this a rule of
semi-Thue type. The Norwegian mathematician Axel Thue (1914) had introduced bidirectional rules
of the form ZxW ↔ ZyW , meaning that ZxW and ZyW may be substituted for each other (equivalent to
a pair of rules containing ZxW → ZyW and ZyW → ZxW), and had posed the question of finding a
general method for solving problems like Can the rules convert string v into string w? for arbitrary v
and w. Post (1947) settled the question by proving that there was no such method. In fact he proved a
stronger claim: the problem is undecidable even for unidirectional replacement rules (like ZxW →
The semi-Thue type of rule is of particular importance, because Chomsky (1962: 539) directly
acknowledges that [a] rewriting rule is a special case of a production in the sense of Post; a rule of the
form ZXW → ZYW. The type-0 rules in the classification of Chomsky (1959) are simply semi-Thue
Thus, more than ten years before Syntactic Structures, Post had defined rewriting systems of a
very general sort (canonical systems) and had shown that neither limiting them to normal systems (xZ
→ Zy) nor limiting them to type-0 rewriting rules (ZxW → ZyW) altered their Turing-equivalence.
In short, Posts role in developing the methods used in early TGG has been underestimated. This
was an important topic for Tomalins project, but because of his reliance on Chomsky for
bibliographical references, he never came upon the relevant papers.

As mentioned above, Tomalin sees his accomplishment in the following terms (186):

. . . perhaps the main achievement of this book has been to associate TGG with both Formalism
and Logicism, two intellectual movements that profoundly influenced scientific methodology in
the early twentieth century. Indeed, this general issue seems to have been the single destination
towards which the various paths of enquiry have led. With its focus on syntax instead of
semantics, with its use of a logic-based notational system, with its identification of the analogy
between a proof and the generation of a grammatical sentence, and with its use of such
procedures as recursive definitions and axiomatic deduction, TGG unambiguously reveals its
associations with the formal sciences.

But Tomalin has not associated TGG with the substance of the programmes of Formalism and
Logicism. One cannot base a substantive association between early TGG and these long-discarded
movements in the philosophy of mathematics on the basis of nothing more than some shared methods.
All of these methods were used in pre-Chomskyan linguistics. Euclid used postulates, but that does not
associate geometry with Bloomfield (1926). What Chomsky drew on that was new to linguistics was the
tool of Post-style production systems; and that is a development Tomalin says nothing about.
The key point about the programmes of Formalism and Logicism is that they were crucially
bound up with the task of proving logical consistency and completeness. This task simply does not 15
arise in TGG. Chomsky was not trying to rescue linguistics from paradoxes, or to prove consistency of
linguistic theories. The notion of an inconsistent TGG does not make sense. In logic there is a
distinction between well-formed formulas and theorems, and in linguistics there is not. In a consistent
and complete logic, each formula p has an anti-formula ∼p, its negation, which must be provable if and
only if p is not. Intuitively, a logic is inconsistent if and only if some provable assertion has a provable
negation. There is no analog in natural languages. The strings of formatives generated by grammars for
natural languages do not have negations in the relevant sense. (Within English, Jesus wept is regarded
as having Jesus didnt weep as its negation, but the whole point is that the grammar SHOULD generate
both of those, not that this should be avoided. In logic each formula has a negation which must NOT be
derived, i.e., proved; the analogy between generating and proving breaks down at this point, and to use
deriving for both is just a pun.) What Gödel showed concerning the Hilbert Programme was that it
could not succeed, because completeness and consistency could never be attained together within a
given system that was powerful enough to express arithmetic. Nothing comparable could ever be done
to TGG, because notions of completeness and consistency do not arise there.
Tomalin concludes with an odd leap from 1957 to the present day. He discusses (pp. 198f) two
conflicting claims about whether TGG is a science. One is negative: Paul Postals view that the
principles and accomplishments touted in recent years are almost embarrassing in their inadequacy and
shoddiness (quoted by Huck & Goldsmith 1995: 141f.). The other is positive: Massimo Piattelli-
Palmarinis claim that generative grammar is well on its way to becoming a full-blown natural science
whose idealisations, abstractions, and deductions will eventually match in depth and subtlety those of
the most advanced domains of modern science (from the introduction to Uriagereka 1998: xxv).
Tomalin thinks that this implicit disagreement can be resolved by answering a historical question, and
that LFS answers it. His idea is that any theory presented in terms of the axiomatic method is a formal
science, hence a science; and Piattelli-Palmarini stresses idealizations, abstractions, and deductions as
if these are what science strives for. But it is more than a little naive to think that a history of theory
FORM in linguistics will be able to settle the disagreement between Postal and Piattelli-Palmarini. What
matters for Postal is the CONTENT of linguistic theories, not just their form. Postals point is that the
referees for linguistic journals should be like employers in the private sector as characterised in Dan
Ackroyds immortal line from Ghostbusters: They expect results.
Another general problem that afflicts LFS has to do with the fact that it discusses the work of
Chomsky, which illumines the linguistics of the second half of the 20th century with such brilliance that
those who attempt to write the history of the period often seem to be blinded by the light. Like others
before him, Tomalin has followed Chomsky too closely and uncritically at many points in his research.

Because Chomsky does not cite Carnap, Tomalin does not appreciate the extent to which
Chomsky follows Carnap in his foundationalism.
Because Chomsky mentions recollecting a critique of induction by Goodman, Tomalin thinks
Goodman must have been sceptical about induction, which he was not.
Because Chomsky regards Bar-Hillel (1953) as merely an advocate of applying logic to natural
language, and Hockett (1955) as simply an example of a model of grammar limited to finite-state
expressive power, rather than seeing both as early advocates for types of generative grammars,
Tomalin does too.
Because Chomsky asserts that no probabilistic account could distinguish stark ungrammaticality
from mere incoherence, and cites no work on probabilistic methods, Tomalin follows suit. 16
Because Chomsky never cites Posts key technical papers, Tomalin likewise overlooks them,
missing the crucial fact that Post invented the machinery of generative grammars and proved the
first theorems relating rule form to weak generative capacity.

In general, though, linguists and philosophers interested in the history of generative linguistics will
find that there is much to be learned from Tomalins book, despite the fact that it does not complete its
historiographical task, it surveys its sources too superficially, it assigns too much importance to the
form of theories (as opposed to the claims that are put into that form), and its expository scope is
unintendedly limited by following the bibliographical materials and opinions of its leading figure much
too closely. It is imperfect, but we nonetheless think that everyone interested in the history of 20th-
century linguistics should read it. We are certainly glad that we did. We learned things we didnt know,
confronted issues we had not been thinking about, and were led to literature that we had long neglected.

: : Tracking the Origins of Transformational Generative Grammar   2011-08-09, 17:35



: : Tracking the Origins of Transformational Generative Grammar   2011-08-14, 20:02



: : Tracking the Origins of Transformational Generative Grammar   2011-08-14, 20:03

Tracking the Origins of Transformational Generative Grammar
1 1

 ::   ::   ::  lmd-