Chapter 09
Chapter 09
He was a grammarian, and could doubtless see further into the future than others. J. R. R. Tolkien, Farmer Giles of Ham I have chosen in this chapter to talk a bit about what i'm calling the Generative Enterprise or the generative approach to the study of grammar. At least in its modern form, the generative approach to grammar has been being developed during the past half-century, and it has contributed greatly to our modern understanding of what human language is like and how it works. The generative approach can be and has been fruitfully applied to all modules of grammar: phonology, morphology, and syntax. But its earliest successes were in the field of syntax; it was only after it had been thoroughly established as a useful way of investigating syntax that it began to be applied to phonology and morphology as well. That's one reason why i've put off discussing it until now: Now that we've looked at all three modules of grammar, most recently syntax, we can to some extent use what we've already talked about to illustrate the generative approach that is relevant to them all. Another reason i've put off this discussion until now, i'll admit, is more personal; my own primary interests in grammatical research are in the field of syntax, so it's in connection with syntax that i feel most comfortable talking about how grammatical research is done.
Creativity
One of the most important, the most essential aspects of human linguistic activity is its CREATIVITY. I don't know what it's like for you people; i don't even know what Chinese word best conveys what i mean. But in the West, if i use the words creative or creativity, people are liable to think of some inflated, 19th-century notion of the creative artist, a heroic, titanic figure who by virtue of extraordinary artistic genius has access to some intimate commerce with God or the Universe that is denied to us ordinary mortals. That's not what i mean when i speak of human linguistic activity as being creative. I mean it very literally. Every one of us, every day, hears and says, reads and writes linguistic expressions we have never before experienced, in many cases that no one has experienced. Any attempt to discuss the phenomenon of human language that overlooks this fact is at best disastrously inadequate and at worst a lie.
Automatic Productivity
Let's think back a bit to the notion of productivity in morphology. Back in Chapter 2, i said that some derivational affixes are more productive than others, meaning that they can be added, relatively freely, to a relatively wide range of words, of stems. Morphological productivity means, among other things, that once certain words exist other words can be freely created from them. Once you know that a certain word exists, that it belongs to a certain class, it's possible to create other words based on it, words that hadn't existed previously, just by using productive derivational affixes. Let's look at the examples in (1). Most of these i gave you back in Chapter 2, but now i want particularly to focus on the coining of new words. The first two examples under (1) involve the addition of productive derivational affixes, -er and -ize, to pre-existing words to create new words; the fluent speaker of English just knows that -er can be added to a verb to form an agent noun, and -ize can be added to an adjective to form a verb. (1cd) involve the deletion from preexist-
175
ing words of what at least sounds like a productive affix, the suffix -er. (1e) involves the addition to a compound adjective of a productive derivational suffix that converts it into a noun, a noun that didn't exist before; (1f) also represents the formation of a new noun, this time involving a suffix that is not very productive but is suggested by the antonymous word warmth. (1g) is the conversion of a whole phrase into a verb by the addition of a pair of productive derivational affixes. And (1h) involves the coining of an adjective from a word that doesn't exist outside of the exercise, purely on the basis of the assertion that this word fleeb, whatever else it may be, is a transitive verb; the fluent speaker of English, by definition, knows that any transitive verb can be converted into an adjective by adding the suffix -able. (1) a. b. c. d. e. f. g. h. Pre-existing word or expression Xerox Cyrillic burglar laser waterproof cool dead branch fleeb New word Xeroxer Cyrillicize burgle lase waterproofness coolth de-dead-branchify fleebable
Having mentioned the case of fleeb and fleebable, i should tell you about what is known among linguists as the wug experiment. This story concerns inflexional morphology rather than derivational morphology, but the point is the same: The creative power of natural language. In this experiment, children were shown a picture of an imaginary animal and told that it was a picture of a wug. They were then shown another picture of two of these creatures, and the experimenter said, Now here's two of them; here are two ..., and the children all without hesitation agreed that the plural of wug is wugs. They'd never come across the word wugs before they'd never come across the word wug before, since outside the context of this experiment the word doesn't exist in English but being fluent speakers of English they knew very well that, unless and until someone told them otherwise, the plural of a noun is formed by adding -s to it. Once they'd been told that the first picture was a picture of a wug, they concluded that wug was a noun, and therefore qualified for this treatment. The fact that they'd never heard the word before, never before encountered it in any context, didn't make any difference at all. There was absolutely nothing preventing them from conjuring up an acceptable plural form of this nonsense word, once they understood it to be a noun. That's the kind of thing i mean by linguistic creativity; being fluent in a language involves being able to apply the rules of that language's grammar freely to create expressions that are completely new.
This is a wug.
Now there is another one. There are two of them. There are two _____. Fig. 9.1 The Wug Experiment 176
A variant of this is found in the Peanuts cartoon in Fig. 9.2. Linus needs a plural for the noun igloo. Since the word itself appears to be alien, formed in a way that is different from ordinary English words (it's actually a loan from Inuit), Linus rather automatically opts for an alien, nonEnglish way of forming plurals, based on the models in (2). Now, it may be noted that the word igloo doesn't quite fit the patterns in (2), since it doesn't end in s or in a single o. It may be further noted that the patterns in (2) are derived from Latin or Italian, which are no more closely related to Inuit than English is. But these facts are irrelevant. What's important is that Linus knows these patterns, recognizes them as foreign (i.e., not the standard pattern in English), and as a result applies them freely to a foreign word to which they have not previously been applied.1 In short, he is creatively generalizing a morphological pattern that actually has rather limited productivity in English.
Fig. 9.2 Creative Generalization of Minor Morphological Pattern (2) singular cactus stimulus alumnus fungus plural cacti stimuli alumni fungi singular tempo libretto graffito mafioso plural tempi libretti graffiti mafiosi
Let us consider the skills involved in riding a bicycle. Riding a bicycle is extremely complex, when you consider all the physical problems of balance, etc., and in fact the mechanical challenges that a bicycle-rider needs to solve merely in order to keep the bicycle moving forward without falling over can't be described without the use of mathematical concepts most of us don't have time to learn. Yet, once we've learned the skill itself, we can use it virtually without thinking. When we're riding a bicycle, all we need to think about is the current traffic situation and how we're going to get to where we want to go. We don't need to worry about how we're going to keep our balance, etc.; our trained brains and muscles do that automatically. Likewise, once we've learned a language, we no longer need to think about it about the grammar and vocabulary we're using most of the time; instead, we can devote our mental energies to concerns about what we're actually saying, what the other person is saying, what we want to say and how best to say it. We don't need to worry about getting the tones right or the agreement right or whether to put the object before or after the verb; the language centers of our brains handle all this pretty much automatically, without our having to think about it.
Which is the reason for Charlie Brown's perplexed response. Since these patterns have not been applied to igloo before, and are in general not productive in English, he fails to recognize the resulting form igli.
177
know and i suspect both statements may be true for different species of birds. But we humans aren't imitating anybody when we speak, nor are we relying entirely on a built-in repertoire. A lot of our linguistic activity consists of composing sentences we have never heard before and delivering them to the world around us, or interpreting novel sentences which we are hearing or reading for the first time. This is part of the point made by the famous linguist Noam Chomsky in 1959 in his devastating criticism of classical Behaviorist psychological theory. You've probably heard of Behaviorism; at least, you've probably heard of B. F. Skinner and Pavlov's dogs. Back in the early 20th century, Pavlov discovered that he could teach dogs to salivate at the sound of a bell, whether there were food present or not; he would routinely ring a bell at the same time the dogs were being fed, and eventually the dogs got so used to the association between food and the sound of the bell that, in anticipation (as it were) of being fed, they would start salivating as soon as they heard the bell, even if food wasn't actually present. Subsequent experiments demonstrated that dogs, rats, pigeons, etc. could be induced to perform all sorts of activities in response to artificial stimuli that had no direct connection with the behaviour they were being induced to perform. Repeated successes in such experiments encouraged a lot of psychologists, especially in the United States, to believe that human behaviour might be equally amenable to manipulation by external stimuli and, furthermore, that undesirable human behaviour might be treated and, ultimately, eliminated by means of this sort of operant conditioning, as it's called. This approach to human psychology ultimately reached the point of suggesting that the sort of talk therapy associated with Sigmund Freud, Carl Jung, depth psychology, analytic psychology, etc. was a waste of time, that there was no need to assume the existence of anything corresponding to the word mind at all, that human behaviour could be modified in any direction to an unlimited extent, and that all learning was merely a very sophisticated form of imitation and automatic response to external stimuli. This ultimate development of classical behaviorism is mostly associated with the name of B. F. Skinner, who was for a long time around the middle of the 20th century the leading figure in American psychology. We don't need to go into the reasons why this approach to human psychology was so popular, especially given that there was, in fact, spectacularly little in the way of experimental evidence to support it.2 The important thing for us in this class is that in 1957 Skinner published a book, Verbal Behavior, in which he attempted to extend his behaviorist theories to account for human linguistic activity. In doing so, he claimed that children learn language merely by imitating their parents, having certain linguistic behaviours reinforced at the expense of others by the expression of their parents' approval or disapproval, etc. Noam Chomsky, then a young linguist who had just recently completed his doctorate, wrote a long review of Skinner's book for the journal Language in which he essentially demonstrated that Skinner didn't have the slightest idea what he was talking about.3 He demonstrated with plenty of data to back up his argument that, if you listen carefully to what children between the ages of about 2 and 4 years actually say, they do not, as a rule, repeat things their parents say, but rather they come up with expressions that they
What positive evidence i have heard of suggests that humans can be conditioned in the same way as dogs, rats, and pigeons but only when they're asleep, i.e., when their higher cognitive faculties aren't functioning. The original review can be found in Language 35: 2657. It was reprinted by Prentice-Hall in 1964 in The Structure of Language: Readings in the Philosophy of Language (Jerry A. Fodor & Jerrold Katz, eds.), pp. 54778, and again in 1967 in Readings in the Psychology of Language (Leon Jakobovits & Murray Miron, eds.), pp. 14271.
3
178
have never heard from their parents or any other grownups and couldn't possibly have, such as those in (3).4 (3) a. b. c. d. Mommy no play. What dolly do? He goed / He wented. Nobody don't like(s) me.
Not only do young children readily produce strings that they have never heard, could not possibly have heard their elders use, but their parents rarely correct their grammar. A paper by Brown & Hanlon5 indicates that the way parents react to a child's attempts at language rarely has anything to do with grammar; parents are usually more concerned about the content rather than the form of the child's utterances: Is the statement true? Is it appropriate (consistent with cultural standards of politeness, etc.)? I mentioned back in Chapter 7 that most of us are quite capable, within certain limits, of interpreting strings in our native languages even when those strings are ungrammatical; parents routinely exploit this ability in trying to understand what their children are saying. Furthermore, on the relatively rare occasions when their elders do try to correct their grammar, children often don't pay attention, or pay attention to the wrong details. I'll tell you a couple of stories, recorded in the scholarly literature on language acquisition, to show you what i mean.6 In (4), the mother is trying to correct two mistakes in the child's initial statement: 1 the double negative (nobody n't) and 2 the lack of 3rd-person agreement (nobody do/like). But the child only notices the second correction, the difference between like and likes. In (5), the child apparently decides that the whole procedure is just one of those idiotic things that grownups occasionally make her do, and does her best to cooperate with the silly game without in any way letting it affect her real language use. (4) Child: Nobody don't like me. Mother: No, say Nobody likes me. Child: Nobody don't like me. (This is repeated 8 times) Mother: No, now listen carefully: say Nobody likes me. Child: Oh! (finally getting it) Nobody don't likes me! (5) Child: Want other one spoon, Daddy. Father: You mean, you want the other spoon? Child: Yes, I want the other one spoon, please, Daddy. Father: Can you say the other spoon? Child: Other ... one ... spoon. Father: Say ... other.
You'll probably have to take my word for it that children growing up in English-speaking households really do say things like this. Some of you might have fun investigating the things Taiwanese children of about the same age typically say; my point is, i am quite confident that Taiwanese children between the ages of 2 and 4 come up with sentences that are just as outrageous as those in (3) are in English, whether they're learning Mandarin or Taiwanese. R. Brown & C. Hanlon, Derivational Complexity and Order of Acquisition in Child Speech, published in 1970 in J. R. Hayes' Cognition and the Development of Language (Wiley).
6 5
The first of these stories is attributed to David McNeill, but i haven't been able to find an exact citation. The second is reported by Martin D. S. Braine, the father in the story, in his paper On Two Types of Models of the Internalization of Grammars, published in 1971 in D. I. Slobin's The Ontogenesis of Grammar: A Theoretical Symposium (Academic Press).
179
Other. Spoon. Spoon Other ... spoon. Other ... spoon. (exasperated) Now give me the other one spoon?
What evidence we have about child language acquisition suggests that, rather than imitating their parents directly, children basically use the input they get from their parents' speech as raw material which they use to build their own speech. The resulting speech patterns match those of their parents only approximately but, as time goes on, the fit improves and the children's speech patterns become more and more similar to those of their parents and other grownups they associate with, until finally they become indistinguishable and the children are effectively speaking the same language as everybody else around them. This is a very oversimplified summary of what is involved in child language acquisition, but the point is that human linguistic behaviour is a creative process, not an imitative one. In demonstrating this conclusively, Chomsky and other linguists in the late 50's and early 60's effectively demolished the whole classical behaviorist school of psychology, since they forced psychologists to realize that there was at least one human activity, namely language, that couldn't possibly be accounted for on the basis of the behaviorist hypotheses.
180
a challenge [PP to [NP the theory [PP of [NP the overconfident young linguist [PP from [NP Harvard University]]]]]] b. a man [PP in [NP the green truck [PP under [NP the maple tree [PP near [NP the house [PP of [NP the Governor's sister]]]]]]]] c. a book [PP on [NP the table [PP under [NP the window [PP in [NP the bedroom [PP on [NP the third floor [PP of [NP the house [PP at [NP the end [PP of [NP the lane]]]]]]]]]]]]]]
The rule in (8) is a somewhat more expanded PS-rule for NPs; i didn't talk about this version in Chapter 8, but there is good reason to believe it's a legitimate rule of English and of a lot of other languages. Notice that it includes an optional S complement. But S's almost always include NPs, which according to the rule in (8) can include S complements, each of which will probably include NPs ... and so on; you get the idea. This is what makes it possible to generate sentences like those in (9). (8) (9) (10) NP (DET) (AP) N (PP) (S) Ted denied [NP the claim that [S Bill had spread [NP the rumour that [S Mary disliked [NP the fact that [S Sam had tortured her pet anteater]]]]]]. a. b. This is the house that Jack built. This is the malt that lay in the house that Jack built.
181
c.
This is the priest all shaved and shorn that married the man all tattered and torn that kissed the maiden all forlorn that milked the cow with the crumpled horn that tossed the dog that worried the cat that killed the rat that ate the malt that lay in the house that Jack built.
When PPs are generated inside NPs as in (7) or S's inside NPs as in (9), we say they are embedded inside these NPs. It's to a great extent by means of such embedding that we are able to generate phrases and sentences of unlimited length. In (10) i've given an extremely extensive example of embedding from a traditional English-language children's game; perhaps you have similar games in your own culture. This game starts with the simple sentence in (10a). It then goes on to embed that sentence inside an NP that is itself part of another sentence, as in (10b). And it goes on like that. Part of my point in giving you this example is that, although this is as i said an extremely extensive example of embedding (involving, as in (10c) which is one but only one possible ultimate example from this game, at least 9 layers), it's a children's game, not a major challenge to human cognitive ability. Children under the age of 10 do this kind of thing for fun! Please note that the sequence in (10c) is only one of many possible outcomes of the game. At least in the United States, as far as i know the sequence house-malt-rat-cat-dog is pretty traditional, but beyond that point the game can go in a number of different directions. And even if it does get to the priest all shaved and shorn, there's no reason why it has to stop there; it can go on indefinitely. The patterns represented by PS-rules like those in (6) and (8) may be finite themselves but they have infinite potential built into them. A generative grammar is a finite set of patterns that makes possible the description of an infinite set of sentences. The word generate must not be understood to mean that a grammar generates sentences in anything like the way a turbine generates electricity. This is an unfortunate misunderstanding that often arises from the use of the word generate in its formal mathematical sense. For instance, in analytical geometry we can define a circle by either the statement in (11a) or the equation in (11b). (11) a. b. A circle is a set of all coplanar points equidistant from some other point, which is called the center. (x - a)2 + (y - b)2 = c2
This equation in (11b) is said in mathematical jargon to generate a circle on an xy plane with a center at the point (a,b) and a radius c. But this equation does not in any way magically produce circles; there is no way that i can take a sheet of paper, mark off a pair of perpendicular axes x and y on it, and then merely by invoking this equation conjure any number of circles to appear on that sheet of paper. The equation merely says that any set of points on that sheet of paper that happens to satisfy the conditions described in (11a) and spelled out in the equation (11b) is ipso facto, as a matter of definition, a circle. That is what generate means in mathematics. In the generative approach to grammar, likewise, a grammar is regarded as a set of rules or definitions like those in (11); any sequence of words that is consistent with this set is grammatical, and any sequence that is not consistent with it is not grammatical. Such a grammar, providing a 182
complete formal description of all grammatical strings in the language, is said to be a generative grammar.
Competence/Performance
Earlier we noted that the set of sentences that constitutes a language is infinite. But in another sense, it can be argued that this set of sentences cannot be infinite because it is possible to posit a number, say 108 (), and then confidently assert that no one will ever utter, write, or compose a sentence of 108 words. This is certainly true, but is it a fact about human linguistic ability? Or is it merely a fact about the human lifespan? Granted that no one will ever actually use such a large sentence, it is not impossible in principle. We can say that the set of grammatical English/ sentences is infinite, even though the set of usable English/ sentences is finite (though very large). This brings us to one of the many manifestations of what is called in generative studies the distinction between competence and performance. The claim is that human beings are inherently competent to generate and to process an infinite number of different sentences, but because of performance restrictions including, ultimately, not only the limits on an average human's attention-span but on hanns lifespan nobody ever actually exploits this competence to its full extent. The competence/performance distinction is important in generative theory but is rather complicated. Part of it has to do with speech errors. We all of us make occasional mistakes mispronunciations, slips of the tongue, etc. even in languages in which we are quite fluent. Partly for their entertainment value, i offer you in (12) some examples of Spoonerisms. In terms of the competence/performance distinction, errors such as these are regarded as matters of performance. The assumption is that, even when you or i mispronounce or misuse a word in our native language, we know very well how that word should be pronounced or used; that's competence. Our occasional failure to pronounce or use it correctly is a matter of performance. (12) (Supposedly) authentic Spoonerisms (attributed to Rev. William A. Spooner, 1844-1930, English priest & professor at Oxford University) what he (presumably) meant to say Addressing his congregation in church: Our next hymn will be Kinkering Kongs their Titles Take Conquering Kings their Titles Take Addressing a lady in church: Mardon me, Padam, Pardon me, Madam, but you are occupewing the wrong pie; but you are occupying the wrong pew; allow me to sew you to another sheet. allow me to show you to another seat. To a delinquent student: You have hissed my mystery lecture; You have missed my history lecture; you have tasted the whole worm. you have wasted the whole term. In a pub, celebrating the 60th anniversary of Queen Victoria's reign: Three cheers for our queer old Dean! Three cheers for our dear old Queen! what he said
(a)
(b)
(c)
(d)
But performance in this sense is not only a source of error. It also has to do with pragmatic considerations such as politeness. There are things we do not say, not because grammar prevents us 183
from saying them, not because we are physically incapable of saying them, but because we don't wish to offend the person we're speaking to. At least from the common generative point of view, knowledge of what might or might not offend another person is not considered part of grammar or even part of language at all; it's part of one's general knowledge of the world, just as much as the knowledge that one person and another person are two people. It is important to understand that, to the general linguist, both competence and performance are of interest. There are branches of linguistic scholarship that are specifically interested in matters of linguistic performance, and rightly so; we shall be talking about some of them next semester. But generative theory is primarily interested in competence: What is the human mind capable of, with regard to language? The grammarian as grammarian is not interested in any fact about language that can be adequately explained on the basis of non-linguistic facts (e.g., politeness, ignorance, or distortions and errors due to influence of alcohol). Hence the theoretical distinction between competence and performance.
184
non, but as a historical and comparative linguist i regard it as primarily a social phenomenon. I do not believe the two points of view are utterly incompatible or mutually exclusive.7 Be that as it may, there is no question that human language is an important component of human psychological behaviour, and it is a fundamental assumption of the Generative Enterprise that we can study human psychology by studying human language. Since human language is something that is done by human minds, it must somehow reflect the structure of human minds; if human language is the way it is, has the characteristics it does, and not different ones, that must mean that the human mind is organized in such a way that human language should have such characteristics. Note i say human language, not human languages. The Generative Enterprise assumes that all the thousands of human languages have certain characteristics in common. The fact that both Chinese and English normally put the verb in the middle of the clause while Japanese and Hindi normally put it at the end is presumably a peculiar fact about the grammars of Chinese and English, on the one hand, and Japanese and Hindi, on the other. The fact that in English and Finnish all inflexional morphemes are suffixes must be a peculiar fact about the grammars of those languages, since we know of other languages that have inflexional prefixes or infixes. But the fact that derivational morphemes tend to be closer to the root than inflexional morphemes is a fact of all languages, and the fact that NPs always allow PP modifiers is apparently likewise a fact of all languages, and these facts therefore presumably represent something very basic about how the human mind works. We can imagine languages that are structured in ways that are radically different from the ways in which all known human languages are structured; the fact that no human language (so far as we know) is structured that way presumably says something not only about how human language works but about how the human mind works. What that something may be in most cases we don't know yet; but we do have some preliminary findings that are quite interesting in their implications for general psychology. And it's findings like this that are part of the ultimate goal of the Generative Enterprise.
However, i have to admit that i am also a Christian, and manage not to regard the doctrine of the Trinity as a contradiction in terms, either; it may be that my religion has trained me to maintain inherently paradoxical philosophical opinions fruitfully in my mind.
185
idually or collectively, with a lot of different strings in we call such a large collection of data a corpus and you told me whether your internalized grammar approved or rejected each one, bearing in mind what was said in Chapter 7 about the difference between acceptability and grammaticality; in such an endeavour, you would be my informants. Now, in order to be of interest to a grammarian like me, i'm afraid this list of strings would have to be pretty long one wants a lot of data for an enterprise like this. But let's suppose that we have done this, and in the end i have a long list of strings in , some of which you have told me are grammatical and others you have told me are not, on the basis of the internalized grammar of which you have by virtue of being fluent speakers of the language. For ease of exposition, let's break this list up into two lists as in Fig. 9.3: Over on the left we have a list of all the strings you approved of, and over on the right the list of all the strings you starred, or rejected by marking with an asterisk. Now i set about the task of devising an explicit grammar of . My goal is to devise a grammar that will agree with you: Like your own internalized grammar, it will accept all the strings in the list on the left of Fig. 9.3 but none of the ones on the right. The more closely my formal, explicit grammar's judgments agree with those of your implicit, internalized grammar, the better i have done my job and the happier i am. The ultimate goal in this stage of the generative enterprise is a formal, explicit grammar that is exactly equivalent to the native speaker's internalized grammar in terms of predicting which strings are grammatical and which are not. Once i've achieved that if i ever do then i have a reasonably good model of the grammar inside your head(s).8 Grammatical *Ungrammatical*
Fig. 9.3 Corpus of Grammatical and Ungrammatical Strings Now, this procedure i've just outlined is not only highly generalized, it's also oversimplified. Please note that it is not necessary to collect all the data before you start building a theory about it. There's a popular fiction about the scientific method that says that a proper scientist goes out and collects all the relevant facts, then sits down and thinks about them until hann develops a theory to explain them all. This is almost never true. Usually what happens is a scientist starts out with a few facts, that hann may have found by whatever means. An idea occurs to hann that seems to make sense out of these facts. It doesn't matter where the idea comes from. It may come as a result of rational cogitation but it often doesn't; it could just as well come as a result of some mystical or religious experience; the 19th-century chemist Kekule found the solution to the puzzle
Some of you may be wondering about the extent to which the details of my grammar correspond to anything inside your brains. That, as it happens, is a different question entirely, and one well worth discussing; we will touch on it somewhat in Chapter 15.
186
of the structure of the benzene molecule through a dream. As far as chemistry is concerned, the important thing is not that he got the idea from a dream but that he was intelligent enough to realize that the image he saw in his dream, although it was actually the image of a snake, suggested a very good solution to the problem of molecular structure he had been working on. Once the scientist gets this idea, whatever it is, hann works with it until hann has developed it into a form that not only offers an explanation of the facts the scientist already has but asks interesting questions about other facts that the scientist doesn't know about yet; in considering such an idea, the scientist says, If this is true, then X, Y, and Z must also be true; i must go and find out if they are. At this point, the idea in question is called a hypothesis.9 The process of checking up on X, Y, and Z which the hypothesis predicts are true but the scientist doesn't yet know whether they're true or not is called testing the hypothesis.10 Likewise, the linguistic scientist may start out with a few facts that strike hann as interesting. Hann somehow or other devises a hypothesis to explain them, and that hypothesis forces hann to ask new questions that call for more data than hann has. In principle, it's a simple matter for hann to go out looking for that data. So, if i look at these two lists of strings and try to come up with some explanation as to why all the strings on the left side are on the left side but all the others are on the right side, and i come up with a hypothesis that suggests something about some string that isn't in either list, i can simply return to you, my informants, and say, What about this? Is this OK in ? And you would tell me it's OK or not OK, and i would know whether my hypothesis was correct in its prediction.11
I'm using the words hypothesis and theory as though they were roughly equivalent. Strictly speaking, they're not. A hypothesis is basically a kind of guess about a body of data. It's an educated guess, in that it has to be both relevant to the body of data you're currently dealing with and compatible with other, related hypotheses that are already running around. And it's a productive guess, in that it makes suggestions or predictions about other data that you haven't looked into yet. But it's basically just a guess. A theory is a cognitive structure that groups together related and compatible hypotheses; most hypotheses only have meaning within the context of some theory.
10
I once met someone who argued that Darwin's Theory of Evolution wasn't a good theory because so much of the data needed to support it wasn't available to Darwin at the time he developed and published it. This is ridiculous. Darwin was a very intelligent and well-trained scientist who understood that what makes a theory worthwhile is not whether it's true or not but whether it prompts you to ask interesting questions about the world and suggests what sort of facts ought to be out there, waiting to be found, if it were true. The fact that most of the data we have now supporting Darwin's theory weren't available to him in the 1860's is irrelevant; his theory not only explained very well the facts he had at hand but suggested where we should go to look for more relevant facts and what those facts ought to be. And we have looked in the places the theory recommends and lo, we have found the facts it predicted we would find. Note, it would still be a good theory if the facts turned out to be other than it predicted; it would then merely be an invalid theory. But it would have prompted us to look for and find those facts, and that's what makes a good theory. When i was working on my doctoral thesis, my situation was a little different. I was working on a dead language, Vedic Sanskrit, and there wasn't anybody i could totally trust to tell me what was and was not grammatical in that language, though some of my teachers had a lot more experience with Vedic than i did and could make some very helpful suggestions. What i had was a huge corpus of literature written in Vedic. Throughout my research, i assumed that any string i found in that corpus must be grammatical, or it wouldn't have been written down. If i could imagine some plausible pattern of words and/or grammatical structures in Vedic but couldn't find any examples of that pattern in the corpus, i had to assume it was because that pattern was in fact ungrammatical. So instead of going out and talking to native speakers when i came up with hypotheses that suggested questions and made predictions i couldn't answer with the data i had, i just kept reading the corpus. I went through several hundred pages of Vedic
11
187
Something else needs to be borne in mind in this connection. It is impossible to study language, or anything else, without some theory. Whatever study you're doing involves some beginning assumptions, and those assumptions constitute some kind of theory about what you're studying. And in the course of your study you have to ask questions in order to acquire more data, and the questions you ask are heavily influenced by the theory you've got at the back of your mind. You can't get the data first and form the theory later; that's not a reasonable requirement, and it's not what the scientific method requires. What the scientific method requires is that you be openminded, that you be willing to change or even throw out your theory if you discover you have to; it requires that at all times you have some idea what sort of data would force you to change or throw out your theory. If you have such an idea and diligently search for such data and they're not there, that's fine, your theory has stood the test. If you diligently search for the contrary data and find it, that's fine too; you may have to change or throw out your theory, but you've learned something, and in the end you'll have a better theory, in the sense of one that fits the facts better. But if your theory can't be tested, if there is no conceivable way it could be proved wrong, then it may be OK as a religion but it's certainly no good as a scientific theory. And if it does suggest ways in which it could be tested and you refuse to test it, then you're being dishonest with yourself. That, in a nutshell, is the scientific ethic. An 18th-century scholar, John Fell, said, It is certainly the business of a grammarian to find out, and not to make, the laws of a language. Note that this attitude parallels the attitude of a researcher in the physical sciences. A physicist is not concerned with how matter and energy ought to behave; hann is concerned to find out how they actually do behave. A chemist is concerned to find out the patterns in which chemical elements and compounds actually interact with each other, not figuring out how hann wishes they interacted and then compelling them to conform to hanns wishes. Scientists, like everybody else, have preferences and hopes and prejudices and fears. But a person engaged in scientific research has to acknowledge that those preferences, hopes, prejudices, and fears are part of hanns own character, not necessarily reflections of objective truth, and objective truth, insofar as it is ascertainable, is what science is all about.12 A person inventing a language is at liberty to make up whatever grammatical rules hann pleases. And some people have played around at inventing languages; some of these people have even been qualified, professional linguists; language-invention can be a fun game, for those who like that sort of thing; i know; i've done it myself.13 But at least to the linguist, a game is all it is or can be. The linguist, as linguist, is concerned not with devising hypothetical, imaginary languages, but with studying real ones. And very often the real ones involve patterns that are quite different from anything the researcher expects or has ever come across before. This is part of the fun of linguistics, but it's also part of what keeps us humble. The true scientist is always humble before the subject of hanns research. Because for the scientist, the truth is out there, not in here.
literature that way, and some of my preliminary hypotheses turned out to be wrong. But they had forced me to look for relevant data, and that was good, and my research and the book that came out of it were the better for it.
12
There are certain people who deny the possibility of objective truth. I'm afraid we don't have time to go into the question of whether there really is such a thing. As a scientist, i firmly believe there is, and i'm more concerned with telling you how we go about finding out what it is then with establishing whether it exists or not. Cf. further discussion of this topic at the end of Chapter 13.
13
188