0% found this document useful (0 votes)
137 views307 pages

Formal Methods Public

The document discusses formal methods in the philosophy of language. It introduces concepts like constituents, which are parts of sentences that form complete thoughts. It also discusses constituency tests to identify constituents, such as coordination and topicalization. The document is long and appears to be a textbook or paper on formal semantics and syntax.

Uploaded by

Armando_Lavalle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
137 views307 pages

Formal Methods Public

The document discusses formal methods in the philosophy of language. It introduces concepts like constituents, which are parts of sentences that form complete thoughts. It also discusses constituency tests to identify constituents, such as coordination and topicalization. The document is long and appears to be a textbook or paper on formal semantics and syntax.

Uploaded by

Armando_Lavalle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 307

Formal Methods

in
Philosophy of Language

((e, t), t) (e, t)

((e, t), ((e, t), t)) (e, t)


t
every2 philosopher λ2

((e, t), t) (e, t)

((e, t), ((e, t), t)) (e, t) t


λ1
some1 linguist e (e, t)

x1 (e, (e, t)) e

admires x2

1
Contents
A Constituents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

B Constituency Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

C Constituency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 7

D Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

E Brackets and Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

F Mathematics of Constituents and Trees . . . . . . . . . . . . . . . . 12

G Grammatical Categories . . . . . . . . . . . . . . . . . . . . . . . . . 13

H Phrase Structure Grammar . . . . . . . . . . . . . . . . . . . . . . . . 15

I Recursive Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

J X-Bar Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

K Movement and Transformation . . . . . . . . . . . . . . . . . . . . . 24

L Meanings as Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 27

M Sentential Meanings . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

N Some Background Mathematics of Functions . . . . . . . . . . . . . 31

O Transitive Verbs and Schönfinkelization . . . . . . . . . . . . . . . 35

P The Winner and . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Q . . . The Bald Man . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

R Functions and Type Theory . . . . . . . . . . . . . . . . . . . . . . . . 52

S Typing Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

T Naming Functions and Lambda Notation . . . . . . . . . . . . . . . 65

U From Mathematical Functions to Semantic Functions . . . . . . . 67

V Lambda Notation and Higher-Typed Functions . . . . . . . . . . . . 69

W More Complicated Semantic Values in Lambda Notation . . . . . . 72

X Calculating Semantic Values With Lambda Terms . . . . . . . . . . 80

2
Y Three More Complicated Linguistic Examples . . . . . . . . . . . . . 85

Z Beyond Binary Branching . . . . . . . . . . . . . . . . . . . . . . . . . 97

AAFunctions of More Than One Argument . . . . . . . . . . . . . . . . 100

AB Truth-Functional Connectives . . . . . . . . . . . . . . . . . . . . . 104

ACTreating a First Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . 108

ADA King and No King . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

AE Beginning Exploration of ((e,t), t) . . . . . . . . . . . . . . . . . . . . 122

AF Upward Closure and ((e,t), t) . . . . . . . . . . . . . . . . . . . . . . . 126

AGSources of Upward Closure . . . . . . . . . . . . . . . . . . . . . . . . 129

AHDownward Closure and ((e,t),t) . . . . . . . . . . . . . . . . . . . . . 134

AI Strong and Weak Determiners . . . . . . . . . . . . . . . . . . . . . . 138

AJ Definite and Indefinite Descriptions are ((e,t),t) . . . . . . . . . . . 138

AKNames are ((e,t), t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

AL Type Lifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

AMContext Sensitive Language . . . . . . . . . . . . . . . . . . . . . . . 150

ANContext Sensitivity, Character, and Avoiding Relativization . . . 157

AOA Problem About Every Linguist . . . . . . . . . . . . . . . . . . . . 164

AP More Admiration Problems . . . . . . . . . . . . . . . . . . . . . . . . 171

AQQuantifiers and Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

ARHow to Produce Multiple Scope Readings . . . . . . . . . . . . . . . 176

AS Assignment-Relativized Semantic Values . . . . . . . . . . . . . . . 177

AT Variable Binding: The Single-Variable Case . . . . . . . . . . . . . . 181

AUVariable Binding: The Multiple-Variable Case . . . . . . . . . . . . 184

AV At Last, Some Linguist Admires Every Philosopher . . . . . . . . 190

3
AWResting On Our Laurels and Reflecting on How the Pieces Fit
Together . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

AXA Dark Secret Revealed . . . . . . . . . . . . . . . . . . . . . . . . . . 203

AY Setwise Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

AZAnother Use of ((e,t), t): Possessives . . . . . . . . . . . . . . . . . . 207

BA Another Use of ((e, t,), t): Relative Clauses . . . . . . . . . . . . . . 208

BB Another Use of ((e, t), t): Pronouns . . . . . . . . . . . . . . . . . . . 215

BC Possible Worlds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

BD Worlds and Connectives . . . . . . . . . . . . . . . . . . . . . . . . . 220

BE Modals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

BF Existential Modals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

BG Modal Flavors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

BHMeanings for Modal Flavors . . . . . . . . . . . . . . . . . . . . . . . 235

BI Modal Modals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

BJ Accessibility Relations and Iterated Modals . . . . . . . . . . . . . 244

BK The Logic of Iterated Modals . . . . . . . . . . . . . . . . . . . . . . 251

BL A Family of Modals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

BMS5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

BNProblems With Conditionals . . . . . . . . . . . . . . . . . . . . . . . 269

BO Modal Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273

BP Subsentential Modality . . . . . . . . . . . . . . . . . . . . . . . . . . 275

BQ Modals and Quantified Noun Phrases . . . . . . . . . . . . . . . . . 282

BR Modals and Scope Options . . . . . . . . . . . . . . . . . . . . . . . . 288

BS The Syntax of Modals . . . . . . . . . . . . . . . . . . . . . . . . . . . 293

BT Things That Might Not Have Been . . . . . . . . . . . . . . . . . . . . 305

4
A Constituents

Consider the following sentence:


(1) From boyhood up this old boy had put the fear of death into me.
This is a string of 14 words. Within it are a lot of substrings of consecutive
words, such as boyhood up this and the fear of death.

Problem 1: How many different strings of consecutive words are


there in the sentence?

There is an interesting distinction among these substrings. Some of them, such


as the fear of death, seem like they belong together. That string of four words
is a real part of the sentence, and makes some kind of sense on its own. Other
substrings, such as boyhood up this, don’t belong together. Boyhood up this
isn’t a real part of the sentence (it seems to bridge across different real parts),
and it doesn’t make sense on its own.

The substrings that seem to belong together we call constituents of the sentence.

Problem 2: Find five substrings of sentence (1) that are constituents


and five substrings of sentence (1) that are not constituents. (Don’t
use trivial substrings consisting of a single word.)

B Constituency Tests

We can give tests for constituents that provide more precision than just intuitive
“is this a real part” judgments.

Test 1: Coordination

Sometimes a substring can be combined with another similar substring using


“and”:

(2) From boyhood up this old boy had put the fear of death and the
hope of victory into me.

In other cases we can’t grammatically combine a substring with another similar


substring using “and”:

(3) #From boyhood up this and adolescence down that old boy had put
the fear of death into me.

5
Substrings that can be combined – coordinated – using “and” pass this con-
stituency test.

Test 2: Topicalization

Sometimes a substring of a sentence can be moved to the front and made the
topic of a sentence (topicalized):

(4) The fear of death, from boyhood up this old boy had put into
me.

In other cases a substring can’t be topicalized:

(5) #Boyhood up this, from old man had put the fear of death into
me.

Substrings that can be topicalized pass this constituency test.

Test 3: Proform Substitution

Sometimes a string of words can be replaced with a pronoun:

(6) From boyhood up this old man had put it into me.

Pronouns work (unsurprisingly) for nouns, but there are similar words for
other parts of speech:

(7) a. John ran slow, while Susan did fast.


b. Prices are stable, but they may not remain so.
c. John snores loudly, while Susan laughs thus.

The general category of such words is called proforms. If a substring can be


replaced by a proform, it passes this constituency test. Some substrings can’t
be replaced by a proform:

(8) #From it/do/so/thus old man had put the fear of death into me.

Problem 3: Is there a suitable proform that shows that blue car is


a constituent of:

(9) I recently bought the blue car.

Should blue car be a constituent of this sentence? What about:

(10) I recently bought a blue car.

Test 4: Parenthetical Insertion

Parenthetical insertions can be made in some places and not in others. Compare:

6
(11) From boyhood up this old man put the fear of death (luckily)
into me.
(12) #From boyhood up this (luckily) old man put the fear of death
into me.
According to this test, spots of permissible parenthetical insertion mark ends
of constituents.
Problem 4: What are the permissible locations of parenthetical in-
sertion in:
(13) The quick brown fox jumped over the lazy dog.
Assuming each permissible location marks the end of a constituent,
is this enough to tell us what the constituents of the sentence are? If
so, how? If not, what additional information or assumptions do we
need?

C Constituency Analysis

A constituency analysis of a sentence finds all of the constituents of that sen-


tence. For example, given the sentence:
(14) The tall giraffe ate every leaf.
we might identify as constituents (ignoring the single-word constituents) all of
the following:
• tall giraffe
• the tall giraffe
• every leaf
• ate every leaf
• the tall giraffe ate every leaf
We could then display this information more compactly as follows:
• [[the [tall giraffe]] [ate [every leaf]]]
where square parentheses mark off each constituent.
Problem 5: For each of the following sentences, use the four con-
stituency tests to give your best constituency analysis of the sen-
tence. Note places where the constituency tests disagree, and be
clear on your reasons for your final constituency analysis when
there is disagreement among the tests.

7
• This student will speak very slowly to that professor.
• Fred knows John knows Joe.
• John rang up his mother.
• He may have been writing a letter.

Problem 6: Consider the following two sentences:


• The student sat in the chair.
• The student brought in the chair.
Plausibly, one of these sentences has a constituency structure with
the form:

[[][[[]]]]
And the other has a constituency structure with the form:

[[][[][]]]
Use the various constituency tests to determine which sentence
should be matched with which form.

Problem 7: Attempt a constituency analysis of a sentence “in the


wild”. Pick a book at random, open it at random, and pick a sen-
tence at random from the opened page. Then try to work out the
constituency structure of the sentence using the constituency tests.
(You may find it surprisingly difficult to get a clear constituency
analysis. Note points of particular difficulty with the analysis.)

Problem 8: You’ve probably already encountered cases in which


different constituency tests yield different answers. With four tests,
there are six pairs of tests. See if for each pair of tests, you can
find a case in which the two tests in the pair give different answers.
Can you construct a single sentence that gets a different overall
constituency analysis for each of the four tests?

D Ambiguity

Some sentences are ambiguous, allowing for more than one reading/interpretation.
Consider:

(15) Very old men and women are happy.

This sentence allows two interpretations:

8
(15) a. Among the happy are both (i) very old men, and (ii) very old
women.
b. Among the happy are both (i) very old men, and (ii) women
(of any age).

The sentence also has two plausible different constituency analyses:

(15) i. [[[very old] [men and women]] [are happy]]


ii. [[[[very old] [men]] and [women]] [are happy]]

(Try some constituency tests to see if you agree with this.) In (15i), very old is
a constituent modifying the constituent men and women:

Definition: Constituent A modifies constituent B if [A B] is itself a


constituent.

Also, it looks like in reading (15b) we take very old to describe just the men,
while in reading (15a) we take very old to describe the men and the women.
It’s thus tempting to think that the two different constituency analyses explain
the ambiguity of (15), with analysis (15i) corresponding to reading (15a) and
analysis (15ii) corresponding to reading (15b). Constituency structure thus
might help us give a general theory of ambiguity.

Problem 9: Each of the following sentences is ambiguous. For


each, show the different ways it can be interpreted. Then see if
the sentence can be given multiple constituency analyses, and if
so, if the different analyses help shed light on the ambiguity of the
sentence.
• They found the boy in the library.
• Our shoes are guaranteed to give you a fit.
• You could not go to the party.
• Time flies like an arrow.
• I couldn’t see myself spending a month in a house without
mirrors.
• I cannot recommend the candidate too highly.
• The restaurant never served veal, which we all hated.
• Someone is mugged every 15 minutes in St. Louis.

Problem 10: The definition given above of modifying produces


some strange results. Give some examples of the strange results.
Are there easy ways to modify the definition to make the results
more plausible?

9
E Brackets and Trees

We have been displaying constituency analyses using a bracket notation:

• [[the [tall giraffe]] [ate [every leaf]]]

But sometimes it is more useful to display the same information in a tree


diagram:

the ate
tall giraffe every leaf
Some terminology for tree diagrams:

Tree: A tree is a collection of nodes joined by edges.


Node: A node is an ending point of an edge. Each in-
dividual word in the sentence is a node, and branching
points in the tree are also nodes. The above tree thus has
11 nodes. (6 nodes for the six words in the sentence, and
5 branching point nodes, including the root of the tree at
the top. (Our trees are upside down.))

Edge: An edge is a line connecting two nodes of the tree.


Our tree above has 10 edges. (Make sure you can correctly
locate and count the ten edges.)

Parent and Child: Node A is the parent of node B if there is an edge


directly joining A and B, and A is above B in the tree. (Thus B is a
substring of A.) B is then a child of node A.

Sibling: Nodes A and B are siblings if they have the same parent.

Dominate: Node A dominates node B if there is a continuous path


of edges leading down from node A to node B.

C-Command: Node A c-commands node B if the parent of A dom-


inates B.

Tree diagrams and bracket diagrams contain exactly the same information, and
it is straightforward to translate between the two methods of representation.

Problem 11: For each of the following bracket diagrams, give a cor-
responding tree diagram. For each of the following tree diagrams,
give a corresponding bracket diagram.

10
1. [ [ John ] [ [ gave ] [ [ the ] [ firefighter ] ] [ [
an ] [ axe ] ] ] ]

2. [ [ [ A ] [ [ big ] [ [ black ] [ dog ] ] ] ] [ [ quickly


] [ [ chased off ] [ [ the ] [ burglar ] ] ] ] ]

Note: chased off is treated as a minimal constituent in this


diagram. Is there a way to improve on that?

3.

colorless sleep furiously


green ideas

4.

the man
saw
a dog in
the park

5.

the man
saw

and
a dog a cat

Problem 12: From the trees in the previous problem, give examples
of pairs of nodes that stand in the parent, child, sibling, dominating,
and c-commanding relations.

Problem 13: Almost all of the trees considered above are binary-
branching trees. A tree is binary branching if no parent has more
than two children. Find the one place where a tree is not binary
branching. Is there a way we could plausibly have done the con-
stituency analysis so that the tree would have been binary branch-
ing?

11
F Mathematics of Constituents and Trees

Our discussion of constituents has tacitly assumed two facts about constituents:

1. Contiguity: Each constituent is a contiguous part of a sentence. There are


never constituents that consist in words broken up within the sentence.
Thus we didn’t even consider the possibility that boy ...the fear was
a constituent of sentence (1).
2. Nesting: We have only considered cases in which, given any constituents
A and B, either (i) A and B are completely disjoint, or (ii) A is a substring
of B, or (iii) B is a substring of A. We haven’t considered the possibility of
strings that partially but not fully overlap. Thus we haven’t considered
constituency analyses on which (for example) both the fear and also
fear of death are constituents of (1).

Problem 14: Give examples of sentences for which there is some


reason to want constituent analyses that contain discontiguous and
non-nested constituents.

Given Nesting, the constituent relation (the relation that holds between A and
B when A is a constituent of B) is a partial order. That means the constituent
relation has two features:

1. Transitivity: If A is a constituent of B, and B is a constituent of C, then A


is a constituent of C.

2. Asymmetry: If A is a constituent of B, then B is not a constituent of A.

Problem 15: Explain what features nodes and edges in a tree have,
that guarantee that a tree models a constituent relation that is tran-
sitive and asymmetric.

We can then think of a tree as a collection of nodes with (a) some partial ordering
on them, and (b) some node acting as the root node, dominating all of the other
nodes in the tree.

Problem 16: Consider each of the following questions about formal


properties of trees:
• Find a general numerical relation between the number of nodes
in a tree and the number of edges in a tree. Sketch a proof that
this relation holds.
• True or false: if any edge is removed from a tree, then the
tree will become disconnected. (There will be nodes between
which there is no path.) If the claim is true, give a proof of it.
If it is false, give a counterexample to it.

12
• True or false: if any additional edge is added to a tree, a cycle
is formed. (There will be a sequence of edges that allow you
to travel in a circle through part of the graph.) If the claim is
true, give a proof of it. If it is false, give a counterexample to it.

G Grammatical Categories

Some strings of English words form grammatical sentences:

(16) Hold the newsreader’s nose squarely, waiter, or friendly milk


will countermand my trousers.

Other strings of English words don’t form grammatical sentences:

(17) #Underground splendid alternative having quickly and.

It would be helpful if constituent analysis would help us distinguish the gram-


matical from the ungrammatical strings.

First Proposal: Take the constituent tree for any grammatical sen-
tence:

the ate
tall giraffe every leaf
Then remove the particular words on the leaves of the tree:

• •
• • • •
Then place any words you want on the now-empty leaves:

almost unless

sunset perfection hybrid penultimately

13
That didn’t work so well. Maybe the problem is that we didn’t respect the
internal constituent structure. Every leaf was a constituent of the original
sentence, and marked as a constituent of the sentence in the tree, but we replaced
it with hybrid penultimately, which is not a constituent.

Second Proposal: Again strip bare the constituent tree of a given


sentence. Then replace each constituent of that tree with a group of
words that forms a constituent:

quickly quite
drink lemonade distinctly ambitious

That didn’t work either.

Problem 17: The counterexample just given to the Second Proposal


has a problem. Figure out what the problem is, and then explain
why noticing that problem doesn’t actually do anything to make
the Second Proposal more attractive.

The basic problem is clear: we are replacing (for example) the adjective tall
in the original sentence with the noun sunset or the verb drink. We need to
restrict ourselves to substitution variants of the constituent tree that keep the
grammatical categories the same.

But to restrict in this way, we need to know what grammatical category each
word belongs to. But defining the various grammatical categories is very
difficult. The canonical source Schoolhouse Rock tells us:
• A noun is a person, place, or thing.
• Adjectives are words you use to really describe things.
• Verb! That’s what’s happening.
None of these “definitions” are very helpful. (Is “fortitude” a thing? Or
“asymptote”? Does “tiger” describe something?)

Problem 18: Give attempts at definitions for the grammatical cat-


egories preposition and adverb. Then give problem cases for your
definition. (Either words that should be in the category, but don’t
meet your definition, or words that shouldn’t be in the category, but
do meet your definition.) Then consider how your definition might
be improved to deal with those problem cases.

14
Instead of trying to give substantive definitions of the grammatical categories,
we can give a structural definition:

Definition: Words A and B belong to the same grammatical category


if, given any grammatical sentence containing word A, replacing
word A with word B also produces a grammatical sentence, and
vice versa.

If we replace giraffe in The tall giraffe ate every leaf with ambition,
we get The tall ambition ate every leaf. Is that grammatical? (We need
to carefully distinguish grammaticality from truth or sensibility.) If not, giraffe
and ambition belong to different grammatical categories. If it is, we have some
evidence that giraffe and ambition belong to the same grammatical category.

We can then hope that all of the familiar English nouns end up in the same
category, which we will then label “noun”. Similarly for all the verbs, all the
adjectives, and so on.

Problem 19: For each of the following pairs of words, give a sentence
that shows, using the substitution test, that the two words do not
belong to the same grammatical category.
• And, but
• Kill, laugh
• All, each
• Think, want
• In, out
• College, university

H Phrase Structure Grammar

We introduce some grammatical categories by simple enumeration:


• NAME: Socrates, Plato, Aristotle
• N: giraffe, cat, dog, virtue
• A: tall, fast, red, angry
• DET: a, the, each
• IV: laugh, snore, die
• TV: kill, admire, rescue
We then introduce three higher-level grammatical categories:

15
• S (sentence)
• NP (noun phrase)
• VP (verb phrase)
Finally, we have phrase structure rules, which specify how the higher-level
phrase categories can be built out of lower-level phrase categories:
(R1) S → NP VP

(R2) NP → NAME

(R3) NP → DET N

(R4) NP → DET A N

(R5) VP → IV

(R6) VP → TV NP
Here’s a simple application. We start with a whole sentence, or S. Using (R1), S
decomposes into NP and VP. That gives us the beginning of a constituent tree
– or, as we’ll now call it, a phrase structure tree:

NP VP
We can then apply rule (R2) to the NP node. (We could also apply (R3) or (R4)
instead.) That gives us:

NP VP

NAME
Then we can apply rule (R6) to the VP node:

NP VP

NAME TV NP
Then we can apply rule (R4) to the new NP node:

16
S

NP VP

NAME TV NP

D A N
Finally, we can pick words in each bottom-level category from our starting
“dictionary”:

NP VP

NAME
TV NP
Aristotle
rescued
DET A N

each angry giraffe


The goal is then to produce a collection of phrase structure rules that generate
all the grammatical sentences and only grammatical sentences.

Problem 20: How many sentences are generated using rules (R1)
through (R6) above, given the starting “dictionary”? How many
grammatical sentences are there in English? Explain why there’s a
problem revealed by those two answers, and think about how that
problem might be fixed.

Problem 21: Are all of the sentences generated using rules (R1)
through (R6) and our starting “dictionary” in fact grammatical sen-
tences? If not, give a counterexample. Are there any grammatical
sentences built using only the words in the starting “dictionary”
that can’t be produced using rules (R1) through (R6)? If so, give an
example, and suggest an improvement or addition to the rule set
that will allow that sentence also to be generated. Does your change
to the rules allow any new ungrammatical strings to be generated?

Problem 22: Consider the first stanza of Lewis Carroll’s “Jabber-


wocky”:

‘Twas brillig, and the slithy toves


Did gyre and gimble in the wabe.
All mimsy were the borogoves,

17
And the mome raths outgrabe.

Give a collection of phrase structure rules and a starting “dictionary”


that allows these two sentences to be generated. Into which gram-
matical categories did you place Carroll’s various nonsense words
(brillig, slithy, toves, gyre, gimble, wabe, mimsy, borogoves,
mome, raths, outgrabe)? What evidence led to putting those words
in those categories?

I Recursive Rules

English and other natural languages have a finite number of words but an
infinite number of sentences. As we saw in Problem 20 above, that leads to a
difficulty for phrase structure grammars of the sort we’ve been considering –
the formation rules, given a finite starting vocabulary, will produce only a finite
number of sentences.

We can fix this difficulty by using recursive rules. In a recursive rule, one of
the output categories of the rule is the same as the input category. Previously
we allowed adjectival modification using the rule:

(R4) NP → DET A N

This rule allows only a single adjective, but that’s clearly not good enough
(“the tall menacing stranger”). We could add another rule for double adjective
modification:

(R4’) NP → DET A A N

But we’ll end up needing a lot of rules that way. Instead, we use a recursive rule:

(R7) N → A N

Problem 23: Use rule (R7) together with rule (R3) to produce trees
for the tall menacing stranger and the old dilapidated red
barn.

Problem 24: It’s a consequence of rule (R7) that strings of modifying


adjectives can occur in any order. Consider the five other variant
orders for the adjectives in the old dilapidated red barn. Are
they all grammatical constructions in English? If not, what dis-
tinguishes the grammatical from the ungrammatical forms? How
could we change our rules to rule out the ungrammatical forms?

18
Rule (R7) lets us produce an infinite number of adjectivally modified nouns,
because we can re-apply the rule as many times as we want. Every time we
apply (R7) to analyze an N node, we produce another N node, and then we can
apply (R7) to that N node as well.

Problem 25: We can also modify nouns after their occurrence using
prepositional phrases, such as man in the house. Let’s add new
categories of P (preposition) and PP (prepositional phrase) with the
rule:

(R8) PP → P NP

Prepositional phrase modification, like adjectival modification, can


be repeated, as in tiger with sharp claws behind the door. So
perhaps we want another recursive rule:

(R9) N → N PP

Show that with these rules, there is more than one phrase structure
tree that can be given for every vicious angry tiger with sharp
claws behind the door. Is it a problem that we can produce more
than one tree?

We can also use recursive rules to handle sentential connectives like and and
or. We introduce a new grammatical category CONN, and add the rule:

(R10) S → S CONN S

Since S is both the input and part of the output of the rule, the rule is recursive,
and can be applied repeatedly to make arbitrarily large sentences.

Problem 26: Use (R10) (together with the other rules) to produce
two trees for the sentence:

• Aristotle laughs and Plato cries or Socrates snores.

Explain how the two trees correspond to two readings of the sen-
tence.

Problem 27: Show that if we put if in the CONN category, we can


also produce a tree for:

• Aristotle laughs if Plato cries if Socrates snores.

19
Is that a problem? Can we produce:

• If Aristotle laughs, Plato cries.

If not, how might we fix the rules to allow for producing that sen-
tence? Are there other words that plausibly belong in the CONN
category that create similar problems?
Recursive rules thus give us a powerful tool for generating an infinite number
of sentences using a finite number of rules. Linguists often take recursive rule
structure to be one of the central defining features of the human linguistic
capacity.
Problem 28: Another source of the infinite stock of English sen-
tences is propositional attitude verbs:

• It is raining.
• John believes it is raining.
• John believes John believes it is raining.
• John believes John believes John believes it is raining.
• ...

Give recursive rules that allow us to produce sentences of this sort.


Consider a range of different psychological attitude verbs and their
differing methods of grammatical expression, and see if that range
produces any problems for your rules. If so, are there ways to
modify the rules to avoid those problems?
Problem 29: We have considered a particularly simple version of
recursive rules here, in which the input grammatical category is also
one of the output grammatical categories. But the same “circular”
recursive structure can happen in more complicated ways. Con-
sider the two rules:

(S1) X → Y Z
(S2) Y → Z X

Prove that an infinite number of expressions can be generated using


these two rules. Prove also that any expression generated using the
two rules has an even number of Z-category expressions in it.

Are there any grammatical constructions in English that call for


recursive rules of this more complicated form?

20
J X-Bar Syntax

As we try to build a phrase structure grammar for a larger and larger fragment of
the language, the collection of rules threatens to become gigantic, idiosyncratic,
and ad hoc.

Problem 30: To bring this out, try to construct a collection of phrase


structure rules that allow generation of the string:

(18) Motty, the son, was about twenty-three, tall and thin
and meek-looking.

Then see if your phrase structure rules allow any ungrammatical


strings to be formed. If they do, attempt to correct them to block
the ungrammatical string while still allowing (18).

Linguists have thus been interesting in trying to find a general format for phrase
structure rules that brings out a simpler underlying picture of how the language
works. One popular approach is X-Bar Theory.

In X-bar theory, we begin with the idea of the head of a phrase. In, for ex-
ample, a noun phrase (NP) like the tiger behind the door, the noun tiger
is the head. The thought then is that all heads produce surrounding phrases
in the same way. For each head, we have the possibility of specifiers and
complements. Roughly:

• Specifiers come before the head, do something to specify the particu-


lar instance of the head, and are unique (and hence not governed by a
recursive rule).
• Complements come after the head, provide further detail about the head,
and can be repeated (and hence are governed by a recursive rule).

X-bar theory thus allows us to produce trees like this:

XP

SPEC X

X COMP

X COMP

X
Here “X” marks the category of the head, so “X” is “N” if we are producing a
noun phrase, “V” if we are producing a noun phrase, and so on. Notice that

21
we’ve introduced a new kind of node labeled X. We use this as an intermediate
step for producing the recursive introduction of COMPs. Sometimes “XP” is
instead written X, or “X double bar”.

We thus have the following rules, which can be used with any head:

(X1) X → SPEC X

(X2) X → X COMP

(X3) X → X

Problem 31: Use the basic X-bar framework to give trees for the
following phrases:
1. The tiger behind the door.
[Hints: let “tiger” head the main phrase. COMP is then a
prepositional phrase headed by behind. That prepositional
phrase then contains another noun phrase. Note that in some
cases either SPEC or COMP might need to be “empty”.]
2. Slowly run around the building.
3. Almost underneath the large threatening bulldozer.

Problem 32: What grammatical category should the SPEC of a


prepositional phrase be? Should we also have an X-bar structure
for that category? If so, give an example (and a tree with that
example) for a prepositional phrase that contains a SPEC that is
itself a full X-bar phrase.

Problem 33: Our rules always put SPEC to the left of the head and
COMP to the right of the head. Does that produce the right analysis?
If not, can we modify the rules to allow for SPEC and COMP to
appear in different positions? Are there interesting constraints on
when SPEC and COMP appear in which order?

Some linguists then propose an analysis in which there is a head type I, for
“inflection”, which carries tense and modal information about the sentence.
I then has the subject of the sentence as its SPEC and the verb phrase of the
sentence for its COMP. Thus we get examples like:

22
IP

SPEC = NP I

SPEC=D N
I COMP=VP
the N
may
tiger SPEC=AdvP V

Adv
V COMP=NP
Adv
V SPEC=D N
ruthlessly
attack the N

students
Notice that here “IP” plays the role that “S” previously played.

Problem 34: Give trees for the following sentences, assuming (i) in
each case the full sentence is an IP phrase, (ii) in various cases, vari-
ous of SPEC and COMP might be empty, and (iii) every basic lexical
item is in some grammatical category to which the X-bar structure
can be applied.

1. The cat will chase the mouse.


2. The specialist in fiberoptics from Paris said that the
doctor was wrong.
3. The big book of poems with the dark blue cover fell dramatically
on the floor.
4. I don’t think any student will enjoy having yet another
sentence to analyze. [Problem: what to do with pronouns?]

Problem 35: Should there be a difference between the X-bar analy-


ses of the following two phrases:

• The student with long hair.


• The student of ancient languages.

Consider the following minimal pairs in thinking about this:

23
1. The student with long hair of ancient languages // the
student of ancient languages with long hair.
2. I like this student with long hair better than that one
with short hair. // I like this student of ancient languages
better than that one of physics.
3. I’d prefer a student with long hair to one with short
hair // I’d prefer a student of ancient languages to one
of physics.
Problem 36: Above we treated D (determiner) as the SPEC of N in
an NP. However, some linguists prefer to treat D as the head of a
phrase like the tall man, with NP then serving as the COMP of D.
Give a tree for the tall man that treats it in this way. If this is the
right analysis, we need two questions answered:

1. What now is the SPEC of N (given that it is not D)?


2. What is the SPEC of D (given that NP is the COMP of D)?

K Movement and Transformation

Compare the following two sentences:


(19) a. The student read the book.
b. Which book did the student read?
We have a perfectly good tree for (19a):

IP

SPEC = NP I

SPEC=D N I COMP=VP
the N PAST V
student
V COMP=NP

V SPEC=D N

read which N

book

24
But this tree then makes (19b) puzzling. read should take a COMP (the book),
but in (19b) read is the end of the sentence, and doesn’t appear to take a COMP.
The student is the SPEC of I and should thus be at the front of the sentence,
but in (19b) it’s buried within the sentence. And a mysterious new verb did
has appeared. How can we make sense of all this?

Linguists have taken constructions like (19b) as evidence that the full grammar
of English, and other natural languages, includes both a phrase structure tree
component, building initial trees, and a movement component, re-arranging
parts of the trees after they are built. In (19b), we want some kind of move-
ment rule that moves “the book” from the end to the front of the sentence (and
changes it to “which book”).

But if movement is an operation on trees, we need more details about how it


occurs – what the pre-movement and post-movement trees look like. Here is a
suggestion for the pre- and post-movement trees for (19b):

Pre-movement:

IP

SPEC = NP I

SPEC=D N I COMP=VP
the N PAST V
student
V COMP=NP

V SPEC=D N

read which N

book
Post-movement:

25
IP

SPEC1=NP IP

SPEC=D N

which N SPEC2=IP IP

book I
SPEC = NP I
PAST
SPEC=D N I COMP=VP

the N t2 V

student COMP=NP
V

V t1

read
The movement rule thus allows (i) a new IP node to be added to the top of the
tree, and (ii) a component of the tree to be moved to the top of the tree and
inserted as the SPEC of the new IP node, and (iii) a trace to be left behind in
place of the moved item. A trace is a covert silent element of the sentence.

Problem 37: Give pre- and post-movement trees for the following
two sentences:

(20) a. Which linguist do you want to invite to the party?


b. Which linguist do you want to invite the philosophers
to the party?

(Make sure you see how (20b) is grammatical and what it means.)
Now note that in (20a), want to can be contracted to wanna, but in
(20b) want to cannot be contracted to wanna. Can we explain the
difference in availability of contraction by considering the differ-
ences in trees and movement?

Problem 38: The following sentence has a scope ambiguity:

(21) Some student read every book.

26
(21) can be given a universal-existential reading, on which each book
is read by some student, but not necessarily the same student for
different books, and also an existential-universal reading, on which
one single student read every book.

Can we use movement rules to explain the ambiguity of (21)? We


might think that there is a hidden level of grammatical representa-
tion (usually called logical form), and that two sentences with the
same surface form can have different grammatical forms.

L Meanings as Functions

As a convenient shorthand, we will put double-brackets around an expression


to indicate the meaning of that expression. So ~Bill is the meaning of the
word Bill (whatever that meaning is), and ~laughed loudly is the meaning
of the expression laughed loudly (whatever that is).

Now consider ~the father of Annette. Here are two tempting theoretical
starting points:
1. Compositionality: The meanings of complex expressions are determined
(in some way yet to be determined) by the meanings of their component
parts.
2. Names Name The meaning of a name is just the object to which the name
refers. Thus:
• ~Annette = Annette

(To be clear, this says that the meaning of the word is the person, not that
the meaning of the word is the word.)
We know that Annette’s father is Christopher, so it’s tempting to think that we
should have ~the father of Annette = Christopher. The question then is
what we should use as ~the father of in order to get a compositional story.
(Ultimately we’d like to break things down further and have separate accounts
of ~the, ~father, and ~of from which we derive ~the father of, but we’ll
proceed in small steps.)

Whatever ~the father of is, it needs then to take us from Annette to Christo-
pher. Similarly it needs to take us from Elizabeth II to George VI, and from
George VI to George V. A convenient mathematical object for capturing this
idea is that of a function.

We can think of a function as a mapping from inputs to outputs. Thus:

27


 Annette Christopher


• f :=  Elizabeth II → George VI
 

George VI → George V
 

Here we have specified the function f by using rows on the table to indicate
which things are mapped to which other things. If we then let ~the father
of = f , then we’ve assigned a meaning to the father of that will be useful
in calculating ~the father of Annette.

Problem 39: Of course, the input-output table given above for f is


only a partial specification of the full function that we would want
for ~the father of. Extend the given chart by adding five more
rows.

In the British science fiction show Red Dwarf, the character Dave
Lister is revealed in the episode Ourobouros to be his own father.
Add an appropriate row to the table for f to incorporate this fact.
Do we need yet another row to capture the fact that Dave Lister’s
father’s father is Dave Lister?

Suppose, then, that:


1. ~Annette = Annette

2. ~the father of = f


To get the meaning of the father of Annette, we want to apply the function
f to Annette.That is, we want f (Annette). We can put the point like this:
• ~the father of Annette = ~the father of(~Annette)

We can then reason as follows:


1. ~the father of Annette = ~the father of(~Annette)
2. ~Annette = Annette

3. ~the father of = f


4. So ~the father of Annette = ~the father of(Anette) [substitution,
1 and 2]
5. So ~the father of Annette = f (Annette) [substitution, 2 and 4]

6. So ~the father of Annette = Christopher [from the table for f ]

Problem 40: Give an explicit step-by-step calculation, in the same


way as above, of ~the father of the father of Elizabeth II.

28
We have used a strategy of calculating semantic values of complex expressions
via functional application. The general hope is that whenever we have a
complex expression A B, we can calculate ~A B by discovering that one of ~A
and ~B is a function, and then applying that function to the other of ~A and
~B. That is:
Functional Application: ~A B is either (i) ~A(~B) or (ii) ~B(~A).
We may sometimes make explicit the tree structure of the expressions we are
evaluating, in order to keep track of that aspect of syntax. So we might say:
 

 


 


•  
 is either (i) ~A(~B) or (ii) ~B(~A).

 

 
A B
Problem 41: The rule we’ve given of Functional Application al-
lows for two choices for what the meaning of a complex expres-
sion A B is. How do we know which one to use? (For example,
how do we know that ~the father of Annette is ~the father
of)(~Annette) rather than ~Annette(~the father of)? Can
you think of English constructions that might show that we want
both options in the the disjunctive definition of Functional Appli-
cation? (That’s a rather speculative question, given that we’re just
getting started on building meaning theories.)
Problem 42: Give an input-output chart for another function g that
is a reasonable meaning for the mother of. Then use all of the
pieces so far assembled to show carefully that ~the father of the
mother of Annette and ~the mother of the father of Annette
can be different from one another. Are we getting reasonable mean-
ings associated with those two phrases?

M Sentential Meanings

Let’s now consider the full sentence:


• The father of Annette laughs
Or in tree structure:
• N3

N2
laughs
N1
Annette
the father of

29
Given what we’ve already said, we are committed to:
• ~N1 = f
• ~N2 = Christopher
We then know that we want to use Functional Application to combine what-
ever ~laughs is to produce whatever ~N3 (that is, ~the father of Annette
laughs) is.

Let’s make a theoretical decision to have the meanings of entire sentences be


truth values: either the true (>) or the false (⊥). Suppose that the father of
Annette does indeed laugh. Then we need one of the following:
1. ~N2(~laughs) = >
2. ~laughs(~N2) = >
But ~N2 = Christopher, and Christopher isn’t a function (he’s a person). So
that can’t be the right option. Thus we need ~laughs to be a function.

However, we need a very particular kind of function here. Our previous


function f mapped people to people. It was thus just a slight generalization of
standard mathematical functions mapping numbers to numbers. In this case,
we need a function mapping people to truth values. We might have a function
something like:

→ >

 Christopher


 Harry the Hyena → >
• h := 


Eeyore → ⊥
 

We can now get a detailed semantic analysis of The father of Annette laughs:
1. ~the father of Annette laughs = ~laughs(~the father of Annette)
2. ~laughs = h

3. ~the father of Annette laughs = h(~the father of Annette) [sub-


stitution, 1 and 2]
4. ~the father of Annette = ~the father of(~Annette)
5. ~the father of = f

6. ~Annette = Annette
7. ~the father of Annette = f (~Annette) [substitution, 4 and 5]
8. ~the father of Annette = f (Annette) [substitution, 6 and 7]
9. ~the father of Annette = Christopher

30
10. ~the father of Annette laughs = h(Christopher) [substitution, 3 and
9]
11. ~the father of Annette laughs = >

Problem 43: Use the above derivation to give a full labelling of the
syntactic tree for The father of Annette laughs, showing what
the meaning of each node in the tree is, and indicating how that
meaning is derived via functional application from the meanings of
its child nodes.

Problem 44: Provide a full semantic analysis of the sentence:


• The king of Belgium is taller than the king of Bhutan.
In doing so, give semantic values for each of the following expres-
sions:
1. ~the king of
2. ~Belgium
3. ~Bhutan
4. ~is taller than
Then show how those semantic values combine using Functional
Application to produce a final semantic value for the whole sen-
tence. (Do the necessary empirical research! It’s good to be clear
on how much of our meaning theory is meant to be established just
by our linguistic competence, and how much builds in substantive
information about the world.)

N Some Background Mathematics of Functions

We’ve been using functions as a central concept in our approach to a theory of


meaning. It’s time, then, to take a closer look at exactly what functions are and
how they work.

Suppose A is some collection of five objects, such as:


• {Paris, London, Beijing, Kuala Lumpur, Dakar}
and B is some other collection of six objects, such as:
• {Kenya, New Zealand, Uzbekistan, Paraguay, Canada, Jordan}
A function f from A to B (which we’ll sometimes write as f : A 7→ B) is a pairing
of inputs from A with outputs in B, such that everything in A appears as an
input (but we don’t require that everything in B appear as an output). So one
sample function f : A 7→ B is:

31
Paris → Canada
 
 

 London → New Zealand 

• f := 
 Beijing → Jordan 

Kuala Lumpur → Jordan
 
 
 
Dakar → Canada

Problem 45: Specify two more functions from A to B. How many


functions from A to B are there in total? How many functions are
there from B to A? Why are the two answers not the same?

These tables we’ve been giving for functions are one way of displaying inputs
and outputs of the function. Another way we could present the same informa-
tion is via a list of input-output pairs. Thus the function f from above could
also be given as:

f = {hParis, Canadai, hLondon, New Zealandi, hBeijing, Jordani,

hKuala Lumpur, Jordani, hDakar, Canadai}


The table presentation and the ordered pair presentation of a function contain
exactly the same information, and it should be easy to move between the two.

Problem 46: For each of the following functions, if it is in table form,


give it in ordered pair form, and if it is in ordered pair form, give it
in table form.
→ Chicago 

 Jupiter

1. f =  → San Diego 
 Saturn
Neptune → Seattle
 

2. f = {hShakespeare, Francei, hMarlowe, Francei, hWebster, Bulgariai}


3. f = {h1, 3i, h2, 5i, h3, 7i, h4, 9i}

Problem 47: What is wrong with the following collection of ordered


pairs as the specification of a function?

f = {hParis, Canadai, hLondon, New Zealandi, hBeijing, Jordani,

hLondon, Jordani, hDakar, Canadai}


Give a general rule for what a collection of ordered pairs needs to
look like in order for it to be a function. Describe the asymmetry
between first and second positions of the ordered pairs in your
general rule. Why is there that asymmetry?

Problem 48: Can a function contain an ordered pair of the form hx, xi
for any object x? If so, give an example of an English expression
whose semantic value might plausibly be a function containing such
an ordered pair.

32
Problem 49: The functions we have been giving have small do-
mains, and aren’t very plausible as real semantic values for English
expressions. The function we gave earlier for the father of, for
example, had only four people in its domain. As a result, ~the
father of Elizabeth I is undefined – the function that we’ve
given won’t accept Elizabeth I as an input.

We’ve been using small functions just for convenience of presen-


tation. Presumably the full semantic value for the father of is a
function with a larger domain. But we still need to decide whether
~the father of is a total function that is defined for any input at
all, or whether it is a partial function that is defined only for some
inputs.

Consider phrases such as the father of London, the father of


the square root of 9, and the father of the function ~the
father of. How do we want these expressions to operate in En-
glish? What should we say about the ~the father of function in
order to make our formal machinery match that answer about what
we want?

Special Case: Sets and Characteristic Functions: Recall the function we used
for ~laughs:

→ > 

 Christopher

• h :=  Harry the Hyena → > 
 
Eeyore → ⊥
 

Another natural way to think about the semantic value of laughs is by giving
a set: namely, the set of all and only things that laugh. Had we gone that route,
we would have said:

• ~laughs = {Christopher, Harry the Hyena}

But if we do things this way, we have to change our rule for combining expres-
sions to make complex expressions. We can’t say:
• ~the father of Annette laughs = ~laughs(~the father of Annette)
because after filling in the values we’re committed to, this becomes:

• ~the father of Annette laughs = {Christopher, Harry the Hyena}(Christopher)


But the set {Christopher, Harry the Hyena} isn’t a function, and can’t be applied
to anything (including Christopher).

Instead, we would need the special rule:

33
 

 


 

• Set Membership: When neither ~A nor ~B is a function, then 






 

A B
 

 


 

= > if either ~A∈~B or ~B∈~A; otherwise 




 = ⊥.

 

A B
Problem 50: This is a bit of a fussy point, but the qualifying clause
“when neither ~A nor ~B is a function” in the previous rule doesn’t
get things quite right. Give a case in which this clause leads to the
wrong result. Is there an easy way to fix things?
We could make everything work proceeding in this way. Sometimes we would
use the rule of Functional Application and sometimes we would use the rule of
Set Membership. We’d then have to be careful to be clear at each point about
which rule should be used. However, there is an easier solution.

Sets can always be transformed into functions. We do this by considering the


characteristic function of the set. The characteristic function is the function
that maps everything in the set to the true and everything not in the set to the
false. Thus given any set A, any function f : A 7→ {>, ⊥} is a characteristic
function on A, and that characteristic function picks out some subset of A. And
given any subset of A, there is a unique characteristic function for that subset.

Some useful notation for these transformations:


1. Given a set S, χS is the characteristic function of S.
2. Given a function f that maps inputs to truth values, f↓ is the set whose
characteristic function is f . (This set is sometimes called the extension or
course of values of the function f .)
Problem 51: Consider the set {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} of the integers
from 1 to 10. There is then a subset of that set containing all of the
prime numbers between 1 and 10. Give the characteristic function
of that subset.

What subset is picked out by the characteristic function:


 1 → > 
 
 2 → ⊥ 
 
 3 → ⊥ 
 
 4 → > 
 
 
 5 → ⊥ 
h :=   
 6 → ⊥ 
 7 → ⊥ 
 
 
 8 → ⊥ 
 
 9 → > 
 
10 → ⊥

34
Problem 52: If A is a set with 10 members, how many characteristic
functions are there on A?

If it’s ever easier to work with set extensions than with functions, then, we can
do so – but then we can use characteristic functions to transform that work on
set extensions back into work with functions.

O Transitive Verbs and Schönfinkelization

Suppose we want to give a meaning analysis of Jones killed Smith along


the lines that we’ve been developing. Killed, unlike laughs, is a transitive
verb. As a result, giving its meaning using a function that maps each object to
a truth value – > if the object satisfies the condition given by the verb and ⊥ if
the object doesn’t satisfy the condition given by the verb – isn’t going to work.
Whatever is going on with the transitive verb killed, it’s not giving us a way
to correlate Jones with a truth value.

The natural thing to say is that killed, unlike laughs, lets us correlate two
objects, rather than a single object, with a truth value. Killed correlates the pair
of Jones and Smith with > in the same way that laughs correlates Christopher
with >. We can capture this idea using two-place functions, functions that take
two inputs and produce a single output.

Problem 53: Which of the following mathematical functions are


two-place functions?
1. f (x, y) = 3x − 2y
2. f (x) = 3x − 2y
3. f (x, y, z) = 3x − 2y
4. f (x, y) = x

We can specify two-place functions using input-output tables the same way we
did with one-place functions. Thus we might say:

 Jones, Smith → >


 

 Brutus, Caesar → > 
~killed = 
 
 Caesar, Napoleon → ⊥


Brutus, Brutus → >
 

This function captures the facts that Jones killed Smith, Brutus killed both
Caesar and himself, and Caesar didn’t kill Napoleon. We could also specify the
same function using ordered triples:
~killed = {hJones, Smith, >i, hBrutus, Caesar, >i, hCaesar, Napoleon, >i, hBrutus, Brutus, >i}

35
Or we could specify the same collection of ordered triples descriptively rather
than by enumeration:
~killed={hx, y, zi : either x killed y and z = > or x did not kill y
and z = ⊥}
Problem 54: Give partial specifications for reasonable functions for
the following transitive verbs:
1. admires
2. is taller than
3. became
4. recognized
5. weighs
(Try doing some of the partial specifications in input-output list
format and some in ordered triple format.)
Problem 55: Using the semantic value given above for killed,
give an appropriate function to use for was killed by. Describe
the general form of the relation between the function for killed
and the function for was killed by. Will this general form give
us a universal procedure for deriving semantic values of passive
constructions? If so, how can we explain that resemble has no
passive form?
Problem 56: Can you find an example of a transitive verb in English
that ought to have the ordered triple h>, >, >i in it? What about
h⊥, ⊥, ⊥i?
Problem 57: If we use two-place functions for giving meanings of
transitive verbs like killed, what should we do with ditransitive
verbs like gives that take both a direct object and an indirect object?
Give a partial specification of a potential semantic value for gives.
How should we handle the fact that ditransitive verbs can often
accept direct and indirect objects in either order:
1. Jones gave Smith the poison.
2. Jones gave the poison to Smith.
Two-place functions are tempting as semantic values for transitive verbs, but un-
fortunately the tempting idea won’t quite work. Consider again Jones killed
Smith. Plausible this sentence has the following syntactic structure:

Jones
killed Smith
Suppose we then make the following assumptions:

36
• ~Jones = Jones
• ~Smith = Smith

 Jones, Smith → >


 

 Brutus, Caesar → > 
• ~killed = 
 
 Caesar, Napoleon → ⊥


Brutus, Brutus → >
 

We can then start calculating ~Jones killed Smith:


1. ~Jones killed Smith = ~killed Smith(~Jones)
2. ~killed Smith = ~killed(~Smith)

 Jones, Smith → >


 

 Brutus, Caesar → > 
3. ~killed Smith =  (Smith)
 
 Caesar, Napoleon → ⊥ 
Brutus, Brutus → >

But now we hit a snag. We’re trying to apply the two-place function ~killed
to the single argument Smith. And that can’t be done. Two-place functions
require two arguments, and can’t sensibly be applied to a single argument. We
can apply addition to a pair of objects, as in 3 + 5. But we can’t apply addition
to a single object, as in the absurd +3.

Problem 58: We could fix this problem by using a different syn-


tactic structure for Jones killed Smith. Sketch how the meaning
analysis would go if we assumed that the sentence had the structure:

Jones killed Smith


How would we need to adjust the principle of Functional Applica-
tion to make this work out?

We need a new idea. There is a general method for extracting a one-place


function from a two-place function. Consider a portion of the multiplication
function:

1, 1 → 1
 
 

 1, 2 → 2 

1, 3 → 3
 
 



 2, 1 2 
× = 
 
2, 2 → 4 

2, 3 → 6
 
 


 3, 1 → 3 

3, 2 → 6
 
 
3, 3 → 9

37
From this starting point, we will build a new function that maps a single input
number to an output function that then gives the product of that number with
a further input:

 1 → 1  
   

 1 →  2 → 2  
   
3 → 3 
   


 1 → 2  
  

new − × =  2 →  2 → 4  
   
3 → 6 
   


 3 → 3  
  

 2 → 6  
 1 → 

 
3 → 9
   

The new-× function, for example, takes 3 as an input and produces as output
 3 → 3 
 
the function new-×(3) = 
 2 → 6 
. That function then takes 2 as input and
3 → 9
 
produces as output 6. So whereas before we simply calculated:
• ×(3, 2) = 6
using the two-place function ×, now we calculate:
• new-×(3)(2) = 6
using a one-place function that maps the first multiplicand to another one-place
function that maps the second multiplicand to the product.

This process of converting a two-place function into a one-place function from


a single input to an output of another one -place function mapping another
input to a final putput is called Schönfinkelization.
Historical Note: The name Schönfinkelization comes from the math-
ematician Moses Schönfinkel, who made extensive use of this idea
in his work. The process is also sometimes called Currying (espe-
cially in computer science), after the mathematician Howard Curry,
who also made use of the idea. Schönfinkel precedes Curry his-
torically, although Gottlob Frege introduced the idea before either
Schönfinkel or Curry.
Let’s now take a slightly expanded version of our earlier two-place killed
function:

Jones, Smith → >


 
 

 Brutus, Caesar → > 


 Caesar, Napoleon → ⊥ 

Brutus, Brutus → >
 
 
 
Brutus, Smith → ⊥

38
We can now Schönfinkel this function to produce:
 " # 
Jones → > 
 Smith →

Brutus → ⊥ i 


 h 
 Caesar
 → Brutus → > 
 h i 
 Brutus
 → Brutus → > 
 h i 
Napoleon → Caesar → ⊥ 

Problem 59: In the above example, we Schönfinkeled the two-place


killed function on its second position, producing a function that
takes a killee as input and produces as output a function that takes
a killer as input. We call this right Schönfinkelization. But we can
Schönfinkel on either side. Give the left Schönfinkelization of the
killed function.

Problem 60: Give both the left and right Schönfinkelization of the
following function:

2, 3 → 10
 
 

 2, 4 → > 

2, 5 → 101
 
 
f =  3, 3 → ⊥
 



 3, 5 → Paris 

5, 3 → ⊥
 
 

Paris, 4 → 18

Problem 61: Give a partial specification of a plausible three-place


function for gives, giving at least ten input-output pairs. We can
then double Schönfinkel this function to convert it into a function
from an input to a function from an input to a function from an
input to an output. Give an input-output table specification of the
resulting doubly Schönfinkeled function. Can we derive the double
Schönfinkelization by using two steps of single Schönfinkelization?

Now we are ready to put all the pieces together and give a full semantic analysis
of Jones killed Smith. We start with:
1. ~Jones = Jones
2. ~Smith = Smith

39
 " # 
Jones → > 
Smith →

h Brutus → ⊥ i 
 
 

3. ~killed = 
 Caesar → Brutus → > 
h i 
Brutus → Brutus → > 


 h i 
Napoleon → Caesar → ⊥ 

(Thus we are taking the semantic value of killed to be the left Schönfinkelization
of the natural two-place killed function.)

We then reason as follows:


1. ~Jones killed Smith = ~killed Smith(~Jones)
2. ~Jones = Jones
3. So ~Jones killed Smith = ~killed Smith(Jones)
4. ~killed Smith = ~killed(~Smith)
5. ~Smith = Smith
6. So ~killed Smith = ~killed(Smith)
 " # 
Jones → > 
 Smith →

h Brutus → ⊥ i 



Brutus → > 

7. ~killed =  Caesar →
h i 
 Brutus → Brutus → > 

 h i 
 Napoleon → Caesar → ⊥ 
 " # 
Jones → > 
Smith →

h Brutus → ⊥ i 
 
 

8. So ~killed Smith = 
 Caesar → Brutus → >  (Smith)
h i 
Brutus → Brutus → > 


 h i 
 → Napoleon Caesar → ⊥ 
 " # 
Jones → > 
Smith →

h Brutus → ⊥ i 
 
  " #
Brutus → >  (Smith) = Jones → >

Caesar
9. 
 →
h i  Brutus → ⊥
Brutus → Brutus → > 


 h i 
Napoleon →
 Caesar → ⊥ 
" #
Jones → >
10. So ~killed Smith =
Brutus → ⊥
" #
Jones → >
11. So ~Jones killed Smith = (Jones)
Brutus → ⊥

40
" #
Jones → >
12. (Jones) = >
Brutus → ⊥

13. So ~Jones killed Smith = >


And we obtain the proper correct conclusion that the sentence Jones killed
Smith (given the starting semantic values we have assumed) is true.

Problem 62: Using the Schönfinkelization ideas, specify semantic


values for all of the parts of the sentence:
• Shakespeare wrote Hamlet
Then use those starting semantic values to calculate the full semantic
analysis of the sentence.

Problem 63: Give starting semantic values for the sentence:


• The King of Belgium is taller than the King of Bhutan
and then use those semantic values to calculate the final truth value
of the sentence.

Problem 64: What would happen, in our analysis of the meaning


of Jones killed Smith, if we Schönfinkeled on the left rather than
on the right? Is there any general story we can tell about when we
should Schönfinkel on the left and when we should Schönfinkel on
the right?

P The Winner and . . .

Consider the sentence:


• The winner celebrated.
with the tree:

celebrated
the winner
We could treat this exactly like earlier cases of intransitive verb sentences if we
could assign an object to ~the winner. And it seems like ~the winner ought
to be some object or other – that’s what we use definite descriptions of the form
The F for – to pick out some object.

But to have a full story, we need to figure out how to build up ~the winner
from ~the and ~winner. Let’s start with winner. A starting thought is that the

41
role of the word winner is to split the world into two categories – the winners
and the losers (‘second place is first place loser’). If that’s right, we can then
associate winner with a set: the set of all the things that make the cut. Usain Bolt
goes in the set; Beck goes out of the set. Furthermore, this looks like a plausible
story about the semantic function of common nouns in general. Giraffe sepa-
rates the world into two categories – the giraffes and the non-giraffes. Thus we
can associate giraffe with a set: the set of all the giraffes. And so on.

Using tools developed earlier, we can then move from these sets to functions by
taking the characteristic functions. Suppose we are discussing the 100m finals
in the London olympics. Then we would have:

Ryan Bailey → ⊥
 
 

 Yohan Blake → ⊥ 


 Usain Bolt → > 

Justin Gatlin → ⊥
 
~winner = 
 

 Tyson Gay → ⊥ 


 Churandy Martina → ⊥ 


 Asafa Powell → ⊥ 

Richard Thompson → ⊥
Problem 65: Give functional semantic values for the common nouns
country, prime number, and noun. (You can either give partial
specifications using input-output tables, or full specifications by
descriptive identification of a set of ordered pairs.)
Now we need a story about ~the. We have two constraints:
1. We want ~the winner to be an object.
2. We’ve decided to make ~winner be a function from objects to truth
values.
Given these constraints, we know what kind of thing ~the needs to be:
• ~the is a function that (i) takes as input a function from objects to truth
values, and (ii) produces as output an object. That is, ~the is a function
from functions from objects to truth values to objects.
Problem 66: Which of the following is a function from functions
from objects to truth values to objects?
" #
1 → >
1.
2 → ⊥
 " # 
 1 → > → 2 
" ⊥ → 1 # 
 

2. 
> → 1 
 2 →

⊥ → 2

42
 " # 
 > → 2
→ > 
 ⊥ → 1 #


3.  " 
 > → 1 
→ ⊥ 
⊥ → 2

 " # 
 1 → ⊥
→ 2 
 2 → ⊥ #


4.  " 
 1 → > 
→ 2 
2 → >

Give another example of a function from functions from objects to


truth values to objects.

Problem 67: Suppose we have five objects. How many functions


from objects to truth values are there? Given that, how many func-
tions from functions from objects to truth values to objects are there?

How many functions from functions from truth values to objects to


functions from objects to truth values are there?

Knowing that ~the is a function from functions from objects to truth values
to objects is only half the battle, though. We also need to know which function
from functions from objects to truth values to objects it is. As we’ve seen in the
previous problem, there are a lot of such functions, so we need to make sure
we pick the right one.

We know that ~the winner (in our London olympics scenario) is Usain Bolt.
And we also know that ~the winner = ~the(~winner). Thus we know that
whatever function from functions from objects to truth values to objects ~the
is, it must be a function such that:

Ryan Bailey → ⊥
 
 

 Yohan Blake → ⊥ 


 Usain Bolt → > 

Justin Gatlin → ⊥
 
• ~the
  = Usain Bolt
 Tyson Gay → ⊥ 

Churandy Martina → ⊥
 
 
Asafa Powell → ⊥
 
 
 
Richard Thompson → ⊥

It’s then not hard to make a specific proposal. Usain Bolt is the winner because
he is the only one who wins – he, and he alone, is mapped to > by ~winner.
Thus we have:

• Given any function f from objects to truth values, ~the is the function
that maps f to the unique object o such that f (o) = >.

43
Problem 68: Give a function f such that ~the(f) = Barack Obama.
What is:

 Immanuel Kant → ⊥


 

 Baruch Spinoza → ⊥ 
• ~the
 
 Rene Descartes → ⊥


Bertrand Russell → >
 

Given this definition of ~the, there are two problem cases we need to deal
with:
1. Consider the 1969 Best Actress Academy awards. There were five nom-
inees for the award – Katherine Hepburn, Patricia Neal, Vanessa Red-
grave, Barbara Streisand, and Joanne Woodward. The award was then
given jointly to Katherine Hepburn and Barbara Streisand. So for this
situation, it looks like we should have:

Katherine Hepburn → >


 
 

 Patricia Neal → ⊥ 

• ~winner =  Vanessa Redgravet → ⊥
 

Barbara Streisand → >
 
 
 
Joanne Woodword → ⊥

But what happens when we apply ~the to ~winner? There is no unique


object which is mapped by ~winner to >.
2. In 1904, the New York Giants were scheduled to play the Boston Amer-
icans in the World Series, but the series was never played, because the
Giants refused to participate. In this situation, we have:

" #
New York Giants → ⊥
• ~winner =
Boston Americans → ⊥

What then happens when we apply ~the to ~winner? No object is


mapped to > by ~winner, so in particular no unique object is mapped to
> by ~winner.
In both of these cases, we will say that ~the(~winner) is undefined. The
~the winner function only produces an output when applied to a function f
from objects to truth values which maps one and only one object to >.If f maps
two or more objects to >, or maps no objects at all to >, then ~the doesn’t
produce any output when applied to f .

Thus in these circumstances, ~the winner doesn’t get a semantic value, and
is meaningless. We’ll return later to some further examination of what to say
about the role of this sort of meaningless expression in our theory of meaning.

44
Problem 69: Given this choice for ~the, many expressions using
the will be meaningless. There are many books, so ~book maps
many objects to >. Thus ~the book is undefined. How could we
fix this problem? Is there a better choice of function for ~the?

Problem 70: There ought to be some semantic connection between


winner and win. Give a sample semantic value for win treating it in
the usual way as a transitive verb. Suppose we then treat the -er
nominalizing ending as a separate expression, so that:

• winner =
win er

Given that ~win is a function from objects to functions from objects


to truth values and ~winner is a function from objects to truth
values, what kind of function should ~-er be? Can you suggest a
particular function of that type that is suitable for ~-er?

Problem 71: Picking up where the previous problem left off, sup-
pose we think that the winner isn’t meant to occur by itself, but
only in conjunction with a subsequent of phrase, as in:

• the winner of the 2016 Olympic 100 meter dash


• the winner of the 1969 Best Actress Academy award

On this view, the string:

• The winner celebrated

isn’t really a complete sentence of English, and is used as elliptical


for some completion such as:

• The winner of the 2016 Olympic 100 meter dash celebrated.


• The winner of the 1969 Best Actress Academy award celebrated.

Now consider two possible syntactic structures for the interaction


between -er and of X in the phrase the winner of X:

1. Option 1:

the
-er
win of X

45
2. Option 2:

the
win -er of X

Pick one of these two syntactic structures and, using it, make pro-
posals for both ~-er and ~of that will produce reasonable results
for expressions like the winner of the 2016 Olympic 100 meter
dash.

Q . . . The Bald Man

Given the theory we adopted in the previous section for the, a definite descrip-
tion of the form the F fails to have a semantic value if there is more than one
F (or if there are no Fs). Because almost any common noun will apply to more
than one object, that theory has problems with simple definite descriptions of
the form:

the N
for a common noun N.

But we can get better results if we consider definite descriptions built out of
adjectively modified nouns, such as the bald man. Suppose that the bald
man has the following syntactic structure:

NP

D N

the A N

bald N

man

Small Details: Notice that here we have a non-branching node in


our tree: the lower N has the N node as its only child. Let’s as-
sume that at non-branching nodes, semantic values are just directly
passed up. Thus the semantic value of the N node is exactly the
same as the semantic value of the N node: namely, ~man.

46
In fact, we’ve really been needing, and implicitly making use of, that
principle all along. Strictly speaking, our semantic value function
~. assigns semantic values to nodes in a syntactic tree. So when we
talk about ~Jones killed Smith, we really more properly mean:
 

 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 S 


 


 


 


 


 NP VP 


 


 


NAME 


 V NP 


 


 


 

 Jones
 


 killed NAME 


 


 

Smith
So it’s the S node that really gets the semantic value. But that’s
annoying to write out, so we usually just discuss the semantic value
of a particular node by listing out the words dominated by that node.

But that convention obscures the difference among:


1. ~Smith
 

 


 


 


 

2. 
 NAME 


 


 

 
Smith
 

 


 


 


 


 


 


 


 


3.  NP 


 


 


 


 NAME 


 


 


 

Smith
because all three of these nodes dominate the same word Smith.
(Note that the first item on the list is just the Smith node itself.)

47
We’ve been just as happy to obscure that difference, because (a)
it’s annoying to keep track of, and (b) it doesn’t actually matter for
anything we’re doing – we want all three of those nodes to have the
same semantic value. But to make that happen, we need the rule
that any non-branching node simply inherits the semantic value of
its child.
To calculate the semantic value of the bald man using the tree:

NP

D N

the A N

bald N

man
we need a semantic value for bald. There are two important constraints on
~bald:
1. ~bald needs to combine with ~man by functional application. Since
~man is a function from objects to truth values, ~bald needs to be one
of:
(a) An object, so that it can be an input to ~man.
(b) A function that takes as input a function from objects to truth values,
so that ~man can be the input to ~bald.
2. When ~bald and ~man combine, they need to produce as output some-
thing that can combine with ~the by functional application. Since ~the
is a function from functions from truth values to objects to objects, the
combination of ~bald and ~man needs to be one of:
(a) A function from objects to truth values, so that it can be an input to
~the.
(b) A function that takes as input a function from functions from objects
to truth values to objects, so that ~the can be the input to the
combination of ~bald and ~man.
Looking over these choices, if we go with option (a-i) and make ~bald an
object, then ~bald man will be a truth value, but a truth value doesn’t meet
either of the conditions (b-i) or (b-ii). So we can’t use (a-i). That means we need
to go with (a-ii), so ~bald must be a function that takes as input a function
from objects to truth values.

That leaves the question of what the output of the ~bald function should be.
Given (b-i) and (b-ii), we have two choices:

48
1. The output of ~bald is a function from objects to truth values. Then
~bald man will be a function from objects to truth values. That function
can then serve as the input to ~the, producing an object as output. Thus
~the bald man will be an object, which will combine nicely with an
intransitive verb like laughs, since ~laughs is a function from objects to
truth values, so ~the bald man laughs will be a truth value, as desired.
So on this option, ~bald is a function form functions from objects to truth
values to functions from objects to truth values.
2. The output of ~bald is a function from functions from functions from
objects to truth values to objects to something. In this case, ~bald man will
be a function from functions from functions from objects to truth values
to objects to something. That function will take as input ~the’s function
from functions from objects to truth values to objects, and produce as
output something. Whatever that something is, it will be the semantic
value of the bald man. Given that ~the bald man needs to combine
with ~laughs, something needs to be either an object or a function from
functions from objects to truth values to truth values. The result is two
choices for ~bald:
(a) A function from functions from objects to truth values to functions
from functions from functions from objects to truth values to objects
to objects.
(b) A function from functions from objects to truth values to functions
from functions from functions from objects to truth values to objects
to functions from functions from objects to truth values to truth
values.

Problem 72: Give an example of each of the following, in input-


output table form:
1. A function from functions form functions from objects to truth
values to objects to objects.
2. A function from functions from objects to truth values to func-
tions from functions from functions from objects to truth values
to objects to objects.
3. A function from functions from objects to truth values to func-
tions from functions from functions from objects to truth values
to objects to functions from functions from objects to truth val-
ues to truth values.

We’re almost there. A function from functions from objects to truth values to functions
from functions from functions from objects to truth values to objects to functions from
functions from objects to truth values to truth values sounds like pretty desperate
territory – it would be nice to avoid quite that much complexity if we can. So
let’s see if we can make the first option work, and have ~bald be a function

49
from functions from objects to truth values to functions from objects to truth
values.

Problem 73: Assuming that ~bald is a function from functions


from objects to truth values to functions from objects to truth values,
label each node in the following tree with the kind of thing that is
the semantic value of that node:
S

NP VP

D N TV NP

the A N met D N
bald N the A N
king bald N

prince

The remaining question is: which function from functions from objects to truth
values to functions from objects to truth values is ~bald?

Problem 74: Suppose there are ten objects. How many functions
from functions from objects to truth values to functions from objects
to truth values are there?

We can make this question easier by recalling that a function from objects to
truth values can be thought of as another way to present a set of objects. (The
function from objects to truth values is the chracteristic function of a set.) So if
~bald is a a function from functions from objects to truth values to functions
from objects to truth values, ~bald can also be thought of as a function from
one set to another set.

Put in that way, the problem is much easier. Consider ~bald man. ~bald need
to map one set to another set. The input set, of course, is the set of men. (~man
is the characteristic function of that set.) And ~bald man should be the set of
bald men. So how does ~bald take us from the set of men to the set of bald men?

Here’s an obvious suggestion. bald, like man, is associated with a set of objects.
man is associated with the set of men, and bald is associated with the set of bald
things. Let’s use small capitals to name these associated sets, so that bald is
the set of bald things. Then ~bald is, roughly, the function that maps an input
set (such as the set man) to the intersection of that set with bald.

50
More precisely (now we translate back from the talk of sets to talk of character-
istic functions of sets), we have:
• ~bald is the unique function f such that for any function g that is the
T function of some set G, f (g) is the characteristic function of
characteristic
the set G bald.

Problem 75: Suppose we have the following sets:

1. man = {Albert, Benedict, Charles, David, Egbert, Frank, George,


Herbert, Iago, Jacob}
2. bald = {Benedict, eagle}
3. bird = {arctic tern, bufflehead, chicken, dodo, eagle}

Use these sets to specify each of the following semantic values:

1. ~man
2. ~bird
3. ~bald man
4. ~bald bird
5. ~the bald man
6. ~the bald bird
7. ~bald

Problem 76: Give semantic values for the words angry, hungry, and
tiger such that:

1. the angry hungry tiger has a defined semantic value (thus,


picks out a specific object)
2. the angry tiger does not have a defined semantic value.
3. the hungry tiger does not have a defined semantic value.

Give the details of calculating ~the angry hungry tiger. Does


~the angry hungry tiger = ~the hungry angry tiger? Is your
answer to that question dependent on the particular semantic values
you have chosen?

Problem 77: Suppose we treat large the same way we treated bald
above. We identify a set large, and then we define ~large to be
the function that maps the characteristic
T function of any set G to the
characteristic function of G large.

51
Explain why it would be a consequence of that treatment that
if ~large mouse maps Mickey to >, then ~large mammal maps
Mickey to >. Is that a desirable or undesirable result?

Give a plausible alternative treatment of both ~large and ~small


such that ~large mouse can map Mickey to > while ~small mammal
also maps Mickey to >.

Problem 78: Explain why the method we used for giving the se-
mantic value of bald won’t work for the adjective alleged. Give a
new proposal for specifying ~alleged and show that your proposal
produces a reasonable interpretation of the alleged murderer.

R Functions and Type Theory

The functions we have been considering as semantic values for adjectives have
been getting rather complicated. “Functions from functions from truth values
to objects to functions from truth values to objects” takes a while to say, and
actually seeing clearly what kind of function is meant by that phrase can require
drawing a careful diagram. And things get even worse if we need to consider
“functions from functions from objects to truth values to functions from func-
tions from functions from objects to truth values to objects to functions from
functions from objects to truth values to truth values.”

We’ll now introduce a more convenient and compact method for describing
functions. We start with things that are not functions. The non-functions we’ve
been making use of fall into two categories:
1. Truth values: > and ⊥ are not functions, but do get used as semantic
values (in particular, as semantic values of whole sentences).
2. Ordinary objects: Paris and London, Socrates and Aristotle, and the
many other things that make up the world are not functions, but do get
used as semantic values (in particular, as semantic values of names and
of definite descriptions).
We will use e as a name for the collection of ordinary objects, and t for the
collection of truth values. We call e and t, as well as the other collections we’ll
go on to define, types.

Types e and t can then be used to build names for various types of functions.
For example:
1. (e, t) names the collection of functions from ordinary objects (members of
e) to truth values (members of t).

52
2. (e, e) names the collection of functions from ordinary objects to ordinary
objects.
3. (t, t) names the collection of functions from truth values to truth values.
4. ((e, t), e) names the collection of functions from functions from ordinary
objects to truth values (members of (e, t)) to ordinary objects.
There is a general pattern that lies behind these names for different types.
Type Formation: Given any types α and β, we use (α, β) as a name
for the collection of functions that take as input things in α and
produce as output things in β.

(α, β) is thus the type of functions from α to β.

In the previous section we settled on treating ~bald as a function from func-


tions from objects to truth values to functions from objects to truth values.
Functions from objects to truth values are type (e, t), so functions from func-
tions from objects to truth values to functions from objects to truth values are
type ((e, t), (e, t)).

Problem 79: In the previous section we briefly considered two more


complicated kinds of semantic value for bald:

1. A function from functions from objects to truth values to func-


tions from functions from functions from objects to truth values
to objects to objects.
2. A function from functions from objects to truth values to func-
tions from functions from functions from objects to truth values
to objects to functions from functions from objects to truth val-
ues to truth values.

Express each of these in the type theory notation.

If two nodes in a phrase structure tree are going to combine successfully using
functional application, one of them needs to be of some type α and the other
needs to be of type (α, β) for that same α and some β. Thus consider the follow-
ing tree:

53
t
e (e, (t, t))

e (e, ((e, t), (e, t))) (t, e)

e (e, ((t, e), (e, t)))


We can start working out way up from the bottom of the tree using functional
application. At the bottom level, we note that:
1. (e, (t, t)) applies to e to produce (t, t).
2. (e, ((e, t), (e, t))) applies to e to produce ((e, t), (e, t)).
3. (e, ((t, e), (e, t))) applies to e to produce ((t, e), (e, t)).
Adding those derived node types to the tree, we have:

(t, t)
t
((e, t), (e, t))
e (e, (t, t))
(t, e) ((t, e), (e, t))
e (e, ((e, t), (e, t)))

e (e, ((t, e), (e, t)))


Continuing up the tree, we see that (t, t) can combine with t to produce t, and
((t, e), (e, t)) can combine with (t, e) to produce (e, t):

(t, t)
t ((e, t), (e, t)) (e, t)

e (e, (t, t))


e (e, ((e, t), (e, t))) (t, e) ((t, e), (e, t))

e (e, ((t, e), (e, t)))

54
Next, ((e, t), (e, t)) can combine with (e, t) to produce (e, t):

t (e, t)

(t, t)
t
((e, t), (e, t)) (e, t)
e (e, (t, t))

e (e, ((e, t), (e, t))) (t, e) ((t, e), (e, t))

e (e, ((t, e), (e, t)))


But now we encounter a problem. The two children of the root node, we
have discovered, are of types t and (e, t). But those two types won’t combine by
functional application, so we can’t calculate a value for the root note. Therefore,
no meaningful sentence can be built with that tree structure out of words whose
semantic values are of the types we started with on the leaves of the tree.

Problem 80: Consider the following tree:

• • • •
Find ways to assign types to the leaves of this tree to satisfy each of
the following conditions:

1. The type of the root node is t.


2. The type of the root node is e.
3. The root node, and only the root node, cannot have a type
calculated using functional application.

Problem 81: Suppose there are three objects a, b, and c in type e. For
each of the following types, determine how many members there
are of the type, and give an example of a member of the type.
1. (e, t)
2. (t, t)
3. (e, e)
4. (e, (e, t))
5. ((e, t), t)
6. ((e, t), (e, t))

55
7. ((e, t), ((e, t), t))

Problem 82: In each of the following trees, fill in the types on all
nodes that don’t already have types specified. In some cases you
will need to reason both up and down the tree to determine all of
the nodes.

1.

((e, t), e) (e, (e, t))


((e, t), (e, t)) (e, t) ((e, t), e) (e, t)

2. t

e
e e

3.

t (t, t)

(e, t)
e
(t, ((t, t), (t, t))) (e, (t, t))
(e, t)
[Note: There is more than one way to complete the type la-
belling in this tree. Find the simplest completion.]

Three observations about the interaction between the type theory and the syn-
tax:
1. Certain grammatical categories look like they are systematically linked to
particular types. For example:
(a) Expressions of grammatical category NAME are of type e.
(b) Expressions of grammatical category S are of type t.
(c) Expressions of grammatical category N are of type (e, t).
(d) Expressions of grammatical category IV are of type (e, t).
(e) Expressions of grammatical category TV are of type (e, (e, t)).
(f) Expressions of grammatical category A are of type ((e, t), (e, t)).

56
So far it’s just a speculative generalization that these grammatical category
- type associations will hold up robustly. We might at any time encounter
tricky cases that require us to assign (for example) specific common nouns
a type other than (e, t).
2. The correlation between types and grammatical categories then suggests
that we might be able to explain the grammatical rules in terms of the se-
mantic values. Why does the grammar allow a name and an intransitive
verb to be combined:

• [S [NP [NAME Aristotle ] ] [IV laughed ] ]

but not allow a name and another name to be combined:

• [S [NAME Aristotle ] [NAME Socrates ] ]

Attempting to join Aristotle and Socrates as the two children of a


node requires calculating the semantic value of the parent node by func-
tional application of one of ~Aristotle and Socrates to the other. Since
~Aristotle and ~Socrates are both of type e, neither is a function.
Thus functional application is impossible in either direction, and we get
a semantic explanation for why Aristotle Socrates isn’t a grammati-
cally permitted combination. On the other hand, if we attempt to join
Aristotle and laughed as the two children of a node, everything works
out. ~laughed is of type (e, t), and an item in type (e, t) can take an item
in type e, such as ~Aristotle, as input.
We get a similar explanation of the grammatical difference between tran-
sitive and intransitive verbs. Why can we not grammatically form:

• [S [NP [NAME Aristotle ] ] [TV admired ] ]

The transitive verb admired is of type (e, (e, t)). In this case, that semantic
value can combine with the type e of ~Aristotle, but the result is of
type (e, t). Since we are trying to use Aristotle admired as an entire
sentence, and since entire sentences are of type t, again the typing does not
work out, and we get an explanation for the inability to make Aristotle
admired a sentence.
3. Nevertheless, there are substantial obstacles to deriving the grammatical
properties of the language directly from the types of the semantic values
of expressions. Two difficulties:

57
(a) Intransitive verbs like laughs and common nouns like man are both
being treated as of type (e, t). But of course intransitive verbs
and common nouns aren’t of the same grammatical category – they
clearly are not intersubstitutable:

i. Aristotle laughs//#Aristotle man.


ii. The man laughs//#The snores laughs.

We can’t use the type of the semantic values to determine the gram-
matical structure if we ever allow there to be two expressions of the
same semantic type but in different grammatical categories.
(b) Some words in English can have semantic values of more than one
type. For example, many transitive verbs can also be used intransi-
tively:
i. Jones ate breakfast earlier today.//Jones ate earlier today.
ii. Smith spilled the wine.//The wine spilled.
But since transitive verbs are of type (e, (e, t)) and intransitive verbs
are of type (e, t), we won’t be able to assign a single type to these
verbs that fully explains its range of grammatical uses.

S Typing Trees

Once we have worked out the right type for the semantic values of some words,
we can often use the type theory to work out the type of the semantic values of
other words. Consider some examples:

1. Consider an adverb such as loudly. We then take the sentence Aristotle


laughed loudly, or:

Aristotle
laughed loudly

We’re already committed to:


• ~Aristotle is of type e.
• ~laughed is of type (e, t).

Incorporating that type information in the tree, we have:

58
e
(e, t)
Aristotle loudly
laughed

Also, as usual, we want the semantic value of the entire sentence to be t:


t

e
(e, t)
Aristotle loudly
laughed

But if ~Aristotle is of type e, the only way to make the whole sentence
be of type t is to have the semantic value of laughed loudly be of type
(e, t):

e (e, t)

Aristotle (e, t)
loudly
laughed

We can then see that ~loudly must be of type ((e, t), (e, t)), so that it can
take as input the type (e, t) semantic value of laughed and produce as
output the desired type (e, t) for laughed loudly.

Problem 83: Give a syntactic tree for the sentence Aristotle


insulted Plato loudly. Then label the leaves of the tree with
the appropriate types for the semantic values of the words,
including the assumption just reached that ~loudly is of type
((e, t), (e, t)). Then check to see if the types all combine properly
in this case in which loudly is used together with a transitive
verb, rather than with an intransitive verb.

Problem 84: Propose a particular function of type ((e, t), (e,


t)) to use as the semantic value of loudly. Does your choice
of function have the consequence that if Aristotle laughed
loudly is true and Aristotle snored is true, then Aristotle
snored loudly must also be true?

2. Consider the preposition beneath, as used in the sentence Aristotle


kicked the man beneath the house. Assume that sentence has the

59
tree:

Aristotle
kicked
the
man
beneath
the house

We’ve already built into our theory commitments about the type of each
of these words other than beneath:

• ~Aristotle is of type e.
• ~kicked is of type (e, (e, t)).
• ~the is of type ((e, t), t).
• ~man is of type (e, t).
• ~house is of type (e, t).

In addition, of course, the entire sentence is of type t. Incorporating all of


that information, we have:

Aristotle (e, (e, t))

kicked ((e, t), e)

the (e, t)

man beneath ((e, t), e) (e, t)

the house

Now we can reason as follows:


(a) ~kicked the man beneath the house needs to be of type (e, t) in
order to combine with type e ~Aristotle to produce t for the whole
sentence.

60
(b) Therefore ~the man beneath the house needs to be of type e in
order to combine with type (e, (e, t)) ~kicked in order to produce
that type (e, t).
Problem 85: Actually, there is a second more complicated
choice for the type of ~the man beneath the house. What
is the other choice?
85
(c) Therefore ~man beneath the house needs to be of type (e, t) in
order to combine with type ((e, t), e) ~the to produce that type e.
Problem 86: Again there is another choice for the type of
~man beneath the house. Give the other type that would
work.

Suppose that we had assigned ~the man beneath the house


the more complicated type we had considered in Problem
85. What would then be our options for the type of ~man
beneath the house?
(d) ~beneath the house needs to combine with the type (e, t) ~man to
produce something of type (e, t). That means we can’t use ~beneath
the house as the input to ~man, but rather must use ~man as the
input to ~beneath the house. Thus ~beneath the house must
be of type ((e, t), (e, t)).
(e) That brings us to ~beneath. We can see that ~the house is of
type e, so ~beneath needs to take something of type e as input and
produce something of type ((e, t), (e, t)) as output. Thus ~beneath
is of type (e, ((e, t), (e, t))).

We can now finish labelling the tree:

61
t

e (e, t)

Aristotle
(e, (e, t)) e

kicked
((e, t), e) (e, t)

the
(e, t) ((e, t), (e, t))

man
(e, ((e, t), (e, t))) e

beneath ((e, t), e) (e, t)

the house

and all of the typing works out.

Problem 87: Along the way we considered more complicated


typing options for ~the man beneath the house and ~man
beneath the house. Pick one of those options and finish fill-
ing out the typing for the whole tree given that choice. What
ends up being the type of ~beneath? How many choices are
there for typing everything given our starting commitments?

Problem 88: Suppose we had started with a different syntactic


analysis of Aristotle kicked the man beneath the house:

Aristotle

kicked

the man
beneath
the house
What effect would this different syntactic tree have on the final
type assignment to ~beneath? Is there any semantic reason to
prefer one syntactic tree to the other?

62
The general strategy, then, is to work our semantic value types for new words
by seeing how those semantic values are constrained by the types of other
semantic values we’ve already built into the theory. As we’ve seen, in some
cases this won’t be enough to fix semantic types uniquely, but it’s often enough
to limit things down to a few choices.

Problem 89: Suppose we give the syntax for conjunctive sentences


in the following way:

S
S
and
Aristotle laughed
Plato cried
We give this tree rather than the more obvious:

S S
and
Aristotle laughed Plato cried

in order to keep our trees binary branching.

What type should ~and have to make the typing work out for
the first tree? Describe the particular function from that type that
should be used for ~and. It then looks like ~or should be of the
same semantic type as ~and. What function should be used for
~or?

Problem 90: And can be used to join expressions other than entire
sentences:

• Aristotle and Plato laughed.


• Aristotle laughed and cried.

Give binary branching syntactic trees for each of those two sentences
along the lines used in the previous problem. What should then be
the type of ~and in Aristotle and Plato laughed? What should
be the type of ~and in Aristotle laughed and cried? Is there
any prospect of giving a single type to ~and that will work for all
uses of and?

63
Once you have determined types for these two uses of and, what
particular functions within those types should be used for ~and?

Problem 91:
Negation would be easy to incorporate into our semantic theory if
it always occurred in a position c-commanding an entire sentence:

S
not

Aristotle laughed
Unfortunately, this isn’t how negation typically works in English.
(Except, perhaps, for the rather artificial it is not the case that.)
Suppose we instead have:

Aristotle
did
not laugh
How can we assign types to ~not and ~did to make the typing
of this sentence work out? (You might consider making ~did an
identity function that simply passes the input value of its sibling
node up to the parent.) What particular function should be assigned
to ~not?

Problem 92: Consider the sentence:

laughed
the
man
very tall
How should we assign semantic type to ~very to make the typing
of that sentence work out?

Does your choice of type for ~very work for:


• The very very tall man laughed.
Does it matter whether we give that sentence the tree structure:

64
laughed

the
man
tall
very very
or:

laughed

the
man
very
very tall

T Naming Functions and Lambda Notation

We’ve seen reason earlier to think that adjectives are typically of type ((e, t), (e,
t)). But different adjectives will, of course, pick out different functions within
that type. For example:
1. ~bald is the function that maps any (e, t) function f to the characteristic
function of the intersection of the set bald with the set of objects that f
maps to >.
2. ~large is the function that maps any (e, t) function f to the function that
(i) maps any object o to > if the size of o is substantially greater than the
average size of objects that f maps to > and (ii) otherwise maps o to ⊥.
Complex descriptions of functions such as these bring out that we could use a
compact and clear terminology for naming specific functions, just as the type
theory gives us a compact and clear terminology for naming categories of func-
tions.

The notation of the lambda calculus provides a standard way for naming
functions. Consider first how functions are named in mathematics. In the
simplest cases, we write things like:
• f (x) = x2

65
• g(x) = sin(x + 1)

3x + 1 if x ≤ 0
(
• h(x) =
x3 + 5 if x > 0

• j(x) = 1
• k(x, y) = xy + x + y + 1
In each case we specify a function by doing two things:
1. We indicate what the input to the function is. That’s the role of the
variables that accompany the function name. Thus f – or as we sometimes
say, f (x) – is a function that takes x as an input, and h(x, y) is a function
that takes both x and y as inputs.
2. We specify the output of the function in terms of the input to the function.
That’s the role of the expression to the right of the ‘=’ sign. Thus f is a
function that for any input x produces x2 – that is, the square of the input
– as output. And j(x) is a function that, given any input, produces the
number 1 as output.
This common mathematical practice provides an easy way to name simple
functions, but it becomes more cumbersome with more complicated functions.

Consider, for example, what happens if we start talking about derivatives of


d
functions. dx is a function that takes a function as input and produces a function
as output. Thus if f (x) = x2 and g(x) = 2x, then dx d
( f ) = g, since the derivative
2
of x (with respect to x) is 2x. But finding the right way to write down the
behavior of the derivative function is tricky. Compare:
• d 2
dx (x ) = 2x
• m(x) = 2x
The occurrences of 2x on the right of the identity in these two claims are playing
different roles. When we say that m(x) = 2x, we are specifying a linear function
m, and the role of 2x is then to pick out a number that is the output of m for any
given input number x. But when we say that dx d
(x2 ) = 2x, the role of 2x isn’t
to pick out a number, but rather to pick out a function (namely, the function
m(x) = 2x) that is the output of the differentiation.

Our simple way of naming functions doesn’t give us any good tools for distin-
guishing between these two roles that 2x can be playing. The lambda notation
will correct that.

Let’s start with a simple case. In the lambda notation, we name the function
f (x) = x2 by writing:
• λx.x2

66
The expression λx.x2 has two parts, one before the period and one after the
period.
1. The part before the period (sometimes called the lambda abstract) indi-
cates that the expression names a function, and also specifies what the
input variable to the function is. In λx.x2 , then, the input variable is x.
But in λy.y2 , the input variable is y.
2. The part after the period specifies the output of the function. For any
particular value of the input variable, the output of the function is the
semantic value of the part of the lambda term after the period given that
value of the variable. Thus the output of λx.x2 for the input x = 3 is 9,
because x2 is 9 when x = 3.

Problem 93: Give lambda term names for each of the following
functions:
1. f (x) = x2 + 2x + 1
2. g(x) = 0
3. h(x) = e2x+1

Problem 94: Give standard mathematical names for each of the


following functions named using lambda terms:
1. λx.x
2. λx. x+1
x−1
3. λz.z2z
4. λy. log √ y (y + 1)

U From Mathematical Functions to Semantic Functions

We have been focusing on functions specified in mathematical terms, but we


don’t have to limit ourselves to this. Aristotle is a name for an object, so:

• λx.Aristotle
is a function specified with a lambda term. λx.Aristotle is the constant function
that maps every input to Aristotle.

Problem 95: What is the type of λx.Aristotle? Is λx.Aristotle a suit-


able semantic value for Aristotle? Why or why not? If not, is
there another expression for which λx.Aristotle is a suitable seman-
tic value?

Or consider the lambda term:

67
• λx.(the father of x)
This is the function that takes an object x as input and outputs the father of x.
Thus we can have:
• ~the father of = λx.(the father of x)
Departing a bit more from standard mathematical practice, we can have:
• λx.(x laughs)
For any given object o, this function maps o to the semantic value of x laughs
when x is assigned to o. Thus when the input to the function is Aristotle, the
output is ~Aristotle laughs, which is then > if Aristotle does indeed laugh.

Because λx.(father of x) maps an input object x to an output object of the father


of x, it is a function of type (e, e). On the other hand, λx.(x laughs) maps an
input object x to the truth value of the claim that x laughs. It is thus a function
of type (e, t). To make it easier to see what types our lambda terms have, it is
sometimes convenient to mark types of inputs and outputs. Thus we write:
• λxe .(the father of x)e
• λxe .(x laughs)t
Because:
• In both cases, the input variable is of type e, since the input is an arbitrary
object.
• the father of x is of type e.
• x laughs is of type t.
In general, a lambda term of the form:
• λxα .Eβ
names a function of type (α, β) for any types α and β.

One special case of this is that inputs for lambda terms don’t have to be of type
e. For example, we can have:
• λxt .(¬x)t
This is the function that takes a truth value (something from type t) as input
and produces an output also of type t. It’s thus a function of type (t, t). In
particular, we plausibly have:
• ~not = λxt .(¬x)t
Problem 96: How many functions of type (t, t) are there? Give
lambda terms that pick out each of those functions.

68
V Lambda Notation and Higher-Typed Functions

The real strength of the lambda notation is in its ability to name more compli-
cated functions of higher-order types – functions that take functions as inputs
or produce functions as outputs (or both).

Consider addition. There is a function of adding 1, which we can write in


standard mathematical notation:

• f (x) = x + 1

or in lambda notation:
• λx.x + 1

There is also the function of adding 2, which we can write in the lambda notation
as:
• λx.x + 2
In fact, for any number n, there is the function of adding n:

• λx.x + n
That means that there is a function that maps each number n to the function
λx.x + n. We can name that function with a lambda term as follows:
• λy.λx.x + y

To see more clearly what is going on with this lambda term, let’s add type
specifications. Both x and y are numbers to be added. They are thus both of
type e:
• λye .λxe .xe + ye

The inner lambda expression λxe .xe + ye is thus a function taking a type e as
input (namely, the variable xe and producing a type e as output (namely, the
sum xe + ye . That means that it is a type (e, e) function. We can also make that
explicit in our full term:
• λye .(λxe .xe + ye )(e, e)

Finally, we see that the full lambda term names a function that takes a type e
as input (namely, the variable ye ) and produces a type (e, e) as output (namely,
the function (λxe .xe + ye )(e, e) ). That means it is a type (e, (e, e)) function:

• (λye .(λxe .xe + ye )(e, e) )(e, (e, e))

(Of course, we will rarely want to mark types as thoroughly as we have done
here. But if we get confused about how things are working, we can always fall

69
back on thorough type marking.)

λy.λx.x + y thus names a function from objects to functions from objects to


objects. (It is, in particular, the Schönfinkelization of the two-place addition
function.) Because the lambda calculus give us straightforward names for
functions, it makes it easy to produce names for complicated higher-order
functions – we just use lambda terms for the lower-level output functions in
constructing the lambda term for the higher-order function.

First Example: Consider the function that takes as input any function f from
numbers to numbers and produces as output a function g from numbers to
numbers whose value for any input is twice f ’s value. (So our function maps
f (x) = x to g(x) = 2x, and maps f (x) = x2 − 3 to g(x) = 2x2 − 6, and so on).

Since this function takes as input an (e, e) input, its lambda term needs to begin:
• λx(e, e) .
Following the period, we need another expression of type (e, e) to name the
output function, To create an expression of type (e, e), we make another lambda
term of the form:
• λye .Ee
using some expression E of type e.

The expression E of type e then needs to give the output of the output “doubled”
function for the input y. So for any given value of y, the value of E for that
value of y needs to be twice the value of the input function for y. The input
function is given by the variable x(e, e) . The value of that function for the input
y is then given by x(y) – the functional application of the x function to the y
input. We want twice that value, so we want 2x(y). Thus the lambda term for
the desired function is:
• λx.λy.2x(y)
Or, to make all of the typing explicit:
• (λx(e, e) .(λye .(2x(e, e) (ye ))e )(e, e) )((e, e), (e, e))

Second Example: Consider the function that takes as input a number n and
produces as output a function that maps any function f to the function f + n
(that is, the function each of whose outputs is n greater than f ’s output for the
same input). This function has an input of type e and produces an output of
type ((e, e), (e, e)).

A lambda term for this function thus begins with λxe . The output of the
function needs to be of type ((e, e), (e, e)), so we need a lambda term of the
form:

70
• λxe .E((e, e), (e, e))

Now we just need to build the appropriate term E((e, e), (e, e)) . To build that
term, we want to start with an input of type (e, e) and then produce an output
that is also of type (e, e). Thus we start with λy(e, e) , and follow it with a lerm
of type (e, e):
• λy(e, e) .F(e, e)

That leaves us with the task of building the right term f(e, e) . That term will
have the form:
• λze .Ge
To get clear on what Ge should be, let’s think carefully about the various
functions involved.
1. There is the final function we are trying to build, the one that maps an
input number to a function that maps functions to functions shifted by
that number. This function will be named by the overall lambda term
λxe .E((e, e), (e, e)) .

2. There is the function that is the output of this function. This is the function
that shifts input functions by some specific amount. This function will be
named by the lambda term λy(e, e) .F(e, e) (which is then identical to the
expression E in the larger lambda term).
3. There is the input to that specific shifter function. This function is picked
out by the variable y(e, e) .

4. And there is the output shifted function. This function is named by the
lambda term λze .Ge (which is then identical to the expression F in the
previous lambda term).
G, then, gives the output of the shifted function for some specific input z. To
get the output of the shifted function, we need two things:
1. The output of the pre-shifted function for the input z. The pre-shifted
function is given by the variable y, so the output of that function for the
input z is y(z).
2. The shifting amount. That amount is given by the variable x.
Thus G needs to be y(z) + x. Assembling the pieces, we have:
• λx.λy.λz.(y(z) + x)
or, with explicit typing:
• (λxe .(λy(e, e) .((λze .((y(e, e) (ze ))e +xe )e )(e, e) ))((e, e), (e, e)) )(e, ((e, e), (e, e)))

71
Problem 97: Give lambda terms for each of the following functions:
1. The function that takes as input a function f from numbers to
numbers and outputs a function g from numbers to numbers
whose value for any input n is the same as f ’s output for 2n.
2. The function that takes as input a function f from truth val-
ues to truth values and outputs the function that results from
applying the function f twice. to any input truth value.
3. The function that takes as inputs two functions f and g and
produces as output the function f − g. (Hint: because this
function is described as taking two inputs, you’ll need to apply
Schönfinkelization to specify it in with a lambda expression.)

Problem 98: Determine the type of the variable x in each of the


following lambda terms. (In some cases there may be more than
one possible type for x.)
1. λx.λye .(x(y))e
2. λx.λy(e,t) .(x(y))t
3. λx.λy((e, t), e .y(x)
4. λx.x(λy.y + 1)
5. λy((t, t).t) .y((λz.z)(λx.λwe .x(w)))

Problem 99: Explain what is wrong with a lambda term of the


form λx.x(x). Is there any natural language expression for which
we might want to use a lambda term like this?

W More Complicated Semantic Values in Lambda Notation

We’ve already seen how to use lambda terms to specify semantic values such
as:
1. ~laughs = λx.x laughs
2. ~the father of = λx.the father of x
Using higher-order functions, we can give semantic values for more expres-
sions.

Consider ~kicks. Kick is a transitive verb, so we want its semantic value to


be the Schönfinkelization of the two-place function that maps (x, y) to x just in
case x kicks y. Thus we want ~kicks to be of type (e, (e, t)). Here are two
obvious candidates:

72
1. λx.λy.x kicks y
2. λx.λy.y kicks x

Problem 100: Give explicit type marking for each of those two
lambda terms to confirm that they both name functions of type (e,
(e, t)). What type should the variables x and y be?

But which candidate is correct? Let’s consider the sentence Socrates kicked
Aristotle and work through the details with both candidates. We start by
assuming:

• ~Socrates = Socrates
• ~Aristotle = Aristotle
• {hx, yi : x kicked y} = {hSocrates, Aristotlei, hSocrates, Platoi, hPlato, Aristotlei}
Now consider the function picked out by each of our lambda term options:

1. λx.λy.x kicks y: This is the function that, given any input x, produces as
output the function that maps any input y to > if x kicked y and to ⊥ if x
didn’t kick y. Given our starting collection of kicking ordered pairs, that’s
the following function:
" # 
Aristotle → > 

 Socrates →

 Plato → > 
 
 
 " # 
 Socrates → ⊥ 
•  Aristotle → 
 Plato → ⊥ 

 
 " # 
 Aristotle → > 
 Plato →

Socrates → ⊥

2. λx.λy.y kicks x: This is the function that, given any input x, produces as
output the function that maps any input y to > if y kicked x and to ⊥ if y
didn’t kick x. Given our starting collection of kicking ordered pairs, that’s
the following function:
" # 
Aristotle → ⊥ 

 Socrates →

 Plato → ⊥ 
 
 
 " # 
 Socrates → > 
•  Aristotle → 
 Plato → > 

 
 " # 
 Aristotle → ⊥ 
 Plato →

Socrates → >

73
Problem 101: Notice that the two functions just given are perfect
opposites of each other: the second function has ⊥ as an output
everywhere the first function has >, and > as an output everywhere
the first function has ⊥. Is that an inevitable feature of the functions
picked out by λx.λy.x kicks y and λx.λy.y kicks x? If so, why? If
not, give an alternative collection of kicking ordered pairs for which
the two functions are not perfect opposites.

We can now work out twice over the semantic analysis of:

Socrates
kicked Aristotle

1. Using λx.λy.x kicks y: First:


• ~kicked Aristotle = ~kicked(~Aristotle)

= λx.λy.x kicks y(Aristotle)


" # 
→ >

 Socrates → Aristotle 

 Plato → > 

 
 
 " # 
 Socrates → ⊥ 
=  Aristotle →

 (Aristotle)
 Plato → ⊥ 

 
 " # 

 Aristotle → > 
 Plato →
Socrates → ⊥

" #
Socrates → ⊥
=
Plato → ⊥

And then:
• ~Socrates kicked Aristotle = ~kicked Aristotle(~Socrates)
" #
Socrates → ⊥
= (Socrates)
Plato → ⊥
=⊥
So we end up concluding incorrectly that Socrates kicked Aristotle
is false.
2. Using λx.λy.y kicks x: First:

• ~kicked Aristotle = ~kicked(~Aristotle)

74
= λx.λy.y kicks x(Aristotle)
" # 
→ ⊥

 Socrates → Aristotle 

 Plato → ⊥ 

 
 
 " # 
 Socrates → > 
=  Aristotle →

 (Aristotle
 Plato → > 

 
 " # 

 Aristotle → ⊥ 
 Plato →
Socrates → >

" #
Socrates → >
=
Plato → >

And then:
• ~Socrates kicked Aristotle = ~kicked Aristotle(~Socrates)
" #
Socrates → >
= (Socrates)
Plato → >
=>

So we end up concluding correctly that Socrates kicked Aristotle is


true.

Thus λx.λy.y kicks x is the right semantic value to use for ~kicked. This makes
sense if we think through the steps of functional application. λx.λy.y kicks x
takes x as its first input and y as its second input. Because it uses y kicks x,
it thus puts the first input in the kicked position and the second input in the
kicker position. But the structure of the tree for Socrates kicked Aristotle
guarantees that kicked will functionally combine with the kicked first (Aristotle)
and then with the kicker second (Socrates). So that’s the order we want the
variables in the lambda term to have.

Problem 102: Give appropriate lambda terms for the semantic val-
ues of each of the following:
1. kills
2. was killed by
3. resembles
4. resembles Superman
5. gives
6. gives the book
7. gives to Socrates

A few more examples drawn from cases we considered earlier:

75
1. The: ~the is of type ((e, t), e). Its semantic value thus needs to be a
lambda term of the form:
• λx(e, t) .Ee

so that we have a function that takes as input something in type (e, t) –


which will then be the value of the variable x – and produces as output
the value of E for that value of x, which will be of type e.
All that remains is to give the right choice of E. Consider:
• λx(e, t) .(the unique object y such that x(y) = >)

Now suppose, from before, that:

Ryan Bailey → ⊥
 
 

 Yohan Blake → ⊥ 


 Usain Bolt → > 

Justin Gatlin → ⊥
 
~winner = 
 

 Tyson Gay → ⊥ 


 Churandy Martina → ⊥ 


 Asafa Powell → ⊥ 

Richard Thompson → ⊥

Then ~the winner is ~the(~winner), which is:

Ryan Bailey → ⊥
 
 

 Yohan Blake → ⊥ 


 Usain Bolt → > 

 Justin Gatlin → ⊥ 
λx(e, t) .(the unique object y such that x(y) = >)  
 Tyson Gay → ⊥ 

Churandy Martina → ⊥
 
 
Asafa Powell → ⊥
 
 
 
Richard Thompson → ⊥

Ryan Bailey → ⊥ 
 

Yohan Blake → ⊥ 
 

Usain Bolt

 → > 
Justin Gatlin → ⊥ 
 

The function   serves as input to the lambda
Tyson Gay
 → ⊥ 
Churandy Martina → ⊥ 
 

Asafa Powell → ⊥ 
 

Richard Thompson → ⊥
term, setting the value of x. The only value for y for which x(y) is > is
thus Usain Bolt, so ~the winner is Usain Bolt, as desired.

76
2. Bald: ~bald is of type ((e, t), (e, t)). Recall that the intent is that bald,
when combined with some common noun N, produce as semantic value
the intersection of bald with the set of things satisfying N. (More care-
fully, ~bald N is the characteristic function of the intersection of bald
with the set that ~N is the characteristic function of.) The lambda term
for ~bald should thus be of the general form:

• λx(e, t) .E(e, t)

The variable x will then be filled by the input (e, t) semantic value, which
will be provided by the common noun that bald modifies. To make E an
expression of type (e, t), we further assume that it has the form:

• λye .Ft

Since Ft is of type t, it should take the form of a sentence – a sentence that


is true if y is a bald x and false if y is not a bald x. Thus we have:

• Ft = x(y) ∧ y ∈ bald

The first conjunct, x(y), requires that y satisfy the common noun (because
the semantic value x of the common noun maps y to >). The second
conjunct requires that y also be in the set of bald things.
Putting the pieces together, we have:

• ~bald = λx(e, t) .λye .(x(y) ∧ y ∈ bald)

Problem 103: Test the semantic value just given for bald by
building a case with four people: Albert, Beatrice, Charles, and
Dorothy. Give a function to use as the semantic value of man and
a set to use as bald. Then calculate ~bald. Use that function to
derive ~the bald man, and see if the final result is plausible.
Problem 104: Suppose we want to treat ~bald as the character-
istic function of bald (and hence of type (e, t). We thus suggest
that bald man in fact has the following structure:

man
bald INTERSECT

77
where INTERSECT is a covert term that transforms the (e, t)
bald into a semantic value that can appropriately interact with
~man by functional application. What type is ~INTERSECT?
Give an appropriate lambda term for ~INTERSECT.
Problem 105: Red is sometimes used to describe the color of the
exterior of an object, as in:

• red apple (interior is white)


• red car (interior is black)

In other cases, red is used to describe the color of the interior


of an object, as in:

• red grapefruit (exterior is orange)


• red marker (exterior is white)

Suppose, then, that we have two different adjectives in English:


redExt and redInt . Give lambda terms for suitable semantic
values for each of these two adjectives.

Alternatively, suppose (along the lines of the previous problem)


that there is a single adjective red, but that there are also two
covert terms Ext and Int that can combine with red to produce
phrases equivalent to redExt and redInt . Give lambda terms
for ~Ext and ~Int.
Problem 106: We’ve noted earlier that some adjectives like
large, tall, and good have non-intersective effects: to be a
large mouse is not just to be large and a mouse. Rather, we
want something like:

• To be a large mouse is to be a mouse and large for a mouse.


• To be a tall building is to be a building and tall for a building.
• To be a good assassin is to be an assassin and good for an
assassin.

Give suitable lambda terms along these lines for ~large, ~tall,
and ~good. One of these three terms seems to function seman-
tically a bit differently from the other two. Which one? Does
this difference call for a difference in the lambda term?

We might also, along the lines of the covert INTERSECT term


discussed above, suggest that large mouse really has the struc-
ture:

78
mouse
large RELATIVE
Propose a suitable semantic value for RELATIVE.

Can we use RELATIVE to make sense of adjectival modifications


such as:

• former president
• fake diamond
• alleged criminal

If not, suggest a suitable lambda term for at least one of these


adjectives.

3. Not: ~not is simple if we make the simplifying assumption that not


modifies an entire sentence. In that case, ~not = λx.¬x. But what if we
want a more realistic syntax on which not occurs in positions such as
Aristotle does not laugh, or:

Aristotle
does
not laugh

First, here is a suggested typing for this structure:

e (e, t)

Aristotle
((e, t), (e, t)) (e, t)

does ((e, t), (e, t)) (e, t)

not laugh

For simplicity, let’s assume that ~does is an identity function that will
simply pass the (e, t) value of ~not laugh up to the next node. Thus
~does is λx(e, t) .x. Then ~not needs to be a function that will map
~laugh to a new ([e, t) function – in particular, a function that maps an

79
object to > if ~laugh maps it to ⊥, and that maps an object to ⊥ if ~laugh
maps it to >.

We thus need:

• ~not = λx(e, t) .λy.¬x(y)

Problem 107: Suppose adverbs like quickly attach directly to verbs,


as in:

Bolt
ran quickly
Then ~quickly needs to be of type ((e, t), (e, t)). What is wrong
with the following lambda term for ~quickly:

• λx(e, t) .x quickly

What about:

• λx(e, t) .λye .x(y) quickly

(In both cases there are precise technical problems for the proposed
lambda terms.) Try to give a lambda term for ~quickly that does
work properly. (This isn’t an easy task.) Does your proposal for
~quickly also work when quickly is combined with a transitive
verb, as in:

• Aristotle wrote the book quickly

X Calculating Semantic Values With Lambda Terms

Lambda terms are useful for giving clear and succinct specifications of com-
plicated functions. They are also useful as a tool for calculating the values of
functions. Consider some simple examples:

1. (λx.x2 )(3) = 32 = 9
2. (λy.3y − 5)(7) = 3 · 7 − 5 = 21 − 5 = 16

80
3. (λz.z3 − 4z + 1)(0) = 03 − 4 · 0 + 1 = 1

In each case we use the following procedure:

1. Remove the initial λ+variable portion of the lambda term (the lambda
abstract).
2. Replace the variable with the value that is being used as input to the
function.
3. Simplify the resulting expression to determine what number it names.

Now consider a slightly more complicated examples:

• (λx.λy.x + y)(3)(4) = (λy.3 + y) = 3 + 4 = 7

This calculation uses the procedure for simple lambda terms, but uses it twice.
We start with the complex lambda term λx.λy.x + y. We apply this function to
the input 3. We thus remove the outermost lambda abstract – the λx portion of
the term – and replace the corresponding variable with the input 3. The result
is λy.3 + y. Note that we don’t replace the variable y with 3, because we are
evaluating the λx portion of the term, and so are only replacing the variable x
with the input.

We’ve thus learned that (λx.λy.x + y)(3) = λy.3 + y. Put into words, that is:
the function that maps any first number to the function that maps any second
number to the sum of the first and second numbers, when applied to the input
3, produces as output the function that maps any number to the sum of 3 and
that number.

We then apply that function to the input 4. That is, we evaluate (λy.3 + y)(4).
Again we follow our procedure. We remove the lambda extract – this time, the
y abstract λy. We replace the corresponding variable – namely, y – with the
input 4. That gives us 3 + 4. Then we use some arithmetic to simplify 3+4 to 7.

Problem 108: Give careful step-by-step calculations of each of the


following:

1. (λx.2x )(5)
2. (λy.y3 − y)(10)
3. (λx.λy.x − y)(3)(5)
4. (λx.λy.x − y)(5)(3)

81
5. (λx.3x + 5)(λy.y2 )(4))
y
6. (λx.λy. x )((λz.z3 − 6)(2))((λw.4 y+1 )(2))
7. (λx.x2 )((λy.5y + 1)((λz. 2z )(−2)))

We can use the same calculation method for lambda terms that aren’t just
straightforward mathematical functions. Consider:
1. (λx.x laughs)(Socrates). Following our procedure, we remove the lambda
extract and replace the variable x with the input to the function, which is
Socrates. The result is:

• Socrates laughs
But what does it mean for this to be the output? λx.x laughs should
name an (e, t) function, so it should take an entity (Socrates) as input and
produce a truth value as output. When we say:
• (λx.x laughs)(Socrates) = Socrates laughs

we thus mean that the output of λx.x laughs for the input Socrates is the
truth value named by the sentence Socrates laughs. If Socrates does in
fact laugh, then that truth value is >, so (λx.x laughs)(Socrates) = >.
2. Now let’s work in detail through a full sentence. Consider the sentence
Aristotle admires Socrates:

Aristotle
admires Socrates

Suppose we have the following semantic values for the individual words:
(a) ~Aristotle = Aristotle
(b) ~admires = λx.λy.y admires x
(c) ~Socrates = Socrates
We can place these semantic values on the leaves of the tree:

Aristotle

Aristotle λx.λy.y admires x Socrates

admires Socrates

82
We know that the higher nodes are determined by functional application,
so we can do an initial completion of the tree:

(λx.λy.y admires x)(Socrates)(Aristotle)

Aristotle (λx.λy.y admires x)(Socrates)

Aristotle
λx.λy.y admires x Socrates

admires Socrates

Now we just need to calculate the values of those higher node lambda
terms:
(a) (λx.λy.y admires x)(Socrates) = λy.y admires Socrates. (As usual,
we remove the initial lambda abstract λx and replace the variable x
with the input to the function, which is Socrates.
(b) (λx.λy.y admires x)(Socrates)(Aristotle), by the previous, is equal to
(λy.y admires Socrates)(Aristotle). And (λy.y admires Socrates)(Aristotle)
= Aristotle admires Socrates. (Again, we remove the lambda ab-
stract, in this case λy, and replace the variable y with the input to
the function, which is Aristotle.)
Thus the tree for Aristotle admires Socrates, fully evaluated, is:

Aristotle admires Socrates

Aristotle λy.y admires Socrates

Aristotle
λx.λy.y admires x Socrates

admires Socrates

As usual, this can look uninformative, since it seems to tell us that the final
semantic value of the sentence Aristotle admires Socrates is Aristotle
admires Socrates. But the labelling of the top node with “Aristotle admires
Socrates” is in fact a labelling of the top node with the truth value named
by “Aristotle admires Socrates”, and so is either > or ⊥ depending on
Aristotle’s particular pattern of admiration. (Presumably >.)
Applying Lambda Terms to Lambda Terms: We have been considering simple
cases so far, in which the inputs to lambda terms are themselves simple objects.
Thus we’ve been considering only lambda terms of type (e, α) for some type

83
α. But of course this doesn’t exhaust the range of lambda terms. We can in the
same way calculate the values of lambda terms of type (t, t), for example:
• (λxt .¬x)(⊥) = ¬⊥ = >
The more complicated cases occur when a lambda term is used as an input to
another lambda term, as in:
• (λx(e, e) .λye .x(y) + 1)(λze .2z − 3)

In this lambda term, the (e, e) function λz.2z − 3 serves as input to the ((e, e),
(e, e)) function λx.λy.x(y) + 1.

But in fact we can use the same procedure in these more complicated cases. We
remove the lambda abstract, and then replace occurrences of the corresponding
variable with the input to the function. The only difference is that the input to
the function is itself a function, named by a lambda term of its own, rather than
just an object.

So to evaluate (λx(e, e) .λye .x(y) + 1)(λze .2z − 3), we:

1. Remove the lambda abstrct, which in this case is λx. That leaves us with
λy.x(y) + 1.
2. Replace the corresponding variable – in this case x – with the input to the
function, which in this case is λz.2z − 3:
λz.2z−3
z}|{
λy. x (y) + 1


λy.(λz.2z − 3)(y) + 1

λy.(λz.2z − 3)(y) + 1 can then be further simplified by calculating out the interior
piece (λz.2z − 3)(y). This simplifies to 2y − 3 in the usual way. Thus λy.(λz.2z −
3)(y) + 1 simplifies to λy.2y − 3 + 1 or λy.2y − 2. The final upshot, then, is
that our starting (λx.λy.x(y) + 1)(λz.2z − 3) simplifies to a function that maps
any input to twice that input minus 2. That’s because λx.λy.x(y) + 1 is itself a
function that maps an input function to an output function whose values are
always one more than the values of the input function, and that higher-order
function is then applied to the (e, e) function that maps each input to twice the
input minus 3.

Problem 109: Evaluate each of the following lambda terms:


1. (λx.x(y))λz.z2
2. (λx.x(x(y)))(λz.z2 )(3)
3. (λx.λy.y(x(λz.z2 )))(λw. dw
dx )

84
4. λx.λy.λz.(x(y) − x(z))(λw.w3 + 4)(3)(5)
x(y)
5. λy.((λx.( x(x(y)) )(λz.z + 1))((λu.u((λv.v2 )(3)))(λt.2t))

Y Three More Complicated Linguistic Examples

We now work through a few more linguistic examples that also involve apply-
ing lambda terms to lambda terms.
1. Consider the sentence:
• The bald man killed Belicoff.
We start with semantic values for the individual words:
(a) ~the = λx.(the unique object y such that x(y) = >)
(b) ~bald = λx.λy.(x(y) ∧ y ∈ bald)
(c) ~man =λx.x is a man
(d) ~killed = λx.λy.y killed x
(e) ~Belicoff = Belicoff
The tree for the sentence with the initial lexical meanings is:

λx.λy.y killed x Belicoff


λx.(the unique object y
such that x(y) = >) killed Belicoff
λx.λy.(x(y)∧ λx.x is a man
y ∈ bald)
the
man
bald
Now we start calculating our way up the three by using functional appli-
cation.
(a) (λx.λy.y killed x)(Belicoff) = λy.y killed Belicoff
(b) (λx.λy.(x(y) ∧ y ∈ bald)(λx.x is a man) = λy.((λx.x is a man)(y) ∧ y ∈
bald)
i. And then: (λx.x is a man)(y) = y is a man
ii. So the above simplifies to λy.y is a man ∧y ∈ bald)

85
Adding these results to the tree we have:

λy.y killed Belicoff

λx.(the unique object y λy.y is a man ∧y ∈ bald) λx.λy.y killed x Belicoff


such that x(y) = >)
killed Belicoff
the λx.λy.(x(y)∧ λx.x is a man
y ∈ bald)
man
bald
Next we calculate:
• λx.(the unique object y such that x(y) = >)(λy.y is a man ∧y ∈ bald)
• = λx.(the unique object y such that x(y) = >)(λz.z is a man ∧z ∈ bald)
[to avoid variable clash]
• = the unique object y such that λz.(z is a man ∧z ∈ bald)(y) = >
• the unique object y such that (y is a man and y ∈ bald) = >
We add this semantic value to the tree:

the unique object y such that λy.y killed Belicoff


(y is a man and y ∈ bald) = >
λx.λy.y killed x Belicoff

λx.(the unique object y λy.y is a man ∧y ∈ bald) killed Belicoff


such that x(y) = >)

the λx.λy.(x(y)∧ λx.x is a man


y ∈ bald)
man
bald

Finally, we calculate the top node:


• λy.y killed Belicoff(the unique object y such that (y is a man and y ∈
bald) = >)

86
• = the unique object y such that (y is a man and y ∈ bald) = > killed
Belicoff
We now have a fully decorated tree:

the unique object y such that


(y is a man and y ∈ bald) = >
killed Belicoff

the unique object y such that λy.y killed Belicoff


(y is a man and y ∈ bald) = >
λx.λy.y killed x Belicoff

λx.(the unique object y λy.y is a man ∧y ∈ bald) killed Belicoff


such that x(y) = >)

the λx.λy.(x(y)∧ λx.x is a man


y ∈ bald)
man
bald

The truth value of the whole sentence The bald man killed Belicoff
is thus the same as the truth value of the claim that there is a unique
object which is both a man and a member of the set bald, and which
killed Belicoff. That truth value will be > if there is a unique bald man
and he killed Belicoff, ⊥ if there is a unique bald man and he did not kill
Belicoff, and undefined if there is no unique bald man.
2. Consider the sentence:
• The large mouse squeaked and the small elephant trumpeted.

As before, let’s assume (perhaps unrealistically) that the conjunction is


syntactically realized in a binary branching structure:

87
squeaked and
the
large mouse
trumpeted
the
small elephant

Assume also we have the following lexical meanings:


(a) ~the = λx.(the unique object y such that x(y) = >)
Σ size(z)
(b) ~large = λx.λy.x(y)∧ size(y) > z:x(z)=>
|{z:x(z)=>}|
(c) ~mouse = λx.x is a mouse
(d) ~squeaked = λx.x squeaked
(e) ~and = λx.λy.y ∧ x
Σz:x(z)=> size(z)
(f) ~small = λx.λy.x(y)∧ size(y) < |{z:x(z)=>}|
(g) ~elephant = λx.x is an elephant
(h) ~trumpeted = λx.x trumpeted

Problem 110: Why should we use the semantic values:


Σ size(z)
• ~large = λx.λy.x(y)∧ size(y) > z:x(z)=>
|{z:x(z)=>}|
Σ size(z)
• ~small = λx.λy.x(y)∧ size(y) < z:x(z)=>
|{z:x(z)=>}|
rather than the simpler:
Σz:x(z)=> size(z)
• ~large = λx.λy. size(y) > |{z:x(z)=>}|
Σz:x(z)=> size(z)
• ~small = λx.λy. size(y) < |{z:x(z)=>}|

We add these semantic values to the leaves of the tree:

88
λx.x squeaked

squeaked λx.λy.y ∧ x

λx.(the unique object y and


such that x(y) = >)
λx.λy.x(y)∧ size(y) λx.x is a mouse
the Σz:x(z)=> size(z) λx.x trumpeted
> mouse
|{z:x(z)=>}|
trumpeted
large
λx.(the unique object y
such that x(y) = >)
λx.λy.x(y)∧ size(y) λx.x is an elephant
the Σz:x(z)=> size(z)
< elephant
|{z:x(z)=>}|

small

First combine the two adjectives each with the corresponding common
noun:
(a) ~large mouse:
• ~large(~mouse)
Σz:x(z)=> size(z)
• = λx.λy.x(y)∧ size(y) > |{z:x(z)=>}| (λx.x is a mouse)
Σz:λx.x is a mouse(z)=> size(z)
• = λy.(λx.x is a mouse)(y)∧ size(y) >
|{z:λx.x is a mouse(z)=>}|
• λx.x is a mouse(z) = z is a mouse; λx.x is a mouse(y) = y is a
mouse
Σ size(z)
• So we simplify to: λy.y is a mouse∧ size(y) > z:z is a mouse=>
|{z:z is a mouse=>}|
• But the sentence:
– z is a mouse = >
is equivalent to the simpler:
– z is a mouse
• So we can replace the former with the latter to obtain:
Σ size(z)
– λy.y is a mouse∧size(y) > z:z is a mouse
|{z:z is a mouse}|
(b) ~small elephant
• ~small(~elephant)
Σz:x(z)=> size(z)
• = λx.λy.x(y)∧ size(y) < |{z:x(z)=>}| (λx.x
is an elephant)
Σz:λx.x is an elephant(z)=> size(z)
• = λy.(λx.x is an elephant)(y)∧ size(y) <
|{z:λx.x is an elephant(z)=>}|
• λx.x is an elephant(z) = z is an elephant; λx.x is an elephant(y) =
y is an elephant
Σz:z is an elephant=> size(z)
• So we simplify to: λy.y is an elephant∧ size(y) <
|{z:z is an elephant=>}|

89
• But the sentence:
– z is an elephant = >
is equivalent to the simpler:
– z is an elephant
• So we can replace the former with the latter to obtain:
Σz:z is an elephant size(z)
– λy.y is an elephant∧size(y) <
|{z:z is an elephant}|

We add both of these semantic values to the tree:

λx.x squeaked

squeaked λx.λy.y ∧ x

λx.(the unique object y λy.y is a mouse∧ and


such that x(y) = >) Σ size(z)
size(y) > z:z is a mouse
|{z:z is a mouse}|
the λx.x trumpeted

λx.λy.size(y) λx.x is a mouse trumpeted


Σz:x(z)=> size(z)
> mouse λx.(the unique object y λy.y is an elephant∧
|{z:x(z)=>}|
such that x(y) = >) Σz:z is an elephant size(z)
size(y) <
large |{z:z is an elephant}|
the

λx.λy.size(y) λx.x is an elephant


Σz:x(z)=> size(z)
< elephant
|{z:x(z)=>}|

small

Next the two phrases large mouse and small elephant can each be
combined with the:
(a) ~the large mouse
• ~the(~large mouse)
• = λx.(the unique object y such that x(y) = >)(λy.y is a mouse∧size(y)
Σ size(z)
> z:z is a mouse )
|{z:z is a mouse}|
• = λx.(the unique object y such that x(y) = >)(λw.w is a mouse∧size(w)
Σ size(z)
> z:z is a mouse ) [to avoid variable clash]
|{z:z is a mouse}|
• = the unique object y such that λw.(w is a mouse∧size(w) >
Σz:z is a mouse size(z)
)(y) = >)
|{z:z is a mouse}|
Σ size(z)
• = the unique object y such that y is a mouse∧ size(y) > z:z is a mouse =
|{z:z is a mouse}|
>
Σ size(z)
• = the unique object y such that y is a mouse∧ size(y) > z:z is a mouse
|{z:z is a mouse}|
(b) ~the small elephant

90
• ~the(~small elephant)
• = λx.(the unique object y such that x(y) = >)(λy.y is an elephant∧size(y)
Σz:z is an elephant size(z)
> )
|{z:z is an elephant}|
• = λx.(the unique object y such that x(y) = >)(λw.w is an elephant∧size(w)
Σz:z is an elephant size(z)
> ) [to avoid variable clash]
|{z:z is an elephant}|
• = the unique object y such that λw.(w is an elephant∧size(w)
Σz:z is an elephant size(z)
> )(y) = >)
|{z:z is an elephant}|
Σz:z is an elephant size(z)
• = the unique object y such that y is an elephant∧ size(y) > =
|{z:z is an elephant}|
>
Σz:z is an elephant size(z)
• = the unique object y such that y is an elephant∧ size(y) >
|{z:z is an elephant}|

We add these two semantic values to the tree:

the unique object y such that λx.x squeaked


y is a mouse∧ size(y) >
Σz:z is a mouse size(z) squeaked λx.λy.y ∧ x
|{z:z is a mouse}|
and

the unique object y such that λx.x trumpeted


λx.(the unique object y λy.y is a mouse∧ y is an elephant∧ size(y) >
such that x(y) = >) Σz:z is a mouse size(z) Σz:z is an elephant size(z) trumpeted
size(y) >
|{z:z is a mouse}|
|{z:z is an elephant}|
the

λx.λy.size(y) λx.x is a mouse


Σz:x(z)=> size(z)
> mouse λx.(the unique object y λy.y is an elephant∧
|{z:x(z)=>}|
such that x(y) = >) Σz:z is an elephant size(z)
large size(y) <
|{z:z is an elephant}|
the

λx.λy.size(y) λx.x is an elephant


Σz:x(z)=> size(z)
< elephant
|{z:x(z)=>}|

small

Next we combine the two noun phrases the large mouse and the small
elephant with their respective intransitive verbs squeaked and trumpeted:
(a) ~the large mouse squeaked
• = ~squeaked(~the large mouse)
• = (λx.x squeaked)(the unique object y such that y is a mouse∧
Σ size(z)
size(y) > z:z is a mouse )
|{z:z is a mouse}|

91
Σ size(z)
• = the unique object y such that y is a mouse∧ size(y) > z:z is a mouse
|{z:z is a mouse}|
squeaked
(b) ~the small elephant trumpeted
• = ~trumpeted(~the small elephant)
• = (λx.x trumpeted)(the unique object y such that y is an elephant∧
Σz:z is an elephant size(z)
size(y) > )
|{z:z is an elephant}|
Σz:z is an elephant size(z)
• = the unique object y such that y is an elephant∧ size(y) >
|{z:z is an elephant}|
trumpeted
And we add these two semantic values to the tree:

the unique object y such that


y is a mouse∧ size(y) >
Σz:z is a mouse size(z)
squeaked
|{z:z is a mouse}|

λx.λy.y ∧ x the unique object y such that


y is an elephant∧ size(y) >
the unique object y such that λx.x squeaked and Σz:z is an elephant size(z)
trumpeted
y is a mouse∧ size(y) > |{z:z is an elephant}|
Σz:z is a mouse size(z) squeaked
|{z:z is a mouse}|

the unique object y such that λx.x trumpeted


λx.(the unique object y λy.y is a mouse∧
Σz:z is a mouse size(z) y is an elephant∧ size(y) >
such that x(y) = >)
size(y) > Σz:z is an elephant size(z) trumpeted
|{z:z is a mouse}|
the |{z:z is an elephant}|

λx.λy.size(y) λx.x is a mouse


Σz:x(z)=> size(z)
> mouse
|{z:x(z)=>}| λx.(the unique object y λy.y is an elephant∧
such that x(y) = >) Σz:z is an elephant size(z)
large size(y) <
|{z:z is an elephant}|
the

λx.λy.size(y) λx.x is an elephant


Σz:x(z)=> size(z)
< elephant
|{z:x(z)=>}|

small

The last two steps are straightforward. First we combine ~the small
elephant trumpeted with ~and:
• ~and the small elephant trumpeted
• = ~and(~the small elephant trumpeted)
• = λx.λy.y ∧ x(the unique object y such that y is an elephant∧ size(y)
Σz:z is an elephant size(z)
> trumpeted)
|{z:z is an elephant}|

• = λx.λw.w ∧ x(the unique object y such that y is an elephant∧ size(y)


Σz:z is an elephant size(z)
> trumpeted) [to avoid variable capture]
|{z:z is an elephant}|

92
• = λw.(w∧ the unique object y such that y is an elephant∧ size(y)
Σz:z is an elephant size(z)
> trumpeted)
|{z:z is an elephant}|

Then we combine ~the large mouse squeaked with ~and the small
elephant trumpeted:
• ~the large mouse squeaked and the small elephant trumpeted
• = ~and the small elephant trumpeted(~the large mouse squeaked)
• = λw.(w∧ the unique object y such that y is an elephant∧ size(y) >
Σz:z is an elephant size(z)
trumpeted)(the unique object y such that y is a mouse∧
|{z:z is an elephant}|
Σ size(z)
size(y) > z:z is a mouse squeaked)
|{z:z is a mouse}|
Σ size(z)
• = the unique object y such that y is a mouse∧ size(y) > z:z is a mouse
|{z:z is a mouse}|
squeaked ∧ the unique object y such that y is an elephant∧ size(y)
Σz:z is an elephant size(z)
> trumpeted
|{z:z is an elephant}|

Adding these semantic values to our tree we get our full analysis:

Σz:z is a mouse size(z) Σz:z is an elephant size(z)


the unique object y such that y is a mouse∧ size(y) > squeaked ∧ the unique object y such that y is an elephant∧ size(y) > trumpeted
|{z:z is a mouse}| |{z:z is an elephant}|

the unique object y such that λw.(w∧ the unique object y such that
y is a mouse∧ size(y) > y is an elephant∧ size(y) >
Σz:z is a mouse size(z) Σz:z is an elephant size(z)
squeaked trumpeted)
|{z:z is a mouse}| |{z:z is an elephant}|

the unique object y such that λx.x squeaked


y is a mouse∧ size(y) >
Σz:z is a mouse size(z) squeaked λx.λy.y ∧ x the unique object y such that
|{z:z is a mouse}| y is an elephant∧ size(y) >
and Σz:z is an elephant size(z)
trumpeted
|{z:z is an elephant}|

λx.(the unique object y λy.y is a mouse∧


such that x(y) = >) Σz:z is a mouse size(z)
size(y) >
|{z:z is a mouse}|
the
the unique object y such that λx.x trumpeted
λx.λy.size(y) λx.x is a mouse y is an elephant∧ size(y) >
Σz:x(z)=> size(z) Σz:z is an elephant size(z) trumpeted
> mouse |{z:z is an elephant}|
|{z:x(z)=>}|

large

λx.(the unique object y λy.y is an elephant∧


such that x(y) = >) Σz:z is an elephant size(z)
size(y) <
|{z:z is an elephant}|
the

λx.λy.size(y) λx.x is an elephant


Σz:x(z)=> size(z)
< elephant
|{z:x(z)=>}|

small

93
3. Let’s do one more example. Consider the sentence:
• The tiger behind the tree roared.
Suppose we have the following syntactic structure:

roared
the
tiger
behind
the tree

The preposition behind should be of type (e, ((e, t), (e, t)) to make the
typing work:

e (e, t)

roared
((e, t), e) (e, t)

the
(e, t) ((e, t), (e, t))

tiger
(e, ((e, t), (e, t)) e

behind ((e, t), e) (e, t)

the tree

We’ll thus start with the following list of semantic values:


(a) ~the = λx.(the unique object y such that x(y) = >)
(b) ~tiger = λx.x is a tiger
(c) ~behind = λxe .λy.(e, t) .λze .y(z) ∧ z is behind x
(d) ~tree = λx.x is a tree
(e) ~roared = λx.x roared

94
We then calculate ~the tree in the usual way:
• ~the ree
• = ~the(~tree)
• = λx.(the unique object y such that x(y) = >)(λx.x is a tree)
• = the unique object y such that λx.x is a tree(y) = >
• = the unique object y such that y is a tree = >
• = the unique object y such that y is a tree
Next we calculate ~behind the tree:

• ~behind the tree


• = ~behind(~the tree)
• = λx.λy.λz.y(z) ∧ z is behind x(the unique object y such that y is a
tree)
• = λx.λw.λz.w(z) ∧ z is behind x(the unique object y such that y is a
tree) [to avoid variable capture]
• = λw.λz.w(z) ∧ z is behind the unique object y such that y is a tree
And then ~tiger behind the tree:
• ~tiger behind the tree
• = ~behind the tree(~tiger)
• = λw.λz.w(z)∧ z is behind the unique object y such that y is a tree(λx.x
is a tiger)
• = λz.(λx.x is a tiger)(z) ∧ z is behind the unique object y such that y
is a tree
• = λz.z is a tiger ∧ z is behind the unique object y such that y is a tree
From there we calculate ~the tiger behind the tree:
• ~the tiger behind the tree
• = ~the(~tiger behind the tree)
• = λx.(the unique object y such that x(y) = >)(λz.z is a tiger ∧ z is
behind the unique object y such that y is a tree)
• = λx.(the unique object y such that x(y) = >)(λz.z is a tiger ∧ z is
behind the unique object w such that w is a tree) [to avoid variable
capture]
• = the unique object y such that λz.z is a tiger ∧ z is behind the unique
object w such that w is a tree(y) = >
• = the unique object y such that y is a tiger ∧ y is behind the unique
object w such that w is a tree = >

95
• = the unique object y such that y is a tiger ∧ y is behind the unique
object w such that w is a tree
And finally we calculate ~the tiger behind the tree roared:
• ~the tiger behind the tree roared
• = ~roared(~the tiger behind the tree)
• = λx.x roared(the unique object y such that y is a tiger ∧ y is behind
the unique object w such that w is a tree)
• = the unique object y such that y is a tiger ∧ y is behind the unique
object w such that w is a tree roared

A tree for the sentence with all semantic values included:

the unique object y such that y is a tiger ∧ y is behind the unique object w such that w is a tree roared

the unique object y such that y is a tiger ∧ y is behind the unique object w such that w is a tree λx.x roared

roared

λx.(the unique object y such that x(y) = >) λz.z is a tiger ∧ z is behind the unique object y such that y is a tree

the

λx.x is a tiger λw.λz.w(z) ∧ z is behind the unique object y such that y is a tree

tiger

λx.λy.λz.y(z) ∧ z is behind x( the unique object y such that y is a tree

behind

λx.(the unique object y such that x(y) = >) λx.x is a tree

the tree

Problem 111: Give a full semantic analysis for the sentence:


• The very tall man laughed.
To do so:
1. Give a phrase structure tree for the sentence.
2. Propose semantic types for each node in the tree.
3. Give lambda terms specifying the semantic values of the in-
dividual words in the sentence. (Some of this work can be
harvested from the handouts.)

96
4. Calculate the semantic values up the tree by combining lambda
terms.
Then test your theory by seeing what it predicts for:
• The very very tall man laughed.
Does it matter which of the following tree structures is given to very
very tall:
1.

very
very tall
2.

tall
very very
Are your predicted truth conditions for The very very tall man
laughed reasonable?

Problem 112: Combine your proposed value for ~very from the
previous term with ~bald to calculate ~very bald. Does the com-
bination make sense? Should it make sense?

Z Beyond Binary Branching

We’ve been working throughout on the assumption that syntactic trees are al-
ways binary branching: each parent node has exactly two children node, Binary
branching is a syntactic assumption, but we haven;’t given any direct syntactic
argument in favor of it. (There has been important syntactic work arguing in
favor of binary branching, though.) Instead, our reliance on binary branching
has been driven by the fact that binary branching trees let us run a semantic
theory on which the semantic values of complex expressions are always deter-
mined by Functional Application. When all the branches in our trees have the
form:

β γ
we can have the general principle ~α = ~β(~γ) or ~γ(~β). But if we have a
trinary branching node:

β γ δ

97
we can’t give a straightforward functional application story. With two child
nodes, we can use one as function and one as argument, but with three child
nodes, we have more nodes than we have roles to distribute.

However, there are expressions that at least look like they ought to have a non-
binary structure. Two examples we’ve already encountered are ditransitive
verbs and sentential connectives like conjunction. With ditransitive verbs, it’s
tempting to think that direct and indirect object both link to the verb at a single
triple-branching node:

The rat

gave the villagers the plague


Among other things, this ternary structure avoids the need to choose between
two ways of setting up a binary branching structure for the same sentence:

1.

The rat

gave
the villagers the plague

2.

The rat

the plague
gave the villagers
Similarly, it’s tempting to think that when two sentences are joined by and, they
join with and at a single triple-branching node:

and
Aristotle laughed Socrates smiled

98
This ternary branching structure seems to capture the symmetry of conjunctions
in a way that’s lost if we have to choose between the two available binary
branching constructions:
1.

Aristotle laughed and


Socrates smiled

2.

and Socrates smiled


Aristotle laughed

One new example: consider resultative constructions such as:


1. George Foreman knocked Michael Moorer unconscious.
2. The cold snap froze the lake solid.
In these constructions, the verb is followed both by a direct object and by an
adjectival phrase indicating a resulting state for the direct object. Again, it’s
tempting to think that these resultative constructions should have a ternary
structure:

the
cold snap
froze solid
the lake
And again the symmetry of the ternary structure avoids the need to impose an
unsatisfactory asymmetry by associating either the verb with the direct object
or the direct object with the resulting state specification:
1.

the
cold snap solid
froze
the lake

99
2.

the
cold snap froze
solid
the lake

AA Functions of More Than One Argument

While the rule of Functional Application won’t allow semantic interpretation


of non-binary-branching trees, it is not hard to give a variant form of functional
application that will allow nodes to branch to any number of children. The key
observation is that the rule of Functional Application assumes that all functions
are functions of one argument – but we can adjust our framework to allow ar-
guments of more than one argument.

Consider the addition function f (x, y) = x + y. This is a function of two argu-


ments – an instance of functional application requires the function f together
with two numbers serving as inputs (to be added together). If we wanted to
represent the internal structure of addition using a tree, a ternary branching
tree would be the natural choice:
1. 5

2 + 3

2. 12

5 7
+

2 + 3 3 + 4

Problem 113: For each of the following mathematical functions, say


how many argument places the function has.
d √
• ÷, ×, dx ,

If we want to start using functions of more than one argument, we need to adjust
our type theory. In our current notation, a type (α, β) contains functions from
members of type α to members of type β. The notation thus presupposes that
we use only functions of one argument. For functions of multiple arguments,
we will use the notation:

100
• (hα1 , . . . , αn i, β)
to name the type of n-place functions that take as inputs members of the types
α1 through αn , and produce as output a member of type β.

For example, (he, ei, t) is the type of two-place functions that take two objects
(two members of type e) as input and produce a truth value (a member of type
t) as output. If type e contains two objects a and b, then one member of type
(he, ei, t) is the two-place function:

 a, a → > 
 
 a, b → ⊥ 
•  
 b, a → > 

b, b → >

Problem 114: Determine whether each of the following expressions


is a valid name of a type. If it is a valid name, indicate how many
argument places functions in that type have.
1. (he, t, e, ti, t)
2. (e, e, e)
3. ht, t, ti
4. ((he, ti, t), (e, e))

Problem 115: Suppose type e has three members a, b, and c. For


each of the following types, say how many members the type has,
and give an example member of the type.
1. (ht, ti, t)
2. (e, (he, ti, e))
3. (he, e, ei, e)
4. (ht, (he, ei, e), ti, t)

In addition to a system for naming types containing functions of more than


one argument, we also need a system for naming specific functions within
those types. We thus generalize slightly the lambda notation. For example, we
specify the addition function by:
• ~+ = λhx, yi.x + y
More generally, a lambda term of the form:
• λhx1 , . . . , xn i.E(x1 , . . . , xn )
names an n-place function that takes inputs x1 through xn and produces as out-
put the value of E(x1 , . . . , xn ) given those inputs. Including type specifications,
we have:
• λhx1τ1 , . . . , xnτn i.E(x1 , . . . , xn )σ

101
naming an n-place function of type (hτ1 , . . . , τn i, σ).

Problem 116: Suppose that f and g are both functions from the real
numbers to the real numbers, such that:
1. f (x) = λx.F(x)
2. g(x) = λx.G(x)
We can then define a two-place function h(x, y) that maps two real
numbers to a real number by setting h(x, y) = f (x) + g(y). Write a
lambda term for the two-place h(x, y) function.

Problem 117: Write a lambda term for a function that takes as input
a pair of integers and produces as output a new function that itself
takes as input a pair of integers and produces as output > if the two
input integers are both strictly between the earlier pair of integers
and ⊥ if the two input integers are not both strictly between the
earlier pair of integers. What is the type of this lambda term?

Problem 118: Give examples of three-place, four-place, and five-


place function. (You can use mathematical examples, or build suit-
able non-mathematical examples.) For each example function, give
an appropriate lambda term naming that function.

Once we write out lambda terms (in this new notation) for functions of multiple
arguments, it should be clear how multiple-argument functions are connected
to single-argument functions via Schönfinkelization. Consider two ways to
think about addition:
1. We can treat addition as a two-place function of type (he, ei, e). In this
case, we have:
• ~+ = λhx, yi.x + y
2. We can treat addition as a one-place function from a number to a one-
place function from a number to the sum of those two numbers. We thus
treat addition as being of type (e, (e, e)), and have:
• ~+ = λx.λy.x + y
The second approach is just the Schönfinkelization of the first approach. And
there is a simple algorithm for Schönfinkeling a lambda term involving the
h·i bracket notation for multiple-argument functions: we simply remove the
brackets and add lambdas for each variable. The general form is:
• The Schönfinkelization of λhx, yi.E is λx.λy.E.

Problem 119: When we move from λhx, yi.E to λx.λy.E, do we get


the left Schönfinkelization or the right Schönfinkelization of the
original two-place function? Give a general rule for getting the
other-handed Schönfinkelization.

102
Problem 120: Explain how we can use Schönfinkeliation twice to
move between the three-place function:
• λhx, y, zi.E(x, y, z)
and the single-argument (type (e, (e, (e, e)))) function:
• λx.λy.λz.E(x, y, z)
Consider both directions of transition (from three-place function
to single-argument function and from single-argument function to
three-place function). If you are careful about the details, you will
notice that we need some assumptions about the relation between
ha, hb, cii and ha, b, ci. Try to state the needed assumptions as clearly
as possible. Is there a way to think about what ordered pairs and
ordered triples are that will help justify those assumptions?

Problem 121: For each of the following lambda terms using the
multi-argument function notation, give an equivalent Schönfinkeled
expression using only functions of a single argument.
1. λhx, yi.x y
x+y
2. λhx, y, zi. z
3. λhx, y, z, wi.x admires y more than w admires z
4. λhx, yi.λz.(x + y) − z
5. λhxe , ye i.λz(he, ei,e) .z(x, y)

Problem 122: For each of the following lambda terms using single-
valued functions, give an equivalent un-Schönfinkeled expression
using functions of multiple arguments.
1. λx.λy.y − x
2. λx.λy.λz.z > yx
3. λx.λy.y admires x
4. λx.λy.λz.z(y(x))

Problem 123: Consider the lambda term:


x+yv
• λx.λy.λz.λw.λu.λv. (z−u)w
We can give multiple Schönfinkelizations of this function, such as:
x+yv
1. λhx, y, z, w, u, vi. (z−u)w
x+yv
2. λx.λhy, z, wi.λhu, vi. (z−u)w
How many different Schönfinkelizations of the starting function are
possible?

103
AB Truth-Functional Connectives

We noted earlier that the natural syntactic structure for a connective like and
makes it part of a trinary branching structure:

and
Aristotle laughed Socrates smiled
Now that we can incorporate multiple-argument functions in our semantics,
we can respect that natural trinary structure for conjunction. We want ~and to
be a function that takes two truth values as input (in this case, the truth values of
Aristotle laughed and Socrates smiled) and produces a single truth value
as output:

• ~and = λhx, yi.x and y


The semantic value ~and = λhx, yi.x and y characterizes the meaning of the
object language word and using the metalanguage word ‘and’. But we can also
characterize the two-place and function directly using a table of input-output
values:
 >, > → > 
 
 >, ⊥ → ⊥ 
• ~and =  
 ⊥, > → ⊥ 

⊥, ⊥ → ⊥
When discussing two-place sentential connectives, of type (ht, ti, t), we often
present functions in truth table format:

A B A and B
> > >
> ⊥ ⊥
⊥ > ⊥
⊥ ⊥ ⊥
Or in an alternative truth table format:

and > ⊥
> > ⊥
⊥ ⊥ ⊥
In any of these formats, we are specifying the function that takes two truth
values as input, and produces > as output when the two inputs are both >, but
produces ⊥ as output if either output is ⊥.

104
There are 16 different functions in type (ht, ti, t). (A function in type (ht, ti, t)
takes a pair of truth values as input. There are four pairs of truth values. Each
input pair can be mapped to one of two truth values as output. Thus there are
24 = 16 functions available.) ~and is thus one of 16 members of its type. We
can easily make a list of all 16 members:

A B 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
> > > > > > > > > > ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥
> ⊥ > > > > ⊥ ⊥ ⊥ ⊥ > > > > ⊥ ⊥ ⊥ ⊥
⊥ > > > ⊥ ⊥ > > ⊥ ⊥ > > ⊥ ⊥ > > ⊥ ⊥
⊥ ⊥ > ⊥ > ⊥ > ⊥ > ⊥ > ⊥ > ⊥ > ⊥ > ⊥
~And is item 8 on this list. Other English connectives can also be modelled
using items from the list. Consider or. There are two plausible semantic values
for or among the 16 members of (ht, ti, t):
1. Inclusive Or: Item 2 on the list provides one plausible semantic value for
or:
A B A or B
> > >
> ⊥ >
⊥ > >
⊥ ⊥ ⊥

2. Exclusive Or: Item 10 on the list provides another plausible semantic


value for or:
A B A or B
> > ⊥
> ⊥ >
⊥ > >
⊥ ⊥ ⊥

Both Inclusive Or and Exclusive Or agree that the truth of one disjunct is
sufficient for the truth of a disjunction. They disagree on the question of
whether the truth of the disjunction requires the truth of exactly one disjunct:
Exclusive Or imposes the exactness requirement, while Inclusive Or allows
a disjunction to be true when both disjuncts are true. There is longstanding
disagreement over whether the English word or expresses Inclusive Or or
Exclusive Or. Perhaps there are two homophonic and homographic words in
English, one for Inclusive Or and one for Exclusive Or. We could write them,
for disambiguation purposes, as ior and xor. But we will, as our default
assumption, treat English or as inclusive.

Problem 124: For each of the following sentences, consider whether


or has an inclusive or an exclusive interpretation. (Notice that in
many of these sentences, or is not being used as a connective joining

105
full sentences, but instead joins noun phrases, verb phrases, or other
subsentential constituents. We’ll return to this feature of connectives
later.)
1. None of the students did the Kant reading or the Hegel
reading.
2. You may have soup or salad with your dinner.
3. If you can run a mile in under five minutes or run a 5k
race in under seventeen minutes, you can run a marathon
in under three hours.
4. I doubt you’ll enjoy the apple pie or the peach cobbler.
5. The department administrator or the librarian can help
you order that book.
Consider the hypothesis that the either ...or construction is the
marker of exclusive disjunction. Does this seem right? Present some
data either for or against the hypothesis.

Problem 125: Consider a construction with three disjuncts joined


using two occurrences of or, such as:
• Aristotle laughed or Plato smiled or Socrates cried.
If the occurrences of or are interpreted as Exclusive Or, under what
conditions is this sentence true? (Is it true or false when Aristotle
laughs, Plato smiles, and Socrates laughs?) Explain what is unsat-
isfactory about the resulting truth conditions, and propose another
semantic value for or that produces better results in an Exclusive
Or spirit.

Another Engish connective that can be modelled using the members of type (ht,
ti, t) is if. Here we can use item 5 on our chart of options, so that:
• ~If is given by the truth table:
A B if A, B
> > >
> ⊥ ⊥
⊥ > >
⊥ ⊥ >
Or equivalently:
if > ⊥
> > ⊥
⊥ > >
(Note that if, unlike and and or, is non-commutative. A and B is equiv-
alent to B and A, and A or B is equivalent to B or A, but if A, B is not

106
equivalent to if B, A. So when we give the truth table in the alterna-
tive rectangular form, we need to be clear about how row and column
correspond to the two positions in the if construction. We have put the
antecedent (the A position in if A, B) on the column, and the consequent
(the B position in If A, B on the row.)

Problem 126: Using the truth table above for ~if, which of the
following claims about the logic of if sentences is correct?
1. A; if A,B  B
2. B; if A,B  A
3. if A,B; if B,C  if A,C
4. if A,B; not B  not A
5. B  if A,B
6. A  if A,B
7. not A  if A,B
8. if A,if B,C  if A and B,C
9. if A,C  if A,B and if A,C
10. if A,if A,B  if A,B
11.  if A,B
12. if A,B  if A,(B and C)
13. if A,B  if A,(B or C)
14. if A,B  If (A or C), B
15.  if A,B or if B,A

Problem 127: Consider two constructions using if:


1. If the bomb goes off, Jones dies.
2. Jones goes off if the bomb dies.
Give trees for both sentences, and then calculate truth conditions
if > ⊥
using ~if = > > ⊥ . Explain why the result is unsatisfac-
⊥ > >
tory. What should we change to fix the problem? Do we need a
different semantic value for if? Or more than one word if in the
language? Or a different syntactic analysis of sentences using if?

Problem 128: For each of the following words or phrases, consider


whether it can be treated as a truth-functional connective of type
(ht, ti, t). If it can, give a suitable truth table for it. If it cannot, say
something about why a truth-functional treatment doesn’t work.
1. unless

107
2. but
3. only if
4. even if
5. whenever
6. because
7. nor

AC Treating a First Ambiguity

We noted earlier that ambiguous sentences often admit of more than one syn-
tactic analysis. We gave as an example the sentence Very old men and women
are happy, which has the two trees:

1.

are happy
and women
men
very old

2.

are happy
very old
men and women

But it remains to be seen whether the different trees that we assign to an am-
biguous sentence can then produce different meanings for that sentence that
correspond to the disambiguated readings of the sentence.

Unfortunately, we can’t give a semantic analysis to any version of Very old men
and women are happy yet. We don’t have an analysis of very, we don’t know
what to do with plural nouns like men and women, we don’t have a workable
treatment of predicative adjectives like happy in the verb phrase are happy,
and we don’t have a treatment of and that allows it to join things other than
sentences. (So, more or less, there are no parts of this sentence that our current
theory can handle.) But we can deal with a simpler case.

Consider the sentence:

108
• Socrates smiled and Plato pouted or Aristotle arrived.
Assume that both and and or give rise to trinary branching trees using the
phrase structure rule:
• S → S CONN S

Then we get two possible trees for the sentence:


1. Tree 1:
S

CONN S

or NP VP
S CONN S
NAME IV
NP VP and NP VP
Aristotle arrived
NAME IV NAME IV

Socrates smiled Plato pouted

2. Tree 2:
S

S CONN

NP VP and
S CONN S
NAME IV
NP VP or NP VP
Socrates smiled
NAME IV NAME IV
Plato pouted Aristotle arrived

109
We’ve had a syntactic theory that assigned these two trees for a while now,
but now we have a semantic theory that can interpret both trees. We’ll use the
following lexicon:
• ~Socrates = Socrates

• ~Plato = Plato
• ~Aristotle = Aristotle
• ~smiled = λx.x smiled
• ~pouted = λx.x pouted

• ~arrived = λx.x arrived

 >, > → > 


 
 >, ⊥ → ⊥ 
• ~and =  
 ⊥, > → ⊥ 
⊥, ⊥ → ⊥
 

 >, > → > 


 
 >, ⊥ → > 
• ~or = 
 
 ⊥, > → > 

⊥, ⊥ → ⊥
 

We can then add these semantic values to the tree and add semantic types for
each node. (We’ll remove the syntactic categories for simplicity, and also strip
off the non-branching nodes):

1. Tree 1:
t

t (ht, ti, t) t

~or
 >, >

→ >

 e (e, t)
 >, ⊥ → > 
 
 ⊥, > → > ~Aristotle ~arrived
 


⊥, ⊥ → ⊥
 Aristotle λx.x arrived
t (ht, ti, t) t

~and
e (e, t) e (e, t)
 >, > → >
 

 >, ⊥ → ⊥ 
 
~Socrates ~smiled  ⊥, >

→ ⊥
 ~Plato ~pouted
λx.x smiled λx.x pouted

Socrates 
⊥, ⊥ → ⊥
 Plato

2. Tree 2:

110
t

t (ht, ti, t) t

~and
e (e, t)  >, >

→ >


 >, ⊥ → ⊥ 
 
~Socrates ~smiled  ⊥, >

→ ⊥

λx.x smiled

Socrates 
⊥, ⊥ → ⊥

t (ht, ti, t) t

~or
e (e, t) e (e, t)
 >, > → >
 

 >, ⊥ → > 
 
~Plato ~pouted  ⊥, >

→ >
 ~Aristotle ~arrived
λx.x pouted

Plato 
⊥, ⊥ → ⊥
 Aristotle λx.x arrived

Let’s then assume that Socrates doesn’t smile, Plato does not pout, and Aristotle
does arrive. We then have:
1. ~Socrates smiled = ~smiled(~Socrates) = (λx.x smiled)(Socrates) =

2. ~Plato pouted = ~pouter(~Plato) = (λx.x pouted)(Plato) = ⊥


3. ~Aristotle arrived = ~arrived(~Aristotle) = (λx.x arrived)(Aristotle)
=>
Inserting these truth values into the trees in the appropriate places, we get:

1. Tree 1:
t

t (ht, ti, t) t

~or >
 >, > → >
 
t (ht, ti, t) t  >, ⊥ →

 > 

 ⊥, > → >
 
⊥ ~and ⊥ 
⊥, ⊥ → ⊥
 
 >, > → >
 

 >, ⊥ → ⊥ 
 
 ⊥, > → ⊥
 

⊥, ⊥ → ⊥
 

2. Tree 2:

111
t ]

t (ht, ti, t) t

⊥ ~and
>, > → >
 

 >, ⊥ →
 t (ht, ti, t) t
 ⊥ 

 ⊥, > → ⊥
 
 ⊥ ~or >
⊥, ⊥ → ⊥
 
>, > → >
 
 
 >, ⊥ → > 
 
 ⊥, > → >
 

⊥, ⊥ → ⊥
 

Now we can apply a first truth function in each tree:


1. Tree 1:
t

t (ht, ti, t) t

 >, > → >  >


 
~or
 >, ⊥ → ⊥ 
 >, > → >
 
 (⊥, ⊥)
  
 ⊥, > → ⊥   >, ⊥ → >
 
 
⊥, ⊥ → ⊥  ⊥, > → >
   

⊥, ⊥ → ⊥
 

 >, > → >


 

 >, ⊥ → ⊥ 
Since   (⊥, ⊥) = ⊥, we have:
 
 ⊥, > → ⊥ 
⊥, ⊥ → ⊥
 

t (ht, ti, t) t

⊥ ~or >
>, > → >
 
 
 >, ⊥ → > 
 
 ⊥, > → >
 

⊥, ⊥ → ⊥
 

112
2. Tree 2:
t ]

t (ht, ti, t) t

⊥ >, > → >


 
~and
  
>, > → >   >, ⊥ → >
 
 (⊥, >)
 
 >, ⊥ → ⊥   ⊥, > → > 
  
 ⊥, > → ⊥  ⊥, ⊥ → ⊥
  
⊥, ⊥ → ⊥
 

 >, > → > 


 
 >, ⊥ → > 
Since   (⊥, >) = >, we have:

 ⊥, > → > 
⊥, ⊥ → ⊥
 

t ]

t (ht, ti, t) t

⊥ ~and >
>, > → >
 
 
 >, ⊥ → ⊥ 
 
 ⊥, > → ⊥
 

⊥, ⊥ → ⊥
 

 >, > → > 


 
 >, ⊥ → > 
Calculating the final truth value of Tree 1, we get   (⊥, >) = >.
 
 ⊥, > → > 
⊥, ⊥ → ⊥
 

 >, > → > 


 
 >, ⊥ → ⊥ 
And calculating the final truth value of Tree 2, we get   (⊥, >) =
 
 ⊥, > → ⊥ 
⊥, ⊥ → ⊥
 
⊥. Thus Tree 1 is true and Tree 2 is false. Because the two trees have differ-
ent truth values, they must capture two truth-conditionally different readings
of the starting sentence Socrates smiled and Plato pouted or Aristotle
arrived.

Tree 1 is true, more generally, whenever either (i) both Socrates smiled and
Plato pouted, or (ii) Aristotle arrived. And Tree 2 is true whenever both
(i) Socrates smiled, and (ii) either Plato pouted or Aristotle arrived. That

113
matches the two natural readings of the ambiguous sentence Socrates smiled
and Plato pouted or Aristotle arrived, so our semantic theory provides
a good analysis.

Problem 129: Identify two readings of the sentence:


• If Aristotle arrived Plato pouted and Socrates smiled.
(The use of commas has been avoided in that sentence to preserve
the ambiguity between the two readings.) Then give a full deriva-
tion of the two readings in the same form as the derivation in the
text above: give syntactic trees for the two readings, assign seman-
tic values to the leaves of the tree, and calculate semantic values up
the tree. Do the resulting truth conditions seem appropriate for the
original two readings of the sentence?

AD A King and No King

Earlier we treated definite descriptions such as the king of Belgium as being


of type e. This analysis had three advantages:

1. It made intuitive sense, given that there is a specific individual picked out
by the king of Nroway.
2. It made definite descriptions of the same semantic type as proper names,
which are also type e, thus explaining why names and definite descrip-
tions can appear in the same syntactic positions.

Problem 130: Is it true that names and definite descriptions


appear in the same syntactic positions? Try to give an exam-
ple of a grammatical sentence containing a name that becomes
ungrammatical when the name is replaced by a definite descrip-
tion, and an example of a grammatical sentence containing a
definite description that becomes ungrammatical when the def-
inite description is replaced by a name.

3. It explains why a definite description can be combined with an intransitive


verb of type (e, t) to form a sentence of type t.

However, there are other reasons to be unhappy with typing the king of
Belgium as e. Consider a range of syntactically similar expressions:
• the king
• a king

• some king

114
• every king
• each king
• no king
• most kings

• few kings
• many kings
• all but one king

• finitely many kings


It would be nice if all of these expressions were of the same type. (They all have
roughly the same grammatical distribution, for example.) But many of them
look quite implausible for type e.

Two sources of trouble:


1. Intuitive Trouble: Expressions of type e are supposed to pick out some
particular object. But it doesn’t seem like some king is meant to pick out
any specific object. If Belgium has a king and Bhutan has a king, then
some king picks out neither the king of Belgium nor the king of Bhutan
– its role is precisely to be uncommitted between the two of them.
Similarly, it doesn’t seem like every king is meant to pick out some
specific king. In some sense every king should be about both the king of
Belgium and the king of Bhutan, rather than picking out one of the two.
And no king even more clearly isn’t meant to pick out some specific king.
its whole purpose is not to provide a specific king. And it wouldn’t be
any better for it to pick out a non-king. Which non-king would it pick
out?
2. Formal Trouble: Suppose some king is of type e. Then there is some
specific entity that it picks out – call that entity Karl. Let’s see what we
can learn about Karl.

(a) The sentence:


• Some king laughs
is true. But then ~laughs is an (e, t) function that maps Karl to >.
(b) The sentence:
• Some king doesn’t laughs
is true. But then ~doesn’t laughs is an (e, t) function that maps
Karl to >.

115
At best, then, Karl both laughs and doesn’t laugh – a peculiar kind of king
(or any other object). At worst, we’ve got an outright contradiction in our
theory, since the ~laughs function needs to map Karl both to > and to ⊥.
So if phrases of the form some N are of type e and pick out objects, those
must be strange objects. They are what are sometimes called glutty
objects – objects that have too many properties. Whatever object some
tiger picks out, it needs to be an object that is hungry and not hungry, in
Africa and in Asia, and also not in Africa and not in Asia.
Suppose that every king is of type e. Then again there is some specific
entity that it picks out – call that entity Keegan. We can similarly learn
things about Keegan.

(a) The sentence:


• Every king laughs
is false. But then ~laughs is an (e, t) function that maps Keegan to
⊥.
(b) The sentence:
• Every king doesn’t laugh.
is false. But then ~doesn’t laugh is an (e, t) function that maps
Keegan to ⊥.
At best, then, Keegan neither laughs nor doesn’t laugh – again, a peculiar
kind of king. At worst, ~laughs needs to produce neither > nor ⊥ as
output given Keegan as input – which is impossible if ~laughs is of type
(e, t).

Problem 131: Consider the suggestion that ~every king is of


type e, but the entity that it picks out is the set of all kings:
• ~every king={x : x is a king}
Give an example of a sentence that should be true but is made
false by this semantic value, and an example of a sentence that
should be false but is made true by this semantic value.
Consider how we might fix the problems you’ve just noted
by changing the rule of functional application. Give a modified
rule that allows us to say ~every king={x : x is a king} and get
the right truth value for sentences combining every king with
intransitive verbs. Does the altered rule cause problems with
any other parts of the general theory of meaning that we’ve
already established? (Consider what happens when names are
combined with intransitive verbs using the altered rule.)

Suppose no king is of type e. Then once more there is some specific entity
that it picks out – call that entity Kieran. Consider some truths involving
the expression n king:

116
• No king lives on the moon
• No king is a giraffe
• No king is a prime number
Then ~lives on the moon, ~is a giraffe, and ~is a prime number
must all map Kieran to >. If no king picks out an entity, it’s a moon-
dwelling entity that’s simultaneously a long-necked mammal and an ab-
stract mathematical object. Furthermore, notice that the following claim
is false:
• No king exists

So ~exists needs to map Keegan to ⊥. So in addition to being a prime


lunar giraffe, Kieran needs to not exist. That’s a difficult job description
for any entity.

Problem 132: Suppose there were no kings at all. What sen-


tences of the form No king Fs would then be true? If ~no
king were of type e picking out Kieran, what would we con-
clude about what Kieran was like?

We need an alternative to the default view that some king, every king, and so
on are of type e. Consider the basic typing puzzle:

? (e, t)

? (e, t) laughs

some king
If ~laughs is going to take ~some king as argument, then ~some king needs
to be of type e, which we want to avoid. But there is another option: ~some
king could take ~laughs as argument. In that case, ~some king would need
to be of type ((e, t), t), so that it could take the type (e, t) ~laughs as input and
produce type t for the whole sentence as output:

((e, t), t) (e, t)

? (e, t) laughs

some king
(~some would then be of type (((e, t), ((e, t), t)), but we’ll come back to that later.)

117
What does type ((e, t), t) look like? These are functions from functions from
objects to truth values, to truth values.

Problem 133: Suppose there are five objects a, b, c, d, and e in type


e. How many things are there in type ((e, t), t)? Give an example of
a member of type ((e, t), t).

Recall that functions of type (e, t) can be thought of as characteristic functions


of sets, mapping objects to > if they are in the set and to ⊥ if they are not in
the set. So we can roughly think of members of (e, t) as being sets rather than
functions.

But we can also think of sets as being basically equivalent to properties. The set
of all red things – {x : x is red} – plays more or less the same role as the property
of being red. So we can also think of members of (e, t) as being properties.

In that case, ((e, t), t) contains functions from properties to truth values. But
once again we can think of those functions as being characteristic functions of
sets, so that ((e, t), t) can be thought of as containing sets of properties. And
once again sets and properties are roughly equivalent, so we can think of ((e, t),
t) as containing properties of properties. Properties of properties are sometimes
called second-order properties. (As opposed to properties of objects, which
are called first-order properties.)

Equivalently, we can also think of things in type ((e, t), t) as being sets of sets
of things in type e. Suppose f is a function of type ((e, t), t). Then f↓ is the
corresponding set – the extension of f , or the set whose characteristic function
is f . f↓ is thus a set of things in (e, t). But since the members of f↓ are themselves
characteristic functions, we can also talk about f↓↓ , which will be the set of the
sets that are the extensions of the characteristic functions in f↓ .

Problem 134: For each of the following second-order properties,


give a first-order property that has that second-order property.
1. Is a property that every object has
2. Is a property that no object has
3. Is a property that is had by France and also by Aristotle
4. Is a property had by an infinite number of things and not had
by an infinite number of things
For each of the following first-order properties, give a second-order
property that the first-order property has.
1. Is a tiger
2. Is divisible by 2 but not by 3
3. Has more legs than eyes
4. Exists

118
For each of the following second-order properties, give an object that
has a first-order property that has the given second-order property.
1. Is a property.
2. Is a property had by more people in the northern hemisphere
than in the southern hemisphere.
3. Is a property that the Eiffel Tower doesn’t have.

Problem 135: Consider the second-order property:


• Being a property that Aristotle has.
Write a lambda term for the ((e, t), t) function that corresponds to
that second-order property. Suppose that we then set ~Aristotle
equal to that lambda term (so that we treat Aristotle as type ((e,
t), t), rather than as type e.) Calculate the resulting semantic value
for Aristotle laughs. Is the result reasonable?

Problem 136: Suppose that type e contains four objects a, b, c, and


d. Consider the ((e, t), t) function f defined by:
• f = λx.∃y(x(y) = >) ∧ ∃y(x(y) = ⊥)
What is f↓↓ ?

What second-order property would be appropriate as the semantic value of


some king? We need a second-order property P such that:
1. Whenever a sentence of the form Some king Fs is true, the first-order
property expressed by F has the second-order property P.
2. Whenever a sentence of the form Some king Fs is false, the first-order
property expressed by F does not have the second-order property P.
The following second-order property will do the job:
• Being a property that is had by some king.
Or alternatively (and equivalently):
• Being a property such that the set of things that have it overlaps with the
set of things that have the property of being a king.
We can then find a lambda term for an ((e, t), t) function that is equivalent to
that second-order property:
• λx.{y : x(y) = >} ∩ {y : y is a king} , ∅
Given this semantic value, we have:
• ~Some king laughs
• = ~some king(~laughs)

119
• = λx.{y : x(y) = >} ∩ {y : y is a king} , ∅(λx.x laughs)
• = {y : λx.x laughs(y) = >} ∩ {y : y is a king} , ∅
• = {y : y laughs = >} ∩ {y : y is a king} , ∅
• = {y : y laughs} ∩ {y : y is a king} , ∅

Notice that we don’t end up with the kind of very simple truth conditions we’re
used to from earlier cases:
• ~Some king laughs = some king laughs

but instead get something more complicated with a bunch of set-theoretic


terminology that doesn’t appear in the original sentence. But if you think
through what it takes for {y : y laughs} ∩ {y : y is a king} , ∅, you should see
that that condition on sets holds if and only if some king laughs.

Problem 137: Construct a situation with four kings in which Some


king laughs is true. Specify the set {y : y is a king} in this situation,
and also the set {y : y laughs}. Then verify that the truth conditions
{y : y laughs} ∩ {y : y is a king} , ∅ are met in your situation.

Then construct a second situation with four kings in which Some


king laughs is not true. Again specify the sets {y : y is a king} and
{y : y laughs} for your situation, and verify that the truth conditions
{y : y laughs} ∩ {y : y is a king} , ∅ are not met in this second
situation.

We can analyze every king using the same tools. Again we ask what second-
order property every king should express such that Every king Fs is true if
the F property has that second-order property and false if the F property lacks
that second-order property. What we want is:
• Being a property that is had by every king
Or equivalently:

• Being a property such that the set of things having the property of being
a king is a subset of the set of things having that property.
We can then give a lambda term for an ((e, t), t) function corresponding to that
second-order property:
• λx.{y : y is a king} ⊆ {y : x(y) = >}

We can now calculate full sentential semantic values:


• ~Every king laughs
• = ~every king(~laughs)

120
• = λx.{y : y is a king} ⊆ {y : x(y) = >}(λx.x laughs)
• = {y : y is a king} ⊆ {y : λx.x ilaughs(y) = >}
• = {y : y is a king} ⊆ {y : y laughs = >}
• = {y : y is a king} ⊆ {y : y laughs}

Again a little thought will show that these are appropriate truth conditions for
Every king laughs.

Problem 138: Give a suitable lambda term for no king. Then verify
that it produces appropriate truth conditions by calculating ~No
king laughs.

Problem 139: Give a suitable lambda term for most kings. Then
verify that it produces appropriate truth conditions by calculating
~Most kings laughs. (Most kings is trickier than no king, and
you may need to make some somewhat arbitrary decisions about
exactly what is required for the truth of Most kings laugh.)

Problem 140: Should the truth conditions for:


• Some kings laugh
be different from the truth conditions for:
• Some king laughs
If so, give a suitable lambda term for ~some kings that is different
from the one given above for ~some king, and which accounts for
the truth-conditional difference between the two sentences.

Problem 141: Suppose we introduce a new word yreve into English


via the semantic stipulation:
• ~yreve = λx.{y : x(y) = >} ⊆ {y : y is a king}
Determine the truth value of the sentence:
• Yreve king laughs
in each of the following two situations:
1. Situation 1:
• Set of kings = {Charles, George, Louis}
• Set of laughing things = {Charles, George}
2. Situation 2:
• Set of kings = {Charles, George, Louis}
• Set of laughing things = {Charles, George, Harry, Louis}
Is there any expression in English whose semantic behavior
matches that of yreve?

121
Problem 142: To combine with the type (e, t) king to produce a type
((e, t), t) some king, some needs to be of type ((e, t), ((e, t), t)). Give
an appropriate lambda term for a type ((e, t), ((e, t), t)) function to
use as ~some.

AE Beginning Exploration of ((e,t), t)

In the previous section we considered a few English expressions that we sug-


gested could usefully be assigned semantic values in type ((e, t), ((e, t), t)):

• ~some king = λx.{y : x(y) = >} ∩ {y : y is a king} , ∅


• ~every king = λx.{y : y is a king} ⊆ {y : x(y) = >}
But ((e, t), ((e, t), t)) is a large category, which means there are a lot of functions
in it that we haven’t yet considered using.

Suppose that e contains exactly four objects. We’ll call them a, b, c, and d. Then:
1. (e, t) contains 24 = 16 items. To specify a function from e to t, we need
for each input from e to specify an output in t. There are two things in t
(> and ⊥), so for each object in e we. have two choices about where to
map it. There are four objects in e, so we have a total of 2 · 2 · 2 · 2 · 2 = 24
choices for how to specify a function.
2. ((e, t), t) contains 216 = 65536 items. Since (e, t) contains 16 items, a
function in ((e, t), t) needs for each of those 16 inputs to pick one of two
possible outputs in t. Thus we make 16 independent choices from two
options, which gives a total of 216 functions.
To help get a better sense of what the contents of ((e, t), t) are like, we’ll start
with a picture of (e, t). The sixteen members of (e, t) can, as usual, be thought
of as subsets of e = {a, b, c, d}. Here’s a diagram showing those sixteen subsets:

122
abcd

bcd acd abd abc

cd ac bc ad ac ab

d c b a


Members of ((e, t), t) can then be thought of as characteristic functions of subsets
of those 16 subsets of e. The following diagram, for example, picks out three
members of ((e, t), t):

abcd

bcd acd abd abc

cd bd bc ad ac ab

d c b a

Inside the blue line is the subset {{a, c, d}, {a, b, d}, {b, c}}, which corresponds to
the ((e, t), t) function that maps the following three (e, t) functions to >:

123
 a → >
 

 b → ⊥ 
1. 
 
 c → >


d → >
 

 a → ⊥
 

 b → > 
2. 
 
 c → >


d → ⊥
 

 a → >
 

 b → > 
3. 
 
 c → ⊥


d → >
 

and maps the other 13 functions in (e, t) to ⊥.

Each of the 65536 members of the ((e, t), t) category corresponds to some subset
of the sixteen members of (e, t) shown in the diagram above. Suppose that of
the four members a, b, c, and d of e, a, b, and c are kings, but d is not. How
should we mark ~some king on the diagram?

abcd

bcd acd abd abc

cd bd bc ad ac ab

d c b a


~Some king maps an (e, t) function to > just in case that (e, t) function picks
out a set with a non-empty intersection with the set of kings. Given that a, b,
and c are all kings, any subset of e that contains at least one of a, b, and c has a
non-empty intersection with the set of kings. Thus only {d} and ∅ fail to overlap
the set of kings, and hence fail to be mapped to > by ~some king.

124
Next consider ~every king. ~Every king maps an (e, t) function to > just in
case that function picks out a set that contains the set of linguists as a subset.
That gives us:

abcd

bcd acd abd abc

cd bd bc ad ac ab

d c b a


Finally, consider ~no king. ~No king is given by the following lambda term:
• ~no king = λx.λy.{y : x(y) = >} ∩ {y : y is a king} = ∅

Given that a, b, and c are kings, ~no king maps to > only subsets of e that
contain none of a, b, and c. Thus:
abcd

bcd acd abd abc

cd bd bc ad ac ab

d c b a

125
Problem 143: Suppose again that type e has four objects a, b, c, and
d, and suppose that c and d are linguists. In a diagram similar to the
ones used above, mark appropriate subsets for:
1. ~some linguist
2. ~every linguist
3. ~no linguist

Problem 144: Mark appropriate subsets for each of:


1. ~most kings
2. ~few kings
Note any difficulties you encounter in finding appropriate subsets
to use as semantic values of these two expressions.

AF Upward Closure and ((e,t), t)

Suppose that of our four objects a, b, c, and d:

• a, b, and c are kings.


• c and d are linguists.
• a is a philosopher.
• a, b, c, and d are people.

Then we can mark ~some king (in blue), ~some linguist (in red), ~some
philosopher (in green), and ~some person (in purple) on our diagram:

126
abcd

bcd acd abd abc

cd bd bc ad ac ab

d c b a


All four of these semantic values are upward closed:
• Let X be of type ((e, t), t). X is upward closed if given any Y and Z of type
(e, t) such that Y↓ is a subset of Z↓ , if Y↓ ∈ X↓↓ , then Z↓ ∈ X↓↓ .

Roughly: an ((e, t), t) expression is upward closed just in case whenever it


maps a subset of e to >, it also maps any larger subset of e also to >. Thus for
example:
1. ~Some king is upward closed. The blue outline encloses the subsets:

• {a}, {b}, {c}, {a, b}, {a, c}, {a, d}, {b, c}, {b, d}, {c, d}, {a, b, c}, {a, b, d}, {a, c, d}, {b, c, d}, {a, b, c, d}
Notice that the small set {a} is in ~some king, and so are all of the
expansion of it – {a, b}, {a, c}, {a, d}, {a, b, c}, {a, b, d}, {a, c, d}, and {a, b, c, d}.
That’s what upward closure requires. Similarly the small set {b} is in
~some king, and so is every expansion of it. And the small set {c} is in
~some king, as is every expansion of it.

Problem 145: Prove that, once we’ve decided what objects are
in type e, there is only one upward closed member of ((e, t), t)
that contains ∅.

Problem 146: Suppose f is some function in ((e, t), t), and f↓↓
is the corresponding set of sets of entities. Call a set X in f↓↓
minimal if there is no Y in f↓↓ such that Y ⊂ X.

Suppose that for each minimal set X in f↓↓ , for every Y such that
X ⊆ Y, Y is also in f↓↓ . Prove that f is then upward closed.

127
2. ~Some philosopher is upward closed. Using the results of the previous
problem, since a is the only philosopher, {a} is the only minimal set in ~some
philosopher. And every expansion of {a} is also in ~some philosopher
– are all within the green region.
It’s easy to find members of ((e, t), t) that are not upward closed. Thus consider:

abcd

bcd acd abd abc

cd bd bc ad ac ab

d c b a

1. The blue set {{c}, {d}, {b, c}, {b, d}} is not upward closed. For example, it
contains {c}, and {c} is a subset of {a, b, c, d}, but it does not contain {a, b, c, d}.
2. The green set {{b}, {a, c}} is not upward closed. For example, it contains
{b}, and {b} is a subset of {a, b, c, d}, but it does not contain {a, b, c, d}.

Problem 147: If e contains the four objects a, b, c, and d, are there


any upward closed members of ((e, t), t) that do not contain
{a, b, c, d}? Are there any upward closed members of ((e, t), t)
that do contain the empty set?

3. The red set {{a, d}, {a, b, d}, {a, b, c, d}} is not upward closed. It contains {a, d},
and {a, d} is a subset of {a, c, d}, but it does not contain {a, c, d}.
4. The purple set {∅, {a}, {b, d}, {c, d}, {a, b, c}, {a, b, d}} (which is scattered in mul-
tiple bubbles in the diagram) is not upward closed. For example, it con-
tains ∅, and ∅ is (trivially) a subset of {c}, but it does not contain {c}.

Problem 148: If e contains the four objects a, b, c, and d, how many


upward closed members of ((e, t), t) are there?

128
AG Sources of Upward Closure

We’ve now seen (i) that all of ~some king, ~some linguist, ~some philosopher,
~some people are upward closed (in our little model with four people), and
that (ii) not all members of ((e, t), t) are upward closed. (Indeed, if you’ve solved
the previous problem, you’ve seen that upward closed members of ((e, t), t) are
quite rare. It would be nice to have an explanation for why all of these some
phrases are upward closed, and perhaps some insight into whether phrases of
the form some N are always upward closed.

As a first step, we need a semantic value for some. The typing of some can be
settled easily. The basic constraints are given in a simple tree:

((e, t), t)

(e, t)
some
N
Some thus needs to take an input of type (e, t) and produce an output of type
((e, t), t), so it must be of type ((e, t), ((e, t), t)).

Next we need an appropriate lambda term for the specific semantic value of
some in type ((e, t), ((e, t), t)). Consider first a range of some X semantic values:

• ~some king = λx.{y : x(y) = >} ∩ {y : y is a king} , ∅


• ~some linguist = λx.{y : x(y) = >} ∩ {y : y is a linguist} , ∅
• ~some philosopher = λx.{y : x(y) = >} ∩ {y : y is a philosopher} , ∅
• ~some people = λx.{y : x(y) = >} ∩ {y : y is a person} , ∅

Generalizing from these examples, we want:


• ~some N = λx.{y : x(y) = >} ∩ {y : y is an N} , ∅
We can thus use the following for ~some:

• ~some = λz.λx.{y : x(y) = >} ∩ {y : z(y) = >} , ∅


Using this entry for some, we derive:
• ~some king
• = ~some(~king)

• = λz.λx.{y : x(y) = >} ∩ {y : z(y) = >} , ∅(λx.x is a king)


• = λz.λx.{y : x(y) = >} ∩ {y : z(y) = >} , ∅(λw.xw is a king) [to avoid
variable clash]

129
• = λx.{y : λw.w is a king(y) = >} ∩ {y : z(y) = >} , ∅
• = λx.{y : y is a king= >} ∩ {y : z(y) = >} , ∅
• = λx.{y : y is a king} ∩ {y : z(y) = >} , ∅

Problem 149: Give lambda terms (of type ((e, t), ((e, t), t))) for each
of the following determiners:
• every
• no
• most
• only

Using ~some = λz.λx.{y : x(y) = >} ∩ {y : z(y) = >} , ∅, we can then prove that
some N is upward closed for any noun N:
Proof: Take some noun N, and let N be the set ~N↓ (that is, the set
of objects of which N is true). Then ~some N is the set of sets of
entities that have a non-empty intersection with N. (More carefully,
~some N↓↓ is that set of sets, but we’ll speak loosely.) Now suppose
X is in ~some N and Y is some set of entities such that X ⊆ Y.
Since X ∈~some N, X has a non-empty intersection with N. But
since X ⊆ Y, everything that is in X is also in Y, so Y also has a
non-empty intersection with N. But then Y ∈~some N. Thus ~some
N is upward closed.
Some phrases are not the only ones that are upward closed. Recall our earlier
setup:
• a, b, and c are kings.

• c and d are linguists.


• a is a philosopher.
• a, b, c, and d are people.

Here are ~every king (in blue), ~every linguist (in red), ~every philosopher
(in green), and ~every person (in purple):

130
abcd

bcd acd abd abc

cd bd bc ad ac ab

d c b a


Examining the diagram shows that each one of these collections is upward
closed. We can see that every N phrases are always upward closed by giving
an appropriate semantic value for every:
• ~every = λz.λx.{y : z(y) = >} ⊆ {y : x(y) = >}
Using this semantic value, we can then prove that every N is upward closed
for any noun N:
Proof: Take some noun N, and let N be the set ~N↓ . Then ~every N
is the set of all supersets of N – that is, the set of all sets of which N is a
subset. Suppose X is in ~every N and Y is some set of entities such
that X ⊆ Y. Because X ∈~every N, we know N ⊆ X. Combining
that with X ⊆ Y, we conclude N ⊆ Y. But then Y ∈~every N. Thus
~every N is upward closed.

Problem 150: Suppose that e contains the four objects a, b, c, and d,


and that a, b, and c are kings. Using a diagram of ((e, t), t) of the sort
we’ve been using, draw the sets for:
1. no king
2. all but one king
3. most kings
4. many kings
5. only kings
6. exactly two kings
Determine which of these is upward closed.

131
The upward closed nature of some king and every king has a helpful linguistic
consequence. Consider the following two verb phrases:
• owns a car
• owns a red car

Anyone who owns a red car owns a car. Thus the set of red car owners is a
subset of the set of car owners. That is, ~owns a red car↓ ⊆ ~owns a car↓ .
(We don’t actually have the tools yet to calculate ~owns a car from its compo-
nent parts, but just by understanding the language we can see that this subset
relation must be correct.)

Because some king is upward closed, if ~owns a red car↓ is in ~some king↓↓ ,
then ~owns a car↓ is in ~some king↓↓ . Therefore:
• Some king owns a red car logically implies ~Some king owns a car.
If Some king owns a red car is true, then Some king owns a car is
also true.
The same holds for every:
• Every king owns a red car logically implies Every king owns a car.
But this doesn’t work for all noun phrases. Consider:

• No king owns a red car.


• No king owns a car.
The first of these sentences does not imply the second. If there is only one
king and he owns only a blue car, then No king owns a red car is true, but
No king owns a car is false.

Problem 151: Find a different logical relation that holds between


No king owns a red car and No king owns a car. Is it just an
accident of this particular example that this relation holds?

So we have a useful diagnostic:


Diagnostic: A noun phrase NP of type ((e, t), t) is upward closed
if and only if given any two verb phrases V1 and V2 such that
everything that V1s also V2s, the sentence:
• NP V1
logically implies the sentence:
• NP V2

Problem 152: Apply the car//red car diagnostic to each of the


following noun phrases:

132
1. no king
2. all but one king
3. most kings
4. many kings
5. few kings
6. only kings
7. exactly two kings
8. Finitely many kings
9. A prime number of kings
10. Usually kings

Problem 153: We can also design a version of the car//red car


diagnostic that gets applied to the noun phrase combined with the
determiner rather than to the verb phrase. Compare:
1. Someone who owns a red car laughed.
2. Someone who owns a car laughed.
The first of these sentences implies the second. We introduce some
terminology:
1. A determiner D (such as some and every is right monotone up
if:
• D king owns a red car.
implies:
• D king owns a car.
Or more generally, if whenever verb phrase V1 is more restric-
tive than verb phrase V2, we have:
• D king V1s.
implies:
• D king V2s.
2. A determiner D is left monotone up if:
• D king who owns a red car laughed.
implies:
• D king who owns a car laughed.
Or more generally, if whenever noun phrase N1 is more re-
strictive than noun phrase N2, we have:
• D N1 laughed.
implies:
• D N2 laughed.

133
Which of the following determiners are left monotone up?
1. no
2. all but one
3. most
4. many
5. few
6. only
7. exactly two
8. Finitely many
9. A prime number of
10. Usually

AH Downward Closure and ((e,t),t)

Earlier we marked ~no king on our diagram. Looking back at the diagram,
we can immediate see that ~no king is not upward closed. ~No king con-
tains the sets ∅ and {d}, but fails to contain lots of larger sets of which these
two sets are subsets. And this isn’t an accidental feature of ~no king. Con-
sider a diagram marking all of ~no king (in blue), ~no linguist (in red), ~no
philosopher (in green), and ~no person (in purple):

abcd

bcd acd abd abc

cd bd bc ad ac ab

d c b a

Every one of these sets fails to be upward closed. No phrases are systematically

134
not upward closed. That’s not surprising given our diagnostic. The inference
from:
• No N owns a red car
to:

• No N owns a car.
won’t be valid for any choice of N. There is always the possibility that some N
owns a car that is not red.

But there is another interesting pattern among the no phrases. All of them are
downward closed:
• Let X be of type ((e, t), t). X is downward closed if given any Y and Z of
type (e, t) such that Y↓ is a subset of Z↓ , if Z↓ ∈ X↓↓ , then Y↓ ∈ X↓↓ .
Roughly: an ((e, t), t) expression is downward closed just in case whenever it
maps a subset of e to >, it also maps any smaller subset of e to >.

Problem 154: Assume that e contains the four objects a, b, c, and d.


Give examples of members of ((e, t), t) that are:
1. Upward closed but not downward closed.
2. Downward closed but not upward closed.
3. Both upward closed and downward closed.
4. Neither upward closed nor downward closed.

We can prove that every phrase of the form no N is downward closed:

Proof: Take some noun N, and let N be the set ~N↓ . Then ~no N
is the set of sets that are disjoint from N – whose intersection with
N is empty. Suppose X is in ~no N and Y is a subset of X. Then
X ∩ N = ∅. But since Y ⊆ X, we also have Y ∩ N = ∅. Thus Y is in
~no N, and ~no N is downward closed.

We can also apply the car//red car diagnostic to downward closed expressions,
but in a slightly different way. We’ve already noted that:
• No king owns a red car.
does not imply:

• No king owns a car.


But we can say more: we do get implication in the other direction. If no king
owns a car, it follows that no king owns a red car. That gives us another
diagnostic:

135
Diagnostic: A noun phrase NP of type ((e, t), t) is downward closed
if and only if given any two verb phrases V1 and V2 such that
everything that V1s also V2s, the sentence:
• NP V2
logically implies the sentence:
• NP V1
Problem 155: Use the diagnostic to determine whether each of the
following is downward closed:
1. Few kings
2. Both kings
3. At most seven kings
4. At least seven kings
5. Only kings
Problem 156: We can extend our diagnostic to impose a four-fold
distinction on determiners. We’ll run the diagnostic with car//red
car, although we could generalize it to any two expressions, one of
which is more restrictive than the other.

D is left monotone up and D is left monotone up and


right monotone up right monotone down
(or ↑ mon↑ ) just in case: (or ↑ mon↓ ) just in case:

• D kings own a red car • D kings own a car implies


implies D kings own a car D kings own a red car

• D kings who own a red car laugh • D kings who own a red car laugh
implies D kings who own a car laugh implies D kings who own a car laugh
D is left monotone down and D is left monotone down and
right monotone up right monotone down
(or ↓ mon↑ ) just in case: (or ↓ mon↓ ) just in case

• D kings own a red car • D kings own a car implies


implies D kings own a car D kings own a red car

• D kings who own a car laugh implies • D kings who own a car laugh implies
D kings who own a red car laugh D kings who own a red car laugh

Give two examples of ↑ mon↑ determiners, two examples of ↑ mon↓


determiners, two examples of ↓ mon↑ determiners, and two exam-
ples of ↓ mon↓ determiners.
Problem 157: Sketch a proof that an ((e, t), t) expression is upward
closed if and only if it is right monotone up, and is downward closed
if and only if it is right monotone down.

136
Finally, consider exactly two kings. Again on the assumption that a, b, and c
are kings, we can mark off ~exactly two kings:
abcd

bcd acd abd abc

cd bd bc ad ac ab

d c b a


~Exactly two kings is not upward closed. It contains {a, b}, and {a, b} is a
subset of {a, b, c}, but it doesn’t contain {a, b, c}. Similarly, ~exactly two kings
is not downward closed. It contains {a, b}, and {a} is a subset of {a, b}, but it
doesn’t contain {a}. Exactly two kings is thus non-monotonic. It’s neither
right monotone up nor right monotone down.

Problem 158: Is exactly two either left monotone up or left mono-


tone down? Work through examples carefully to justify your an-
swer.

However, although exactly two kings is non-monotonic, it’s still closely re-
lated to monotonic expressions. In particular, ~exactly two kings is the
intersection of an upward closed set and a downward closed set. Note that:

• ~exactly two kings = {{a, b}, {a, c}, {b, c}}.


• {{a, b}, {a, c}, {b, c}, {a, b, c}, {a, b, d}, {a, c, d}, {b, c, d}, {a, b, c, d}} is upward closed.
• {∅, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}} is downward closed.
• {{a, b}, {a, c}, {b, c}} = {{a, b}, {a, c}, {b, c}, {a, b, c}, {a, b, d}, {a, c, d}, {b, c, d}, {a, b, c, d}}∩
{∅, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}}
We can see this clearly with a diagram:

137
abcd

bcd acd abd abc

cd bd bc ad ac ab

d c b a

The blue exactly two kings region contains all and only the subsets of e that
are in both of:
1. The upward closed {{a, b}, {a, c}, {b, c}, {a, b, c}, {a, b, d}, {a, c, d}, {b, c, d}, {a, b, c, d}}
2. The downward closed {∅, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}}

Problem 159: Give English expressions E1 and E2 such that when e


contains the four objects a, b, c, and d, and the kings are a, b, and c,
then:
1. ~E1 = {{a, b}, {a, c}, {b, c}, {a, b, c}, {a, b, d}, {a, c, d}, {b, c, d}, {a, b, c, d}}
2. ~E2 = {∅, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}}

AI Strong and Weak Determiners

[Coming soon.]

AJ Definite and Indefinite Descriptions are ((e,t),t)

Earlier we analyzed definite descriptions like the winner of the race as type
e, with the definite article the of type ((e, t), e) so that it can combine with a
type (e, t) noun to pick out an object.

138
But this analysis has the disadvantage of treating definite descriptions quite
differently from other syntactically similar noun phrases. Compare:
• Every linguist: analyzed as ((e, t), t)
• Some linguist: analyzed as ((e, t), t)

• No linguist: analyzed as ((e, t), t)


• The linguist: analyzed as ((e, t), e)
Fortunately, we can also treat definite descriptions as type ((e, t), t), and unify
the category. To do this, we need to figure out, for example, what second-order
property to use as the semantic value of the winner of the race.

Suppose that Usain Bolt is the winner of the race. What kind of property (that
is, what kind of (e, t) value) does laughed need to pick out in order for The
winner of the race laughed to be true? What;’s needed is that laugh pick
out a property that Usain Bolt has. More generally, for:

• The F is G
to be true, we need is G to pick out a property had by whatever object is the
unique F object.

So we want the winner of the race to pick out the second order property:
• Being a property that Usain Bolt has.
and more generally, we want the F to pick out the second order property:
• Being a property that the unique object having the property F has.

That leads to the following proposal:


• ~the N = λx(e, t) .(the unique object y such that ~N(y) = > is such that
x(y) = >

Or, if we want to avoid using the word the in giving the semantic value of the,
we can rewrite this as:
• ~the N = λx(e, t) .(∃y(~N(y) = > ∧ ∀z(~N(z) = > → z = y) ∧ x(y) = >)

Problem 160: Should we care about whether we use the word the
in giving the semantic value of the word the? This isn’t a feature
we’ve avoided in other cases, as can be seen in values like:
• ~laughed = λx.x laughed
• ~linguist = λx.x is a linguist

139
Is there something helpful gained in the second clause above for
~the N that uses the logical quantifiers ∃ and ∀ rather than the
English the? If so, should we be trying to gain similar helpful
things in other cases by, for example, not using the word laughed
in giving the semantic value of laughed?

It’s then easy to extract a semantic value specifically for the:


• ~the = λw(e, t) .λx(e, t) .(the unique object y such that w(y) = > is such
that x(y) = >, or equivalently:
• ~the = λw(e, t) .λx(e, t) .(∃y(w(y) = > ∧ ∀z(w(z) = > → z = y) ∧ x(y) = >)

Problem 161: Return to the example of The bald man killed Belicoff
from section Y. Give another full computation of the semantic value
of this sentence and its parts using the new treatment of the as type
((e, t), t).

We’ve now considered two analyses of the winner of the race: one on
which it is of type e and picks out a specific object, and one on which it is
of type ((e, t), t), and picks out a second-order property (roughly, the second-
order property of being a property had by the object that was the semantic value
on the type e analysis). In many cases these two analyses end up producing
all of the same results when the winner of the race is used in a sentence.
However, there are cases in which the two analyses diverge in their larger
predictions:
1. Suppose no one wins the race. (Perhaps no one even finishes the race, as
in the 2019 Berkeley Marathons.) Then when the is treated as type ((e,
t), e), and thus the winner of the race is of type e, there is no object
for the winner of the race to pick out. Thus ~the winner of the
race is undefined, and sentences containing that phrase are semantically
defective. However, when the winner of the race is treated as type
((e, t), t), it is still defined.
2. Suppose more than one person wins the race. (There is a tie result, as in
the tie between Allyson Felix and Jeneba Tarmoh in the 100 meter dash
2012 Olympic qualification trials.) Then when the is treated as type ((e,
t), e), and thus the winner of the race is of type e, there is no object
for the winner of the race to pick out. Thus ~the winner of the
race is undefined, and sentences containing that phrase are semantically
defective. However, when the winner of the race is treated as type
((e, t), t), it is still defined.
The type e treatment of definite descriptions made them semantically defective
when the restricting noun wasn’t satisfied by exactly one object. What happens
with the type ((e, t), t) treatment in these cases?

140
Let’s work through a case carefully. Suppose type e contains three objects a, b,
and c, and that noun N is true of a and b, but not of c. What is ~the N? We
have:
 a → > 
 
• ~N = 
 b → > 

c → ⊥
 

Therefore:
• ~the N

• = ~the(~N)

 a → >
 

• = λw(e, t) .λx(e, t) .(∃y(w(y) = >∧∀z(w(z) = > → z = y)∧x(y) = >))( b → > )
 
c → ⊥
 

 a → >   a → > 
   
• λx(e, t) .(∃y(  (y) = > ∧ ∀z(  (z) = > → z = y) ∧
 b → >   b → > 
c → ⊥ c → ⊥
   
x(y) = >))

 a → > 
 
 (z) = > → z = y) is equivalent to ∀z((z =
 b → > 
But notice that ∀z(
c → ⊥
 
a ∨ z = b) → z = y). But the truth of that sentence for any value of y requires
both that y = a and y = b. We can’t have y identical both to a and b, so
 a → > 
 
∀z( b → >  (z) = > → z = y) is false for every value of y.
c → ⊥
 

 a → > 
 
Since ∀z( b → >  (z) = > → z = y) is false for evert value of y, there is
 
c → ⊥
 

 a → > 
 
no suitable value for y, so the existential quantification ∃y( b → >  (y) =
 
c → ⊥
 

 a → > 


 
> ∧ ∀z( b → >  (z) = > → z = y) ∧ x(y) = >) is false, no matter what the
c → ⊥
 
input variable x is.

As a result, ~the N maps any input (e, t) value to ⊥. That means ~the N is
the empty second-order property: the second-order property that is not had by
any first-order property. And as a result of that, we conclude that:
• The N VPs

141
is false for any verb phrase.

Notice the difference between the two accounts. Suppose there is more than
one winner of the race. Then:
1. When ~the is type ((e, t), e): The winner of the race laughed is se-
mantically defective: it does not have a semantic value and is neither true
nor false.
2. When ~the is type ((e, t), t): The winner of the race laughed is false.

Problem 162: What happens when there is no winner of the race?


What is the semantic value of the winner of the race and of The
winner of the race laughed on the ((e, t), t) analysis?

Problem 163: So far we have seen the ((e, t), e) and the ((e, t),
t) analyses diverge in cases in which the first analysis makes a
sentence semantically defective while the second analysis makes
that sentence false. Are there cases in which the first analysis makes
a sentence semantically defective while the second analysis makes
that sentence true? If so, give such a case. How plausible is the
result?

Problem 164: We’ve seen that the ((e, t), e) analysis of the builds in
assumptions of existence and uniqueness – if there is no N, or more than
one N, the phrase the N lacks a semantic value and is semantically
defective. The ((e, t), e) and the ((e, t), t) cases thus come apart when
existence or uniqueness fail. But there are other cases in which the
two analyses differ. Consider sentences such as:
1. Every philosopher admires the linguist he learned semantics
from.
2. No linguist agreed with the paper she read.
Explain why these definite descriptions cause problems for the ((e,
t), e) approach. We don’t yet have tools in place for giving adequate
analyses of such sentences using the ((e, t), t) approach either, but
see if you can say anything about how that approach might provide
profitable routes for development.

Treating definite descriptions as ((e, t), t) rather than ((e, t), e) thus allows us
to trade out semantic defects for simple falsehood. Whether that’s good or bad
depends on what we think about sentences with definite descriptions when the
description isn’t satisfied by a unique object.

Problem 165: For each of the following sentence, decide whether it


is best characterized as true, false, or semantically defective.
1. The king of France is bald.

142
2. The first man on the moon was the king of France.
3. The king of France read Anna Karenina.
4. The king of France wrote Anna Karenina.
5. If France has a king, then the king of France probably
lives in Paris.

In addition to definite descriptions, there are also indefinite descriptions. Com-


pare:
• The linguist // A linguist
• The winner of the race // A winner of the race // A winner of a
race
Indefinite descriptions can also be put under the ((e, t), t) umbrella. We treat
indefinite descriptions as, in essence, definite descriptions with the uniqueness
requirement removed. Thus we get:

• ~a N = λx(e, t) .(an object y such that ~N(y) = > is such that x(y) = >)

• ~a N = λx(e, t) .(∃y(~N(y) = > ∧ x(y) = >))

• ~a = λw(e, t) .λx(e, t) .(a y such that w(y) = > is such that x(y) = >), or
equivalently:
• ~a = λw(e, t) .λx(e, t) .(∃y(w(y) = > ∧ x(y) = >))

A linguist is thus semantically the same as some linguist – the two expres-
sions will be interchangeable in all contexts.

Problem 166: Consider the sentence:


• If you give a mouse a cookie, he’s going to ask for a
glass of milk.
This sentence contains three indefinite descriptions. But only one
of them works well with the ((e, t), t) analysis of indefinites we have
been describing. Which one works, and why don’t the other two
work?

Suppose we replace each indefinite description with a correspond-


ing some phrase:
• If you give some mouse some cookie, he’s going to ask
for some glass of milk.
How (if at all) does this change the meaning of the sentence? Be as
precise as you can in describing any meaning changes.

Problem 167: Consider the problems created by certain kinds of an


N and the N sentences, such as:

143
1. Aristotle became a philosopher.
2. Obama became the president.
Explain carefully why the ((e, t), t) semantic values for ~a and
~the create difficulties here. (You may need to say something
about the semantic value for ~became in order to tell the story in
detail.) Do the ((e, t), e) semantic values work any better? Is there
another semantic story for ~a philosopher and ~the president
that handles this data well?

Problem 168: What is the difference in meaning between singular


and plural definite descriptions? What, for example, makes the
linguists semantically different from the linguist? What hap-
pens if we apply the ((e, t), t) semantic value of the to ~linguists
rather than to ~linguist?

In trying to answer that question, you’ll probably realize that, given


everything we’ve said so far, there’s difference between ~linguists
and ~linguist . How could we make these two semantic values
different? Once they are made different, what happens when we
use them as input to ~the?

AK Names are ((e,t), t)

In the previous section we showed that we could stop treating definite descrip-
tions as type e, and instead unify them with their grammatical relatives such
as every linguist and no philosopher under a general treatment as type ((e,
t), t). Once we do this, we’re left with only proper names in category e.

Proper names don’t share the same syntactic similarity that every linguist, no
philosopher, and the king share – proper names aren’t formed by attaching
a determiner such as every, some, or no to a noun phrase. But proper names
do have the same syntactic distribution that quantified noun phrases have –
anywhere a proper name occurs, a quantified noun phrase could be put in its
place, and everywhere that a quantified noun phrase occurs, a proper name
could be put in its place.

Problem 169: There are some counterexamples to the claim that


proper names and quantified noun phrases have the same syntactic
distribution. Consider three possible counterexamples:
1. Proper names can be combined with determiners, as in:
• A thief stole three Picassos from the museum.
• There are few Davids in my class this year.

144
But quantified noun phrases can’t be combined with (addi-
tional) determiners:
• #A thief stole three many paintings from the museum.
• #There are few the tallest boy in my class this year.
2. Some quantified noun phrases can be used in the restrictor of
another quantified noun phrase when put in the genitive/possessive:
• All of the linguists attended the lecture.
• Few of the students did well on the exam.
But proper names can’t be used in these same positions:
• #All of Chomsky attended the lecture.
• #Few of Aristotle did well on the exam.
3. Some quantified noun phrases can be combined with collective
verbs such as met and surrounded:
• Several students met in the park.
• The protestors surrounded the house.
But proper names can’t be used in these same positions:
• #Several Plato met in the park.
• #Socrates surrounded the house.
Not all of these data points are equally convincing. Which one do
you think presents the strongest challenge to the claim that proper
names and quantified noun phrases have the same syntactic distri-
bution? Are there any lessons to be learned about the semantics
either of proper names or of quantified noun phrases from that data
point?

Given the sameness of syntactic distribution, it would be nice if proper names


and quantified noun phrases had the same type. One reason for this is the
possibility of conjunctions of the two. Right now we can account for the con-
junction of two proper names as follows:

e (e, e)

Socrates (e, (e, e)) e

and Plato
And we can account for the conjunction of two quantified noun phrases as
follows:

145
((e, t), t)

((e, t), t)

((e, t),((e, t), t)) (e, t)


(((e, t), t), (((e, t), t), ((e, t), t))) ((e, t), t)
every linguist
and
((e, t),((e, t), t)) (e, t)

some philosopher
But what do we do with the conjunction of a proper name and a noun phrase:

Aristotle ? ((e, t), t)

and
((e, t),((e, t), t)) (e, t)

some linguist
There are ways to make the typing work here, but they’re not theoretically
pretty. We have to give up the attractive idea that and is always of type (α, (α,
α)) for some type α.

Problem 170: Give two typings for and that allow the typing of
Aristotle and some linguist to work. What happens if we con-
sider some linguist and Airstotle instead?

Fortunately, it is possible to give proper names semantic type ((e, t), t) as well.
Suppose Aristotle laughed is true. Then we want to use as ~Aristotle a
second-order property that is possessed by the first-order property of laugh-
ing. Suppose Aristotle cried is false. Then we want to use as ~Aristotle
a second-order property that is not possessed by the first-order property of
crying.

More generally, we want to use as ~Aristotle some second order property


that:
1. Is possessed by every first-order property that Aristotle has.
2. Is not possessed by any first-order property that Aristotle does not have.

146
What we want, then, is the second-order property being a property that Aristotle
has. In the lambda notation, we want:
• ~Aristotle = λx(e, t) .x(Aristotle)

Given this semantic value, we can then calculate ~Aristotle laughed:


• ~Aristotle laughed
• = ~Aristotle(~laughed)
[Note that the functional application here is the reverse of the form
~laughed(~Aristotle) that we see when ~Aristotle is type e.]

• = λx.x(Aristotle)(λy.y laughed)
• = λy.y laughed(Aristotle)
• = Aristotle laughed

And everything works out perfectly.

This idea generalizes. Take any proper name N. We want to distinguish the old
e type semantic value for N and the new ((e, t), t) type semantic value for N. Just
as a notational convenience, we’ll use ~Ne and ~N((e,t),t) to pick out these two
semantic values. Then we can give the following general rule for ~N((e,t),t) :

• ~N((e,t),t) = λx.x(~Ne )
Each proper name is thus semantically associated with the second-order prop-
erty of being a property had by the (ordinary, e-type) referent of the name.

Problem 171: When proper names are treated as ((e, t), t) quantifiers,
are they upward closed, downward closed, or neither? Are they
positive strong, negative strong, or neither?

Problem 172: Suppose type e contains some n objects o1 through on .


Think of the diagram of subsets of e of the sort we’ve been using to
think about ((e, t), t) values in the last few sections. Give a general
characterization of the subsets of that diagram that are suitable for
being ((e, t), t) semantic values for proper names. How many such
subsets are there of the diagram?

When names are of type ((e, t), t), we have an easy story about how names and
quantified noun phrases can link together via conjunction:

147
((e, t), t)

((e, t), t) (((e, t), t), ((e, t), t))

Aristotle

((e, t), t), (((e, t), t),((e, t), t)) ((e, t), t)

and
((e, t),((e, t), t)) (e, t)

some linguist
What exactly this typing analysis predicts for the semantic value of these con-
junctions, though, depends on what we use as ~and – which particular member
of the rather complicated type ((e, t), t), (((e, t), t),((e, t), t)) we assign to it. We’ll
defer that question until later when we take a more careful look at sentential
connectives.

Going forward, we’ll usually for simplicity continue to treat proper names as
type e, but we’ll keep available the tool of moving to quantified noun phrase
values of type ((e, t), t) as something to try when the type e approach is creating
problems.

AL Type Lifting

In the previous section we set out a general method for taking a semantic value
of type e and creating via it a new semantic value of type ((e, t), t). That general
method can be applied to starting points other than e. Suppose, for example,
that we have some expression E of type t. Type (t, t) can then be thought of as
the type of properties of truth values.

Problem 173: Type (t, t) has four members. Say what those four
members are. So when we think of (t, t) as being the type of proper-
ties of truth values, we end up with four properties of truth values.
What are the resulting four properties? Should there be more than
four properties of truth values?

Given that (t, t) is the type of properties of truth values, ((t, t), t) is the type of
properties of properties of truth values, or of second-order properties of truth values.

Problem 174: How many members of ((t, t), t) are there?

148
Just as we set ~Aristotle((e,t),t) to be the second-order property of being a
property had by ~Aristotlee , we can assign E a higher-typed property of
being a property had by the t-type value of E. Thus we have:
• ~E((t,t),t) = λx(t, t) .x(~Et )

If, for example, ~E=>, the ((t, t), t) value associated to E by this rule is the set:
" # " #
> → > > → >
• { , }
⊥ → > ⊥ → ⊥

Problem 175: If ~Et is ⊥, what is ~E((t,t),t) ?

This idea can be generalized to any expression E of any type α. The type (α,
t) is then the type of properties of α’s, and the type ((α, t), t) is the type of
second-order properties of α’s. If ~Eα is the original type-α semantic value of
E, we then define a new semantic value of E:
• ~E((α, t), t) = λx(α, t) .x(~Eα )

We’ll refer to this new semantic value for E as the type-lifted semantic value,
and will use ~E+ to refer to type-lifted values. Any expression can be given a
type-lifted value using this procedure.
Problem 176: Suppose that e contains the three objects a, b, and c
and that a and b are philosophers but c is not. Determine each of
the following:
1. ~philosopher+
2. ~some philosopher+
3. ~no philosopher+

The availability of type-lifting suggests a general semantic processing story of


crash and lift. On this story, we start with lower-level semantic values for
expressions, and then if those lower-level semantic values don’t allow things
to functionally combine at some higher-level, we apply type-lifting to obtain
new higher-typed semantic values that might be able to combine successfully.

Crash and lift allows the following story for Aristotle and some linguist.
We start with ~Aristotle of type e. That gives us:

149
e (((e, t), t), ((e, t), t))

Aristotle

((e, t), t), (((e, t), t),((e, t), t)) ((e, t), t)

and
((e, t),((e, t), t)) (e, t)

some linguist
The type e ~Aristotle then won’t functionally combine with the type (((e, t),
t), ((e, t), t)) ~and some linguist, so we type-lift ~Aristotle:

((e, t), t)

e+ =((e, t), t) (((e, t), t), ((e, t), t))

Aristotle

((e, t), t), (((e, t), t),((e, t), t)) ((e, t), t)

and
((e, t),((e, t), t)) (e, t)

some linguist
+
The type-lifted value ~Aristotle will combine with ~and some linguist,
so now we get a successful computation of ~Aristotle and some linguist.

Crash and lift lets us combine the simplicity of the lower-typed semantic values,
such as simple objects as semantic values of proper names, with the increased
functional flexibility of higher-typed values like ((e, t), t) by bringing the higher-
typed values into play only as a repair strategy when the lower-typed values
won’t work out.

AM Context Sensitive Language

The semantic machinery we have been developing is centered around the idea
that sentences are of semantic type t, and thus have truth values as their se-
mantic values. There is then a puzzle about how to fit into this framework a

150
sentence such as:
• I am Greek.
This sentence is true when uttered by Aristotle, but false when uttered by
Abraham Lincoln. So what should we use as ~I am Greek? Neither > nor ⊥
seems right – > isn’t faithful to the use of the sentence by Lincoln, and ⊥ isn’t
faithful to the use of the sentence by Aristotle.

Problem 177: What is wrong with the following proposal for the
semantic value of I am Greek:
We should have:
• ~I am Greek=I am Greek
just as we have:
• ~Aristotle is Greek=Aristotle is Greek.
• ~Lincoln is Greek=Lincoln is Greek.
And it’s easy enough to see how we’ll get this result.
We’ll just have ~I=I, in the same way that we have
~Aristotle=Aristotle and ~Lincoln=Lincoln.
There are various problems that can be raised for the proposal.
Consider, for example, the question of who is making the proposal.

The basic problem is clear: the sentence I am Greek isn’t the kind of sentence
that gets a truth value. That’s because the sentence contains the word I, and
the word I picks out different people depending on who is speaking. When
Aristotle is speaking, I picks out Aristotle. Aristotle’s utterance of I am Greek
thus says the same thing as an utterance of Aristotle is Greek, and hence is
true. But when Lincoln is speaking, I picks out Lincoln. Lincoln’s utterance
of I am Greek thus says the same thing as an utterance of Lincoln is Greek,
and hence is false.

So there’s no such thing as what is said by the sentence I am Greek. That’s why
we can’t assign a truth value to the sentence – it’s the kind of sentence that says
different things as used by different speakers.

I isn’t the only word that creates this effect of varying in semantic value from
use to use. Consider other similar words such as:

• you, we, now, here, today

Problem 178: Give five more examples of words whose semantic


value varies from use to use. Include at least one questionable ex-
ample: one word that might be best understood as having different
semantic values on different uses, but might not. For any ques-
tionable examples, comment briefly on the factors for and against
viewing it as having use-variable semantic values.

151
Now picks out different times with different uses. That’s why we can’t hope
to have a semantic theory simply assign a truth value to a sentence such as
Chomsky is laughing now – that sentence can be true as used at one time and
false as used at another time. In this way now is like I. But there’s also an
important difference between now and I:
1. To determine the semantic value of a use of now, we need to know what
time the use occurred.
2. To determine the semantic value of a use of I, we need to know who the
speaker was for that use.
To handle all of these different use-variable words, let’s introduce the idea of
a context. Informally, a context is a situation in which a sentence is used.
But formally it will be easier to treat contexts as ordered sequences of bits of
information about the situation of use that are then useful in interpreting use-
variable words. In particular, we will treat a context as an ordered quadruple:
• hSpeaker, Audience, Time, Placei
When Aristotle speaks to Plato in 350 B.C. in Athens, he speaks in the context
hAristotle, Plato, 350 B.C., Athensi.When Lincoln speaks to Hannibal Hamlin
in 1862 in the White House, he speaks in the context hLincoln, Hamlin, 1862,
White Housei. The words I, you, now, and here can then have different semantic
values in these different contexts.
To make use of contexts, we will relativize semantic values to a context.
So instead of introducing a single semantic value ~I into our theory, we will
introduce many semantic values for I – one for each context. For any context
c, ~Ic is the semantic value of I in, or relative to, the context c. Thus:

• ~IhAristotle, Plato, 350 B.C., Athensi = Aristotle

• ~IhLincoln, Hamlin 1862, White Housei = Lincoln


The superscripted context in ~Ic thus indicates which context we are using in
determining the context-relativized semantic value of I.

It’s not just the semantic value of I that we want relativized to a context.
To capture our starting observation that I am Greek is sometimes true (for
example, when spoken by Aristotle) and sometimes false (when spoken by
Lincoln), we want the semantic value of the entire sentence to be relativized to
context, so that we can have:

• ~I am GreekhAristotle, Plato, 350 B.C., Athensi = >

• ~I am GreekhLincoln, Hamlin 1862, White Housei = ⊥


But to get context-relativized semantic values for entire sentences, we need to
take another look at how semantic values of complex expressions are deter-
mined. Our earlier principle of functional application tells us:

152
• ~I am Greek = ~am Greek(~I)
But this won’t work any more. It doesn’t give us a context-relativized semantic
value for I am Greek, and it appeals to an unrelativized semantic value for I
that we’re not trying to provide.

So instead we’ll make use of a principle of bf relativized functional application.


We will say:
• ~I am Greekc = ~am Greekc (~Ic )
Relativized functional application thus mirrors simple functional application,
but relativizes all of the semantic values to a particular context. The context c
can be any context we want – we’re thus committed to:

• ~I am GreekhAristotle, Plato, 350 B.C., Athensi =


~am GreekhAristotle, Plato, 350 B.C., Athensi (~IhAristotle, Plato, 350 B.C., Athensi )

• ~I am GreekhLincoln, Hamlin 1862, White Housei =


~am GreekhLincoln, Hamlin 1862, White Housei (~IhLincoln, Hamlin 1862, White Housei )
More generally, we have:
• Relativized Functional Application: Given any complex expression α β
and any context c, ~α βc = ~αc (~βc ) or ~α βc = ~βc (~αc ).

We’ve already seen that ~IhAristotle, Plato, 350 B.C., Athensi =Aristotle, but
what should we make of ~am GreekhAristotle, Plato, 350 B.C., Athensi ? For
simplicity, let’s assume that the semantic value ~am Greek, that we would
have used (before turning our eye to matters
" of context-sensitivity)
# in analyzing
Aristotle → >
~Aristotle is Greek, is the function . What function
Lincoln → ⊥
should we then use for the semantic value of am Greek when relativized to a
particular context, such as the context of Aristotle speaking to Plato in Athens
in 350 B.C.?

Am Greek doesn’t contain any use-variable expressions, so it doesn’t matter


what context the phrase is spoken in – it will always pick out the same context.
(This is probably too simplistic. The time of a context probably interacts in some
way with the present tense of the verb in am Greek. We’ll continue our policy
of ignoring tense for now, though.) As a result, we have:

• ~am GreekhAristotle,
"
Plato, 350 B.C., Athensi = ~am GreekhLincoln, Hamlin 1862, White Housei
#
Aristotle → >
= ~am Greek =
c
, for any context c.
Lincoln → ⊥

153
We can thus distinguish between context sensitive and context insensitive ex-
pressions. An expression E is context sensitive if there are any two contexts
c1 and c2 such that ~Ec1 ,~Ec2 . Otherwise, E is context insensitive. Context
insensitive expressions thus have the same semantic value relative to every
context. In effect, the relativization of semantic values to contexts makes no
difference for context insensitive expressions, and is done just to give a uni-
form presentation to the semantic machinery that allows us to use relativized
functional application throughout.

Problem 179: Calculate in detail ~Aristotle is GreekhAristotle, Plato, 350 B.C., Athensi .
Then calculate in detail ~Aristotle is GreekhLincoln, Hamlin 1862, White Housei ,
and compare the results. Will we have ~Aristotle is Greekc1 =~Aristotle
is Greekc2 for any two contexts c1 and c2 ?

Now we can give general rules for some context-sensitive terms. I, for example,
always picks out the speaker in the context. Given the way we have defined
contexts as ordered quadruples, that means I always picks out the first member
of the quadruple that is the context. That is:
• For any context c, ~Ic = c(1)
(where c(j) for any j picks out the jth element of c, if it has at least j elements).
Similarly we can say:
• ~youc = c(2)
• ~nowc = c(3)
• ~herec = c(4)
Let’s consider a test case for these semantic values. Consider an utterance of I
admire you, made by Aristotle to Plato in 350 B.C. in Athens. We are thus inter-
ested in determining ~I admire youhAristotle, Plato, 350 B.C., Athensi . For
convenience, let’s use c as a name for the context hAristotle, Plato, 350 B.C., Athensi.
Then we have:
• ~I admire youc
• = ~admire youc (~Ic )
• = ~admire youc (c(1))
• = ~admire youc (Aristotle)
• = (~admirec (~youc ))(Aristotle)
• = (~admirec (c(2)))(Aristotle)
• = (~admirec (Plato))(Aristotle)
• = (λx.λy.y admires x(Plato))(Aristotle)

154
• =(λy.t admires Plato)(Aristotle)
• = Aristotle admires Plato
We thus get the (desirable) result that an utterance of I admire you in context c,
spoken by Aristotle addressing Plato, is equivalent to an utterance of Aristotle
admires Plato.

Problem 180: Consider two contexts c1 = hα, β, γ, δi and c2 =


hβ, α, γ, δi. Show that ~I admire youc1 =~You admire mec2 . What
do we learn about communication and the way that contexts are
connected to specific communicative acts from this equivalence?

Problem 181: Our semantic rules for I, you, now, and here tell
us how the meanings of these words are related to the context in
which they are uttered. But we haven’t yet said anything about how
the context, when considered as an ordered quadruple hα, β, γ, δi is
related to the speech situation in which an utterance is produced.
It’s tempting to think that contexts are connected to utterances in
the following way:
Utterances to Contexts: Utterance u is made in (and
should be evaluated relative to) context c = hα, β, γ, δi,
where:
1. α is the speaker in u
2. β is the audience in u
3. γ is the time of production of u
4. δ is the place of production of u
Consider each of the following problem cases for the Utterances to
Contexts thesis. In each case, explain why it is a problem (be specific
about which component of Utterances to Contexts is challenged by
the case), and consider what might be done in response to that
problem. (Should Utterances to Contexts be modified in some way
to handle the problem case? Does the problem case show that the
underlying idea of Utterances to Contexts needs to be given up
altogether?)
1. Aristotle, speaking to Plato and Socrates, says I admire you.
2. Professor X, leaving his office, puts a note on the door saying
I am not here now -- be back soon.
3. The waiter arrives with everyone’s orders, and James says I am
the ham sandwich. Later, leaving the restaurant James says I
am parked down the alley.
4. Questioned about how unusual it is to have a South American
pope, Francis says Usually I’m Italian.

155
5. Members of Jefferson Smith’s re-election campaign design and
arrange for billboards with a picture of Smith and a cap-
tion below reading I promise to fight against graft in
Congress.
6. Socrates says Only I know myself to be ignorant.
7. Sarah, answering the phone, says Oh, I thought you were
my mother calling.
As one possible (but not the only) test, you might consider whether
there is any change of meaning when the pronoun is replaced by an
appropriate proper name (for example, the name of the speaker).

Can you provide any additional problem cases for Utterances to


Contexts?
Problem 182: We’ve been considering contexts such as hAristotle,
Plato, 350 B.C., Athensi. But of course the conversation between
Aristotle and Plato occurs in a location more precise than just Athens.
It occurs, for example, in the agora in Athens. or at the northwest corner
of the Altar of the Twelve Gods in the agora in Athens. Similarly, the
conversation between Aristotle and Plato happens at a time more
precise than just 350 B.C.. It happens, for example, at 11:30 in the
morning on the third day of Hekatombaion in 350 B.C..

As a result, there is a question about how to think about contexts


in which utterances are made. Are Aristotle’s utterances made
in the context hAristotle, Plato, 350 B.C., Athensi, or in the con-
text hAristotle, Plato, 11:30 in the morning on the third day of
Hekatombaion in 350 B.C., northwest corner of the Altar of the
Twelve Gods in the agora in Athensi? Discuss how sentences like
the following could be relevant to answering this question:
1. You’re finally here!
2. We have groves of olive trees here.
3. I am writing a book on metaphysics now.
4. It’s time to go meet Alexander now.
Do any similar issues arise in thinking about the first and second
positions in the context quadruple and their relation to the pronouns
I an dyou?
Problem 183: We haven’t yet given a semantic value for the context-
sensitive word today. Consider the following proposals:
1. In a context c = hα, β, γ, δi, ~todayc = the 24 hour time period
centered around time γ.
2. In a context c = hα, β, γ, δi, ~todayc = the day in which γ is
located.

156
3. In a context c = hα, β, γ, δi, ~todayc = the period from the
midnight, according to the time zone of δ, prior to γ to the
midnight, according to the time zone of δ, after γ.
4. Add a fifth element to contexts so that a context c takes the
form hα, β, γ, δ, i, and then have ~todayc = c(5) = .
How might we decide among these proposed semantic values?
Some cases that might be useful in thinking about the decision
process:
1. If we were on Venus, today would last another 116 days.
2. If we were on Venus, the Superbowl would be played today.
(uttered on January 1)
3. If the earth rotated faster, today would already be over.
4. I’ll get the proposal to you later today (spoken by some-
one on one side of the International Date Line to someone on
the other side of the International Date Line.)
(None of these cases is meant to be uncontroversial in its interpre-
tation and evaluation.)

Another approach to today is to treat it as a phrase made up of


the two words to- and day. This approach might allow a unified
treatment of today and tomorrow (as well as of the archaic toyear.
Attempt to sketch an approach along these lines. What should we
say about ~to-c ? How do ~to-c and ~dayc combine? (Is there
any good explanation for why we don’t have the expressions toweek
and tomonth?)

Today and this day don’t behave exactly the same. To see this,
compare:
• This day is Christmas (said pointing to December 25 on a
calendar)
• Today is Christmas (said pointing to December 25 on a cal-
endar)
What lessons for the possible semantic value of to- should be drawn
from this contrast?

AN Context Sensitivity, Character, and Avoiding Relativization

In the previous section we made an important fundamental shift in our se-


mantic machinery, by moving from a framework in which we had a simple
function ~· that assigns a semantic value (once and for all) to each expression

157
to a relativized framework in which we had a function ~·c that assigns each
expression a semantic value relative to each context. But it’s not inevitable that
we make this shift and start relativizing semantic values.

Instead, we could introduce more complicated unrelativized semantic values.


One way to do this is to add a third basic semantic type – the type c of contexts.
Then I, for example, can be treated as type (c, e), rather than simply as type
e. The more complicated semantic value for I can easily be extracted from the
relativized semantic value given above. We have:
• ~I=λc.c(1)

Similarly, you, now, and here are type (c, t), with:
• ~you=λc.c(2)
• ~now=λc.c(3)
• ~here=λc.c(4)

But of course it’s not just single words such as I and you that need to be
context-sensitive. In the relativizing approach, a sentence such as I laughed
gets a truth value only relative to a context. So if we are going to treat I as hav-
ing an unrelativized semantic value of type (c, e), we need to treat I laughed
as having an unrelativized semantic value of type (c, t).

We could then complicate semantic values in the same way throughout our
entire system. Intransitive verbs, rather than being simply (e, t), would be (c,
(e, t)). Transitive verbs would shift from (e, (e, t)) to (c, (e, (e, t))). Quantified
noun phrase would shift from ((e, t), t) tp (c,((e, t), t)). And so on.

But there is another complication that comes with this approach. The compli-
cated version of ~laughed, for example, is going to be:
• ~laughed=λc.λx.x laughed
But now consider the calculation of ~I laughed:

• ~I laughed
• = ~laughed(~I)
• = (λc.λx.x laughed)(λc.c(1))
But here the computation crashes. λc.λx.x laughed requires a context (a member
of c) as input, but what it’s getting as an input is λc.c(1), which is not a context.
The source of the crash is clear in the typing. Laughed is type (c, (e, t)) and I
is type (c, e). But these two types won’t functionally combine – each of them
takes type c as input, and neither is type c, so neither can take the other as input.

158
We can fix this by changing to a fancier version of functional application. The
key thought is this: before we started adding the new semantic type c to our
system, we would have two expressions of types α and (α, β), and we would
combine them using functional application. After we add type c to the system,
these two expressions become types (c, α) and (c, (α, β)). So we now need to
be able to combine two expressions of type (c, α) and (c, (α, β)). To do this, we
want to take an arbitrary context c, calculate the α value of the first expression
applied to c, calculate the (α, β) value of the second expression applied to c, and
then apply the resulting (α, β) value to the resulting α value. Then we want to
generalize that whole procedure for all choices of c.

It’s not hard to write down a rule for doing this:


• C-Functional Application: Suppose E1 is an expression of type (c, (α, β))
and E2 is an expression of type (c, α), such that:
1. ~E1 = λc.λxα .y(x, c)β
2. ~E2 = λc.y(c)α
Then the c-functional application of ~E1 to ~E2, which we write ~E1[~E2]
(with the square brackets marking the difference between standard func-
tional application and c-functional application) is defined as:
– ~E1[~E2] = λdc .((λc.λxα .y(x, c)β )(d))((λc.y(c)α )(d))

Problem
" 184: Suppose# we have two contexts" c1 and c2 , and ~I = #
c1 → Aristotle Aristotle → >
. Let f be the function ,
c2 → Plato Plato → ⊥
" #
c1 → f
and let ~laughed = . (We thus represent laughed as
c2 → f
context-insensitive.)

Calculate ~I laughed = ~laughed(~I). Verify that the resulting


value is type (c, t) and thus suitable as a semantic value for a sen-
tence. Is the specific (c, t) semantic value we get a sensible one for
~I laughed, given our starting values for ~I and ~laughed?

Semantic values of this complicated sort – values that are functions from con-
texts to our earlier simpler semantic values – are called characters. The character
of I is thus a function from contexts to the ordinary referent of I in contexts. The
ordinary referent of I varies from context to context – in one context, I picks
out Aristotle, and in another context, I picks out Plato. Thus if we want ordi-
nary referents as semantic values for I, semantic values need to be relativized
to contexts. But the character of I doesn’t vary from context to context. No
matter what context we are in, the character of I is the function from contexts
to speakers in (that is, first elements of) those contexts.

159
What we have seen, then, is that we can avoid relativization of semantic values
to contexts by running our entire system at the level of characters. There are
two prices that we pay when we do this. First, as we’ve already seen, we have
to shift from simple functional application to the somewhat more complicated
c-functional application. Second, because everything is done at the level of
characters, we never actually assign truth values to any sentences. When we
work with relativized semantic values, we do end up saying:

• ~I admire youhAristotle,Plato,350 B.C.,Athensi = >

so our semantic tools directly acknowledge Aristotle’s utterance as true. But


when we work with characters, the only thing our semantic tools say about
Aristotle’s utterance of I admire you is what we say about any utterance
of I admire you – namely, that it has a character that maps any context in
which the first (speaker) element admires the second (audience) element to the
true. Our semantic tools don’t then make any distinction between Aristotle’s
true utterance of I admire you and Carnap’s false utterance to Heidegger of
I admire you. Even a context-insensitive sentence like Aristotle admires
Plato doesn’t get a truth value. What we get is:
• ~Aristotle admires Plato = λc.Aristotle admires Plato = λc.>

That is, Aristotle admires Plato is assigned the character that maps every
context to the true – but isn’t directly assigned a truth value.

Here is one more option for incorporating context sensitivity into our semantic
machinery. We can build context sensitivity into our starting semantic types.
So:
• Instead of type e being the type of objects, type e – or what we might
call type e∗ to avoid confusion – is the type of functions from contexts to
objects. Type e∗ is thus the type of object characters.
• Instead of type t being the type of truth values, type t – or what we might
call type t∗ to avoid confusion – is the type of functions from contexts to
truth values. Type t∗ is thus the type of truth value characters.
So far this is just a notational variant on our previous approach. On the pre-
vious approach, expressions that we originally had put in category e were in
category (c, e), and expressions that we had originally put in category t were
in category (c, t). But e∗ just is (c, e) – both are the collection of functions from
contexts to objects. And t∗ just is (c, t) – both are the collection of functions from
contexts to truth values.

But from here, the two approaches diverge. Once we set e6∗ and t∗ as the basic
categories, we can (as in our original system) treat intransitive verbs as type
(e∗ , t∗ ). That’s unlike the previous option, which treated intransitive verbs as
type (c, (e, t)). The previous option, that is, assigned to intransitive verbs (e,

160
t) characters. But that’s not what we’re now doing – (e∗ , t∗ ) isn’t a character,
because it’s not a function from a context to anything.

So what is, for example, ~laughed on this approach? We want ~laughed to


be type (e∗ , t∗ ). That is, ~laughed should be a function from functions from
contexts to objects to functions from contexts to truth values. In particular, we
have:
• ~laughed = λxe∗ .λc.x(c) laughed
The advantage of this is that we can calculate ~I laughed simply by using
standard functional application. The typing works out straightforwardly. Be-
cause ~laughed is type (e∗ , t∗ ) and ~I is type e∗ , ~laughed is a function that
can take ~I as an input. Working through the details, we then have:
• ~I laughed
• = ~laughed(~I)
• = (λxe∗ .λc.x(c) laughed)(λd.d(1))
• = λc.(λd.d(1))(c) laughed
• = λc.c(1) laughed
This is the same semantic value for I laughed that we would have gotten
on the previous character-based approach. It’s a truth value character – a
function that maps each context to the truth value of I laughed as uttered
in that context. This approach thus shares with the character-based approach
the disadvantage of not assigning a truth value to the sentence I laughed,
but only a character. But this time we at least calculate the character without
using a novel procedure of c-functional application – we are able to get by with
standard functional application.
Problem 185: Suppose there are two contexts c1 and c2 , and three
objects a, b, and c. Determine how many things there are in each of
the following types:
1. e∗
2. t∗
3. (e∗ , t∗ )
4. ((e∗ , t∗ ), t∗ )
Give a specific example of a member of type (e∗ , (e∗ , t∗ )). (You
can specify the member either by explicitly setting out the relevant
function or by naming it with an appropriate lambda term.)
Problem 186: Give a plausible (e∗ , (e∗ , t∗ )) value for ~admires.
Using this semantic value, calculate ~Aristotle admires Plato.
(What should ~Aristotle and ~Plato be, if both expressions are
type e∗ ?)

161
We’ve now seen three different ways to add context-sensitivity to our semantic
theory. All three start with the use of contexts, represented as ordered quadru-
ples of speaker, audience, time, and location. But the three then make use of
contexts in different ways:
1. Relativized Semantic Values: Relativize all semantic values to contexts,
so that instead of using ~I or ~You laughed, we use ~Ic or ~You
laughedc .
2. Character Semantic Values:Use unrelativized semantic values, but change
the type of all expressions to add a context argument, so that where we
formerly used type (e, —bf t) we now use type (c, (e, t)) and where we
formerly used type (t, t) we now use type (c, (t, t)).
3. Setwise Semantic Values: Use unrelativized semantic values, but change
the basic types from e (objects) and t (truth values) to e∗ (functions from
contexts to objects) and t∗ functions from contexts to truth values. Then
build up semantic values of other expressions in the familiar way from
this different starting point, so that where we formerly used type (e, t) we
now use type (e∗ , t∗ ) and where we formerly used type (t, t) we now use
type (t∗ , t∗ ).
An overview comparison of the features of the three approaches:

Relativized Semantic Character Semantic Setwise Semantic


Values Values Values
Method of Combining Relativized Functional C-Functional Application Functional Application
Semantic Values Application
Semantic Value of Truth value (relative to Function from contexts Function from contexts
Sentences a context) to truth values to truth values
Sample Semantic Type (e, (e, t)) (relative to (c, (e, (e. t))) (e∗ , (e∗ , t∗ ))
(Of Transitive Verb) a context)
We’re about to set context-sensitivity aside for a while, although there is much
more to say about it and we’ll return to it later. So until further notice, we’ll be
working with a context-insensitive fragment of the language and won’t make
any of these three changes. However, as we’ll see shortly, the relativization
idea has many useful applications outside of context-sensitivity, so we’ll soon
see more relativization enter the system.

When we do return to context-sensitivity, the approach of relativized semantic


values will be our default approach, but we’ll also occasionally look at how
things work out with the other two approaches.

Problem 187: The central difference between character semantic


values and setwise semantic values is that using character seman-
tic values replaces every former typing α with a new typing (c, α),
thus allowing a different old-style semantic value for each context,

162
whereas using setwise semantic values replaces only the basic types
e and t with functions from contexts to old-style semantic values,
and then builds up complex types from there. We might then ex-
pect a difference in how well the two approaches deal with context-
sensitivity in expressions that are not (on the old approach) e type
or t type.

To test this, let’s introduce a potential case of a context-sensitive


verb. Suppose that the intransitive verb smile is vague in the fol-
lowing way. Any smile-like curvature of the mouth can be assigned
a number from 0 to 100, where 0 represents a perfectly straight
mouth and 100 represents a maximally exaggeratedly upturned
mouth. There is then no determinate cutoff point for what score
marks a mouth as smiling. Rather, in different contexts, different
amounts of curvature are needed to achieve a smile. We thus add a
fifth element to a context, so that a context c is an ordered quintuple
hα, β, γ, δ, i, where  is a number from 0 to 100 representing the
smiling threshold. Relative to c, anyone whose mouth curvature is
at or above  counts as smiling; anyone whose curvature is below 
does not count as smiling.

Suppose, then, that Sam’s mouth curvature is 60, and that c1 (5) = 50
and c2 (5) = 80. We then have:
• ~Sam smilesc1 = >
• ~Sam smilesc2 = ⊥
That’s the result when we implement context-sensitivity by rela-
tivizing semantic values to contexts. Let’s now see how things
work out on the other two approaches.
1. Suppose we implement context-sensitivity by using character
semantic values. Then smiles is type (c, (e, t)). Give an ap-
propriate lambda expression of this type for ~smiles, and
calculate the resulting ~Sam smiles. Is the result plausible?
2. Suppose we implement context-sensitivity by using setwise se-
mantic values. Then smiles is type (e∗ , t∗ ). Give an appropriate
lambda expression of this type for ~smiles, and calculate the
resulting ~Sam smiles. Is the result plausible?
What do we learn from all of this about the comparative abilities
of the character semantic value approach and the setwise semantic
value approach to deal with context-sensitive expressions across a
wider range of the language?

163
AO A Problem About Every Linguist

We now have an account that gives us a satisfactory typing for Every linguist
admires Aristotle:

((e, t), t) (e, t)

((e, t), ((e, t), t)) (e, t) (e, (e, t)) e

every linguist admires Aristotle


But there is new trouble right next to this sentence. Consider Aristotle
admires every linguist:

e ?

Aristotle
(e, (e, t)) ((e, t), t)

admires
((e, t), ((e, t), t)) (e, t)

every linguist
If we type each of Aristotle, admires, every, and linguist as before, we
encounter an immediate problem. The type (e, (e, t)) admires cannot combine
with the type ((e, t), t) every linguist. Admires takes type e as input, and
every linguist is not of type e, so every linguist can’t serve as input to
admires. And every linguist takes type (e, t) as input, and admires is not of
type (e, t), so admires can’t serve as input to every linguist. But those are
the only options for functional application, so the semantic composition crashes.

Let’s try to fix this problem.

First Attempt: We look for a workable typing of Aristotle admires every


linguist. To keep revisions minimal, let’s assume changes are restricted to
every linguist, so that we keep Aristotle as type e and admires as type (e,
(e, t)). Then our partially typed tree is:

164
t

e
(e, (e, t))
Aristotle
admires every linguist
Admires every linguist then needs to be type (e, t) in order to combine with
type e Aristotle. That gives us two options:
1. Every linguist is the input to admires, and thus is type e.
2. Admires is the input to every linguist, so every lihguist takes an (e,
(e, t)) input and produces an (e, t) output. Thus every linguist is type
((e, (e, t)), (e, t)).
But we’ve already tried and abandoned the first approach. So instead we’ll
make every linguist be of type ((e, (e, t)), (e, t)). Given that linguist is type
(e, t), we then need every to be type ((e, t), ((e, (e, t)), (e, t))):

e (e, t)

Aristotle

(e, (e, t)) ((e, (e, t)), (e, t))

admires
((e, t), ((e, (e, t)), (e, t))) (e, t)

every linguist
That makes the typing work out. The only typing change that we have made
is for every, so we now need a suitable specific semantic value for every. We
first note that we need ~every linguist to be:
• ~every linguist = λx.λy.{z : z is a linguist} ⊆ {z : x(z)(y) = >}
To achieve this effect, we need ~every to be:
• ~every = λw.λx.λy.{z : w(z) = >} ⊆ {z : x(z)(y) = >}
This semantic value for every gives us a workable theory that produces the
right result for Aristotle admires every linguist. But we’ve had to pay a
significant price for the theory – we now have two words every in the language,
one of type ((e, t), ((e, t), t) and one of type ((e, t), ((e, (e, t)), (e, t))). And in fact
two versions of every won’t be enough:

165
1. If ditransitive verbs like give are of type (e, (e, (e, t))), then neither of
the above typings for every will allow every linguist to combine with
give to form give every linguist.
2. If prepositions like under are of type (e, ((e, t), (e, t))), then neither of the
above typings for every will allow every tree to combine with under to
form under every tree.

Problem 188: Give semantic types for two more versions of every
– one that will combine with ditransitive verbs and one that will
combine with prepositions. Then give a lambda expression for an
appropriate semantic value for each of those two versions of every.

Problem 189: Give another example of a linguistic context in which


every linguist can appear that can’t be handled by any of the four
versions of every discussed above and in the previous problem.
How should a fifth version of every be typed to work with that new
linguistic context?

Second Attempt: Perhaps the difficulty here is that we haven’t fully integrating
an idea we discussed earlier. Let’s put together two thoughts:

1. Transitive verbs are type (e, (e, t)) in order to explain how a transitive
verb can combine with two names (as in Aristotle admires Plato),
given that the names are type e.
2. Once we’ve introduced the generalized quantifier framework, we can
treat names as special cases of generalized quantifiers, and thus give
names semantic values of type ((e, t), t).
If we change our treatment of names to make them ((e, t), t), perhaps we should
then change our treatment of verbs so that they take ((e, t), t) as input. We
would then have:

1. Intransitive verbs are type (((e, t), t), t).


2. Transitive verbs are type (((e, t), t), (((e, t), t), t)).
If we make this shift, the typing of Aristotle admires every linguist falls
into place easily:

166
t

((e, t), t) (((e, t), t), t)

Aristotle

(((e, t), t), (((e, t), t), t)) ((e, t), t)

admires
((e, t), ((e, t), t)) (e, t)

every linguist
The basic idea is straightforward. We typed generalized quantifiers as ((e, t), t)
because we noticed that names like Aristotle and quantified noun phrases like
no linguist interacted differently with verbs. We could capture that difference
by sometimes using the verb as a function taking the subject as input (as with
Aristotle) and sometimes using the subject as a function taking the verb as
input (as with no linguist). But once we type-lift names to ((e, t), t), we’re no
longer trying to combine verbs sometimes with e and sometimes with ((e, t),
t), so we no longer need both modes of combination. Once we no longer need
both modes, we could set things up so that the verb is always taking the subject
as input, by changing verb inputs from e to ((e, t), t).

Problem 190: Suppose type e contains three objects a, b, and c.


1. When intransitive verbs are typed (e, t), how many possible
intransitive verb semantic values are there? Give an example
of one possible value.
2. When intransitive verbs are typed (((e, t), t), t), how many
possible intransitive verb semantic values are there? Give an
example of one possible value.
As you’ll see, there are a lot more (((e, t), t), t) values for intransitive
verbs than there are (e, t) values for intransitive verbs. Speculate
briefly on what is going on with the large increase in the range of
possible values.

Saying that intransitive verbs are type (((e, t), t), t) and transitive verbs are type
(((e, t), t), (((e, t), t), t)) settles the typing, but it doesn’t address the question
of what the particular semantic values of verbs should be. Consider first the
intransitive case. When laughs is (e, t), we have:

• ~laughs = λx.x laughs


But what about when laughs is type (((e, t), t), t)?

167
1. First Option: One possibility is to stick as close as possible to our previous
strategy. We can’t use exactly the same semantic value as before, because
that value was (e, t). But maybe we can just change the typing of the
variable:
• ~laughs = λx((e, t), t) .x laughs

It’s a subtle question whether this semantic value is even sensible. Here’s
a worry. Suppose for simplicity that there
h is only ione object
h a in type
i e.
Then (e, t) contains the two functions a → > and a → ⊥ . A
sample member of ((e, t), t) is then:
 h i 
 a → > → ⊥ 
•  h i 
a → ⊥ → > 

But if that’s a member of type ((e, t), t), we should be able to provide it as
input to our proposed type (((e, t), t), t) semantic value for laughs:
 h i 
 a → > → ⊥ 
• (λx((e, t), t) .x laughs)( h i )
a → ⊥ → > 
 h i 
 a → > → ⊥ 
• =  h
 i  laughs
a → ⊥ → > 
 h i 
 a → > → ⊥ 
But that’s a curious final result. The function   h i 
a → ⊥ → > 
surely doesn’t laugh. Since x((e, t), t) is always going to be a function,
the proposed semantic value is always going to make sentences about
laughing depend on whether some function laughs. All such sentences
will be false, which isn’t right.

Problem 191: To test this worry, suppose that Aristotle and


Plato are the only members of type e, and that Aristotle laughs
and Plato doesn’t. Give a semantic value for ~Aristotle of
type ((e, t), t). Then use that semantic value together with
the proposed semantic value λx((e, t), t) .x laughs for laughs to
calculate ~Aristotle laughs. How acceptable is the result?

It’s not clear that this worry is decisive. After all, the original motivation
for the ((e, t), t) treatment of quantified noun phrases was that those noun
phrases had meanings that could take the meaning of intransitive verbs
like laugh as an input. So maybe we should just insist on this view in
the use of laugh in the lambda term specification of the type (((e, t), t), t)
semantic value for laugh. In that case, we can just interpret “x laughs”
in that term as the application of x to the (e, t) value of “laugh”. That’s
not entirely happy, since it involves thinking of laugh as type (e, t) within

168
the lambda term, and then using that typing of laugh in order to specify
a more complex semantic value to type laugh outside the lambda term
as (((e, t), t), t). But it’s perhaps workable. Fortunately, there’s another
alternative that avoids the whole issue.
2. Second Option: We can avoid the typing worries we were just consider-
ing by continuing to assume that “laughs”, as used in the specification of
lambda terms, allows us to make proper sense of λx.x laughs as an (e, t)
term, and then making use of that term to specify a (((e, t), t), t) term. We
thus have:
• ~laughs = λx((e, t), t) .x(λy.y laughs)

Here there is no typing worry. The variable x is, by stipulation, type ((e,
t), t), and we already know that λy.y laughs is type (e, t). Thus x can take
λy.y laughs as input, and it will produce type t as output. Thus the entire
lambda term for ~laughs is a function that takes type ((e, t), t) as input
and produces type t as output. That’s a function of type (((e, t), t), t), as
desired.

Problem 192: Suppose again that Aristotle and Plato are the only
members of e, and that Aristotle laughs and Plato does not laugh.
Using the specification ~laughs = λx((e, t), t) .x(λy.y laughs), work
out in full detail what ~laughs is. Then specify ~Aristotle, with
Aristotle treated as type ((e, t), t). What do these two semantic
values then predict as the truth value of Aristotle laughs?

Then specify ~Plato (again treated as ((e, t), t), and determine
the predicted truth value of Plato laughs. Finally, assume both
Aristotle and Plato are philosophers. Specify ~some philosopher.
(If you want, you can (i) specify ~philosopher as type (((e, t), t), t),
rather than as type (e, t), and (ii) give an appropriate semantic value
for some treating it as type ((((e, t), t), t), ((e, t), t)), and then use these
two semantic values to calculate ~some philosopher. But you can
also just directly specify ~some philosopher if you prefer.) What
truth value is then predicted for Some philosopher laughs.

Problem 193: When we treat intransitive verbs as type (e, t), we


make a commitment about the maximum number of possible intran-
sitive verb meanings in a language. If there are 100 objects in type e,
for example, there are 2100 = 1, 267, 650, 600, 228, 229, 401, 496, 703, 205, 376
members of (e, t), and thus there are a maximum of 1,267,650,600,228,229,401,496,703,205,376
possible intransitive verb meanings.

Of course, a given language doesn’t have to (and probably won’t)


have words with each of those roughly one nonillion meanings. But
we might be tempted by a principle of Fullness:

169
• Fullness: Every (e, t) function is a possible semantic value of
an intransitive verb.
If we shift to treating intransitive verbs as type (((e, t), t), t), there is
an obvious analog Fullness∗ of Fullness:
• Fullness∗ : Every (((e, t), t), t) function is a possible semantic
value of an intransitive verb.
Consider the implications of Fullness∗ . Show that Fullness∗ com-
mits us to the possibility of adding to English some intransitive verb
(let’s call it gimble such that both of the following sentences are true:
1. Some badger gimbles.
2. No badger gimbles.
(Be as thorough as possible in explaining this, detailing exactly what
semantic value gimble will have to produce these truth values.) To
what extent does this consequence generalize? Should we on this
basis reject Fullness∗ ? If we do reject Fullness∗ , what (if anything)
does this tell us about the plausibility of the (((e, t), t), t) treatment
of intransitive verbs?

Problem 194: Suppose we want to reject Fullness∗ , but reject it in


a principled way that lets us keep the idea that lay behind simple
Fullness. Then we might suggest that we start with an (e, t) seman-
tics for intransitive verbs, and combine (e, t) semantic values with
a special LIFT operator that transforms those values into (((e, t), t),
t) semantic values.

First consider a simple special case. Suppose type e contains only a


single object a. Then (e, t) contains two objects:
h i
1. a → >
h i
2. a → ⊥
Type ((e, t), t) then contains four objects (here listed using charac-
teristic sets rather than functions):
h i
1. { a → > }
h i
2. { a → ⊥ }
h i h i
3. { a → > , a → ⊥ }
4. ∅
And the lifted type (((e, t), t), t) for intransitive verbs will then con-
tain 16 items. Which two of those 16 items should serve as the lifted
versions of the two members of (e, t)?

170
Next let’s attempt to generalize. What type should the proposed
operator LIFT be? Propose a lambda expression as the specific
semantic value for ~LIFT. Then test your value by giving a simple
model with two objects Chomsky and Russell (where Chomsky is
a linguist and Russell is not), giving a (e, t) semantic value for
~laughs, and working out the truth value of both of:
1. Some linguist laughs
2. Russell laughs
in your model, on the assumption that these sentences have the
structures:
1.

Some linguist LIFT laughs


2.

Russell
LIFT laughs

Problem 195: Now let’s put all of the pieces together. First, give a
semantic value for admires. You can either:
1. Directly give a suitable (((e, t), t), (((e, t), t), t)) value for admires.
2. Give a traditional (e, (e, t)) value for admires, and then give
a generalized version of the LIFT operator from the previous
problem that will transform that (e, (e, t)) value into a suitable
(((e, t), t), (((e, t), t), t)) value.
Then give ((e), t), t) values for Aristotle and every linguist in the
usual way. Finally, calculate the full semantic value for Aristotle
admires every linguist via functional application, and check to
see that the resulting truth conditions are plausible.

AP More Admiration Problems

By treating names and quantified noun phrases both as type ((e, t), t) and
by lifting verbs (and other associated expressions like adverbs, although we
haven’t focused on that aspect of things) to higher types such as (((e, t), t), t)
and (((e, t), t), (((e, t), t), t) (replacing e with ((e, t), t) throughout, we can get a
workable typing for a sentence like Aristotle admires every linguist with
a quantified noun phrase in object position. And we’ve made some preliminary
stabs at picking out appropriate semantic values within those verb types. For
an intransitive verb like laughs, we’ve considered two approaches:

171
1. Directly specify ~laughs via the condition ~laughs = λx((e, t), t) .x(λy.y laughs)

2. Start with the unlifted (e, t) semantic value for laughs, and then apply a
LIFT operator of type ((e, t), (((e, t), t), t) to produce a type (((e, t), t), t)
value for LIFT laughs.

But it’s not hard to see that the first approach goes wrong when we turn to
transitive verbs. The obvious generalization of the first approach would be
something like:
• ~admires = λx((e, t), t) .λy((e, t), t) .y(x(λu.λv.v admires u))

But now the typing does not work out. λu.λv.v admires u is type (e, (e, t)), and
x is type ((e, t), t). But x then cannot take λu.λv.v admires u as input. We can try
rearranging what is used as input to what, but it won’t help. A little inspection
shows that we’re back at our original problem – we can’t successfully combine
types ((e, t), t) and (e, (e, t)) in any order.

The typing will work once we get admires up to type (((e, t), t), (((e, t), t), t) –
the puzzle is just how to get this higher type. Perhaps, then, a suitable LIFT
operator will do the job.

Consider first the details of LIFT for intransitive verbs. Suppose type e con-
 a → > 
 
tains three objects a, b, and c, and suppose ~laughs =  b → > . Then
 
c → ⊥
 
LIFT laughs should pick out a collection of ((e, t), t) values appropriately
corresponding to ~laughs. That is, ~LIFT laughs should pick out a set of
subsets of the diagram:

abc

bc ac ab

c b a

172
But which subsets of the diagram should be included in ~LIFT laughs?

So that we can consider specific sentences, let’s assume that Albert is a name
for a, Beatrice is a name for b, and Clarissa is a name for c, and that the
common noun linguist is true of all of a, b, and c. Then some observations:
1. Albert laughs should be true, so ~Albert = {{a}, {a, b}, {a, c}, {a, b, c}}
should be in ~LIFT laughs.
2. Beatrice laughs should be true, so ~Beatrice = {{b}, {a, b}, {b, c}, {a, b, c}}
should be in ~LIFT laughs.
3. Clarissa laughs should be false, so ~Clarissa = {{c}, {a, c}, {b, c}, {a, b, c}}
should not be in ~LIFT laughs.
4. Some linguist laughs should be true, so ~some linguist = {{a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}}
should be in ~LIFT laughs.
5. No linguist laughs should be false, so ~no linguist = {∅} should not
be in ~LIFT laughs.
6. Every linguist laughs should be false, so ~every linguist = {{a, b, c}}
should not be in ~LIFT laughs.
7. Most linguists laugh should be true, so ~most linguists = {{a, b}, {a, c}, {b, c}, {a, b, c}}
should be in ~LIFT laughs.
8. At most one linguist laughs should be false, so ~at most one linguist
= {∅, {a}, {b}, {c}} should not be in ~LIFT laughs.
Looking for patterns, one striking observation here is that:
• The set of laughers – {a, b}, or ~laughs↓ – is a member of the semantic
value of every quantified noun phrase that should be in ~LIFT laughs
and is not a member of the semantic value of any quantified noun phrase
that should not be in ~LIFT laughs.
This observation suggests a hypothesis:
• ~LIFT laughs↓ = {X ⊆ {a, b, c} : {a, b} ∈ X}.
Or abstracting from the particulars of what individuals there are and which
individuals laugh:
• ~LIFT laughs↓ = {X ⊆ e : ~laughs↓ ∈ X}
And from this we can extract a rule for ~LIFT laughs itself, rather than its
corresponding set:
• ~LIFT laughs = λx((e, t), t) .x(λy.y laughs)
Finally, from there we can give the semantic value of LIFT itself:
• ~LIFT = λy(e, t) .λx((e, t), t) .x(y)

173
AQ Quantifiers and Scope

We’ve now seen that by treating names and quantified noun phrases both as
type ((e, t), t) and by lifting verbs (and other associated expressions like adverbs,
although we haven’t focused on that aspect of things) to higher types such as
(((e, t), t), t) and (((e, t), t), (((e, t), t), t) (replacing e with ((e, t), t) throughout,
we can get a workable typing for a sentence like Aristotle admires every
linguist with a quantified noun phrase in object position, and with a bit more
work we can even get plausible truth conditions for such a sentence.

Unfortunately, problems remain. The problems appear when we look at sen-


tences with more than one quantified noun phrase, such as:
• Every philosopher admires some linguist.

Every philosopher admires some linguist is ambiguous. On one reading,


it requires that for each philosopher, there is some linguist that that philosopher
admires (but allows different philosophers to admire different linguists). Call
this the universal-existential, or ∀∃, reading. On another reading, it requires
that there be a single linguist that is admired by every philosopher. Call this
the existential-universal, or ∃∀, reading.

We can bring out the difference between these two readings using diagrams.
Consider the following situation S1 :

Aristotle Chomsky

Plato Partee

Socrates Kratzer

This situation has three philosophers (Aristotle, Plato, and Socrates) on the
right and three linguists (Chomsky, Partee, and Kratzer) on the left. The ar-
rows indicate the admiring relation, so Aristotle admires Partee, Plato admires
Chomsky, and Socrates admires Kratzer. This situation then makes true the ∀∃
reading of Every philosopher admires some linguist, since each of Aristo-
tle, Plato, and Socrates has a linguist they admire. But it does not make true
the ∃∀ reading of Every philosopher admires some linguist, since there is
no one linguist that is admired by every philosopher.

Now consider the situation S2 :

174
Aristotle Chomsky

Plato Partee

Socrates Kratzer

In this situation there is a single linguist who is admired by every philosopher


– namely, Partee. This situation thus makes true the ∃∀ reading of Every
philosopher admires some linguist.

Problem 196: What is the truth value of the ∀∃ reading of Every


philosopher admires some linguist in S2 ?

Problem 197: Consider the sentence:


• Some philosopher admires every linguist.
Distinguish the two scope readings of this sentence. Give two situa-
tions that show that the two readings have different truth conditions.

Problem 198: It might be tempting to think that the supposed am-


biguity of Every philosopher admires some linguist is an illu-
sion. On the illusion view, the sentence has only the ∀∃ read-
ing, and the appearance of a ∃∀ reading is an illusion created by
the fact that the similar sentence Some linguist is admired by
every philosopher does have the ∃∀ reading (and not the ∀∃ read-
ing).

This view would let us say that the scope reading of a sentence is
fully determined by the linear order of the quantified noun phrases
in the sentence. Thus:
1. In Every philosopher admires some linguist, the every noun
phrase precedes the some noun phrase, so the sentence has ∀∃
truth conditions.
2. In Some linguist is admired by every philosopher, the some
noun phrase precedes the every noun phrase, s the sentence
has ∃∀ truth conditions.
But consider the sentence:
• Someone is killed in St Louis every 48 hours.
This sentence naturally receives a ∀∃ interpretation. (Does it also
allow a ∃∀ interpretation? If so, why is the ∀∃ interpretation fa-
vored?) But the linear order of its quantified noun phrases has some
before every, so the linear order of the quantifiers can’t always de-
termine the scoping.

175
Give three more examples of sentences whose natural reading has a
scoping that does not match the linear order of the quantified noun
phrases in the sentence. (Try to make your examples as different as
possible from the St Louis example.) Can you give an example of
a sentence with three quantified noun phrases that is at least three
ways ambiguous due to multiple scope options?

Problem 199: Show that the ∃∀ reading of Every philosopher


admires some linguist implies the ∀∃ reading.

The fact that the ∃∀ reading implies the ∀∃ reading can create a
worry about whether there are really two distinct readings of the
sentence. Perhaps there is only the ∀∃ reading, and what we have
been identifying as the ∃∀ reading is just a specific way of mak-
ing that ∀∃ reading true. (Compare the sentence The linguists
fought the the philosophers. One way this sentence can be
made true is via a one-to-one pairing that has each linguist fight-
ing a unique philosopher and each philosopher fought by a unique
linguist. But it’s not obvious that we want to say that there is a
special one-to-one reading of the sentence, rather than just one kind
of scenario among many that satisfies the general requirement that
there be a lot of fighting between linguists and philosophers.)

To address this worry, we’ll show that with quantifiers other than
some and every, we can get two different scope readings such that
neither is equivalent to the other. For each of the following sen-
tences, give two scenarios, one of which makes one scoping true
and the other false and one of which makes the other scoping true
and the first scoping false.
1. Most philosophers admire most linguists.
2. At least three linguists admire at least two philosophers.
3. Few philosophers admire no philosophers.
4. All but one linguist admires all but one philosopher.

AR How to Produce Multiple Scope Readings

We need to produce two different semantic analyses of Every philosopher


admires some linguist, capturing the ∀∃ and the ∃∀ scope readings of that
sentence. But there’s an immediate obstacle to producing two readings. Given
the machinery we’ve developed so far, once we settle on a tree for the sentence:

176
every philosopher admires
some linguist
and then assign semantic values to each of the leaves of the tree, it’s completely
settled by the rule of Functional Application what semantic values will be
assigned to all of the higher nodes of the tree, including the root node for the
entire sentence. There’s thus no opportunity to produce more than one reading
of the sentence.
[Skipped Stuff. Moving on to Variable-Relativized Semantics]

AS Assignment-Relativized Semantic Values

Step 1: Just for convenience, we’ll use as our variables:


• x1 , x2 , x3 , . . .
This gives us a nice easy way to refer to specific variables. This gives us in-
finitely many variables – one for each positive integer n. That insures that we
are never in danger of running out of available variables.

Step 2: We define variable assignments. A variable assignment assigns a value


(a specific object; a member of type e) to each variable. Because the variables are
numbered, we can just use an ordered list of objects as a variable assignment.
So:
• σ = hAristotle, Socrates, Plato, . . . i
is a variable assignment. We’ll use σ(1) to pick out the first object in the ordered
list σ, and σ(2) to pick out the second object in the ordered list σ, and so on.

Problem 200: Let σ be the variable assignment that begins by listing


all of the states of the United States in alphabetical order. What is
σ(10)?

Problem 201: Give variable assignments satisfying each of the fol-


lowing conditions. (Variable assignments, of course, are infinite
lists. So you don’t need to list out the assignment in full. Either
list enough of the assignment to show that it satisfies the condition,
or give a general description of how the assignment proceeds that
shows why it satisfies the condition.)
1. σ(1) = Paris and σ(2) = London

177
2. σ(1) , Aristotle
3. σ(1) = σ(3), σ(2) = σ(4), but σ(1) , σ(2)
4. σ(i) = σ(j) for all i, j.
5. σ(i) > i + 1 for all i

Problem 202: If there is only one object in type e, how many different
variable assignments are there? What if there are two objects in type
e?

Step 3: We relativize semantic values to variable assignments. Previously,


expressions had semantic values absolutely. We write:

• ~laugh = λx.x laughs


and thereby say what the semantic value of laugh is. But now we start giving
expressions semantic values only relative to a variable assignment. So a single
expression can have one semantic value relative to one variable assignment
and a different semantic value relative to a different variable assignment.

An Analogy: We can’t give the word I a referent in an absolute way.


Any claim of the form:
• I refers to Aristotle//I refers to Plato//etc
won’t be right, because I sometimes refers to Aristotle and some-
times refers to Plato.

Instead, we need to say what I refers to in a context. In a context in


which Aristotle is speaking, I refers to Aristotle. But in a context in
which Plato is speaking, I refers to Plato. So we relativize the notion
of reference to a context.

(This is only a rough analogy right now, because we haven’t yet


introduced formal tools for dealing with context sensitivity in lan-
guage. When we do introduce those tools, we can then see how
close the analogy ends up being.)

So we need to rework our basic tools to make semantic values relative to variable
assignments. We’ll use two central ideas to do the reworking:
1. Assignment Insensitivity: Expressions that don’t contain variables won’t
be sensitive to the choice of variable assignment. That is, they will have
the same semantic value relative to every variable assignment. Given that
we previously said:
• ~laugh = λx.x laughs
and given that laugh does not contain any variables, we’ll now say:

178
• For any variable assignment σ, ~laughσ = λx.x laughs
Because σ isn’t mentioned anywhere in the expression λx.x laughs, it
doesn’t matter which variable assignment we pick. So suppose:
(a) σ1 = hAristotle, Socrates, Plato, . . . i
(b) σ2 = hParis, London, Berlin, . . . i
Then ~laughσ1 =~laughσ2 = λx.x laughs.
Now that we are relativizing semantic values to variable assignments,
there really isn’t any such thing as simple ~laugh any more. But since
~laughσ is the same no matter what σ is, we’ll nevertheless still some-
times just refer to ~laugh, where this is just shorthand for ‘~laughσ , but
we don’t care what σ is.’
2. Assignment Sensitivity: Variables are sensitive to the choice of assign-
ment function, in a very simple way. Given assignment function σ and
variable xi , we have:

• ~xi σ = σ(i)
So with σ1 and σ2 as before, we have:
• ~x1 σ1 = σ1 (1) = Aristotle
• ~x1 σ2 = σ2 (1) = Paris
• ~x2 σ1 = σ1 (2) = Socrates
• ~x2 σ2 = σ2 (2) = London
From this starting point, we can proceed to use functional application as before.
So consider x1 laughs. We have:

• ~x1 laughsσ
• = ~laughsσ (~x1 σ )
• = λx.x laughs(σ(1))
• = σ(1) laughs

Suppose that e contains only Aristotle, Plato, and Socrates, and that ~laughs
is the function:
 Aristotle → > 
 
• ~laughs =  → ⊥ 
 Plato
Socrates → >
 

Suppose we have assignment functions:


1. σ1 = hAristotle, Plato, Socrates, . . . i

179
2. σ2 = hPlato, Socrates, Aristotle, . . . i
3. σ3 = hSocrates, Plato, Aristotle, . . . i
Then we obtain:
1. ~x1 laughsσ1 =σ1 (1) laughs = Aristotle laughs = >

2. ~x1 laughsσ2 =σ2 (1) laughs = Plato laughs = ⊥


3. ~x1 laughsσ3 =σ3 (1) laughs = Socrates laughs = >
So x1 laughs is true relative to assignment functions that assign x1 to Aristotle
or Socrates, but false relative to assignment functions that assign x1 to Plato.
That’s the right result, given that Aristotle and Socrates laugh but Plato doesn’t.

Another example. Consider x1 admires x2 . As before, admires is assignment


insensitive, so we have:
• ~admiresσ = λx.λy.y admires x

So we have:
• ~x1 admires x2 σ
• = (~admiresσ (~x2 σ ))(~x1 σ )

• = (λx.λy.y admires x(σ(2)))(σ(1))


• = σ(1) admires σ(2)
Suppose Aristotle admires Plato, but Plato does not admire Aristotle. Take:
1. σ1 = hAristotle, Plato, Socrates, . . . i

2. σ2 = hPlato, Aristotle, Socrates, . . . i


Then:
1. ~x1 admires x2 σ1 = Aristotle admires Plato = >

2. ~x1 admires x2 σ2 = Plato admires Aristotle = ⊥

Problem 203: Explain how the semantic value of Aristotle admires


Plato depends on the choice of assignment function σ. Is that a rea-
sonable result?

180
AT Variable Binding: The Single-Variable Case

With semantic values relativized to assignment functions, expressions with


variables all end up being of the type that we’d normally expect. Relative to
a choice σ of assignment function, a sentence x1 laughs gets a value of type t,
just like the sentence Aristotle laughs. Relative to a choice σ of assignment
function, the verb phrase admires x2 gets a value of type (e, t), just like the
expression admires Plato.

Unfortunately, this creates a problem for our plan to deal with quantifier noun
phrases by using movement-altered trees. Consider a simple example such
as Some linguist laughs. We’re now considering the following movement-
altered tree for that sentence:

some1 linguist x1 laughs


But consider how our standard typing works out (recalling that all of these
types now determine values relative to a variable assignment):

((e, t), t) t

e (e, t)
((e, t), ((e, t), t)) (e, t)
x1 laughs
some1 linguist
Unfortunately, the ((e, t), t) value provided by some linguist won’t func-
tionally combine with the t value provided by x1 laughs. Our second-order
property treatment of quantified noun phrases was built on the assumption
that these noun phrases would then combine with verb phrases of type (e, t),
but we’ve now given up that assumption by restructuring the syntactic trees.

The restructuring is good – it helps us avoid having quantified noun phrases


sometimes combine with (e, t) (when the quantified noun phrase is in subject
position) and sometimes combine with (e, (e, t)) (when the quantified noun
phrase is in object position). Now the quantified noun phrase is always combin-
ing with type t. That simplifies things, but at the price of giving us the wrong
type to combine with.

We’re going to fix this by adding one more hidden bit of syntax. This last bit of
hidden syntax will convert the type t value of x1 laughs back into a type (e, t)
value suitable for combining with the type ((e, t), t) some linguist:

181
t

((e, t), t) (e, t)

t
((e, t), ((e, t), t)) (e, t) λ1
e (e, t)
some1 linguist
x1 laughs
Now we need to explain how λ1 works. Here’s the basic idea. x1 laughs is of
type t, relative to a variable assignment σ. The job of the variable assignment
is, in this case, to provide a value to the variable x1 . Once that value has been
provided, we can work out the simple truth value of x1 laughs. So we have
the makings of a function from objects to truth values. Given any particular
object, we report back the truth value of x1 laughs when x1 is assigned to that
object. A bit more carefully, we have:

• λ1 applied to ~x1 laughsσ yields the function that maps any given object
o to the truth value of x1 laughs relative to the variable assignment σ0
which is just like σ except that it contains o in the first position instead of
whatever object σ contains in its first position.
Let’s go through this carefully. Suppose that the laughter facts are as we
considered earlier:
 Aristotle → > 
 
• ~laughs =  → ⊥ 
 Plato
Socrates → >
 

Suppose we have:
• σ = hAristotle, Plato, Socrates, . . . i
We now apply λ1 to ~x1 laughsσ . We get a function that maps each object to
some truth value. To build that function, let’s go through the objects:

1. First, Aristotle. We make a new variable assignment σ0 that is just like


σ except that it puts Aristotle in the first position. Of course, σ already
had Aristotle in the first position, so this doesn’t require any change in σ.
Thus:
• σ0 = hAristotle, Plato, Socrates, . . . i

We then check ~x1 laughsσ . This is:


0

• ~laughsσ (~x1 σ )
0 0

• = λx.x laughs(σ0 (1))


• = λx.x laughs(Aristotle)

182
• = Aristotle laughs
• =>
Thus our new function maps Aristotle to >.
2. Second, Plato. We make a new variable assignment σ0 that is just like σ
except that it puts Plato in the first position. Thus:
• σ0 = hPlato, Plato, Socrates, . . . i
We then check ~x1 laughsσ . This is:
0

~laughsσ (~x1 σ )
0 0

• = λx.x laughs(σ0 (1))
• = λx.x laughs(Plato)
• = Plato laughs
• =⊥
Thus our new function maps Plato to ⊥.
3. Third, Socrates. We make a new variable assignment σ0 that is just like σ
except that it puts Socrates in the first position. Thus:
• σ0 = hSocrates, Plato, Socrates, . . . i
We then check ~x1 laughsσ . This is:
0

~laughsσ (~x1 σ )
0 0

• = λx.x laughs(σ0 (1))
• = λx.x laughs(Socrates)
• = Socrates laughs
• =>
Thus our new function maps Socrates to >.
Thus applying λ1 to ~x1 laughsσ produces a function that maps Aristotle and
Socrates to > and Plato to ⊥. That is, it produces the function:
 Aristotle → > 
 
• ~laughs =  → ⊥ 
 Plato
Socrates → >
 

That function is, of course, exactly the function we started with as ~laughs.
Our very complicated bit of formal machinery has just enabled us to re-extract
that function from the variable-assignment relativized t value assigned higher
in the tree. That’s a lot of work, but it does make things function properly. The
application of λ1 transforms the t value of x1 laughs into an (e, t) value, and
now that (e, t) value can be the input to the ((e, t), t) some linguist, and we’ll
get the right result.

183
AU Variable Binding: The Multiple-Variable Case

We’ve just shown that applying λ1 to ~x1 laughsσ produces the function:

 Aristotle → > 
 
• ~laughs =  Plato → ⊥ 

Socrates → >
 

But a curious feature of this derivation is that it doesn’t depend on which


variable assignment σ is. We used a specific σ (one that starts: Aristotle, Plato,
Socrates, . . . ), but that specific choice didn’t matter at any point. That’s because,
in constructing our new function by applying λ1 , the first thing we did was
always to overwrite the first position in σ with our new object. Then x1 would
inherit the new object in that position, and nothing else about σ would ever be
used.

Problem 204: To confirm this, consider the variable assignment:


• ρ = hParis, London, Berlin, . . . i
Go through a derivation similar to the one given above to calculate
the result of applying λ1 to ~x1 laughsρ . Do you get the same
result we got before?

However, when we have multiple variables involved, the variable assignment


does matter. To see this, let’s go carefully through part of the process for:

some1 linguist
λ1

every2 philosoper λ2
x1
admires x2
In particular, we’ll focus here just on the subtree:

λ2
x1
admires x2

Question: Why are we applying λ2 rather than λ1 in this case?

184
Answer: Good question. Roughly, the answer is that we’re prepar-
ing to join x1 admires x2 with every philosopher, and (as the ’2’
subscript on every indicates) every philosopher is meant to con-
nect to the variable x2 rather than the variable x1 . But to get a more
complete answer, we’ll need to see how the details work out.

To do this carefully, we’ll need a specific admires function to work with. Con-
tinuing with e containing Aristotle, Plato, and Socrates, we’ll use:

 Aristotle → ⊥  
   

 Aristotle →  Plato


 → ⊥  
Socrates → > 
  

 Aristotle → >  
   

• ~admires =  Plato → >  
 Plato
→ 
 
Socrates → ⊥ 
   


 Aristotle → >  
  


 Socrates →  Plato → ⊥  
 
Socrates → >
   

Then ~admires x2 σ =~admiresσ (~x2 σ ). We’ll consider three different vari-


able assignments:

1. σ1 is a variable assignment such that σ1 (2) = Aristotle.


2. σ2 is a variable assignment such that σ2 (2) = Plato.
3. σ3 is a variable assignment such that σ3 (2) = Socrates.
We then have:

1. ~admires x2 σ1 = ~admiresσ1 (~x2 σ1 ) = (λx.λy.y admires x)(σ1 (2)) = λy.y
 Aristotle → ⊥
 

admires Aristotle. λy.y admires Aristotle is then the function  Plato → ⊥ .

Socrates → >
 

2. ~admires x2 σ2 = ~admiresσ2 (~x2 σ2 ) = (λx.λy.y admires x)(σ2 (2)) = λy.y
 Aristotle → > 
 
admires Plato. λy.y admires Plato is then the function  Plato → > .
Socrates → ⊥
 

3. ~admires x2 σ3 = ~admiresσ3 (~x3 σ3 ) = (λx.λy.y admires x)(σ3 (2)) = λy.y
 Aristotle → >
 

admires Socrates. λy.y admires Socrates is then the function  Plato → ⊥ .

Socrates → >
 

Next we want to consider ~x1 admires x2 σ , for various choices of relativizing


variable assignment σ. We have ~x1 admires x2 σ = ~admires x2 σ (~x1 σ ) =
~admires x2 σ (σ(1)).

185
So we need to know what our variable assignment has in its first position. We
didn’t specify earlier, so now we’ll split each of our three variable assignments
into three subcases:

σ11 (1) = Aristotle σ21 (1) = Plato σ31 (1) = Socrates


σ11 (2) = Aristotle σ21 (2) = Aristotle σ31 (2) = Aristotle
σ12 (1) = Aristotle σ22 (1) = Plato σ32 (1) = Socrates
σ12 (2) = Plato σ22 (2) = Plato σ32 (2) = Plato
σ13 (1) = Aristotle σ23 (1) = Plato σ33 (1) = Socrates
σ13 (2) = Socrates σ23 (2) = Socrates σ33 (2) = Socrates
(So the superscript determines what object is in the first position of the variable
assignment, and the subscript determines what object is in the second position
of the variable assignment.) We now have nine different variable assignments
we’re considering. We can calculate ~x1 admires x2 ] relative to each of them:

1. ~x1 admires x2 σ1 = ~admires x2 σ1 (~x1 σ1 ) = (λy.y admires Aristotle)(σ11 (1))
1 1 1

= (λy.y admires Aristotle)(Aristotle) = Aristotle admires Aristotle = ⊥.

2. ~x1 admires x2 σ1 = ~admires x2 σ1 (~x1 σ1 ) = (λy.y admires Aristotle)(σ21 (1))
2 2 2

= (λy.y admires Aristotle)(Plato) = Plato admires Aristotle = ⊥.

3. ~x1 admires x2 σ1 = ~admires x2 σ1 (~x1 σ1 ) = (λy.y admires Aristotle)(σ31 (1))
3 3 3

= (λy.y admires Aristotle)(Socrates) = Socrates admires Aristotle = >.

4. ~x1 admires x2 σ2 = ~admires x2 σ2 (~x1 σ2 ) = (λy.y admires Plato)(σ12 (1))
1 1 1

= (λy.y admires Plato)(Aristotle) = Aristotle admires Plato = >.

5. ~x1 admires x2 σ2 = ~admires x2 σ2 (~x1 σ2 ) = (λy.y admires Plato)(σ22 (1))
2 2 2

= (λy.y admires Plato)(Plato) = Plato admires Plato = >.

6. ~x1 admires x2 σ2 = ~admires x2 σ2 (~x1 σ2 ) = (λy.y admires Plato)(σ32 (1))
3 3 3

= (λy.y admires Plato)(Socrates) = Socrates admires Plato = ⊥.

7. ~x1 admires x2 σ3 = ~admires x2 σ3 (~x1 σ3 ) = (λy.y admires Socrates)(σ13 (1))
1 1 1

= (λy.y admires Socrates)(Aristotle) = Aristotle admires Socrates = >.

8. ~x1 admires x2 σ3 = ~admires x2 σ3 (~x1 σ3 ) = (λy.y admires Socrates)(σ23 (1))
2 2 2

= (λy.y admires Socrates)(Plato) = Plato admires Socrates = ⊥.

9. ~x1 admires x2 σ3 = ~admires x2 σ3 (~x1 σ3 ) = (λy.y admires Socrates)(σ33 (1))
3 3 3

= (λy.y admires Socrates)(Socrates) = Socrates admires Socrates = >.


Finally, we can see what happens when we apply λ2 as our choice of relativizing
variable assignment changes. We won’t work through all nine cases, but let’s
look in detail at a few.
1. When we apply λ2 to ~x1 admires x2 σ1 , we get the function that assigns
1

to each object o the truth value of x1 admires x2 relative to the variable

186
assignment (σ11 )0 that is just like σ11 except that it contains o in its second
position (that is, (σ11 )0 )(2) = o.
There are three choices for o: Aristotle, Plato, and Socrates. Consider each
in turn:
(a) When o is Aristotle, (σ11 )0 is the variable assignment that is just like
σ11 except that (σ11 )0 (2)=Aristotle. In fact, σ11 (2) is already Aristotle, so
(σ11 )0 = σ11 . From above, we see that ~x1 admires x2 σ1 = ⊥, so our
1

new function maps Aristotle to ⊥.


(b) When o is Plato, (σ11 )0 is the variable assignment that is just like σ11
except that (σ11 )0 (2)=Plato. Thus (σ11 )0 = σ12 . From above, we see that
~x1 admires x2 σ2 = >, so our new function maps Plato to >.
1

(c) When o is Socrates, (σ11 )0 is the variable assignment that is just like σ11
except that (σ11 )0 (2)=Socrates. Thus (σ13 )0 = σ13 . From above, we see
that ~x1 admires x2 σ3 = >, so our new function maps Socrates to
1

>.
Applying λ2 to ~x1 admires x2 σ1 thus produces a function that maps
1

 Aristotle → ⊥ 


 
Aristotle to ⊥ and Plato and Socrates to >, which is the function  Plato → . > .
Socrates → >
 

Notice that this function is (e, t), so once again the application of λ has
transformed a t value into an (e, t) value, which is what we want to make
the typing work out. Looking back at our original function assigned as
~admires, we see that this new (e, t) function we’ve obtained is just
λx.Aristotle admires x.

This function is not any of the three subfunctions that are produced as
output by applying ~admires to one of Aristotle, Plato, or Socrates.
Rather, it’s the function that can be found ‘hiding inside’ ~admires by
taking the first row of each of the three subfunctions.

2. Now let’s try applying λ2 to ~x1 admires x2 σ2 . When we apply λ2 to ~x1
1

admires x2 σ2 , we get the function that assigns to each object o the truth
1

value of x1 admires x2 relative to the variable assignment (σ12 )0 that is just


like σ12 except that it contains o in its second position (that is, (σ12 )0 )(2) = o.
Once again we have three choices for o: Aristotle, Plato, and Socrates.
Considering each in turn:

(a) When o is Aristotle, (σ12 )0 is the variable assignment that is just like
σ12 except that (σ12 )0 (2)=Aristotle. Thus (σ12 )0 = σ11 . From above, we
see that ~x1 admires x2 σ1 = ⊥, so our new function maps Aristotle
1

to ⊥.

187
(b) When o is Plato, (σ12 )0 is the variable assignment that is just like σ12
except that (σ12 )0 (2)=Plato. Thus (σ12 )0 is just σ12 . From above, we see
that ~x1 admires x2 σ2 = >, so our new function maps Plato to >.
1

(c) When o is Socrates, (σ12 )0 is the variable assignment that is just like σ12
except that (σ12 )0 (2)=Socrates. Thus (σ12 )0 = σ13 . From above, we see
that ~x1 admires x2 σ3 = >, so our new function maps Socrates to
1

>.
Applying λ2 to ~x1 admires x2 σ2 thus produces a function that maps
1

 Aristotle → ⊥ 
 
Aristotle to ⊥ and Plato and Socrates to >, which is the function  Plato
 → . > .
Socrates → >
 

That’s the same function that we got when we applied λ2 to ~x1 admires
x2 σ1 . Once again, the function we’ve obtained is just λx. Aristotle admires
1

x.
3. Let’s do one more, this time applying λ2 to ~x1 admires x2 σ3 . When
1

we apply λ2 to ~x1 admires x2 σ3 , we get the function that assigns to


1

each object o the truth value of x1 admires x2 relative to the variable


assignment (σ13 )0 that is just like σ13 except that it contains o in its second
position (that is, (σ13 )0 )(2) = o.
Once again we have three choices for o: Aristotle, Plato, and Socrates.
Considering each in turn:
(a) When o is Aristotle, (σ13 )0 is the variable assignment that is just like
σ13 except that (σ13 )0 (2)=Aristotle. Thus (σ13 )0 = σ11 . From above, we
see that ~x1 admires x2 σ1 = ⊥, so our new function maps Aristotle
1

to ⊥.
(b) When o is Plato, (σ13 )0 is the variable assignment that is just like σ13
except that (σ13 )0 (2)=Plato. Thus (σ13 )0 = σ12 . From above, we see that
~x1 admires x2 σ2 = >, so our new function maps Plato to >.
1

(c) When o is Socrates, (σ13 )0 is the variable assignment that is just like σ13
except that (σ13 )0 (2)=Socrates. Thus (σ31 )0 is just σ13 . From above, we
see that ~x1 admires x2 σ3 = >, so our new function maps Socrates
1

to >.
Applying λ2 to ~x1 admires x2 σ3 thus produces a function that maps
1

 Aristotle → ⊥ 
 
Aristotle to ⊥ and Plato and Socrates to >, which is the function 
 Plato → . > .
Socrates → >
 

We’ve gotten the same function again – this is the function that we got
when we applied λ2 to ~x1 admires x2 σ1 and to ~x1 admires x2 σ2 .
1 1

Once again, the function we’ve obtained is just λx.Aristotle admires x.

188
So λ2 produces the same result when applied to any of ~x1 admires x2 σ1 , ~x1
1

admires x2 σ2 , or ~x1 admires x2 σ3 . That’s because σ11 , σ12 , and σ13 differ only
1 1

in what object they contain in their second position. But what λ2 does is allow
us to vary the object in the second position, so that it no longer matters what
object the original sequence put in that position.

When we apply λ2 to any of ~x1 admires x2 σ1 , ~x1 admires x2 σ2 , or ~x1
1 1

admires x2 σ3 , what we do is:


1

1. Hold fixed the interpretation of x1 as determined by the first element of


σ11 , σ12 , or σ13 (where that first object is the same – namely, Aristotle – for
all three variable assignments).
2. Then see what truth value we get as we try different interpretations of x2
by varying the second object in the variable assignment.

The result, as we’ve seen, is λx.Aristotle admires x.

So when we apply λ2 to ~x1 admires x2 , it doesn’t matter which of the σ1


variants we are relativizing to. But what if we relativize to one of the σ2
variants, such as σ21 ?

Problem 205: Before we work through the formal details carefully,


see if you can anticipate what should happen just based on an intu-
itive picture of how the machinery is working. What (e, t) function
do you expect to be produced by applying λ2 to ~x1 admires x2 σ1 ?
2

Let’s now calculate the result of applying λ2 to ~x1 admires x2 σ1 . When we
2

apply λ2 to ~x1 admires x2 σ1 , we get the function that assigns to each object
2

o the truth value of x1 admires x2 relative to the variable assignment (σ21 )0 that
is just like σ21 except that it contains o in its second position (that is, (σ21 )0 )(2) = o.

Once again we have three choices for o: Aristotle, Plato, and Socrates. Consid-
ering each in turn:
1. When o is Aristotle, (σ21 )0 is the variable assignment that is just like σ21
except that (σ21 )0 (2)=Aristotle. Thus (σ21 )0 is just σ21 again. From above, we
see that ~x1 admires x2 σ1 = ⊥, so our new function maps Aristotle to ⊥.
2

2. When o is Plato, (σ21 )0 is the variable assignment that is just like σ21 except
that (σ21 )0 (2)=Plato. Thus (σ21 )0 = σ22 . From above, we see that ~x1 admires
x2 σ2 = >, so our new function maps Plato to >.
2

3. When o is Socrates, (σ21 )0 is the variable assignment that is just like σ21
except that (σ21 )0 (2)=Socrates. Thus (σ21 )0 is just σ23 . From above, we see
that ~x1 admires x2 σ1 = ⊥, so our new function maps Socrates to ⊥.
2

189
What we end up with, then, is a function that maps Aristotle and Socrates to ⊥
 Aristotle → ⊥ 
 
and Plato to >, or  Plato → > . With a little inspecting of our original
Socrates → ⊥
 
input-output chart for ~admires, we can recognize this as λx.Plato admires x.

Problem 206: Go through the details of calculating the effect of


applying λ2 to ~x1 admires x2 σ2 . You should once again get the
2

function λx.Plato admires x as the result.

Summary: We won’t go through all the details for the remaining 5 variable
assignments. Here’s the final result:
1. When we apply λ2 to ~x1 admires x2  relative to any of σ11 , σ12 , or σ13 –
that is, relative to any variable assignment that has Aristotle in the first
position – we get the (e, t) function λx.Aristotle admires x.
2. When we apply λ2 to ~x1 admires x2  relative to any of σ21 , σ22 , or σ23 – that
is, relative to any variable assignment that has Plato in the first position –
we get the (e, t) function λx.Plato admires x.
3. When we apply λ2 to ~x1 admires x2  relative to any of σ31 , σ32 , or σ33 –
that is, relative to any variable assignment that has Socrates in the first
position – we get the (e, t) function λx.Socrates admires x.

So unlike applying λ1 to ~x1 laughs, the result of the λ application does de-
pend on what the relativizing variable assignment is. It just doesn’t depend too
much on what the relativizing variable assignment is. We get different outputs
from the λ application depending on what the relativizing variable assignment
has in its first position and assigns to x1 . Other than that, nothing about the
variable assignment matters. (In particular, what the variable assignment has
in its second position and assigns to x2 doesn’t matter to the outcome of the λ
application.)

AV At Last, Some Linguist Admires Every Philosopher

At long last, we have all the pieces in place to work carefully through:

• some linguist admires every philosopher.


In fact, we’ll work through it twice, once for each of two post-movement trees
we can assign to the sentence. We have:
1. Tree ∃∀:

190
t

((e, t), t) (e, t)

((e, t), ((e, t), t)) (e, t)

some1 linguist t
λ1

((e, t), t) (e, t)

t
((e, t), ((e, t), t)) (e, t) λ2
every2 philosopher e (e, t)

x1 (e, (e, t)) e

admires x2
2. Tree ∀∃:
t

((e, t), t) (e, t)

((e, t), ((e, t), t)) (e, t)


t
every2 philosopher λ2

((e, t), t) (e, t)

((e, t), ((e, t), t)) (e, t) t


λ1
some1 linguist e (e, t)

x1 (e, (e, t)) e

admires x2
Next we want a scenario in which to evaluate these two sentences. To make
things easy, we’ll consider a scenario with just two linguists (Chomsky and
Partee) and just two philosophers (Aristotle and Plato). We can then draw a
picture of the scenario:

191
Chomsky Aristotle

Partee Plato

Linguists Philosophers

The arrows represent the admiring relation, so Chomsky admires Plato (but
not Aristotle) and Partee admires Aristotle (but not Plato). We can thus give
the input-output table for the ~admires function:
  
 Chomsky → ⊥  


 Partee → ⊥  
 
 Chomsky → 

 Aristotle → ⊥  
  

Plato → > 
  

 Chomsky → ⊥  
   

  Partee → ⊥  
 Partee
 → 
 Aristotle → >  
   
 Plato → ⊥ 
• ~admires =    
 Chomsky → ⊥  


 Partee → ⊥  
 
 Aristotle → 

 Aristotle → ⊥  
  

Plato → ⊥ 
 

 Chomsky → ⊥  
   

→ ⊥  
  Partee


 Plato → 
 Aristotle → ⊥  
  

Plato → ⊥
  

Finally, because we have just the four individuals Chomsky, Partee, Aristotle,
and Plato involved, the variable assignments we need to consider only involve
those four individuals. Because our sentence involves only the variables x1 and
x2 , we only care about the first two positions of the variable assignment. That
means that there are 16 variable assignments that matter. For convenience, we’ll
just refer to these as σ(X, Y), where X and Y are the two objects in the first and
second positions of the variable assignment. Thus, for example, σ(Chomsky,
Partee) is the variable assignment with Chomsky in the first position and Partee
in the second position.

Now we have all the pieces in place and we can start calculating final semantic
values for Tree ∃∀ and Tree ∀∃. The first step is the same for both of them:

192
we need to work out the semantic value of x1 admires x2 relative to different
variable assignments. Looking at the scenario we’ve set up, we can quickly see:
• ~x1 admires x2  is > relative to σ(Chomsky,Plato) and σ(Partee,Aristotle).
• ~x1 admires x2  is ⊥ relative to all the 14 other variable assignments.
After this, things diverge between Tree ∃∀ and Tree ∀∃. So we’ll work through
the remainder of each tree separately.

Tree ∃∀:
1. The next step is to apply λ2 to ~x1 admires x2 . As we’ve seen above,
this produces an (e, t) function that depends only on the first object in the
variable assignment sequence. We get:
(a) Whenever σ has Chomsky in the first position (that is, we have
σ(Chomsky,·)), the application of λ2 to ~x1 admires x2 σ produces
the (e, t) function λx.Chomsky admires x.
(b) Whenever σ has Partee in the first position, (that is, we have σ(Partee,·))
the application of λ2 to ~x1 admires x2 σ produces the (e, t) function
λx.Partee admires x.
(c) Whenever σ has Aristotle in the first position, (that is, we have
σ(Aristotle,·)) the application of λ2 to ~x1 admires x2 σ produces
the (e, t) function λx.Aristotle admires x.
(d) Whenever σ has Plato in the first position (that is, we have σ(Plato,·)),
the application of λ2 to ~x1 admires x2 σ produces the (e, t) function
λx.Plato admires x.
2. In each case, we then need to provide that (e, t) function as input to
~every philosopher. Every philosopher is assignment-insensitive, so
no matter what σ is, ~every philosopherσ is the ((e, t), t) function:
• λx.{z : z is a philosopher} ⊆ {z : x(z) = >}
What happens when this functional application occurs depends on which
variable assignment we’re considering:
(a) For σ(Chomsky,·), we get:
• (λx.{z : z is a philosopher} ⊆ {z : x(z) = >})(λx.Chomsky ad-
mires x)
• = {z : z is a philosopher} ⊆ {z :Chomsky admires z}
• = ⊥, because Plato is a philosopher and Chomsky doesn’t admire
Plato.
(b) For σ(Partee,·), we get:
• (λx.{z : z is a philosopher} ⊆ {z : x(z) = >})(λx.Partee admires x)
• = {z : z is a philosopher} ⊆ {z :Partee admires z}

193
• = ⊥, because Aristotle is a philosopher and Partee doesn’t ad-
mire Aristotle.
(c) For σ(Aristotle,·), we get:
• (λx.{z : z is a philosopher} ⊆ {z : x(z) = >})(λx.Aristotle admires
x)
• = {z : z is a philosopher} ⊆ {z :Aristotle admires z}
• = ⊥, because Plato is a philosopher and Aristotle doesn’t admire
Plato.
(d) For σ(Plato,·), we get:
• (λx.{z : z is a philosopher} ⊆ {z : x(z) = >})(λx.Plato admires x)
• = {z : z is a philosopher} ⊆ {z :Plato admires z}
• = ⊥, because Aristotle is a philosopher and Plato doesn’t admire
Aristotle.
3. We’ve thus learned that every2 philosopher λ2 x1 admires x2 is false
relative to every assignment function.
4. Next we need to apply λ1 to the variable-assignment-relative t values
we’ve just calculated for every2 philosopher λ2 x1 admires x2 . Doing
this will create an (e, t) function that maps an object o to the truth value of
every2 philosopher λ2 x1 admires x2 relative to a variable assignment
that puts o in the first position.
But since every2 philosopher λ2 x1 admires x2 is false relative to every
variable assignment, the resulting truth value will always be ⊥. Thus
applying λ1 produces a function that maps each object to ⊥:

 Chomsky → ⊥ 
 
 Partee → ⊥ 
• 
 Aristotle → ⊥ 

Plato → ⊥
5. That function is the semantic value that results from applying λ1 to every2
philosopher λ2 x1 admires x2 . (Notice that the function we get, at this
point, no longer depends on what variable assignment we are relativizing
to. So we’ve achieved an assignment-insensitive semantic value.) Finally,
that function serves as input to ~some linguist. We have:
• ~some linguist = λx.{z : z is a linguist} ∩ {z : xz = >} , ∅

 Chomsky → ⊥
 

 Partee → ⊥ 
So applying ~some linguist to our function   yields:
 
 Aristotle → ⊥ 
Plato → ⊥
 

 Chomsky → ⊥
 

 Partee → ⊥ 
• (λx.{z : z is a linguist} ∩ {z : xz = >} , ∅)( )
 
 Aristotle → ⊥ 
Plato → ⊥
 

194
 Chomsky → ⊥
 

 Partee → ⊥ 
• = {z : z is a linguist} ∩ {z :   (z) = >} , ∅
 
 Aristotle → ⊥ 
Plato → ⊥
 

• = {z : z is a linguist} ∩ ∅ , ∅
• =⊥
So we conclude that, in our scenario, some linguist admires every philosopher
is false when given the existential-universal reading. This is a triumph, because
the sentence should be false in that scenario – we don’t have any one linguist
who admires all of (both of) the philosophers.

(Keep in mind that strictly speaking, some linguist admires every philosopher
in its Tree ∃∀ reading gets a truth value relative to a variable assignment. But the
final sentence is assignment-insensitive, so it gets the same truth value (namely,
⊥, in our scenario) relative to every variable assignment. Notice that there are
three t nodes in Tree ∃∀. The lowest t node gets a truth value in a way that in
sensitive to both the first and the second objects in the variable assignment. The
middle t node, which occurs after application of the λ2 operator, gets a truth
value in a way that is sensitive to the first object in the variable assignment
but not anymore to the second object in the variable assignment. And the top
t node, which occurs after application of the λ1 operator, isn’t sensitive to the
first object in the variable assignment, either.)

Tree ∀∃: Now we need to see what happens when we process things using the
other tree. As before, we know that ~x1 admires x2  is > relative to σ(Chomsky,
Plato) and σ(Partee, Aristotle), but ⊥ relative to all other variable assignments.

But this time the next step is to apply λ1 rather than λ2 . When we apply λ1 , we
make a new (e, t) function by seeing what truth value is produced by setting
a given input object to be the first value in our variable assignment. That
means it won’t matter what object our starting variable assignment has in its
first position (since that object will just be ‘overwritten’ as we change the first
position), but it will matter what object our starting variable assignment has
in its second position. So there are four cases to distinguish for the result of
applying λ1 to ~x1 admires x2 :

1. Variable assignments of the form σ(·,Chomsky): As we vary the value of


the first position of the variable assignment, we will check whether that
object admires Chomsky. So the result of applying λ1 to ~x1 admires
x2  relative to a variable assignment of the form σ(·, Chomsky) is λx.x
 Chomsky → ⊥ 
 
 Partee → ⊥ 
admires Chomsky, or  .

 Aristotle → ⊥ 
Plato → ⊥
 

195
2. Variable assignments of the form σ(·,Partee): In these cases we check
whether the chosen object admires Partee, so the result of apply λ1 is λx.x
 Chomsky → ⊥ 
 
 Partee → ⊥ 
admires Partee, or  .
 Aristotle → ⊥ 
Plato → ⊥
 

3. Variable assignments of the form σ(·,Aristotle): Here application of λ1


 Chomsky → ⊥ 
 
 Partee → > 
produces λx.x admires Aristotle, or  .

 Aristotle → ⊥ 
Plato → ⊥
 

4. Variable assignments of the form σ(·,Plato): Here application of λ1 pro-


 Chomsky → > 
 
 Partee → ⊥ 
duces λx.x admires Plato, or  .
 Aristotle → ⊥ 
Plato → ⊥
 

We’ve now created an (e, t) value for λ1 x1 admires x2 , with the specific (e, t)
function depending on which variable assignment we’re relativizing to.

The next step is to apply the ((e, t), t) function ~some linguist to the (e, t)
functions we’ve just derived. We have:
• ~some linguist = λx.{z : z is a linguist} ∩ {z : x(z) = >} , ∅
Now we go through our four cases:
1. When we have σ(·,Chomsky), then we apply λx.{z : z is a linguist} ∩ {z :
x(z) = >} , ∅ to λx.x admires Chomsky:
• (λx.{z : z is a linguist} ∩ {z : x(z) = >} , ∅)(λx.x admires Chomsky)
• = {z : z is a linguist} ∩ {z : z admires Chomsky} , ∅
 Chomsky → ⊥ 
 
 Partee → ⊥ 
• = {z : z is a linguist} ∩ {z :   (z) = >} , ∅
 Aristotle → ⊥ 
Plato → ⊥
 

• {z : z is a linguist} ∩ ∅ , ∅
• =∅,∅
• =⊥
2. When we have σ(·,Partee), then we apply λx.{z : z is a linguist}∩{z : x(z) =
>} , ∅ to λx.x admires Partee:
• (λx.{z : z is a linguist} ∩ {z : x(z) = >} , ∅)(λx.x admires Partee)
• = {z : z is a linguist} ∩ {z : z admires Partee} , ∅

196
 Chomsky → ⊥
 

 Partee → ⊥ 
• = {z : z is a linguist} ∩ {z :   (z) = >} , ∅
 
 Aristotle → ⊥ 
Plato → ⊥
 

• {z : z is a linguist} ∩ ∅ , ∅
• =∅,∅
• =⊥
3. When we have σ(·,Aristotle), then we apply λx.{z : z is a linguist} ∩ {z :
x(z) = >} , ∅ to λx.x admires Aristotle:
• (λx.{z : z is a linguist} ∩ {z : x(z) = >} , ∅)(λx.x admires Aristotle)
• = {z : z is a linguist} ∩ {z : z admires Aristotle} , ∅
 Chomsky → > 
 
 Partee → ⊥ 
• = {z : z is a linguist} ∩ {z :   (z) = >} , ∅
 Aristotle → ⊥ 
Plato → ⊥
• {z : z is a linguist} ∩ {Chomsky} , ∅
• = {Chomsky, Partee} ∩ {Chomsky} , ∅
• = {Chomsky} , ∅
• =>
4. When we have σ(·,Plato), then we apply λx.{z : z is a linguist} ∩ {z : x(z) =
>} , ∅ to λx.x admires Plato:
• (λx.{z : z is a linguist} ∩ {z : x(z) = >} , ∅)(λx.x admires Plato)
• = {z : z is a linguist} ∩ {z : z admires Plato} , ∅
 Chomsky → ⊥
 

 Partee → > 
• = {z : z is a linguist} ∩ {z :   (z) = >} , ∅
 
 Aristotle → ⊥ 
Plato → ⊥

• {z : z is a linguist} ∩ {Partee} , ∅
• = {Chomsky, Partee} ∩ {Chomsky} , ∅
• = {Partee} , ∅
• =>
The upshot of all of that is that some linguist λ1 x1 admires x2 is true rela-
tive to variable assignments that have Aristotle or Plato in their second position,
but false relative to variable assignments that have Chomsky or Partee in their
second position:

• ~some linguist λ1 x1 admires x2 σ(·,Chomsky) = ⊥

• ~some linguist λ1 x1 admires x2 σ(·,Partee) = ⊥

197
• ~some linguist λ1 x1 admires x2 σ(·,Aristotle) = >

• ~some linguist λ1 x1 admires x2 σ(·,Plato) = >


The next step is to apply λ2 to those variable-assignment-relativized truth
values. Applying λ2 produces a function that maps each object to the truth
value that results from a variable assignment containing that object in the
second position. So we can read the resulting (e, t) function off of the above
collection of truth values immediately:
• Applying λ2 to ~some linguist λ1 x1 admires x2  produces the func-
 Chomsky → ⊥ 
 
 Partee → ⊥ 
tion  .

 Aristotle → > 
Plato → >
 

The final step is then to use that (e, t) function as input to the ((e, t), t) typed
~every philosopher. We have:
• ~every philosopher = λx.{z : z is a philosopher} ⊆ {z : x(z) = >}
So we do a last bit of calculation:

• ~some linguist admires every philosopher = ~every philosopher


λ2 some linguist λ1 x1 admires x2 
• = ~every philosopher(~λ2 some linguist λ1 x1 admires x2 )

 Chomsky → ⊥ 
 
 Partee → ⊥ 
• = λx.{z : z is a philosopher} ⊆ {z : x(z) = >}( )

 Aristotle → > 
Plato → >
 

 Chomsky → ⊥ 
 
 Partee → ⊥ 
• = {z : z is a philosopher} ⊆ {z :   (z) = >}
 Aristotle → > 
Plato → >
• = {Aristotle, Plato} ⊆ {Aristotle, Plato}

• =>
So the universal-existential reading Tree ∀∃ comes out true in our scenario.
That’s the right verdict, so it’s a complete triumph for our test run of the theory
of variable binding.

Problem 207: Consider the sentence:


• Socrates walked every dog in the park.

198
Show that this sentence is ambiguous, specifying two readings of
it. Then propose two post-movement trees for the two readings.
Finally, attempt a detailed calculation of the semantic values for both
trees. One of the two trees should be reasonably straightforward,
but the other may create some complications. Consider whether
you can use the same value for ~in in both cases.

Problem 208: Suppose we add more lambda term to the top of one
of our trees for Some linguist admires every philosopher:

t
λk

((e, t), t) (e, t)

((e, t), ((e, t), t)) (e, t)


t
every2 philosopher λ2

((e, t), t) (e, t)

((e, t), ((e, t), t)) (e, t) t


λ1
some1 linguist e (e, t)

x1 (e, (e, t)) e

admires x2
What happens at the new top node resulting from applying λk to
the result we calculated above? What type is the output semantic
value, and what specific semantic value within that type do we get?
How does the result vary as we vary the particular k that we use for
λk ?

AW Resting On Our Laurels and Reflecting on How the Pieces Fit To-
gether

With a lot of work, we’ve produced a very general theory of quantified noun
phrase. The three central ideas of this theory are:

199
1. Quantified noun phrases are always of type ((e, t), t).
2. Using syntactic movement rules, quantifier noun phrases are systemati-
cally moved to the tops of trees and adjoined to S nodes of type t.
3. λ terms convert the relativized behavior of type-t nodes in new type (e,
t) nodes.
By putting all quantified noun phrases in type ((e, t), t), we get a rich standard
sandbox for characterizing quantified noun phrases. We can work out general
structural features of this category, such as the features of monotonicity and
strength and weakness we’ve explored above, and think about how different
quantified noun phrases are characterized by these structural features.

The price that’s paid for putting all of the quantified noun phrases in the
same typed sandbox is trouble in getting quantified noun phrases to combine
via functional application in a wide range of syntactic settings. Quantified
noun phrases can appear as subjects, as objects of transitive verbs, as objects
of prepositions, as indirect objects of ditransitive verbs, and so on. But when
those various syntactic contexts are built around an underlying core semantic
type e, they won’t all successfully interact with ((e, t), t).

The second central idea then pays that price. By using syntactic movement to
relocate quantified noun phrases to the tops of trees, adjoined to S nodes, we
put all quantified noun phrases in the same syntactic position, so that quanti-
fied noun phrases all need to interact with semantic values of whole sentences.
Thus we’re able to use the same semantic type for all quantified noun phrases.

The remaining difficulty is that the syntactic environment that we’ve put all
quantified noun phrases in doesn’t quite match the type-theoretic demands of
((]bf e, t), t). By adjoining quantified noun phrases to S nodes, we’ve set them
up to receive type t inputs, which is not the right input for a ((e, t), t) function.
So the final central concept is the use of λ terms. These terms, together with
variable-assignment-relativized semantic values throughout, allow us to shift
the type t values of the S nodes back to (e, t) values that are suitable to serve as
inputs to type ((e, t), t).

So the resulting picture lets us freely deploy members of the nicely-explored ((e,
t), t) category in lots of different places in sentences, and to get a variety of scop-
ing interactions among those different members by using syntactic movement
to rearrange them in different orders at the tops of trees. It’s a very powerful
and elegant picture that gives us a rich theory of a useful fragment of natural
language.

Let’s consider a sketch of how this resulting picture can let us handle a compli-
cated example. Consider the sentence:
• Most linguists read few books in most libraries.

200
This sentence contains three quantified noun phrases: most linguists, few
books, and most libraries. We can situate all three of these quantified noun
phrases in our general theory by giving appropriate ((e, t), ((e, t), t)) semantic
values for most and few:
1. ~most = λx.λy.|x↓ ∩ y↓ | > |x↓ − y↓ |

2. ~few = λx.λy.x↓ ∩ y↓ has few members


We can then note that:
1. Most is right monotone up, left non-monotonic, and positive strong.

2. Few is right monotone down, left monotone down, and weak.


We have quantified noun phrases appearing in three contexts:
1. As subject of the sentence, positioned to combine with read few books
in most libraries, which should be type (e, t).

2. As object of the transitive verb read, which should be of type (e, (e, t)).
3. As object of the preposition in, which should be of type (e, ((e, t), (e, t))).
But our movement rules will pull all three quantified noun phrases to the top of
the tree, leaving type e variables behind that combine unproblematically with
all of those contexts. We get trees like:

most libraries
λ3

few books
λ2
in x3

most linguists λ1
x1 read
x2
Because there are three quantified noun phrases, there are in principle six dif-
ferent orders in which those three phrases can be moved to the top of the
tree. However, there’s an extra constraint at work here. The phrase most
libraries leaves behind the variable x3 (given the way we’re numbering the
variables). So we want most linguists to appear higher in the tree than λ3 ,
which in turn appear higher in the tree than x3 . But since x3 is the result of

201
the movement of most libraries out of few books in most libraries, we
therefore need most libraries to move higher in the tree than few books in
most libraries (which will then have become few books in x3 ).

That means there are really only three orders available for the three phrases (in
order from top to bottom):
1. most libraries, few books in most libraries, most linguists
2. most libraries, most linguists, few books in most libraries
3. most linguists, most libraries, few books in most libraries

Problem 209: What happens if we use one of the three orderings


that don’t obey our additional constraint, such as:

most linguists
λ1

few books
λ2
in x3

most libraries λ3
x1 read
x2
What final semantic value do we get for the entire sentence with
this tree if we apply our rules?

Problem 210: On another way of looking at things, we actually have


six pieces to put in order in the tree for Most linguists read few
books in most libraries: the three quantified noun phrases most
linguists, few books in most libraries, and most libraries;
and also the three lambda terms λ1 , λ2 , and λ3 . With six things to
put in order, there are 720 possible orders for them, which means
we get a lot of additional orderings beyond the three standard ones
mentioned above. These other orderings include options such as:

202
λ2

λ1

most libraries

few books λ3
x1
in x3 read x2
What sort of semantic value results when we use one of these other
orders as the starting tree?

AX A Dark Secret Revealed

We’ve been a little cagey about how exactly these λ terms we have been using
work. In particular, we’ve carefully avoided putting a semantic typing on them.
It’s time to think more carefully about that issue. Consider a simple example
of the use of a lambda term:

((e, t), t) (e, t)

t
((e, t), ((e, t), t)) (e, t) λ
e (e, t)
some linguist
x1 laughed
If everything is going to proceed via functional application as normal, this tree
requires λ to be type (t, (e, t)), so that it can map the type t input it receives
from x1 laughed to the type (e, t) output that the type ((e, t), t) expression some
linguist requires.

But making λ of type (t, (e, t)) creates serious problems. Here’s a first-draft
statement of the problem:
• There are only two truth values in t: > and ⊥. That means there are only
two outputs that λ can produce, which means that there are only two (e,

203
t) values that can be provided as input to the quantified noun phrase. But
two different (e, t) values isn’t enough. Consider the following collection
of data:
1. First triad:
– Some cat is a reptile: ⊥
– Every cat is a reptile: ⊥
– No cat is a reptile: >
2. Second triad:
– Some cat is hungry: >
– Every cat is hungry: ⊥
– No cat is hungry: ⊥
3. Third triad:
– Some cat is a mammal: >
– Every cat is a mammal: >
– No cat is a mammal: ⊥
In each of these three triads, we use some expression to provide an (e, t)
input to three different quantified noun phrases: some cat, every cat,
and no cat. The inputs are provided by λ x1 is a reptile, λ x1 is
hungry, and λ x1 is a mammal.
But we get different patterns of outputs for each of the three inputs. When
we use λ x1 is a reptile to provide the input, some cat and every
cat produce ⊥ as output, but no cat produces > as output. However,
when we use λ x1 is hungry to provide the input, some cat produces
> as output while every cat and no cat produce ⊥ as output. That
means that λ x1 is a reptile and λ x1 is hungry can’t be producing
the same output – if they did, they’d produce the same result when
combined with any ((e, t), t) expression.
And then λ x1 is a mammal produces yet another pattern when com-
bined with the various quantified noun phrases. For the input provided
by λ x1 is a mammal, some cat and every cat produce > as output
while no cat produces ⊥ as output. But now we need λ to be producing
three different outputs. But that’s impossible. Since λ takes a t value as
input, and since there are only two different t values, λ can produce only
two different outputs.
The first-draft statement of the problem is on the right track, but it doesn’t
get things exactly right. It’s true that there are only two members of type t,
so that if x1 laughed is type t, there are only two possible inputs to a type (t,
(e, t)) λ term. But this overlooks the fact that x1 laughed can have different t
values relative to different variable assignments. That was the whole point of
introducing variable assignment relativized semantic values – it wouldn’t be a
surprise if ignoring the relativization got us into trouble in understanding the

204
typing of the λ terms.

So let’s be careful about relativization of semantic values. We thus want to


consider:
• ~λ x1 laufhedσ

By functional application, this is:


• ~λσ (~x1 laughedσ)
Because λ doesn’t contain any variables, we will assume that its semantic value
is assignment-insensitive. So we can simplify this to:

• ~λ(~x1 laughedσ)
~x1 laughedσ can be different truth values for different choices of σ, so we can
get different outputs from the application of λ. That’s helpful, but it doesn’t fix
the underlying problem.

We’ll bring out the underlying problem in two different ways:


1. Suppose we are trying to make some cat is hungry work out. We have
two cats C1 and C2. C1 is hungry and C2 is not hungry. We’ll then
consider two assignment functions:

(a) σ1 assigns x1 to C1
(b) σ2 assigns x1 to C2
Then we will have:
(a) ~x1 is hungryσ1 = >
(b) ~x1 is hungryσ2 = ⊥
What then happens when we apply λ to these assignment-relative truth
values? The whole point of paying attention to the assignment-relativity
was to get different inputs to λ (relative to different variable assignments),
so if we’re going to get anything useful out of this, λ had better produce
different outputs from these different inputs. So let’s assume:
(a) ~λ(~x1 is hungryσ1 ) = ~(λ)(>) = some (e, t) value V1 picking out
a set that has a non-empty overlap with the set of cats.
(b) ~λ(~x1 is hungryσ2 ) = ~(λ)(⊥) = some (e, t) value V2 picking out
a set that has an empty overlap with the set of cats.

We then have:
(a) ~some cat is hungryσ1 :
• ~some cat is hungryσ1

205
• = ~some cat λ x1 is hungryσ1
• = ~some catσ1 (~λ x1 is hungryσ1 )
• = ~some cat(~λσ1 (~x1 is hungryσ1 ))
• = ~some cat(V1)
• =>
(b) ~some cat is hungryσ2 :
• ~some cat is hungryσ2
• = ~some cat λ x1 is hungryσ2
• = ~some catσ2 (~λ x1 is hungryσ2 )
• = ~some cat(~λσ2 (~x1 is hungryσ2 ))
• = ~some cat(V2)
• =⊥
But this is an undesirable result. We get the sentence some cat is hungry
to be true relative to one variable assignment and false relative to an-
other. However, some cat is hungry shouldn’t be assignment-sensitive.
Rather, it should be simply true (since there is, in fact, a cat C1 that is hun-
gry).
The central problem here is that if we are going to use the relativization of
semantic values to variable assignments to get a greater diversity of inputs
to λ, we have to accept to relativization of semantic values throughout the
entire system. And that means we need to relativize to σ both above and
below the λ term in the tree – we get a diversity of relativized inputs to λ,
but we thereby also get a diversity of relativized outputs from λ, and thus
a diversity of relativized final truth values for the sentence. But that’s
not what we wanted. We wanted a single, unrelativized, truth value for
the final sentence, which means we wanted λ to somehow seal off the
relativization to variable assignment that x1 is hungry correctly shows.
But nothing in our current type-theoretic tools allows λ to perform that
sealing off.
2. Inheriting relativization of semantic values all the way up to the level of
complete sentences isn’t the only problem we face when we pay careful
attention to the effect of relativization on treating λ terms as type (t, (e,
t)). In addition, we’ll still find that we don’t get enough diversity in the
outputs of λ terms to get reasonable results.

AY Setwise Types

Instead of relativizing semantic values, we can write variable assignment func-


tions into the basic machinery of our semantic type system. One way to do this
would be to introduce a new type s of variable assignments. Then variables
could be of type (s, e). In particular, we would have:

206
• ~x1  = λxs .x(1)
Proper names could also be of type (s, e), but would be constant functions that
didn’t depend on the input variable assignment, as in:
• ~Aristotle = λxs .Aristotle
In the same way, whole sentences could be of type (s, t), as in:
• ~x1 laughed = λxs .x(1) laughed
And sentences without free variables could also be of type (s, t), but again using
constant functions:
• ~Aristotle laughed = λxs .Aristotle laughed

AZ Another Use of ((e,t), t): Possessives

We’ve been considering a type ((e, t), t) semantic framework in which we can
handle a wide range of quantified noun phrases such as:
• some king
• no linguist
• most philosophers
Another construction that looks like it has the same syntactic form as these
quantified noun phrases is the possessive noun phrase:
• Chomsky read my book.
• Your car hit a deer.
• Aristotle’s objection stymied Plato.
• The unhappy linguist’s party depressed everyone.
Let’s see if we can use some of the same semantic tools to model the possessive
noun phrases.

We’ll start with Aristotle’s objection stymied Plato. First we need a syn-
tactic structure for the sentence. A first draft is:

stymied Plato
Aristotle’s objection

207
In this structure, Aristotle’s objection is a quantified noun phrase with
determiner Aristotle’s and noun objection.

In the usual second-order property ((e, t), t) way, we want:


• ~Aristotle’s objection = λx(e, t) .x(Aristotle’s objection)

That is, there is some specific object – Aristotle’s objection – and the semantic
value of Aristotle’s objection is the second-order property of being a prop-
erty that that object has.

It would then be nice to build up the semantic value of Aristotle’s objection


from the semantic values of Aristotle’s and objection. Objection, of course,
is a common noun of type (e, t) (such that ~objection = λx.x is an objection),
so we want Aristotle’s to be acting as the type ((e, t), ((e, t), t)) determiner
combining with that common noun. The rough idea is that Aristotle’s should
take as input some property and then (i) find some object or objects having that
property that (ii) bear some relation to Aristotle and (iii) take the second-order
property of being a property that that object or those objects have. There are
two important choice points we confront when working out the details:
1. Uniqueness?: Does the use of Aristotle’s objection require that Aris-
totle have only one objection? Some uses of the possessive look like they
impose a uniqueness requirement, while others don’t. So compare:
(a) Socrates’s brother found him in the agora.
(b)

BA Another Use of ((e, t,), t): Relative Clauses

Consider the sentence:


• The man who sold the world laughed.
This sentence contains the relative clause who sold the world. This relative
clause serves to restrict the quantification over men created by the use of the.
Relative clauses share this restricting role with adjectives and prepositional
phrases:
• The man who sold the world
• The tall dark man
• The man in the corner

208
In each case, the restrictor answers the question: which man? (The one who
sold the world, the tall dark one, the one in the corner.) We have a model
for how adjectives and prepositional phrases restrict the quantification. These
expressions are type ((e, t), (e, t)), so they map the type (e, t) man to a restricted
(e, t) which then provides the domain of quantification for the. Can we tell a
similar story for relative clauses?

Let’s start by considering our initial typing constraints. We have the tree:

((e, t), t) (e, t)

laughed
((e, t), ((e, t), t)) (e, t)

the
(e, t) ((e, t),(e, t))

man
(e, t)
who
(e, (e, t)) e

sold the world


So we need the relative pronoun who to transition from the type sold the
world to a type ((e, t), (e, t)) restrictor value for the entire relative clause. That
means who needs to be of type ((e, t), ((e, t), (e, t))). And, in fact, it looks like
what we want is a familiar member of that type. Man who sold the world
should pick out the intersection of things that are men with things that sold
the world, so we want who to have the semantic value of INTERSECT, which
imposes intersective application of adjectives. Thus:
• ~who = λx(e, t) .λy(e, t) .λze .(x(z) ∧ y(z))

Problem 211: Using this semantic value for who as well as plausible
semantic values for the other words in the sentence, calculate the
final semantic value of The man who sold the world laughed. Is
the result plausible?

Problem 212: For simplicity, we treated the world as type e rather


than type ((e, t), t) in our tree above. Give another analysis of the
sentence treating the world as a quantified noun phrase of type
((e, t), t). To do this, you’ll need to reconsider both the syntax and

209
the semantics of the sentence. Does the more complicated semantic
treatment of the world have an effect on the appropriate semantic
value for who?

However, other cases of relative clauses don’t work out so nicely. Consider:
• The man who I admire laughed
Everything looks fine when we first consider the typing of the tree:

((e, t), t) (e, t)

laughed
((e, t), ((e, t), t)) (e, t)

the
(e, t) ((e, t),(e, t))

man
((e, t), ((e, t), (e, t)) (e, t)

who e (e, (e, t))

I admire
But trouble is lurking. A first sign of the lurking trouble is that this tree makes
I admire a constituent, but it shouldn’t be. Admire is a transitive verb, and
should combine with its object, not with the subject I, to form a constituent.
And the full trouble comes out when we consider the semantic consequences
of those constituency facts. Because the semantic value of admire is λx.λy.y
admires x, admires can combine with I, but the combination produces the (e,
t) property admiring me. And as a result, the entire sentence ends up meaning
that the man who admires me laughed, which is the wrong meaning.

Problem 213: Using the above tree for The man who I admire
laughed, calculate the final truth value for the sentence, and confirm
that it has the (incorrect) truth conditions that the man who admires
me laughed.

We can start to see what went wrong by noticing that, in a less colloquial register
of English, the who in The man who I admire laughed should have been whom.
That’s because the relative pronoun who/whom is, in some sense, playing the
role of direct object for admire. What we want is a movement-based syntactic

210
analysis something along the lines of:

laughed
the
man
whom
I
admire whom

If we then assume that when whom moves, it leaves a variable of type e, we can
begin typing the tree to figure out how the moved whom functions:

((e, t), t) (e, t)

laughed
((e, t), ((e, t), t)) (e, t)

the
(e, t)
t
man whom
e (e, t)

I (e, (e, t)) e

admire x1
We need some method of transitioning from a t value (for I admire x1 ) to
something (e, t)-like. But this is a familiar problem, especially given the pres-
ence of a variable. We need to insert a lambda abstractor:

211
t

((e, t), t) (e, t)

laughed
((e, t), ((e, t), t)) (e, t)

the
(e, t) ((e, t), (e, t))

man
((e, t), ((e, t), (e, t))) (e, t)

whom t
λ1
e (e, t)

I (e, (e, t)) e

admire x1
And now all the typing works out. Furthermore, we can use the intersective
semantic value for whom that we used above for who. Sketching the derivation,
we have:
• ~I admire x1  g = > if and only if I admire g(1).
• So ~λ1 I admire x1  is the (e, t) function that maps an object to > if I
admire that object.

• So ~whom λ1 I admire x1  is a function that maps a (e, t) property to


the intersective conjunction of that property with the property of being
admired by me.
• So ~man whom λ1 I admire x1  is the property of being both a man and
something I admire.

• So The man whom I admire laughed is true if and only if the unique
thing that is both a man and something I admire laughed.

Problem 214: Relative pronouns can also occur as objects of prepo-


sitions, as in:
• The man for whom the bell tolled died.

212
Determine whether the intersective ((e, t), ((e, t), (e, t))) treatment of
who given above will also work for these uses of relative pronouns.
In doing so, consider what the right syntactic analysis of The man
for whom the bell tolled died should be, give a speculative se-
mantic value for for, and determine what truth conditions for the
sentence are produced.

Problem 215: Relative clauses can be both restrictive and non-


restrictive. Non-restrictive relative clauses, typically offset by com-
mas, don’t restrict the scope of the quantified noun phrases in which
they occur. Thus compare:
• The tall man who bought the car left.
• The tall man, who bought the car, left.
What effect on truth conditions do non-restrictive relative clauses
have? How can we adapt the semantics for relative clauses given
in this section to deal with non-restrictive relative clauses? Do we
need a difference in the syntactic analysis of the relative clause, in
the semantics of the relative clause, or both?

Problem 216: Consider a sentence with a relative clause that itself


contains a quantified noun phrase, such as:
• Some linguist whom every philosopher admires laughed.
• Every philosopher whom some linguist admires laughed.
Different parts of our overall machinery require (i) movement of
the quantified noun phrase out of the subject position in the rela-
tive clause, and (ii) movement of the relative pronoun out of the
object position in the relative clause. Because two movements are
happening, there are at least two different trees that could result:

213
1.

λ3
some x3 laugh

linguist

whom

λ1

every philosopher λ2
x2 .
admires x1
2.

every philosopher

λ2

λ3
some x3 laugh
linguist
whom
λ1

x2 .
admires x1
Calculate final semantic values for both of these trees. Are both
truth conditions available as readings of the English sentence Some
linguist whom every philosopher admires laughed? Consider

214
other quantified noun phrases embedded in relative clauses and
their scope interactions with other quantified noun phrases in the
same sentence. Are there any good generalizations on which scope
readings are and are not available?

Problem 217: Relative clauses can be nested inside relative clauses,


as in:
• The linguist who admires every philosopher who admires
some linguist laughed.
Draw a tree for this sentence and use the intersective semantic value
for who to calculate its truth conditions. Is the result plausible, given
your understanding of the sentence?

When relative clauses use relative pronouns moved out of object


position (whom), the use of the relative pronoun is optional in English.
Compare:
• The linguist whom the philosopher admires
• The linguist the philosopher admires
When the relative pronoun is omitted, these relative clause con-
structions can be used to create difficult-to-parse center embedding
constructions. For example:
• The mouse the cat the dog bit chased squeaked.
Give a tree for this sentence and calculate full semantic values for
that tree. The mouse the cat the dog bit chased squeaked is a
depth-three center embedding. Give another example with a depth-
four center embedding, and draw a tree for that example.

BB Another Use of ((e, t), t): Pronouns

BC Possible Worlds

The framework we’ve been developing is centered around the thesis that the
semantic type of a full sentence is t. The assumption that the root node of a
sentence’s tree is type t helps drive the typing of the other nodes. But there is
a disadvantage to typing sentences t.

The following sentences are both true:


• Aristotle is a philosopher.

215
• Chomsky is a linguist.
But since sentences have semantic value of type t, we have:
• ~Aristotle is a philosopher = > = ~Chomsky is a linguist.
Thus Aristotle is a philosopher and Chomsky is a linguist have the
same semantic value. But that means that the two sentences are synonymous,
and that’s obviously an unacceptable conclusion.

Problem 218: Consider the following response to the synonymy


problem just raised:
But that can’t be right. Suppose we have:
1. ~Aristotle = Aristotle
2. ~Chomsky = Chomsky
3. ~is a philosopher = λx.x is a philosopher
4. ~is a linguist = λx.x is a linguist
Then we’ll straightforwardly get:
• ~Aristotle is a philosopher = Aristotle is a philoso-
pher
• ~Chomsky is a linguist = Chomsky is a linguist
So the two sentences don’t have the same semantic value
after all, and are not synonymous.
What is wrong with that response?

More generally, the way things are currently set up, (i) every sentence has a
semantic value of type t, and (ii) there are only two truth values in t. Thus
each sentence has one of only two possible semantic values. All of the true
sentences are synonymous with one another, and all of the false sentences are
synonymous with one another.

We don’t want Aristotle is a philosopher and Chomsky is a linguist to


be synonymous. Here’s one reason for thinking the two sentences shouldn’t be
synonymous:

Possible Difference: Although Aristotle is a philosopher and


Chomsky is a linguist are in fact both true, they could have had
different truth values. It could have been that Aristotle was a
philosopher but Chomsky wasn’t a linguist. In that case Aristotle
is a philosopher would be true and Chomsky is a linguist would
be false. And it could have been that Chomsky was a linguist
but Aristotle wasn’t a philosopher. In that case Aristotle is a
philosopher would be false and Chomsky is a linguist would
be true. But if the two sentences really meant the very same thing,
it wouldn’t be possible for them to have different truth values.

216
To account for the semantic difference between Aristotle is a philosopher
and Chomsky is a linguist we’ll thus add tools for talking about the possible
truth values of sentences under various possible circumstances.

The central tools we need are possible worlds. A possible world is a maximally
specific state of affairs – a state of affairs that fully specifies everything about
the world, leaving nothing undetermined. For our purposes, we can think of
worlds as being given in two steps:
1. First we give a modelling language. The modelling language contains a
limited number of sentences that specify the basic features of a possible
world – roughly, enough features that we’ve fixed exactly how everything
is in the world, but not so many features that there are redundancies in
how we describe the world.
2. Second, we give a particular possible world by setting the truth value of
each sentence in the modelling language.

For our current example, the choice of modelling language is particularly


straightforward. Our modelling language contains two sentences:
1. Aristotle is a philosopher.
2. Chomsky is a linguist.

Given this modelling language, there are then four possible worlds we can
consider:
1. World w1 :
• Aristotle is a philosopher: >
• Chomsky is a linguist: >
2. World w2 :
• Aristotle is a philosopher: >
• Chomsky is a linguist: ⊥

3. World w3 :
• Aristotle is a philosopher: ⊥
• Chomsky is a linguist: >
4. World w4 :

• Aristotle is a philosopher: ⊥
• Chomsky is a linguist: ⊥

217
Notice that Aristotle is a philosopher is true only in w1 and w2 , while
Chomsky is a linguist is true only in w1 and w3 . That’s promising, because
it marks a difference between these two sentences that we were struggling to
differentiate using our previous semantic tools.

To make good on that promise, we’ll return to the idea of relativizing semantic
values that we investigated earlier. Previously we relativized semantic values
to variable assignments, but now we will relativize semantic values to possible
worlds. Instead of saying:
• ~Aristotle is a philosopher = >

we now say all of:


1. ~Aristotle is a philosopherw1 = >
2. ~Aristotle is a philosopherw2 = >
3. ~Aristotle is a philosopherw3 = ⊥

4. ~Aristotle is a philosopherw4 = ⊥
So we can introduce a new kind of semantic value for sentences. We’ll use
~Aristotle is a philosopher to pick out the set of worlds in which Aristotle
is a philosopher is true:

• ||Aristotle is a philosopher|| = {w :~Aristotle is a philosopherw =


>}
So given what we’ve already said, we can see that ||Aristotle is a philosopher||
= {w1 , w2 }.

Similarly, we can see that ||Chomsky is a linguist|| = {w1 , w3 }. This gives us


a kind of semantic value that differentiates Aristotle is a philosopher and
Chomsky is a linguist. We can observe simultaneously:
1. If the actual world is w1 , the ~· values of Aristotle is a philosopher
and Chomsky is a linguist are the same relative to the actual world:

• ~Aristotle is a linguistw1 = ~Chomsky is a linguistw2


2. The worldly ||·|| semantic values of Aristotle is a philosopher and
Chomsky is a linguist are different:
• ||Aristotle is a philosopher|| = ||Chomsky is a linguist||

Problem 219: Given the space of four worlds we describe above,


how many different worldly semantic values are possible? For
each possible worldly semantic value, give a sentence that has that
worldly semantic value.

218
Two important notes about worldly semantic values:
1. Worldly semantic values, at least as we’ve designed them so far, are
features only of entire sentences. ||Aristotle||, for example, is just unde-
fined.

Problem 220: It isn’t actually true, given what we’ve said so far,
that ||Aristotle|| is undefined. If we just apply the definition,
what do we get for ||Aristotle||? Explain why, however, that’s
not a reasonable or helpful result to get, and why we should just
treat subsentential expressions like Aristotle as being outside
the domain of the ||·|| function.

So in particular, ||Aristotle is a philosopher|| is not built up out of


some component pieces ||Aristotle||, ||is||, ||a||, and ||philosopher||. That’s
just because there are no such things as ||Aristotle||, ||is||, and so on –
so of course there’s no way that ||Aristotle is a philosopher|| can be
built up out of those non-existent pieces.
2. When one sentence is a component part of another sentence – as, for ex-
ample, Aristotle is a philosopher is a component part of Aristotle
is not a philosopher – both sentences have worldly truth values. But
the worldly truth value of the larger sentence is not built up from the
worldly truth value of the smaller sentence in the usual manner of func-
tional application that we’ve been considering. Thus:
• ||Aristotle is a philosopher|| = {w1 , w2 }
• ||Aristotle is not a philosopher|| = {w3 , w4 }
We aren’t getting ||Aristotle is not a philosopher|| by functionally
applying ||Aristotle is a philosopher|| to something, because ||Aristotle
is a philosopher|| isn’t a function. And we aren’t getting ||Aristotle
is not a philosopher|| by functionally applying ||not|| to ||Aristotle
is a philosopher||, because there’s no such thing as ||not||. And we
aren’t getting ||Aristotle is not a philosopher|| by functionally ap-
plying ~not to ||Aristotle is a philosopher||, because ~not is type (t,
t) and needs a truth value as an input, but ||Aristotle is a philosopher||
is a set of worlds, not a truth value.
Both of these features of worldly semantic values are results of the fact that
worldly semantic values are derivative features of sentences. The core feature
of the theory is the range of ~· semantic values. It’s these values that are built
up from values of smaller expressions to values of larger expressions using
functional application. When we add relativization to worlds, we can then
define worldly ||·|| values out of ~· values.

Problem 221: If we use a modelling language with n sentences,


how many possible worlds do we get? Call two sentences S1 and S2

219
worldly equivalent if ||S1 || = ||S2 ||. Given our modelling language
with n sentences, what is the maximum number of non-worldly-
equivalent sentences?

Problem 222: Pick a modelling language and sketch the space of


worlds produced by that modelling language. Then give three
sentences S1 , S2 , and S3 such that ||S1 || ⊆ ||S2 || ⊆ ||S3 ||. Is there any
special way that two sentences must be related so that the worldly
value of one is a subset of the worldly value of the other?

BD Worlds and Connectives

We saw above that ||Aristotle is a philosopher|| = {w1 , w2 }. What about


||Aristotle is not a philosopher||?

The two sentences Aristotle is a philosopher and Aristotle is not a


philosopher always have opposite truth values. Any time Aristotle is a
philosopher is true, Aristotle is not a philosopher is false. And any time
Aristotle is a philosopher is false, Aristotle is not a philosopher is
true. This pattern persists when we consider world-relativized truth values:
• For any world w, if ~Aristotle is a philosopherw = >, then ~Aristotle
is not a philosopherw = ⊥.

• For any world w, if ~Aristotle is a philosopherw = ⊥, then ~Aristotle


is not a philosopherw = >.
This lets us make two observations:
• Because Aristotle is a philosopher is true in w1 and w2 , Aristotle
is not a philosopher is false in w1 and w2 .

• Because Aristotle is a philosopher is false in w3 and w4 , Aristotle


is not a philosopher is true in w3 and w4 .
Combining these observations shows that ||Aristotle is not a philosopher||
= {w3 , w4 }.

Notice two things about ||Aristotle is not a philosopher||:


1. ||Aristotle is a philosopher|| and ||Aristotle is not a philosopher||
are disjoint. There are no worlds that are in both sets.
2. ||Aristotle is a philosopher|| and ||Aristotle is not a philosopher||
are exhaustive. Between the two sets, they cover all of the worlds. Each
of w1 through w4 is contained in one of the two sets.

220
Call the set of all of the possible worlds W. Then we can say:
• ||Aristotle is not a philosopher|| = W - ||Aristotle is a philosopher||
This feature generalizes to the interaction of every sentence with negation:
Claim: Given any sentence S, ||not S|| = W - ||S||.

Proof: We show two things: (i) any world in ||S|| is not in ||not S||,
and (ii) any world not in ||S|| is in ||not S||.
1. Suppose w ∈ ||S||. Then ~Sw = >. Since negation inverts truth
values, we have ~not Sw = ⊥. So w < ||not S||.
2. Suppose w < ||S||. Then ~Sw = ⊥. Since negation inverts truth
values, we have ~not Sw = >. So w ∈ ||S||.
From this, we see that ||S||∩||not S|| = ∅ and ||S||∪||not S|| = W. Thus
||not S|| = W - ||S||.
Thus at the level of worldly values, the effect of negation is set-theoretic comple-
mentation – the worldly value of the negation of any sentence is the complement
of the worldly value of the original sentence.

Problem 223: What is the relation between ||S|| and ||not not S||?

We can find similar rules connecting other connectives to worldly truth values:

1. And: Suppose we have two sentences A and B, and we know ||A|| and
||B||. Then we can from that information determine ||A and B||:
Claim: ||A and B|| = ||A|| ∩ ||B||

Proof: We show two things: (i) any world in both ||A|| and ||B||
is also in ||A and B||, and (ii) any world in wvA and B is in both
||A|| and ||B||.
(a) Suppose w ∈ ||A|| and w ∈ ||B||. Then ~Aw = > and ~Bw =
>. But from this it follows that ~A and Bw = >. Thus w ∈
||A and B||.
(b) Suppose w ∈ ||A and B||. Then ~A and Bw = >. But from
this it follows that ~Aw = > and ~Bw = >. Thus w ∈ ||A||
and w ∈ ||B||.
Therefore ||A and B|| = ||A|| ∩ ||B||.
Thus at the level of worldly values, the effect of conjunction is intersection
– the worldly value of the conjunction of two sentences is the intersection
of the worldly values of the two original sentences.

Problem 224: Let S be an arbitrary sentence. What can we say


about the following:

221
(a) ||A and not A||
(b) ||not(A and not A)||

Problem 225: Prove that for any sentence S, ||S|| = ||S and S||.

2. Or: Suppose we have two sentences A and B, and we know ||A|| and ||B||.
Then we can from that information determine ||A or B||:

Claim: ||A or B|| = ||A|| ∪ ||B||.

Proof: We show two things: (i) any world that is in either of


||A|| or ||B|| is also in ||A or B||, and (ii) any world in ||A or B||
is in at least one of ||A|| and ||B||.
(a) Suppose w ∈ ||A||. Then ~Aw = >. From this it follows that
~A or Bw = >, and thus that w ∈ ||A or B||. Similarly,
suppose w ∈ ||B||. Then ~Bw = >. From this it follows that
~A or Bw = >, and thus that w ∈ ||A or B||.
(b) Suppose w ∈ ||A or B||. Then ~A or Bw = >. But if that
disjunction is true at w, then at least one disjunct is true
at w. Thus either ~Aw = > or ~Bw = > (or both). But if
~Aw = > then w ∈ ||A||, and if ~Bw = > then w ∈ ||B||. Thus
w is in either ||A|| or ||B||.
Therefore ||A or B|| = ||A|| ∪ ||B||.

Thus at the level of worldly values, the effect of disjunction is union –


the worldly value of the disjunction of two sentences is the union of the
worldly values of the two original sentences.

Problem 226: Show that for any sentences A, B, and C we have


the following:
(a) ||A or B|| = ||not (not A and not B)||.
(b) ||A and (B or C)|| = ||(A and B) or (A and C)||.
Given that one of the reasons for introducing worldly val-
ues was to provide different semantic values for sentences like
Aristotle is a philosopher and Chomsky is a linguist that
have the same (actual) truth value, how worried should we be
about the sameness of worldly values for such pairs of sen-
tences.

3. If: Let’s suppose we have a simple-minded theory of if that just uses


the truth table for the material conditional, so that If A, B is true if A is
false or B is true, and If A, B is false if A is true and B is false.
• Unlike and and or, the syntax of if is pretty friendly to a binary-
branching picture. We can have:

222
B
if A
and then define ~if = λxt .λyt .(x = ⊥ ∨ y = >)
Now suppose we have two sentences A and B and we know ||A|| and ||B||.
Then we can from that information determine ||if A, B||:

Claim: ||If A, B|| = (W - ||A||) ∪ ||B||.

Proof: We show two things: (i) any world that is in ||if A, B||
is either in ||B|| or not in ||A||, and (ii) any world that is either in
||B|| or not in ||A|| (or both) is in ||if A, B||.
(a) Suppose w ∈ ||if A, B||. Then ~if A, Bw = >. Therefore
either ~Aw = ⊥ or ~Bw = >. If ~Sw = ⊥ then w < ||A||,
so w ∈ W-||A||, so w ∈ (W-||A|| ∪ ||B||. And if ~Bw = >, then
w ∈ ||B||, so w ∈ (W-||A||) ∪ ||B||. So in either case, w ∈ (W -
||A||) ∪ ||B||.
(b) Suppose w ∈ W - ||A||. Then w < ||A||. Thus ~Aw = ⊥, so
~if A, Bw = >. Thus w ∈ ||if A, B||.Suppose instead
that w ∈ ||B||. Then ~Bw = >, so ~if A, Bw = >. Thus
again w ∈ ||if A, B||. So in either case, w ∈ ||if A, B||.
Therefore ||if A, B|| = (W - ||A||) ∪ ||B||.

Problem 227: Show that ||A|| ⊆ ||B|| if and only if ||if A, B|| =
W.

Problem 228: Show that each of the following is true:


(a) ||A and (if A, B)|| = ||A and B||.
(b) ||(if A, B) and not B|| = ||not A and not B||.

The truth-functional connectives thus create distinctive set-theoretic effects on


worldly values:
1. Negation is set-theoretic complementation.
2. Conjunction is intersection.
3. Disjunction is union.
4. Other connectives can have their set-theoretic effects built up by writing
them in terms of negation, conjunction, and disjunction.
Problem 229: Write the truth-functional biconditional connective
↔ in terms of negation, conjunction, and disjunction, and use the
resulting expression to give a rule for determining ||A if and only
if B|| in terms of ||A|| and ||B||.

223
BE Modals

So far the relativization of truth values to worlds is an idle wheel in our for-
malism. Having world-relativized truth values lets us distinguish between
sentences that have the same unrelativized truth value, and lets us make some
useful observations about the connection between truth-functional connectives
and operations on sets. But we don’t do anything with world-relativized truth
values = they don’t serve as input to any further mechanism in our formal
machinery.

But there are parts of language that do specifically require world-relativized


truth values. We’ll start by considering modals. Modals are a large category of
expressions that are used to say not just what did happen, but what could have,
would have, or must have happened under other possible circumstances. Thus
consider sentences such as:
1. Necessarily, tigers are mammals.
2. Tigers must be mammals.
3. Tigers have to be mammals.
4. Possibly, some tigers are purple.
5. Tigers might have been purple.
6. Tigers can be purple.
There is a lot of syntactic variation in how modals appear in English, but for
now let’s simplify things by focusing on modals that act as sentence modifiers –
modals that serve as sibling nodes to S nodes, having another S node as parent.
For example:
S

S
Necessarily

tigers
are mammals
And now we can quickly see why we need world-relativized truth values to
make sense of modals. We start with four observations:
1. Tigers are mammals is true.
2. Tigers are striped is true.
3. Necessarily tigers are mammals is true.
4. Necessarily tigers are striped is false.

224
If necessarily combined with non-relativized simple t values, then necessarily
would receive the same input when combined with either of Tigers are
mammals or Tigers are striped (namely, >). But if necessarily receives
the same input, it will produce the same output, so we would be forced to give
the same truth value to Necessarily tigers are mammals and Necessarily
tigers are striped. Since we want those two sentences to have different
truth values, rather than the same truth value, we need necessarily to take
something other than simple truth values as input.

World-relativized truth values will do the trick. Tigers are mammals and
Tigers are striped don’t have the same truth value relative to every world.
Precisely because Necessarily tigers are striped is false, we know there is
some possibility of non-striped tigers. Let w be a world in which tigers are not
striped. Since Necessarily tigers are mammals is true, tigers are mammals
in w. Thus we have:
• ~Tigers are mammalsw = >

• ~Tigers are stripedw = ⊥


and thus Tigers are mammals and Tigers are striped don’t have the same
truth values relative to every world. Using world-relativized truth values as the
input to necessarily thus lets us distinguish between the two necessity claims.

However, as with the earlier relativization to variable assignments, we need


some extra machinery to let us make use of the world-relativized truth values.
And as with variable assignments, we can do this in two ways. Whichever way
we do it, we need to add a new basic type to our type theory. We’ll let w be
the type of possible worlds. Then we can proceed along either of the following
routes:

1. First Approach: We use world-relativized semantic values, and add λ


terms to convert world-relativized values into properties of worlds. We
thus add a new kind of λ term to the syntax, so that Necessarily, tigers
are mammals gets the structure:

Necessarily
λw
tigers
are mammals

The two full sentence nodes for tigers are mammals and necessarily
tigers are mammals are of type t, but relativized to a choice of world:

225
t

Necessarily
t
λw

tigers
are mammals
We aren’t worrying yet about how to incorporate world-relativity into
semantic values at the sub-sentential level, so we won’t worry here about
the typing of tigers, are, or mammals. λw will then convert the world-
relativized t of tigers are mammals into a property of worlds, and hence
a value of type (w, t). Finally, necessarily provides a second-order prop-
erty of worlds (just as quantified noun phrases provide a second-order
property of objects), and is of type ((w, t), t):

((w, t), t) (w, t)

Necessarily
t
λw

tigers
are mammals
Now we need to characterize λw and give a specific semantic value for
necessarily:
(a) λw takes as input a node with world-relativized truth-values and
outputs a (w, t) function. In particular, if N is the input node, then
(λw N)(w) = > if and only if ~Nw = >.
(b) Necessarily takes a (w, t) function as input, and produces > as
output if and only if the (w, t) input maps every world to >. Thus
~Necessarilyu = λx(w, t) .∀w x(w) = >. (For any world u – note
that while necessarily gets a world-relativized semantic value, it
is insensitive to the choice of world.)
To test this, suppose we have four world w1 , w2 , w3 , and w4 , and that
Tigers are mammals is true in each of those worlds, but Tigers are
striped is true only in w1 , w2 , and w3 . That is:
• ~Tigers are mammalsw1 =~Tigers are mammalsw2 =~Tigers are
mammalsw3 =~Tigers are mammalsw4 =~Tigers are stripedw1 =~Tigers
are stripedw2 =~Tigers are stripedw3 = >.

226
• ~Tigers are stripedw4 = ⊥.
Then we have:
 w1 → > 
 
 w → > 
• λw Tigers are mammals =  2 .

 w3 → > 
w4 → >
 

 w1 → > 
 
 w → > 
• λw Tigers are striped =  2 .
 w3 → > 
w4 → ⊥

We can then provide these two functions as inputs to ~necessarily:


• ~Necessarily tigers are mammalsu
• = ~Necessarilyu (~λw tigers are mammalsu )
 w1 → > 
 
 w → > 
• = (λx(w, t) .∀w x(w) = >)( 2 )

 w3 → > 
w4 → >
 

 w1 → > 
 
 w → > 
• = ∀w  2  (w) = >
 w3 → > 

w4 → >
• =>
So Necessarily tigers are mammals comes out true, as desired. (No-
tice that the final semantic value of Necessarily tigers are mammals is
> relative to every possible world, so that the sentence is world-insensitive
in its semantic value.) And then:
• ~Necessarily tigers are stripedu
• = ~Necessarilyu (~λw tigers are stripedu )
 w1 → > 
 
 w → > 
• = (λx(w, t) .∀w x(w) = >)( 2 )

 w3 → > 
w4 → ⊥
 

 w1 → > 
 
 w → > 
• = ∀w  2  (w) = >
 
 w3 → > 
w4 → ⊥
 

• =⊥

227
So Necessarily tigers are striped comes out false, also as desired.
(Notice that again the final semantic value of Necessarily tigers are
striped is ⊥ relative to every possible world, so that the sentence is world-
insensitive in its semantic value. Tigers are striped, on the other hand,
is world-sensitive in its semantic value, since it is true relative to some
worlds (w1 , w2 , and w3 ) and false relative to other worlds (w4 ).)

Problem 230: Notice that both Tigers are mammals and Necessarily
tigers are striped are world-insensitive. But they are world-
insensitive for different reasons. Necessarily tigers are striped
is world-insensitive de jure – it’s a consequence of the rules of
the system (in particular, the rule for ~necessarily) that the
sentence comes out world-insensitive. Tigers are mammals,
on the other hand, is world-insensitive de facto – it’s a conse-
quence of substantive, non-semantic features of tigers that the
sentence comes out world-insensitive.

Give two additional examples of sentences that are de jure


world-insensitive and two additional examples of sentences
that are de facto world-insensitive. See if you can find one
de jure world-insensitive sentence that does not use the word
necessarily.

Do any of your examples call into question how clear the dis-
tinction is between de jure and de facto world-insensitive sen-
tences?

2. Second Approach: We write the world-relativization into our basic type


theory. Thus t∗ abbreviates the type (w, t), and we use t∗ everywhere we
previously used t. Tigers are mammals and Tigers are striped thus
both get unrelativized values of type t∗ . In particular, we have:
(a) ~Tigers are mammals = λw.tigers are mammals in w, or ~Tigers
 w1 → > 
 
 w → > 
are mammals =  2 .
 
 w3 → > 
w4 → >
 

(b) ~Tigers are striped = λw.tigers are striped in w, or ~Tigers are


w1 → > 
 

 w2 → > 
striped =  .

 w3 → > 
w4 → ⊥
 

~necessarily is then of type (t∗ , t∗ ). In particular, we have:

• ~necessarily = λxt∗ .λvw .∀u ∈ w x(u) = >

228
With these pieces, we can calculate ~Necessarily tigers are mammals
and ~Necessarily tigers are striped. First we have:
• ~Necessarily tigers are mammals
• = ~Necessarily(~Tigers are mammals)
w1 → >
 
 
 w2 → > 
• = (λxt .λvw .∀u ∈ w x(u) = >)( )

 
 w3 → > 
w4 → >
 

 w1 → > 
 
 w → > 
• = λvw .∀u ∈ w  2  (u) = >

 w3 → > 
w4 → >
 

• = λvw .>
~Necessarily tigers are mammals is thus the member of t∗ that maps
every world to the true.
Second we have:
• ~Necessarily tigers are striped
• = ~Necessarily(~Tigers are striped)
w1 → >
 
 
 w2 → > 
• = (λxt∗ .λvw .∀u ∈ w x(u) = >)( )
 
 w3 → > 
w4 → ⊥
 

 w1 → > 
 
 w → > 
• = λvw .∀u ∈ w  2  (u) = >

 w3 → > 
w4 → ⊥
 

• = λvw .⊥
~Necessarily tigers are striped is thus the member of t∗ that maps
every world to the false. Notice that both necessity claims pick out one of
the constant members of t∗ , mapping each world to the same truth value.

Both of these semantic approaches agree that necessarily is, in effect, a uni-
versal quantifier over worlds, requiring that the matrix sentence be true with
respect to every possible world. The resulting complex sentence is no longer
world-sensitive – in the first framework, it has the same truth value relative
to every world, and in the second framework, its t∗ value is one of the two
constant functions that maps each world to the same truth value.
Summarizing, we have two possible semantic values for necessarily:
1. World-Relativized: ~Necessarilyu = λx(w, t) .∀w x(w) = >

229
2. Set-Wise: ~necessarily = λxt∗ .λvw .∀u ∈ w x(u) = >

Problem 231: Combine one of the semantic accounts of necessarily


with reasonable semantic values for and and or in order to produce
truth conditions that let you check the validity of the following
arguments (using a truth-preservation conception of logical conse-
quence):
1. Necessarily (A and B)  Necessarily A and necessarily
B
2. Necessarily A and necessarily B  Necessarily (A and B)
3. Necessarily (A or B)  Necessarily A or necessarily B
4. Necessarily A or necessarily B  Necessarily (A or B)

BF Existential Modals

In addition to the universal modals such as necessarily and must, there are
various modals that have the effect of existential quantification over worlds.
Existential modals include possibly, may, might, and can. For simplicity, we’ll
focus on possibly. Let’s again work through two sentences:
1. Possibly tigers are striped.
2. Possibly tigers are reptiles.

As before, Tigers are striped is true in w1 , w2 , and w3 , and false in w4 .


Since Tigers are reptiles is false whenever Tigers are mammals is true,
and Tigers are mammals is true in all four worlds w1 through w4 , Tigers are
reptiles is false in w1 through w4 . We can then calculate the semantic value of
the modalized sentences in either of two ways:

1. First Approach: We use world-relativized semantic values and add a λ


term to convert world-relativized truth values into properties of worlds:

((w, t), t) (w, t)

Possibly
t
λw

tigers
are striped

230
 w1 → > 
 
 w → > 
As before, ~λw tigers are striped =  2 . We also have

 w3 → > 
w4 → ⊥
 
w → ⊥
 
 1 
 w → ⊥ 
 2
~λw tigers are reptiles =  .

 w3 → ⊥ 
w4 → ⊥
 

We also need a semantic value for possibly, which implements it as an


existential quantifier over worlds:
• ~possibly = λx(w, t) .∃w x(w) = >.

With these pieces, we can calculate semantic values for the modalized
sentences. First we have:

• ~Possibly tigers are stripedu


• = ~Possiblyu (~tigers are stripedu )
 w1 → > 
 
 w → > 
• = λx(w, t) .∃w x(w) = >( 2 )
 
 w3 → > 
w4 → ⊥
 

 w1 → > 
 
 w → > 
• = ∃w  2  (w) = >
 
 w3 → > 
w4 → ⊥
 

• =>
And then we have:

• ~Possibly tigers are reptilesu


• = ~Possiblyu (~tigers are reptilesu )
 w1 → ⊥ 
 
 w → ⊥ 
• = λx(w, t) .∃w x(w) = >( 2 )
 w3 → ⊥ 

w4 → ⊥
 w1 → ⊥ 
 
 w → ⊥ 
• = ∃w  2  (w) = >
 
 w3 → ⊥ 
w4 → ⊥
 

• =⊥
2. Second Approach: We write world-relativization into our basic type
theory as before. t∗ is then the type of functions from worlds to truth

231
w1 → >
 
 
 w2 → > 
values, and ~tigers are striped =   while ~tigers are
 
 w3 → > 
w4 → ⊥
 

 w1 → ⊥ 
 
 w → ⊥ 
reptiles =  2 .
 
 w3 → ⊥ 
w4 → ⊥
 

We then have:
• ~Possibly = λxt∗ .λww .∃u ∈ w x(u) = >
We can then calculate again both ~Possibly tigers are striped and
~Possibly tigers are reptiles. First we have:
• ~Possibly tigers are striped
• = ~Possibly(~Tigers are striped)
 w1 → > 
 
 w → > 
• = (λxt∗ .λww .∃u ∈ w x(u) = >)( 2 )
 
 w3 → > 
w4 → ⊥
 

 w1 → > 
 
 w → > 
• = λww .∃u ∈ w  2  (u) = >
 
 w3 → > 
w4 → ⊥
 

• = λww .>
~Possibly tigers are striped is thus the member of t∗ that maps
every world to the true.
Second we have:
• ~Possibly tigers are reptiles
• = ~Possibly(~Tigers are reptiles)
 w1 → ⊥
 

 w → ⊥ 
• = (λxt∗ .λww .∃u ∈ w x(u) = >)( 2 )
 
 w3 → ⊥ 
w4 → ⊥
 

 w1 → ⊥ 
 
 w → ⊥ 
• = λww .∃u ∈ w  2  (u) = >
 
 w3 → ⊥ 
w4 → ⊥
 

• = λww .⊥
~Possibly tigers are reptiles is thus the member of t∗ that maps
every world to the false.

232
Each of these two approaches in its own way makes possibly into an existential
quantifier over worlds. (Note that both semantic values for possibly feature a
∃ existential quantifier somewher e.)

Problem 232: We’ve treated modals in a way quite similar to our


earlier treatment of quantified noun phrases:
1. Quantified noun phrases are either type ((e, t), t) or of type t∗ ,
understood as the set of functions from variable assignments
to truth values.
2. Modals are either type ((w, t), t) or type t∗ , understood as the
set of functions from worlds to truth values.
Modals are thus, in effect, quantifiers over worlds rather than ob-
jects. It’s thus not entirely surprising that two significant forms of
quantifiers – universal and existential quantifiers – also show us as
forms of modals (necessarily and possibly).

Do any of the other quantified noun phrases from ((e, t), t) have
analogs among modal expressions? Are there modal terms that are,
in effect, most or few or many quantifiers over worlds?

Problem 233: Combine semantic values for necessarily and possibly


with reasonable semantic values for and, or, and not in order to
produce truth conditions that let you check the validity of the fol-
lowing arguments (using a truth-preservation conception of logical
consequence):
1. Possibly (A and B)  Possibly A and possibly B
2. Possibly A and possibly B  Possibly (A and B)
3. Possibly (A or B)  Possibly A or possibly B
4. Possibly A or possibly B  Possibly (A or B)
5. Not possibly A  Necessarily not A
6. Not necessarily A  Possibly not A

BG Modal Flavors

The simple account of necessarily in the previous section gives an initial


glimpse of what we can do by integrating possible worlds into our semantic
technology. But the account is too simple to deal with the full complexity of
natural language data. Let’s switch to a different modal term, must. Consider
the following two must claims:
1. Bachelors must be unmarried.

233
2. Everyone must pay their taxes by April 15.
The two claims seem to be attributing different kinds of necessity. The first claim
fits well with the model of the previous section. The distinctive thing about the
claim Bachelors are unmarried is that it is inevitably true. No matter how the
world is, bachelors are unmarried – there’s just no way for things to go that
results in unmarried bachelors.

But the second claim isn’t like that. Everyone pays their taxes by April
15 isn’t an inevitable truth. In fact, it’s unlikely even to be a truth – typi-
cally some people pay their taxes late. So in saying Everyone must pay their
taxes by April 15, we aren’t saying that timely tax-paying is something that
happens no matter how the world is. Rather, we are saying that timely tax-
paying is required by law.

So we have two different kinds of necessity in English. There is unqualified


necessity, or perhaps it should be called analytic necessity – the sort attributed
in Bachelors must be unmarried. And there is legal necessity – the sort at-
tributed in Everyone must pay their taxes by April 15. Once we notice
that there is more than one kind of necessity, we can easily produce examples
of additional kinds:
1. Epistemic Necessity: If those two angles of the triangle are 45◦
angles, then the third angle must be a right angle.

2. Circumstantial Necessity: Sorry I must/have to sneeze.


3. Deontic Necessity: Everyone must/ought to help those in need.
4. Teleological Necessity: To get to the Isle of Man, you must take
a ferry.

5. Bouletic Necessity: I’ve had all I can take of your rudeness. You
must/need to leave immediately.
6. Capability Necessity: I can’t run a five minute mile/I must take
more than five minutes to run a mile.

7. Nomological Necessity: Objects must experience force to undergo


a change in direction.
8. Chess Necessity: Bishops must move diagonally, not horizontally
or vertically.

That’s not meant to be an exhaustive list – just some indication of the range of
meanings (what are sometimes called modal flavors) that is available for words
like must.

234
Problem 234: Find three more modal flavors. To identify a modal
flavor, give a sentence using some modal word like must, might,
necessarily, or possibly in which the meaning of the modal word
doesn’t fit any of the categories we’ve considered so far. If it’s
helpful, you can give some surrounding context for the sentence
to help isolate the right meaning. Comment briefly on how your
modal flavor differs from the flavors considered above, and on the
right way to describe the new modal flavor.

Problem 235: The list we’ve given of modal flavors might encourage
the thought that, while the word must can express many flavors of
modality, once we’ve picked a particular sentence containing the
word must, we’ve done enough to fix the modal flavor. To show
that this isn’t true, give a sentence containing must that can be read
with more than one modal flavor. Give a collection of surrounding
contexts that bring out different modal readings of the sentence. See
how many modal flavors you can get from a single sentence.

Problem 236: We’ve been giving lists of modal flavors as if the


various uses of modals break down nicely into some number of
discrete flavor types. But as the list of modal flavors expands, it can
be tempting to think that there aren’t a number of discrete modal
flavors, but instead just a continuous spread of gradually changing
modal uses. To support this alternative hypothesis, pick two modal
flavors from the list given above, and then construct three to five
examples that use must in ways that gradually transition from the
first modal flavor to the second.

BH Meanings for Modal Flavors

Tigers must be mammals is true just in case Tigers are mammals is true in
every world. But You must pay your taxes by April 15, understood as an
expression of legal necessity, can’t require for its truth that You pay your taxes
by April 15 is true in every world. It’s true that you must pay your taxes by
April 15. But there are many ways the world can be on which you don’t pay
your taxes by April 15. (Perhaps the actual world is even a world in which you
don’t pay your taxes by April 15.) So the truth of this legal necessity claim can’t
require the truth of You pay your taxes by April 15 in every world.

However, we can retain the core idea that modals like must and necessarily
are universal quantifiers over possible worlds. The truth of a legal necessity
claim of the form Necessarily S doesn’t require the truth of S in every pos-
sible world, but it does require the truth of S in every legally possible world.

235
A world is legally possible if all of the (actual) laws are obeyed in that world.
Legally possible worlds can be quite different from the actual world – a world
inhabited entirely by talking purple kangaroos, but in which the kangaroos all
drive under the speed limit and pay their taxes on time, is legally possible. One
of the laws requires that taxes be paid by April 15. That law, then, must be
obeyed in a world for that world to count as legally possible. Thus taxes are
paid by April 15 in every legally possible world, which means that it is legally
necessary that taxes be paid by April 15.

w is the type of possible worlds. We can then let wL be the collection of legally
possible worlds. The legally possible worlds are a subset of all of the possible
worlds, so wL ⊆w. We can then give semantic values for both universal and
existential legal modals:
1. Legally necessarily:
(a) Using relativized semantic values:
~legally necessarilyu = λx(w, t) .∀w ∈ wL x(w) = >
(b) Using setwise semantic values:
~legally necessarily = λxt∗ .λvw .∀u ∈ wL x(u) = >
2. Legally possibly:

(a) Using relativized semantic values:


~legally possiblyu = λx(w, t) .∃w ∈ wL x(w) = >
(b) Using setwise semantic values:
~legally possibly = λxt∗ .λvw .∃u ∈ wL x(u) = >

The obvious extension of this approach is to have a number of distinguished


sets of possible worlds, with distinct modals for each distinguished set:
1. wep is the set of epistemically possible worlds (the set of worlds in which
everything that is actually known is true). mustep is the necessity modal
quantifying universally over wep , and mayep is the possibility modal quan-
tifying existentially over wep .
2. wcirc is the set of circumstantially possible worlds (the set of worlds in
which all of some relevant set of current circumstances are true). mustcirc
is the necessity modal quantifying universally over wcirc , and maycirc is the
possibility modal quantifying existentially over wcirc .

3. wd is the set of deontically possible worlds (the set of worlds in which


all of the things that should be true are true). mustd is the necessity
modal quantifying universally over wd , and mayd is the possibility modal
quantifying existentially over wd .

236
4. wt is the set of teleologically possible worlds (the set of worlds in which
some relevant goal is achieved). mustt is the necessity modal quantify-
ing universally over wt , and mayt is the possibility modal quantifying
existentially over wt .
5. wb is the set of bouletically possible worlds (the set of worlds in which all
the desires of some relevant person are satisfied). mustb is the necessity
modal quantifying universally over wb , and mayb is the possibility modal
quantifying existentially over wb .
6. wcap is the set of capacity possible worlds (the set of worlds in which every
action performed by some relevant person lies within the capacities of that
person). mustcap is the necessity modal quantifying universally over wcap ,
and maycap is the possibility modal quantifying existentially over wcap .
7. wn is the set of nomologically possible worlds (the set of worlds in which
all of the actual laws of physics (or other sciences) are true). mustn is
the necessity modal quantifying universally over wn , and mayn is the
possibility modal quantifying existentially over wn .
8. wchess is the set of chess possible worlds (the set of worlds in which all of the
rules of chess are followed). mustchess is the necessity modal quantifying
universally over wchess , and maychess is the possibility modal quantifying
existentially over wchess .

On this approach, modal words like must and might are multiply ambiguous.
That is, there are really many words in English that look like must – the words
we’ve identified above as mustep , mustcirc , mustb , and so on. (We also need a
word must for the unrestricted modal that requires truth in all possible worlds
(what we called analytic necessity).)

Proliferating words must in English doesn’t make for a very elegant theory.
Another possibility is to treat must as a context-sensitive expression. Previously
we have modeled contexts as ordered quadruples of the form:
• hspeaker, audience, time, placei

Now we will expand our contexts so that they also include a set of worlds:
• hspeaker, audience, time, place,worldsi
We can then give the following context-sensitive semantic values for must and
may:

1. ~mustc = λx.∀w ∈ c(5) x(w) = >


2. ~mayc = λx.∃w ∈ c(5) x(w) = >
The modal flavor is then controlled by the context. When must is used in
a context c such that c(5) is the set of worlds consistent with the speaker’s

237
knowledge, must expresses an epistemic flavor of modality. When must is used
in a context c such that c(5) is the set of worlds in which all the laws of physics
are true, must expresses a nomological flavor of modality. Because the context
controls the modal flavor, we only need a single word must in the language,
rather than separate modals for each modal flavor.

Problem 237: We’ve given two competing accounts of modal fla-


vors:
1. Modal flavors via multiple lexical items
2. Modal flavors via context sensitivity
These two accounts produce different predictions in some cases.
Consider a single sentence that contains more than one modal, such
as:
• Aristotle must be a philosopher and Chomsky must be a
linguist.
Under a context-sensitive approach, this sentence gets evaluated
with respect to some context c which provides some set c(5) of
worlds. Thus both occurrences of must in the sentence quantify
universally over c(5), and both express the same modal flavor. Un-
der a multiple lexical item approach, the two occurrences of must
in the sentence can be different lexical items. (For example, the first
could be mustep and the second could be mustcirc ). Thus the two
modals in the sentence can express different modal flavors.

Use this difference in prediction to compare the multiple lexical item


account to the context sensitive account. Can there be a sentence
with multiple occurrences of must having different modal flavors?
If there can, is there any way we can adjust the context sensitivity
account to allow for this? If there cannot, is there any way we can
adjust the multiple lexical items account to explain the unavailabil-
ity of diverse modal flavors in a single sentence?

Problem 238: Consider two ways of specifying the set of worlds


wchess for the chess necessity modal. Let S be a long sentence that
fully states all of the rules of chess. Thus S is something like:
• Bishops only move diagonally, and rooks only move horizontally
or vertically, and pawns move forward exactly one square
unless capturing or making their first move, and a player’s
king is never in check after their move, and ...
We can then say either:
1. wchess is the set of worlds in which S is true. That is, wchess =
{w :~Sw = >}.

238
2. wchess is the set of worlds in which S states the rules of chess.
That is, wchess = {w :~The rules of chess are that Sw =
>}
Consider how these two methods of specifying wchess differ. What
kind of worlds will be in one version of wchess but not the other?
What affect does that difference in worlds have on the truth value of
claims of chess necessity? Which method of specifying wchess gives
a better account of the truth values of chess necessity sentences?

BI Modal Modals

Here is a modal truth about chess:


• Pawns may move two squares on their first move.

But chess wasn’t always played with rules allowing pawns to move two squares
– this was a change in the rules in the late medieval period. And since chess
wasn’t always played with a two-square pawn movement rule, there are pos-
sible worlds in which chess is not played with that rule. Let @ be the actual
world, the world we actually live in. And let w be a world in which the rules of
chess are different from the actual rules, and permit only single-square moves
by pawns at any point. Then we should have:
1. ~Pawns may move two squares on their first move@ = >
2. ~Pawns may move two squares on their first movew = ⊥
Unfortunately, we can’t get this result using the machinery we’ve developed
so far. We’ll use a simplified tree for Pawns may move two squares on their
first move:
t

((w, t), t) (w, t)

May

λw t

Pawns move two squares on their first move


Now suppose we have three worlds:

239
1. The actual world @, in which some pawns are moved one square on their
first move and some pawns are moved two squares on their first move.
2. World w, which doesn’t permit two-square moves by pawns, and in which
all pawns are moved one square on their first move.

3. World u, in which some pawns are moved two squares on their first move
and some pawns are moved three squares on their first move.
One thing we can extract from these three worlds is that ||Pawns move two
squares on their first move||={@, u}. (We aren’t worrying here about the
internal details of calculating the truth conditions of Pawns move two squares
on their first move, but we are assuming that those truth conditions are
existential, so that the claim is true relative to a world as long as at least some
pawns in that world move two squares. If we don’t like those existential truth
conditions, we could add a fourth world v in which pawns always move two
squares on their first move.)

We can also use these worlds to say exactly what wchess is. In world u, the (ac-
tual) rules of chess aren’t obeyed, since the (actual) rules of chess don’t allow
pawns ever to move three squares. Thus u <wchess . But in the actual world @, the
rules of chess are obeyed (we idealize slightly here), so @∈wchess . And in world
w, the (actual) rules of chess are also obeyed – the rules of chess say that both
single-square and double-square initial pawn moves are legal, so there’s no vio-
lation of the rules if, in fact, pawns always move just one square. Thus w ∈wchess .

That gives us wchess = {@, w}. We can now use that specification of wchess to
calculate both ~Pawns may move two squares on their first move@ and
~Pawns may move two squares on their first movew .

1. ~Pawns may move two squares on their first move@


• = ~may@ (~λw pawns move two squares on their first move@ )
 @ → >
 

• ~λw pawns move two squares on their first move =  w →
@  ⊥ 

u → >
 

• So ~may@ (~λ
 w pawns move two squares on their first move@ )
 @ → > 

= ~may@ ( w → ⊥ )

u → >
 

 @ → >
 

• = (λx(w, t) .∃v ∈ wchess x(v) = >)( w → ⊥ )
 
u → >
 

 @ → >
 

• = (λx(w, t) .∃v ∈ {@, w} x(v) = >)( w → ⊥ )
 
u → >
 

240
 @ → > 
 
• = ∃v ∈ {@, w}  w → ⊥  (v) = >

u → >
 

 @ → >
 

• = >, since  w → ⊥  (@) = >
 
u → >
 

2. ~Pawns may move two squares on their first movew


• = ~mayw (~λw pawns move two squares on their first movew )
 @ → >
 

• ~λw pawns move two squares on their first movew =  w → ⊥
 

u → >
 

• So ~mayw (~λ
 w pawns move two squares on their first movew )
 @ → > 

= ~mayw ( w → ⊥ )

u → >
 

 @ → >
 

• = (λx(w, t) .∃v ∈ wchess x(v) = >)( w → ⊥ )
 
u → >
 

 @ → >
 

• = (λx(w, t) .∃v ∈ {@, w} x(v) = >)( w → ⊥ )
 
u → >
 

 @ → > 
 
• = ∃v ∈ {@, w}  w → ⊥  (v) = >
 
u → >
 

 @ → > 
 
= >, since   (@) = >
 w → ⊥ 

u → >
 

We thus get Pawns may move two squares on their first move coming out
true in both @ and w. But that’s the wrong result – the sentence should be true
in @ and false in w.

It’s not hard to see what has gone wrong. When we calculate both ~Pawns
may move two squares on their first move@ and ~Pawns may move two
squares on their first movew , the world relativization quickly disappears.
That’s because:
1. ~May@ = ~Mayw

2. ~λw pawns move two squares on their first move@ = ~λw pawns move
two squares on their first movew
So after we pass the first step of calculating by noting that:

241
• ~Pawns may move two squares on their first move@/w = ~May@/w (~λw
pawns move two squares on their first move@/w )
the relativization of semantic value to worlds effectively drops out, and we’re
guaranteed to get the same final truth value relative to @ or relative to w.

We should have ~λw pawns move two squares on their first move@ = ~λw
pawns move two squares on their first movew . The whole point of the
λw expression is to convert a sentence, with its distribution of truth values
relative to worlds, into a (w, t) expression mapping each world to the truth
value of the sentence at that world. ~λw pawns move two squares on their
first move thus depends on the total behavior of pawns move two squares
on their first move at all worlds – it doesn’t matter what world we are rel-
ativizing to, if we’re reporting a global feature of the sentence’s truth values at
all worlds.

So if we’re going to get different truth values for Pawns may move two squares
on their first move relative to @ and w, we need to have ~May@ , ~Mayw –
we need may to have a genuinely world-sensitive semantic value. The question
then is where to incorporate world-sensitivity. Right now we have:
• ~May = λx(w, t) .∃w ∈ wchess x(w) = >

There aren’t many options in that semantic value for adding some world-
sensitivity.

But there is one good option. The set wchess of chess-possible worlds should
vary depending on the world of evaluation. Recall:
1. In @, the rules of chess allow pawns to move one or two squares on their
first move.
2. In w, the rules of chess require pawns to move exactly one square on their
first move.
@ and w, but. not u, obey all the rules of chess according to @. That’s why we
set wchess = {@, w}. But @ doesn’t obey the rules of chess according to w. In w,
the rules prohibit pawns from moving two squares, so @ isn’t a chess-possible
world according to w. From the point of view of w, wchess should be {w}, not
{@, w}. (World u isn’t chess-possible according to either @ or w.)

World-sensitivity, then, enters into the selection of wchess , the relevant set of
worlds for the modal may to quantify over. To build this into our machinery,
we replace the set of worlds wchess with a function gchess that maps worlds into
sets of worlds. Given any input world v, gchess (v) is the set of worlds that are
chess-possible according to v – that is, the set of worlds in which all of the rules
that are rules of chess as it is played in v are obeyed. Thus we have:
1. gchess (@) = {@, w}

242
2. gchess (w) = {w}
We then change the semantic value of may to make use of this function:
• ~Mayv = λx(w, t) .∃w ∈ gchess (v) x(w) = >

Let’s re-calculate the truth value of Pawns may move two squares on their
first move relative to both @ and w using this new semantic value for may:

1. ~Pawns may move two squares on their first move@


• = ~may@ (~λw pawns move two squares on their first move@ )
 @ → > 
 
• = ~may@ ( w → ⊥ )
 
u → >
 

 @ → > 
 
• = (λx(w, t) .∃v ∈ gchess (@) x(v) = >)(
 w → ⊥ 
)
u → >
 

 @ → > 
 
• = (λx(w, t) .∃v ∈ {@, w} x(v) = >)(
 w → ⊥ 
)
u → >
 

 @ → > 
 
• = ∃v ∈ {@, w}   (v) = >
 w → ⊥ 
u → >
 

 @ → > 
 
• = >, since  w → ⊥  (@) = >
 
u → >
 

2. ~Pawns may move two squares on their first movew

• = ~mayw (~λw pawns move two squares on their first movew )


 @ → > 
 
• = ~mayw ( w → ⊥ )
 
u → >
 

 @ → > 
 
• = (λx(w, t) .∃v ∈ gchess (w) x(v) = >)( w → ⊥ )
 
u → >
 

 @ → > 
 
• = (λx(w, t) .∃v ∈ {w} x(v) = >)( w → ⊥ )
 
u → >
 

 @ → > 
 
• = ∃v ∈ {w}   (v) = >
 w → ⊥ 
u → >
 

243
→ >

 @


• = ⊥, since  w → ⊥  (w) = >
 
u → >
 

And now we have the claim true relative to @ and false relative to w, as desired.
Generalizing, we can associate with each modal flavor F a function gF that maps
each world to a set of worlds – the set of worlds that characterize the possibilities
of that flavor according to the input world. Thus we have functions gep , gb , gcirc ,
and so on. We then have a collection of modals in the language, with both
universal and existential modals of each flavor, using the rules:
1. ~mustF v =λx(w, t) .∀w ∈ gF (v) x(w) = >

2. ~mayF v =λx(w, t) .∃w ∈ gF (v) x(w) = >

BJ Accessibility Relations and Iterated Modals

Pick some modal flavor F, and let must express that modal flavor. Suppose Must
S is true, for some unspecified sentence S. What can we say about Must must S?

Before we made the shift, in the previous section, from modeling modal flavors
using a set wF of worlds to modeling modal flavors using a function gF that
assigns each world its own set of worlds for the flavor, the answer would have
been simple. Pre-shift, we can reason as follows. Suppose Must S is true. As
we observed earlier, when we use wF , Must S isn’t world-sensitive, so ~Must
Sw = ~Must Su for any worlds w and u. Since Must S is true at the actual
world, it thus must be true at every world.

But if Must S is true at every world, Must must S will also be true. We can
check the details. We have:

((w, t), t) (w, t)

Must
λw t

((w, t), t) (w, t)

must λw t

S
Thus:
• ~Must λw must λw Sw = ~Mustw (~λw must λw Sw )

244
• Because Must S is true relative to every world, ~λw must λw Sw is the
function that maps every world to >. For simplicity, assume we have just
 w1 → > 
 
three worlds w1 , w2 , and w3 . Then ~λw must λw S =  w2 → > .
w

w3 → >
 

 w1 → > 
 
• So ~Must λw must λw Sw = ~Mustw ( w2 → > )
 
w3 → >
 

 w1 → > 
 
• = λx(w, t) .∀w ∈ wF x(w) = >( 2
 w → > 
)
w3 → >
 

 w1 → > 
 
• = ∀w ∈ wF  w2 → >  (w) = >
 
w3 → >
 

• =>
Thus the truth of Must S guarantees the truth of Must must S.

Problem 239: Show that on our simpler semantics (not using a g


function), the truth of Must must S guarantees the truth of Must S.
Conclude that Must S and Must must S are logically equivalent on
this treatment.

However, when we shift from using a set wF of flavored worlds to using a


flavored function gF mapping each world to a set of flavored worlds, it’s easy
to get cases in which Must S is true but Must must S is false.

Suppose we have three worlds, w1 , w2 , and w3 . S is true at w1 and w2 but false at


 w1 → > 
 
w3 , so ||S||={w1 , w2 } and ~λw S =  w2 → >  (for any world u). Suppose
u

w3 → ⊥
 
that:

1. gF (w1 ) = {w1 , w2 }
2. gF (w2 ) = {w2 , w3 }
(It won’t matter to us what gF (w3 ) is.)

Then we make the following observations:


1. Must S is true at w1 . The truth of Must S at w1 requires the truth of S at
every world in gF (w1 ). But g f (w1 ) = {w1 , w2 }, and S is true in both w1 and
w2 .

245
2. Must S is false at w2 . The truth of Must S at w2 requires the truth of S at
every world in g f (w2 ). But g f (w2 ) = {w2 , w3 }, and S is false in w3 .
3. Must must S is false at w1 . The truth of Must must S at w1 requires the
truth of Must S at every world in gF (w1 ). Since gF (w1 ) = {w1 , w2 }, and
Must S is (as we’ve just seen) true at w1 but false at w2 , we don’t have
Must S true at every world in gF (w1 ). Thus Must must S is false at w1 .
The switch from flavored worlds to flavored functions thus gives us a seman-
tics in which Must S does not imply Must must S. That means that the two
sentences Must S and Must must S don’t mean the same thing, so in our mod-
ified semantics, iterated modals aren’t redundant in meaning – adding another
modal can change the meaning of a sentence.

Problem 240: Give an example of a function gF which shows that


Must must S can be true without Must S being true.

To get a clearer picture of the semantic effect of using flavored functions, and
how those functions impact the interpretation of iterated modals, let’s consider
another method of implementing the functions idea. Our starting picture as-
sociated with each modal flavor a distinguished set of worlds – the worlds
relevant to that modal flavor. So if type w contains some nine worlds:

w1 w2 w3

w4 w5 w6

w7 w8 w9

we can represent wF diagrammatically via a designated subset of w:

w1 w2 w3

w4 w5 w6

w7 w8 w9

When we switch to a flavor function gF , the attempt to represent things dia-


grammatically gets rather more complicated. We could try marking the distin-
guished subsets and then adding arrows from each world to the distinguished
subset that’s associated with it:

246
w1 w2 w3

w4 w5 w6

w7 w8 w9

All of the necessary information is in this diagram. For example, gF (w3 ) =


{w4 , w5 , w7 , w8 } (following the green arrow from w3 to the green bubble around
w4 , w5 , w7 , and w8 ). But the diagram isn’t terribly readable (and would be even
worse if we hadn’t used collections of worlds near each other in the diagram
as the outputs of gF .

Problem 241: Using the previous diagram, specify exactly what set
of worlds gF (w) is for each world w in {w1 , . . . , w9 }.

Fortunately, there is a (somewhat) more convenient way to express the same


information. An accessibility relation is a binary relation between worlds. In
a diagram, we can represent an accessibility relation as a collection of arrows
from one world to another. For example:

w1 w2 w3

w4 w5 w6

w7 w8 w9

In this diagram, there is an accessibility arrow from world w1 to world w2 . We’ll


thus say that w1 has access to w2 .

Notice that for any world w, we can find the set of all worlds that w has access
to. As a result, an accessibility relation associates each world with a set of
worlds. That means we can reproduce the effects of a modal flavor function
gF by using an accessibility relation. We get a simple translation procedure in
both directions:
1. Suppose we are given an accessibility relation RF . We can then define a
modal flavor function gF using RF :

• For any world w, gF (w) = {u : w bears RF to u. (That is, gF is the


set of worlds to which w has access.) For example, in our previous
diagram, gF (w1 ) = {w2 , w4 }, and gF (w5 ) = {w1 , w6 , w7 }.

247
2. Suppose we are given a modal flavor function gF . We can then define an
accessibility relation RF using gF :
• For any worlds w and u, w bears RF to u just in case u ∈gF (w). For
example, in our earlier diagram of gF , we have w1 with access to w5
and w5 with access to w8 , but w8 does not have access to w1 .

As a result, we can easily move back and forth between characterizing modals
with flavor functions and characterizing modals with accessibility relations.

Problem 242: Using the accessibility relation in the previous dia-


gram, determine gF (w4 ), gF (w7 ), and gF (w9 ).

Problem 243: Give an accessibility relation that corresponds to the


full modal flavor function given in the earlier diagram.

Problem 244: For each of the following accessibility relations, give


the corresponding modal flavor function.

w1 w2 w3

w4 w5 w6

w7 w8 w9
1.

w1 w2

w3 w4

2.

w1 w2

w3 w4

3.

248
w1 w2 w3

w4 w5 w6

w7 w8 w9

4.

Problem 245: For each of the following modal flavor functions, give
a corresponding accessibility relation.
1. • gF (w1 ) = {w1 , w3 }
• gF (w2 ) = {w2 , w4 }
• gF (w3 ) = {w1 , w3 }
• gF (w4 ) = {w2 , w4 }
2. • gF (w1 ) = {w2 }
• gF (w1 ) = {w1 , w2 , w3 , w4 , w5 }
• gF (w3 ) = {w1 , w3 , w5 }
• gF (w4 ) = ∅
• gF (w5 ) = {w2 , w4 }
3. • gF (w1 ) = {w2 , w3 , w4 , w5 , w6 }
• gF (w2 ) = {w1 , w3 , w4 , w5 , w6 }
• gF (w3 ) = {w1 , w2 , w4 , w5 , w6 }
• gF (w4 ) = {w1 , w2 , w3 , w5 , w6 }
• gF (w5 ) = {w1 , w2 , w3 , w4 , w6 }
• gF (w6 ) = {w1 , w2 , w3 , w4 , w5 }
4. • gF (w1 ) = ∅
• gF (w2 ) = {w2 , w9 }
• gF (w3 ) = ∅
• gF (w4 ) = {w1 , w8 , w9 }
• gF (w5 ) = {w5 , w7 , w9 }
• gF (w6 ) = ∅
• gF (w7 ) = {w2 , w4 , w6 , w8 , w9 }
• gF (w8 ) = {w1 , w3 , w5 , w9 }
• gF (w9 ) = {w9 }

Problem 246: Suppose w contains ten worlds. How many different


modal flavor functions are there on w? (How many different inputs
are there to a modal flavor funciton gF ? How many possible outputs
are there for each input? How can these two numbers be used to
determine the total number of modal flavor functions?)

249
Problem 247: Suppose again that w contains ten worlds. How
many different accessibility relations are there on w? (An accessi-
bility relation is a binary relation on w, which means that it is a set of
ordered pairs of members of w. How many ordered pairs of mem-
bers of w are there? How can this number be used to determine the
total number of accessibility relations?)

We can now give revised semantic values for (flavored) modals using an acces-
sibility relation rather than a modal flavor function:
• ~mustF v =λx(w, t) .∀w(vRF w → x(w) = >)

• ~mayF v =λx(w, t) .∃w(vRF w ∧ x(w) = >)

Problem 248: Suppose we have a collection of eight possible worlds


produced by assigning all possible combinations of truth values to
three sentences A, B, and C. We’ll represent each world using a string
of three letters, with an upper-case letter indicating that the corre-
sponding sentence is true and a lower-case letter indicating that the
corresponding sentence is false. Thus the world AbC is the world in
which A and C are true and B is false. The eight worlds are then:

ABC ABc AbC Abc

aBC aBc abC abc

We now add an accessibility relation to this set of worlds:

ABC ABc AbC Abc

aBC aBc abC abc

Using this accessibility relation, determine the truth values of the


following sentences at the indicated worlds.
1. ~Must AABC
2. ~Might BaBc
3. ~Must (A and C)AbC
4. ~Might (A or B)abc
5. ~Must must BAbc

250
6. ~Might might Cabc
7. ~Must (might A or might B)ABC
8. ~Must not AaBC
9. ~Not must AaBC
10. ~Must not might not must not AABC

BK The Logic of Iterated Modals

Consider the following arguments:


• Bachelors must be unmarried. Therefore, bachelors are unmarried.
• Physical objects must travel under the speed of light. Therefore, physical
objects travel under the speed of light.
• It must be raining. (Everyone is bringing in umbrellas.) Therefore, it is
raining.
• You must pay your taxes by April 15. Therefore, you pay your taxes by
April 15.
Some of these arguments look valid and some do not. (Not everyone will agree
on which arguments look which way.)

The semantic value we’ve developed for must:


• ~mustF v =λx(w, t) .∀w(vRF w → x(w) = >)

doesn’t settle the question of whether arguments of the general form:


• Must A; therefore A
are valid. It turns out, on examination, that validity depends on the accessibility
relation we use for the modals. Consider two cases.

1. Suppose we have two worlds A and a, with the following accessibility


relation:

A a

The world in which A is false has access to the world in which A is true,
but not vice versa, and neither world has access to the other. Then we can
observe:

251
• ~Aa = ⊥, because world a is, by definition, a world making A false.
• ~Must Aa = >, because world a has access only to world A, and
~AA = >.

But if Must A is true and A is false, Must A does. not imply A. So the
argument:
• Must A; therefore A
is invalid.

2. Suppose we have two worlds A and a with the following accessibility


relation:

A a

The world in which A is false has access to the world in which A is true,
and not vice versa. In addition, each world has access to itself. Then we
observe:

• ~AA = > and ~Aa = ⊥, by the definitions of worlds A and a.


• Must AA = >, because world A has access only to itself, and A is true
at A.
• Must Aa = ⊥, because world a has access to both A and a, and A is
false at a.
Since A is the only world that makes Must A true, and it also makes A
true, Must A implies A, and the argument:
• Must A; therefore A

is valid.

One immediate lesson is that the argument:

• Must A; therefore A
is valid for some accessibility relations and invalid for other accessibility rela-
tions. But we can do better than this – we can say what it is about an accessibility
relation that makes the argument valid.

An accessibility relation is reflexive if every world bears the relation to itself:


• RF is reflexive: for all w ∈w, wRF w.

252
Suppose that RF is reflexive, and that Must A is true at some world v. Then A
is true at every world accessible from v. But since RF is reflexive, v is acces-
sible from v. Thus A is true at v. As a result, every world that makes Must A
true also makes A true. This shows that (when RF is reflexive), Must A implies A.

Now suppose that RF is not reflexive. Then there is some world v that does
not have access to itself. Now let A be false at v and true at every world other
than v. Must A is then true at v, because A is true at every world accessible
from v. (The worlds accessible from v do not include v, and A is true at all other
world.) Thus v is a world making Must A true and A false. Because there is
such a world, Must A does not imply A.

There is thus a perfect correspondence between:


1. A modal mustF having a semantic value using a reflexive accessibility
relation.
2. That same modal making valid the inference from MustF A to A.
For modal flavors that allow the inference from Must A to A (metaphysical and
nomological modals, for example), we can then use a reflexive accessibility
relation in giving their semantic values. For modal flavors that do not allow
the inference from Must A to A (deontic modals, for example), we can use a
nonreflexive accessibility relation in giving their semantic values.

Problem 249: Show that the inference:


• A; therefore Might A
is valid if the accessibility relation is reflexive and invalid if the
accessibility relation is not reflexive.

There are many interesting connections between structural features of accessi-


bility relations and inferential features of modals with those accessibility rela-
tions. Some connections of particular interest for natural language semantics:
1. Seriality: An accessibility relation RF is serial if every world has access
to at least one world:
• RF is serial: for all w ∈w, there is v ∈w such that wRF v.
The following accessibility relation is serial:

• •

• •

253
Each world has access to another – three of the worlds form a cycle of
accessibility, and the fourth world has access to itself.

But this accessibility relation is not serial:

• •

• •

The lower right world does not have access to any world.

A modal mustF makes valid the inference from MustF A to MightF A if


and only if its accessibility relation RF is serial.

Suppose RF is serial, and suppose w is a world at which MustF A is true.


Then there is some world v such that wRF u. Because MustF A is true at w,
A is true at every world accessible from w, so A is true at u. But because
w has access to a world at which A is true, MightF A is true at w. Hence
every world that makes MustF A true also makes MightF A true. This
shows that MustF A implies MightF A.

Now suppose RF is not serial. Then there is some world w that does not
have access to any world (including itself). MustF A is trivially true at w
– the truth of MustF A at a world requires the truth of A at all accessible
worlds, and if there are no accessible worlds, then trivially they all make
A true. But MightF A is false at w. The truth of MightF A at a world
requires the truth of A at some accessible world, so if a world has access to
no worlds, MightF A cannot be true at that world. So if RF is not serial,
there is a world at which MustF A is true and MightF A is false, showing
that MustF A does not imply MightF A.

Problem 250: Call a world w a dead end world if it has access to


no worlds (including itself). What can we say about what Must
claims and what Might claims are true at a dead-end world?

Suppose A is a tautology (a sentence true at every world) and


B is a contradiction (a sentence true at no world). What do we
learn about the accessibility relation at w when we learn that
each of the following sentences is true at w:
(a) Must A
(b) Must B

254
(c) Might A
(d) Might B
Problem 251: A modal conflict is a situation in which two
sentences of the following form:
• MustF A
• Not mustF A
are both true. Show that if the accessibility relation RF for the
modal mustF is serial, then there can be no modal conflicts. Con-
sider five different modal flavors and give your best judgment
for each about whether modal conflicts are in fact possible for
those flavors.
Problem 252: Prove that if an accessibility relation RF is reflex-
ive, it is also serial. Can there be an accessibility relation that is
serial but not reflexive?

Recall that:
• If RF is reflexive, then the inference form MustF A to A is
valid.
• If RF is serial, then the inference from MustF A to MightF A
is valid.
Show that the validity of the inference from MustF A to A implies
the validity of the inference from MustF A to MightF A.
2. Transitivity: An accessibility relation RF is transitive if, whenever a first
world has access to a second and a second has access to the third, then
the first has access to the third:
• RF is transitive: for all worlds u, v, and w, if uRF v and vRF w, then
uRF w.
The following accessibility relation is transitive:

• • • •

But this accessibility relation is not transitive:

• • • •

255
The second accessibility relation is not transitive because the second world
from the left has access to the third world from the left, and the third world
from the left has access to the fourth world from the left, but the second
world from the left does not have access to the fourth world from the left.

A modal mustF makes valid the inference from MustF A to MustF Must F
A if and only if its accessibility relation RF is transitive.

Suppose RF is transitive, and suppose w is a world that makes MustF A


true. Now let u be any world accessible from w. We claim that MustF A
is also true at u. To see this, let v be any world accessible from u. Then
by transitivity, since wRF u and uRF v, we have wRF v. And since MustF A
is true at w, A is true at every world accessible from w, so A is true at v.
But that means that A is true at any world accessible from u, so MustF A is
true at v. v was an arbitrary world accessible from w, so MustF A is true
at every world accessible from w. That is sufficient to show that MustF
mustF A is true at w. Thus any world that makes true MustF A also makes
true MustF mustF A, so MustF A implies MustF mustF A.

Now suppose that RF is not transitive. Then there are three worlds u, v,
and w, such that uRF v, vRF w, but not uRF w. Suppose A is false at w, and
true at every other world. Because u does not have access to w, A is then
true at every world that u has access to. Therefore MustF A is true at u. But
because v does have access to w, and A is false at w, MustF A is false at v.
And because u has access to v, MustF mustF A is false at u. Therefore u is
a world at which MustF A is true and MustF mustF A is false. This shows
that when RF is not transitive, MustF A does not imply MustF mustF A.

3. Symmetry: An accessibility relation RF is symmetric if, whenever one


world has access to a second, the second also has access to it:
• RF is symmetric: for all worlds u and v, if uRF v, then vRu .
The following accessibility relation is symmetric:

• •

But this accessibility relation is not symmetric:

256

• •

The second accessibility relation is not symmetric because its accessibil-


ity arrows all point in only one direction, rather than in both directions.
(Note that a single one-way arrow is enough to make an accessibility re-
lation not symmetric.)

A modal mustF makes valid the inference from A to MustF mightF A if


and only if its accessibility relation RF is symmetric.

Suppose RF is symmetric, and suppose w is a world at which A is true.


Let u be an arbitrary world accessible from w. Because RF is symmetric, w
is then accessible from u. Thus u has access to a world at which A is true.
That means that MightF A is true at u. Since u was arbitrary, every world
accessible from w makes MightF A true. Therefore, MustF mightF A is
true at w. Thus any world making true A also makes true MustF mightF
A, which shows that A implies MustF mightF A.

Now suppose RF is not symmetric. Then there are are world u and v such
that uRF v, but not vRF u. Let A be true at u, but false at every other world.
Because v does not have access to u, v does not have access to any world
at which A is true. Thus MightF A is false at v. But since u has access to v, u
has access to a world at which MightF A is false. Therefore MustF mightF
A is false at u. So there is a world making A true and MustF mightF A
false, showing that A does not imply MustF mightF A.

Problem 253: An accessibility relation RF is convergent if, whenever


a world has access to two worlds, there is some world that those
two worlds both have access to:
• RF is convergent: for any worlds u, v, and w, if uRF v and uRF w,
then there is a world z such that vRF z and wRF z.
Show that modals mustF and mightF make valid the inference from
MightF mustF A to MustF mightF A if and only if the associated
accessibility relation RF is convergent.

Problem 254: Find a constraint on the accessibility relation RF such


that the inference from MightF A to MustF A is valid if and only if
that constraint applies.

Problem 255: Suppose modals mustF and mightF have semantic


values using an accessibility relation RF that is dense:

257
• RF is dense: for any worlds u and v, if uRF v, then there is a
world w such that uRF w and wRF v.
Find an argument whose validity is tied to the density of the ac-
cessibility relation (so that the argument is valid if the accessibility
relation is dense and invalid if the accessibility relation is not dense).

BL A Family of Modals

In the previous section we saw the following correlations between structural


features of an accessibility relation and logical features of the modals governed
by that accessibility relation:

Accessibility Relation Inference


Reflexive Must A  A
Serial Must A  Might A
Transitive Must A ` Must must A
Symmetric A  Must might A
It is common to give names to the logical systems that result from imposing
various constraints on the accessibility relations. The systems that we have
considered thus far have the following standard names:

Accessibility Relation Logical System Name


(No constraint) K
Reflexive T
Serial D
Symmetric KB
Transitive K4
(The names are terrible, and derive from various historical sources that don’t
matter to us.)

K is the ‘default’ system that we get if we allow any accessibility relation at


all. Notice that the other systems T, D, KB, and K4 are all strengthenings of K.
Any argument that is valid in K is also valid in the other four systems. That’s
because K is more permissive than the other system – any accessibility relation
that is permitted in, say, T is also permitted in K (but not vice versa). So if a
possible inference has a counterexample in one of the other systems, that same
counterexample exists in K.

T is a logical strengthening of D. We can see this in two ways:


1. T is characterized by a reflexive accessibility relation, and D is character-
ized by a serial accessibility relation. But any accessibility relation that

258
is reflexive is also serial. If every world has access to itself, then every
world certainly has access to some world.

Problem 256: Give an example of an accessibility relation that


is serial but not reflexive.

Suppose an argument is valid in D. Then every model with a serial acces-


sibility relation that makes the premises of that argument true also makes
the conclusion of the argument true. But every model with a reflexive
accessibility relation is also a model with a serial accessibility relation.
Therefore, every model with a reflexive accessibility relation that makes
the premises of the argument true also makes the conclusion of the argu-
ment true. So the argument is also valid in T. Since T makes valid every
argument that D makes valid, plus the argument:
• Must A  A
which is not valid in D, T is logically stronger than D.
2. T is characterized by the validity of the inference:
• Must A ` A
while D is characterized by the validity of the inference:
• Must A  Might A
But we can use the inference from Must A to A to derive the inference from
Must A to Might A.
We first contrapose the inference from Must A to A. That gives us:
• Not A  Not must A
We then note that the following two sentences are equivalent:
• Not must A
• Might not A

Problem 257: Using the semantic values:


• ~mustF v =λx(w, t) .∀w(vRF w → x(w) = >)
• ~mightF v =λx(w, t) .∃w(vRF w ∧ x(w) = >)
• ~notv =λxt .x = ⊥
verify that Not must A and Might not A always have the same
truth value, and that Not might A and Must not A always have
the same truth value. Conclude that Must A is equivalent to Not
might not A and that Might A is equivalent to Not must not
A.

Therefore we have:

259
• Not A  Might not A
But A here is an arbitrary sentence, so we can replace A with Not A to get:
• Not not A  Might not not A
Since Not not A is equivalent to A, we can simplify this to:

• A  Might A
Thus T gives us both of:
• Must A  A
• A  Might A
Combining these, we get:
• Must A  Might A
This is the characteristic logical feature of D, so T proves everything that
D does. Since T also allows the inference from Must A to A, which D does
not, T is logically stronger than D.
KB, on the other hand, is not a logical strengthening of D. (Nor is D a logical
strengthening of KB – the two systems are logically incomparable.) There are
accessibility relations that are serial but not symmetric, such as:

• •

And there are accessibility relations that are symmetric but not serial, such as:

• •

Problem 258: Show that D and K4 are logically imcomparable by


giving a pair of accessibility relations, one of which is serial but not
transitive and one of which is transitive but not serial. Then show
that KB and K4 are logically incomparable by giving another pair of
accessibility relations, one of which is symmetric but not transitive
and one of which is transitive but not symmetric.

260
We can give a graph of the logical relations among the modal sys-
tems we’ve been considering:

D KB K4

We can also create additional modal systems by imposing more than one con-
straint on the accessibility relation. Suppose, for example, we require the
accessibility relation to be both reflexive and symmetric. Then we get a modal
that validates both of the inferences:
• Must A  A
• A  Must might A
(The resulting modal system is in fact exactly characterized by adding these
two validities, but showing this is non-trivial, and we won’t pursue that here.)
The modal system created by an accessibility relation that is both reflexive and
symmetric is called B. It is a strengthening of both T and KB, because it imposes
the accessibility constraint of both of those system. So we can add to our graph
of systems:

D KB K4

K
We have considered four different structural features of the accessibility rela-
tion: seriality, reflexivity, symmetry, and transitivity. That means there are a
total of sixteen possible combinations of those features: However, we’ve al-
ready seen that any accessibility relation that is reflexive is also serial, so we
can’t have combinations of features that are +reflexive and -serial. That leaves
twelve combinations:
1. -serial, -reflexive, -symmetric, -transitive: With no constraints on the
accessibility relation, we get the modal system K. (Note that -serial means
that we do not require the accessibility relation to be serial, rather than that
the accessibility relation is not serial.)
2. -serial, -reflexive, -symmetric, +transitive: Requiring only transitivity, we
get the modal system K4.

261
3. -serial, -reflexive, +symmetric, -transitive: Requiring only symmetry, we
get the modal system KB.
4. -serial, -reflexive, +symmetric, +transitive: Requiring both symmetry and
transitivity, we get the modal system called KBE. (We’ll say more about
this system and its name soon.)
5. +serial, -reflexive, -symmetric, -transitive: Requiring only seriality, we
get the modal system D.
6. +serial, -reflexive, -symmetric, +transitive: Requiring both seriality and
transitivity, we get the modal system KD4.
7. +serial, -reflexive, +symmetric, -transitive: Requiring both seriality and
symmetry, we get the modal system KDB.
8. +serial, -reflexive, +symmetric, +transitive: This combination turns out
to be logically impossible.

Problem 259: Show that an accessibility relation that is serial,


symmetric, and transitive must also be reflexive.

9. +serial, +reflexive, -symmetric, -transitive: Requiring reflexivity (and


hence seriality), we get the modal system T.
10. +serial, +reflexive, -symmetric, +transitive: Requiring reflexivity (and
hence seriality) and transitivity, we get the modal system —bf S4.
11. +serial, +reflexive, +symmetric, -transitive: Requiring reflexivity (and
hence seriality) and symmetry, we get the modal system B.
12. +serial, +reflexive, +symmetric, +transitivity: Requiring all of seriality,
reflexivity, symmetry, and transitivity, we get the modal system S5.
We can then give an expanded graph of the logical relations among these modal
systems:

S5

S4 B KBE

T KDB KD4

D KB K4

262
BM S5

The strongest of the modal systems in our graph is the system S5, characterized
by an accessibility relation that is reflexive, symmetric, and transitive. A relation
that is reflexive, symmetric, and transitive is an equivalence relation. Consider
some examples of equivalence relations:

• • •

• • •

• • •
1.

• • •

• • •

• • •
2.

• • •

• • •

• • •
3.

Notice that each of the above equivalence relations splits up the collection of
worlds into distinct clusters, so that within each cluster every world has access
to every other world. In the first example, each world is in its own cluster,
having access only to itself. In the second example, there are four clusters of
worlds – two clusters of three worlds each, one of two worlds, and one with a
single world. And in the third example, all of the worlds form a single cluster,
with every world having access to every other world.

Problem 260: For each of the following accessibility relations, add


the minimum number of accessibility arrows needed to make the
accessibility relation an equivalence relation.

263
• • •

• • •

• • •
1.

• • •

• • •

• • •
2.

• • •

• • •

• • •
3.

A partition of a set S is a collection of subsets of S that (i) are pairwise disjoint,


(ii) have S itself as their union, and (iii) are all non-empty. For example, given
the set {a, b, c, d, e, f }:
• {a, f }, {b, d, e}, {c} is a partition.
• {a, c, f }, {b, c, d, e} is not a partition, because the two sets overlap at c.
• {a, e, f }, {b}, {c} is not a partition, because d is not included in any partition
element.

Problem 261: How many partitions of a nine-element set are there?

Equivalence relations and partitions fit naturally together: every equivalence


relation determines a partition, and every partition determines an equivalence
relation:
1. Given an equivalence relation R on some set S, we can produce a partition
Π consisting of sets of items all mutually inter-related by R. Thus Π =
{{b ∈ S : aRb} : a ∈ S}.

264
2. Given a partition Π on some set S, we can produce an equivalence relation
R that relates two items just in case they are in the same partition element.
Thus aRb if and only if ∃π ∈ Π, a, b ∈ π.

Problem 262: Prove that if R is an equivalence relation, and Π is


created from R as above, then Π is a partition. (Be clear on how you
use the reflexivity, symmetry, and transitivity of R in your proof.)
Then prove that if Π is a partition, and R is created from Π as above,
then R is an equivalence relation. (Be clear on how you use the
disjointness and exhaustivity of Π in your proof.)

Problem 263: Let S be the set {a, b, c, d, e, f }. For each of the following
partitions of S, find the equivalence relation determined by that
partition.
1. {a, e, f }, {b}, {c, d}
2. {a, b, c, d, e, f }
3. {a}, {b}, {c}, {d}, {e}, { f }
4. {a, b, c, f }, {d, e}
For each of the following equivalence relations, find the partition
determined by that equivalence relation:

a b c

d e f
1.

a b c

d e f
2.

a b c

d e f
3.

The S5 modal logic is particularly simple, because worlds all have access to
each other (within a given cluster of worlds). As a result, in S5 there are no
interesting effects from iterated modals. In S5, all of the following are equivalent:

265
• Must A
• Must must A
• Might must A
• Must might must A
• Might might must must might must A
If Must A is true at a world w, then A is true at every world in its cluster. But
then every world in w’s cluster has access only to worlds in that cluster, so Must
A is true at every world in the cluster, so both Might must A and Must must
A are true at w. Since all the worlds in the cluster are symmetrically arranged
in accessibility, if Might must A and Must must A are true at w, they is true at
every world in the cluster. Therefore Must might must A, Must must must A,
Might might must A, and Might must must A are all true at w. And so on –
if Must A is true at w, then any sequence of must and might, ending in must,
applied to A is true at w. So there’s no real effect of adding more modals once
an initial must has been added. Similarly with might – if Might A is true at w,
then any sequence of must and might, ending in might, applied to A is true at
w.

One other constraint on the accessibility relation that n useful to consider when
thinking about S5 is euclideanness. An accessibility relation RF is euclidean if,
whenever a world has access to two worlds, those two worlds have access to
one another:
• RF is euclidean: for all worlds w, u, and v, if wRF u and wRF v, then uRF v
and vRF u.
The following accessibility relation is euclidean:

• •

But this accessibility relation is not euclidean:

• •

If RF is symmetric, then RF ’s being euclidean is equivalent to RF ’s being transi-


tive:

266
1. Suppose RF is symmetric and euclidean. We want to show that RF then
must be transitive. So consider some three worlds u, v, and w such that
uRF v and vRF w:

u v w

By symmetry, we also have vRF u and wRF v:

u v w

But then v has access to both u and w, so by euclideanness, u and w have


access to each other:

u v w

So, in particular, when u has access to v and v has access to w, u has access
to w. Thus RF is transitive.

2. Suppose that RF is symmetric and transitive. We want to show that RF


then must be euclidean. So suppose world w has access to two worlds u
and v:

u v

By symmetry, u and v both have access back to w:

u v

267
Since u has access to w and w has access to v, by transitivity, u has access
to v. Similarly, since v has access to w and w has access to u, v has access
to u:

u v

Thus u and v have access to each other, showing that RF is euclidean.


This is why the modal logic that requires symmetry and transitivity is called
KBE – symmetry and transitivity is equivalent to symmetry and euclideanness,
and the E in KBE represents the euclidean requirement.

Also, an accessibility relation that is serial, symmetric, and transitive is also


reflexive. Consider an arbitrary world w. By seriality, it has access to some
world u:

w u

By symmetry, u has access back to w:

w u

Since w has access to u and u has access to w, by transitivity, w has access to w:

w u

So our arbitrary world has access to itself, showing that the accessibility must
be reflexive. (By the same reasoning, we could show that u must also have
access to itself.)

As a result, requiring an accessibility relation to be serial, symmetric, and


euclidean also forces it to be reflexive, symmetric, and transitive, and thus an
S5 accessibility relation. This gives us an alternative characterization of that
modal system.

268
Problem 264: Show that any equivalence relation must be serial,
symmetric, and euclidean (thus showing that the requirements (i)
reflexive, symmetric, and transitive, and (ii) serial, symmetric, and
euclidean are equivalent to each other).

We can slightly expand our map of modal systems:

S5

S4 B KBE

T KDB KD4

D KB K4 KE

We have added the modal system KE, which requires only that the accessibility
relation be euclidean. Notice that the first modal system that is above all of D
(serial), KB (symmetry), and KE (euclidean) is S5.

Our default assumption will be that modals are S5 modals, but we’ll also
consider cases where S5 doesn’t look like the right system.

BN Problems With Conditionals

Consider a conditional if ...then sentence such as:


• If Socrates cries, Plato laughs.

We’ve given tools already for a simple treatment of this sentence. We start with
a syntactic analysis, either binary branching:

(t, t) t

Plato laughs
(t, (t, t)) t

if Socrates cries

269
or trinary branching:

(ht, ti, t) t t

if Socrates cries Plato laughs


We then use a suitable semantic value for if as a truth function, either:

1. For the binary-branching structure, ~if = λxt .λyt .x = ⊥ or y = >


2. For the trinary-branching structure, ~if = λhxt , yt i.x = ⊥ or y = >
Using either approach, we conclude that If Socrates cries, Plato laughs
is true whenever either Socrates doesn’t cry or Plato laughs.

That analysis has some nice features. Most noticeably, it explains the validity
of the rule of modus ponens:
• A; if A, B  B

Problem 265: Use a truth-table presentation of the semantic value


of if to show that modus ponens is valid given that semantic value.

However, the truth-functional analysis of if also has a number of less desirable


features:
1. We have:

• not A  if A, B
or equivalently, any conditional with a false antecedent is true. But con-
ditionals with false antecedents don’t always appear true. Consider some
examples:

• If granite is a liquid, then Julius Caesar was killed by Brutus.


• If Shakespeare wrote Doctor Faustus, then Mars has five moons.
• If Berlin is in Australia, it is in the northern hemisphere.
• If 15 is prime, it is divisible by 2.
• If the moon is made of green cheese, it is shaped like a doughnut.

Judgments on sentences like these – conditionals with false antecedents


and consequents that are either unrelated to the antecedent or related to
the antecedent in the ‘wrong way’ – vary from person to person. Some-
times such conditionals strike people as false, and sometimes they strike

270
people as bizarre and inappropriate. But they rarely strike people as
straightforwardly true, as would be predicted by the truth-functional se-
mantics for if. A better semantics for if would allow conditionals with
false antecedents to be something other than true.
2. We have:

• B  if A, B
or equivalently, any conditional with a true consequent is true. But con-
ditionals with true consequents don’t always appear true. Consider some
examples:

• If granite is a liquid, then Julius Caesar was killed by Brutus.


• If New York City is in Florida, then it is north of Washington
D.C.
• If 15 is prime, it is divisible by 3.
• If elephants are larger than rhinoceroses, then Jupiter is
larger than Saturn.
• If Berlin is in France, then Berlin is in Germany.
Again, judgments on sentences like these vary from person to person,
striking people sometimes as false and sometimes as bizarre or pointless.
But again they rarely strike people as straightforwardly true, as would be
predicted by the truth-conditional semantics for if. A better semantics
for if would allow conditionals with true consequents to be something
other than true.
3. We have:

• not(if A, B)  A and not B


To reject a conditional, then, requires (according to the truth-functional
semantics for if) endorsing the antecedent and rejecting the consequent.
But this doesn’t seem right. Suppose Alex is trying to convince Beth that
she has the bubonic plague, and says to her:

• If you have the bubonic plague, your left big toe glows purple.
So let’s check your toe.
Beth, understandably, says:
• What? No, that’s not true.

Alex then replies:


• Ah, so you agree that you have the bubonic plague.

271
Alex’s reply looks absurd, but if not(if A, B) does imply A and not
B, then Beth’s denial of If you have the bubonic plague, your left
big toe glows purple implies You have the bubonic plague and your
left big toe does not glow purple, and hence implies You have the
bubonic plague. Since that looks wrong, we would prefer a semantics
for the conditional that didn’t have the result that rejecting a conditional
requiring affirming the antecedent of the conditional.
4. We have:
•  (If A, B) or (if B, C)

That is, given any three sentences A. B, and C, if there isn’t a conditional
connection from the first to the second, then there is a conditional con-
nection from the second to the third.

Problem 266: Show that it is a consequence of  (If A, B) or


(if B, C) that, given any two sentences A and B, at least one
of the following two conditionals is true:
• If A, B
• If B, A
Give an example in which neither of those two conditionals
seems true. Is it possible for both If A, B and If B, A to be
true?

But again, there are examples that don’t seem to conform to this logical
feature.

(a) Consider the sentences It is raining, The streets are dry, and
Cars skid easily. Then we get the disjunction:
• Either (if it is raining then the streets are dry) or (if
the streets are dry, then cars skid easily).
But neither disjunct of this disjunction looks true.
(b) Consider the sentences Number N is prime, N is divisible by
4, and N is odd. Then we get the disjunction:
• Either (if N is prime, then N is divisible by 4) or (if
N is divisible by 4, then N is odd.)
But again, neither disjunct looks true.
So (If A, B) or (if B, C) does not appear to be a logical truth, and
it would be nice to have a semantics for if that didn’t make it a logical
truth.
5. We have:

• if A and B, C  (if A, C) or (if B, C)

272
But many examples seem to contradict this inference pattern. Smith is
attempting to defuse a bomb, and is told:

• If you cut the red wire and the green wire, the detonator
will be disconnected.

It doesn’t seem to follow from this instruction that:


• Either if you cut the red wire the detonator will be disconnected
or if you cut the green wire the detonator will be disconnected.
Both wires need to be cut to make the bomb safe – neither one of the two
is sufficient by itself. The truth-functional semantics for if guarantees
that no consequent can be genuinely dependent on a pair of antecedent
conditions.
In short, then, the truth-functional semantics for if creates a logic which has a
number of mismatches with the way that conditionals are used in English. It’s
time to see if we can build a better conditional semantics.

BO Modal Conditionals

We can improve our semantics for conditionals by using possible worlds re-
sources. The truth-functional semantic value for if makes ~if A, B depend
only on the truth values of A and B at the actual world, not on their truth values
at other possible world. But we can instead give a modal implementation for
if, by requiring a conditional to have the truth values of antecedent and con-
sequent properly coordinated at all worlds.

This modal implementation gives us what is called a strict conditional. Using


world-relativized semantic values, we have:

• ~ifw = λx(w, t) .λy(w, t) .∀u ∈ w(x(u) = ⊥ ∨ y(u) = >)

Implemented this way, if takes inputs of type (w, t). So, as with the world-
relativized implementation of modals, we’ll need intervening λ operators to
convert the world-relativized t values of sentences into (w, t) values. Thus
we’ll think of If A, B as having the full form if (λw A), (λw B).
Alternatively, we can use set-wise semantic values in giving a modal analysis
of if. Done this way, we have:
• ~if = λxw .λy(w, t) .λz(w, t) .∀u ∈ w(y(u) = ⊥ ∨ z(u) = >)

On either version of the modal analysis, the basic idea is that If A, B is true
(at a world w) if every world that makes A true also makes B true. This is then
equivalent to the requirement that ||A||⊆||B||.

273
Problem 267: Let’s use the symbol ⊃ for the truth-functional version
of if, often called the material conditional, and reserve if for the
modal strict conditional analysis. Show that the following two
constructions are equivalent:
• Must(⊃ A, B)
• if A, B
(In order to get the equivalence to work out, we need to say some-
thing about the accessibility relation used for must. What accessi-
bility relation is needed?)

Problem 268: Explain carefully why the truth of If A, B is equiv-


alent to the truth of ||A||⊆||B||. Then use this equivalence, combined
with some basic mathematical features of the subset relation, to
show that the strict conditional has the following inferential fea-
tures:
1. If A, B; If B, C  If A, C
2.  If A, A
3. If A, B  If (not B), (not A)
4. If A, B  If (A and C), B
5. If A, B  If A, (B or C)

The strict conditional avoids many of the problems we encountered with the
truth-functional version of the conditional. For example, we do not have:
• B  If A, B
To see this, suppose we have two worlds u and v. At world u, A and B are both
true. At world v, A is true and B is false. But then ||A||*||B||. Not every world
that makes A true makes B true. Thus If A, B is false. So at world u, B is true
and If A, B is false. This shows that B does not imply If A, B.

Problem 269: Give possible worlds constructions, similar to the


example just given showing that B 2 If A, B, showing each of the
following invalidities:
1. Not A 2 If A, B
2. Not(if A, B) 2 A and not B
3. 2 (If A, B) or (if B, C)
4. If (A and B), C 2 (If A, C) or (if B, C)

The simple modal implementation we’ve just given doesn’t make any use of
an accessibility relation. If we want to add a role for the accessibility relation,
we can say:
• ~ifw = λx(w, t) .λy(w, t) .∀u ∈ w(wRu → x(u) = ⊥ ∨ y(u) = >)

274
This semantic value for if has the effect that If A, B is true at a world w just
in case every world accessible from w that makes A true also makes B true.

Once we add a role for the accessibility relation, we can ask what effect the
choice of accessibility relation, and thus the choice of modal logic, has on the
behavior of the conditional. As it turns out, the answer is: not very much. The
most important question is whether the accessibility relation is reflexive.

Fact: The argument A; If A, B  B is valid if and only if the acces-


sibility relation for the modal conditional is reflexive.

Proof: Suppose the accessibility relation R is reflexive, and suppose


both A and If A, B are true at some world w. The truth of If A,
B at w tells us that every accessible world that makes A true also
makes B true. Since R is reflexive, w has access to itself. Since w
makes A true and w has access to w, w must also make B true. Thus
A together with If A, B implies B.

Now suppose R is not reflexive. Let w be a world that does not have
access to itself. Let A be true at w and false at every other world,
and let B be false at w. Then If A, B is true at w, because A is false
at every world accessible from w. So at w, A is true and If A, B is
true, but B is false. This shows that A together with If A, B does
not imply B.

Problem 270: Show that if the modal if uses an S4 accessibility


relation, then the conditional validates the inference:
• If A, B  Must(If A, B)
Show that if if uses an S5 accessibility relation, then the conditional
validates the inference:
• Might(if A, B)  Must(if A, B)
Can you find an interesting inferential feature of the conditional that
follows from the assumption that if uses a D accessibility relation?

BP Subsentential Modality

We’ve been simplifying the addition of possible worlds to our semantic machin-
ery thus far by focusing only on what happens at the level of entire sentences.
It’s time to work out q possible worlds version of the semantic values for sub-
sentential expressions. Consider the sentence Aristotle laughs. Suppose
that we have four possible worlds w1 , w2 , w3 , and w4 , and that ||Aristotle
laughs|| = {w1 , w2 }. That is, Aristotle laughs is true relative to w1 , w2 , and

275
false relative to w3 and w4 . How can we give semantic values for Aristotle
and laughs that produce this result?

The obvious approach is simply to world-relativize the semantic values of


Aristotle and laughs. Suppose our four worlds are characterized in the
following way:
1. In w1 , Aristotle laughs and Plato laughs.
2. In w2 , Aristotle laughs and Plato doesn’t laugh.
3. In w3 , Aristotle doesn’t laugh and Plato laughs.

4. In w4 , Aristotle doesn’t laugh and Plato doesn’t laugh.


Then we’d like to have the following world-relativized semantic values for
laughs:
" #
Aristotle → >
1. ~laughs =
w1
Plato → >
" #
Aristotle → >
2. ~laughsw2 =
Plato → ⊥
" #
Aristotle → ⊥
3. ~laughsw3 =
Plato → >
" #
Aristotle → ⊥
4. ~laughsw4 =
Plato → ⊥

But it looks like there is no need to world-relativize the semantic value of


Aristotle. We can simply say:

• ~Aristotlew1 = ~Aristotlew2 = ~Aristotlew3 = ~Aristotlew4 =


Aristotle
We can then calculate complex semantic values in the usual way using rela-
tivized functional application:

• ~Aristotle laughsw = ~laughsw (~Aristotlew )


Thus:
" #
Aristotle → >
1. ~Aristotle laughs w1
= ~laughs (~Aristotle ) =
w1 w1
(Aristotle)
Plato → >
=>
" #
Aristotle → >
2. ~Aristotle laughs w2
= ~laughs (~Aristotle ) =
w2 w2
(Aristotle)
Plato → ⊥
=>

276
" #
Aristotle → ⊥
3. ~Aristotle laughs w3
= ~laughs (~Aristotle ) =
w3 w3
(Aristotle)
Plato → >
=⊥
" #
Aristotle → ⊥
4. ~Aristotle laughs w4
= ~laughs (~Aristotle ) =
w4 w4
(Aristotle)
Plato → ⊥
=⊥
So we get, as desired, that Aristotle laughs is true in w1 and w2 but false in
w3 and w4 , and hence that ||Aristotle laughs|| = {w1 , w2 }.

We specified the world-relative values of ~laughs above by giving the desired


function in explicit list format. But we’d also like to be able to specify ~laughs,
relative to a given world, by giving an appropriate lambda expression using
our metalinguistic term laughs. Before incorporating possible worlds, we did
this by specifying:

• ~laughs = λx.x laughs


We can add world-relativization to this kind of semantic value in various ways:
1. We can add world-relativization directly to the metalanguage predicates
we use in the lambda expressions, by giving relativized semantic values
such as:

• ~laughsw = λx.x laughs in w


Using semantic values of this sort presupposes that, when we speak as
theorists in our language of theorizing, we understand what it means to
laugh in w for a given world w. People seem willing to accept phrases
of this sort, although it isn’t always clear that we can give a coherent
explanation of what they means. One obvious candidate explanation is
that we are increasing the argument places of verbs. We change from
thinking of laughing as a monadic property of an individual to thinking
of laughing as a binary relation between an individual and a world. In
the same way we add one argument place to every verb. Intransitive
verbs all become relations between members of e and worlds. Transi-
tive verbs, rather than expressing binary relations between individuals,
express three-place relations among two individuals and a world.
2. We can use conditionals whose antecedent specifies what world is actual:
• ~laughsw = λx.if w is the actual world, then x laughs

Using such semantic values doesn’t require us to add argument places


to verbs or to make sense of, for example, laughing in a world w. But it
does require us to be able to make sense of hypothetically supposing that
a given world w is the actual world, and to consider what would be the
case (for example, who would laugh) under that supposition.

277
3. We can convert a world into a canonical description, and then use con-
ditions that have that canonical description as an antecedent. Suppose
again we have our four worlds:
(a) In w1 , Aristotle laughs and Plato laughs.
(b) In w2 , Aristotle laughs and Plato doesn’t laugh.
(c) In w3 , Aristotle doesn’t laugh and Plato laughs.
(d) In w4 , Aristotle doesn’t laugh and Plato doesn’t laugh.
And suppose for simplicity that these features of the worlds fully describe
them. Then instead of saying:

• If w1 is the actual world, then S.


we can instead say:
• If Aristotle laughs and Plato laughs, then S.

Now let D be a function that maps each world to a canonical description


of that world. (D is thus a function of type (w, t).) We can then write:
• ~laughsw = λx.If D(w), then x laughs
Thus:

(a) ~laughsw1 = λx.If D(w1 ), then x laughs = λx.If Aristotle laughs and
Plato laughs, then x laughs.
(b) ~laughsw2 = λx.If D(w2 ), then x laughs = λx.If Aristotle laughs and
Plato doesn’t laugh, then x laughs.
Giving the world-relative semantic value of laughs in this way doesn’t
require us to understand any esoteric world-specific terminology (like
laughing in a world or world w1 being actual), but it does require us to have
a canonical method for associating each world with a description of how
things are in that world.
We’ll primarily use the first of these options, adding argument places for worlds
to ordinary English verbs. Thus we can calculate:
• ~Aristotle laughsw = ~laughsw (~Aristotlew )
• =~laughsw (Aristotle)
• = (λx.x laughs in w)(Aristotle)

• = Aristotle laughs in w
Now let’s try two more complicated examples:
1. Consider a sentence with a transitive verb:

278
• Plato admires Socrates
World-relativized semantic values can be given for Plato and Socrates
in the same way that we handled Aristotle above:
• ~Platow = Plato
• ~Socratesw = Socrates
This is part of a general strategy of treating names as picking out their
bearers with respect to all worlds. For admires, we world-relativize along
the same lines as we did laughs:

• ~admiresw = λx.λy.y admires x in w


We can now work our semantic analysis up a tree for the sentence:

Plato

Plato λx.λy.y admires x in w Socrates

admires Socrates

Applying ~admiresw to Socrates, we get:

Plato λx.x admires Socrates in w

Plato
λx.λy.y admires x in w Socrates

admires Socrates

And then applying ~admires Socrates to Plato we get:

Plato admires Socrates in w

Plato λx.x admires Socrates in w

Plato
λx.λy.y admires x in w Socrates

admires Socrates

279
We thus discover that the semantic value of Plato admires Socrates is
> relative to worlds in which Plato admires Socrates, and ⊥ relative to
worlds in which Plato does. not admire Socrates.
2. Now let’s add a modal operator to the mix. Consider:
• Might(Aristotle admires Plato)
We start with a tree for the sentence (notice that we must add a λw -abstract
to the tree to allow for the modal to interact properly):

λx(w, t) .∃u ∈ w(wRu ∧ x(u) = >)

Might λw

Aristotle

Aristotle λx.λy.y admires x in w Plato

admires Plato

The Aristotle admires Plato subtree combines as in the previous ex-


ample, so we get:

λx(w, t) .∃u ∈ w(wRu ∧ x(u) = >)

Might λw Aristotle admires Plato in w

Aristotle λy.y admires Plato in w

Aristotle
λx.λy.y admires x in w Plato

admires Plato

The λw abstractor then converts the range of world-relativized truth val-


ues of Aristotle admires Plato in w to a function from worlds to truth values.
That function can simply be given by:
• λw.Aristotle admires Plato in w

280
Adding this to the tree, we have:

λx(w, t) .∃u ∈ w(wRu ∧ x(u) = >) λw.Aristotle admires Plato in w

Might
λw Aristotle admires Plato in w

Aristotle λy.y admires Plato in w

Aristotle
λx.λy.y admires x in w Plato

admires Plato

Finally, Might combines with the rest of the tree to give:


• ~Might(~λw Aristotle admires Plato)
• = (λx(w, t) .∃u ∈ w(wRu ∧ x(u) = >))(λw.Aristotle admires Plato in
w)
• = ∃u ∈ w(wRu ∧ λw.(Aristotle admires Plato in w)(u) = >)
• = ∃u ∈ w(wRu∧ (Aristotle admires Plato in u) = >)
• =∃u ∈ w(wRu∧ Aristotle admires Plato in u)
So our final tree is:

281
∃u ∈ w(wRu∧ Aristotle admires Plato in u)

λx(w, t) .∃u ∈ w(wRu ∧ x(u) = >) λw.Aristotle admires Plato in w

Might
λw Aristotle admires Plato in w

Aristotle λy.y admires Plato in w

Aristotle
λx.λy.y admires x in w Plato

admires Plato

And we conclude that Might(Aristotle admires Plato) is true in a


world w just in case there is some world accessible from w in which
Aristotle admires Plato.

Problem 271: Give a suitable world-relative semantic value for


killed and use that semantic value to give a full semantic analysis
of the sentence:
• Booth killed Lincoln

Problem 272: Give a full semantic analysis of the sentence:


• Might(Socrates laughs) and Might not(Socrates laughs)
Give a collection of worlds and an accessibility relation for might
that make this sentence true.

BQ Modals and Quantified Noun Phrases

Consider the sentence:


• The king might be dead.
Let’s treat that might as an epistemic possibility, and then consider two scenar-
ios:
1. First Scenario: It is early 1793. You know that the king is Louis XVI, and
you also know that the sans-culottes have plans to execute him. You’re
thus uncertain whether Louis XVI is alive or dead, and you thus endorse
The king might be dead.

282
2. Second Scenario: It is 1471. The War of the Roses has been raging for
years, and Henry VI and Edward IV have been alternating periods as
king. You’ve just seen Edward kill Henry, so you know that Edward
is alive and Henry is dead. However, you haven’t been tracking all the
political turmoil of the War of the Roses, so you aren’t sure which of Henry
and Edward was king at the time of the killing. You’re thus uncertain
whether the king is alive or dead. (Let’s assume that if Henry was king,
Edward doesn’t become king until some future coronation, so that Henry
remains the (dead) king.) You thus endorse The king might be dead.
The king might be dead is true in both of these scenarios, but it is true in
different ways in each scenario. In the first scenario, there is certainty about
who is king but doubt about whether that person is alive or dead – it’s doubt
about vitality that makes the might claim true. But in the second scenario, there
is certainty about who is alive and who is dead, but doubt about who is king –
it’s doubt about nobility that makes the might claim true.

A good theory of subsentential modality needs to allow The king might be


dead to be true in both scenarios, then. But this turns out to be tricky. Let’s start
with the obvious approach. We have a tree:

Might
λw

is dead
the king
(For simplicity we assume that is dead is a single intransitive verb, rather than
forming it from an adjective dead and an is of predication.) We already have
world-relativized semantic values available for most of the pieces:
• ~mightw = λx(w, t) .∃u ∈ w(wRu ∧ x(u) = >)

• ~kingw = λx.x is king in w

• ~is deadw = λx.x is dead in w


We can then try assuming that the is not world-sensitive in its semantic value,
so that:
• ~thew = ~the = λx(e, t) .λy(e, t) .∃z ∈ e(x(z) = > ∧ ∀u(x(u) = > ↔ u =
z) ∧ y(z) = >)
Working our way up the tree, we start with:

283
λx(w, t) .∃u ∈ w
(wRu ∧ x(u) = >)

Might

λw

λx.x is dead in w

is dead
λx(e, t) .λy(e, t) .∃z ∈ e(x(z) = > λx.x is king in w
∧∀u(x(u) = > ↔ u = z) ∧ y(z) = >)
king
the
Composing the king, we get:

• ~the kingw = ~thew (~kingw )


• = (λx(e, t) .λy(e, t) .∃z ∈ e(x(z) = >∧∀u(x(u) = > ↔ u = z)∧y(z) = >))(λx.x
is king in w)

• = λy(e, t) .∃z ∈ e((λx.x is a king in w)(z) = >∧∀u((λx.x is a king in w)(u) =


> ↔ u = z) ∧ y(z) = >)
• = λy(e, t) .∃z ∈ e((z is a king in w) = > ∧ ∀u((u is a king in w) = > ↔ u =
z) ∧ y(z) = >)
• = λy(e, t) .∃z ∈ e(z is a king in w ∧ ∀u(u is a king in w ↔ u = z) ∧ y(z) = >)

So we have:

284
λx(w, t) .∃u ∈ w
(wRu ∧ x(u) = >)

Might

λw

λy(e, t) .∃z ∈ e(z is a king in w∧ λx.x is dead in w


∀u(u is a king in w ↔ u = z) ∧ y(z) = >)
is dead

λx(e, t) .λy(e, t) .∃z ∈ e(x(z) = > λx.x is king in w


∧∀u(x(u) = > ↔ u = z) ∧ y(z) = >)
king
the
Next we compose the king with is dead:
• ~the king is deadw = ~the kingw (~is deadw )

• = (λy(e, t) .∃z ∈ e(z is a king in w ∧ ∀u(u is a king in w ↔ u = z) ∧ y(z) =


>))(λx.x is dead in w)
• = ∃z ∈ e(z is a king in w∧∀u(u is a king in w ↔ u = z)∧(λx.x is dead in w)(z) =
>)

• = ∃z ∈ e(z is a king in w ∧ ∀u(u is a king in w ↔ u = z) ∧ (z is dead in w) =


>)
• = ∃z ∈ e(z is a king in w ∧ ∀u(u is a king in w ↔ u = z) ∧ z is dead in w)
So we now have:

285
λx(w, t) .∃u ∈ w
(wRu ∧ x(u) = >)

Might

λw ∃z ∈ e(z is a king in w
∧∀u(u is a king in w ↔ u = z) ∧ z is dead in w)

λy(e, t) .∃z ∈ e(z is a king in w∧ λx.x is dead in w


∀u(u is a king in w ↔ u = z) ∧ y(z) = >)
is dead

λx(e, t) .λy(e, t) .∃z ∈ e(x(z) = > λx.x is king in w


∧∀u(x(u) = > ↔ u = z) ∧ y(z) = >)
king
the
The lambda abstractor λw then converts the collection of world-relativized se-
mantic values of the form ∃z ∈ e(z is a king in w ∧ ∀u(u is a king in w ↔ u =
z) ∧ z is dead in w) into the function from worlds to truth values given by
λww. ∃z ∈ e(z is a king in w ∧ ∀u(u is a king in w ↔ u = z) ∧ z is dead in w):

286
λx(w, t) .∃u ∈ w λww. ∃z ∈ e(z is a king in w∧
(wRu ∧ x(u) = >) ∀u(u is a king in w ↔ u = z) ∧ z is dead in w)

Might

λw ∃z ∈ e(z is a king in w
∧∀u(u is a king in w ↔ u = z) ∧ z is dead in w)

λy(e, t) .∃z ∈ e(z is a king in w∧ λx.x is dead in w


∀u(u is a king in w ↔ u = z) ∧ y(z) = >)
is dead

λx(e, t) .λy(e, t) .∃z ∈ e(x(z) = > λx.x is king in w


∧∀u(x(u) = > ↔ u = z) ∧ y(z) = >)
king
the
Finally, this (w, t) value combines with the semantic value of might (relative to
w) to give a final semantic value for the entire sentence (relative to w):

• ~Might λw the king is deadw = ~Mightw (~λw the king is deadw )


• = (λx(w, t) .∃u ∈ w(wRu∧x(u) = >))(λww. ∃z ∈ e(z is a king in w∧∀u(u is a king in w ↔
u = z) ∧ z is dead in w))

• = ∃u ∈ w(wRu ∧ (λww. ∃z ∈ e(z is a king in w ∧ ∀u(u is a king in w ↔ u =


z) ∧ z is dead in w))(u) = >)
• = ∃u ∈ w(wRu ∧ ∃z ∈ e(z is a king in u ∧ ∀y(y is a king in u ↔ y = z) ∧
z is dead in u) = >)

• = ∃u ∈ w(wRu ∧ ∃z ∈ e(z is a king in u ∧ ∀y(y is a king in u ↔ y = z) ∧


z is dead in u))
So our final and fully analyzed tree is:

287
∃u ∈ w(wRu ∧ ∃z ∈ e(z is a king in u∧
∀y(y is a king in u ↔ y = z) ∧ z is dead in u))

λx(w, t) .∃u ∈ w λww. ∃z ∈ e(z is a king in w∧


(wRu ∧ x(u) = >) ∀u(u is a king in w ↔ u = z) ∧ z is dead in w)

Might

λw ∃z ∈ e(z is a king in w
∧∀u(u is a king in w ↔ u = z) ∧ z is dead in w)

λy(e, t) .∃z ∈ e(z is a king in w∧ λx.x is dead in w


∀u(u is a king in w ↔ u = z) ∧ y(z) = >)
is dead

λx(e, t) .λy(e, t) .∃z ∈ e(x(z) = > λx.x is king in w


∧∀u(x(u) = > ↔ u = z) ∧ y(z) = >)
king
the
The final result is thus that Might(the king is dead) is true in a world w just
in case w has access to some world u such that whoever is the unique king in
u is dead in u. Notice that this makes the truth of Might(the king is dead)
depend on who is king and who is dead in the epistemically possible world
w – who is king and who is dead in the actual world doesn’t determine the
truth of the sentence, and in particular the person who is actually king can
be alive in every epistemically possible world even while Might(the king is
dead) is true. We are thus capturing the reading of Might(the king is dead)
that matches Scenario 2 above. We’ve given truth conditions that allow for
ignorance both about who is king and about whether that person, whoever it
is, is dead.

BR Modals and Scope Options

Scenario 1 and Scenario 2 above correspond to what look like two different
scope readings of Might(the king is dead). On one reading (the Scenario 2
reading), the modal might has scope over the definite description the king.

288
On this reading we first pick a world w and then second pick out whoever is
king in that world w, and then check whether that person is dead or alive in w.
On another reading (the Scenario 1 reading), the definite description the king
has scope over the modal might. On this reading, we first pick out whoever is
king (in the actual world), and then pick out a world w and see whether that
person is dead or alive in w.

The tree we considered above gives might scope over the king. (We can see this
from the structure of the tree, since might c-commands the king, but not vice
versa.) And that tree, as we’ve seen, produces truth conditions for Might(the
king is dead) that are suitable for Scenario 2. To capture the other scope op-
tion, we need a different tree. To get this other tree, we assume that the king
is able to move above might to the top of the tree, producing:

the king
λ1
might λw
x1 is dead

Problem 273: We might also want a tree for the other scope reading
that uses movement, so that the king moves out of the initial sen-
tence position, leaving a variable x1 , and the remaining variable is
then bound by the king:

might
λw

λ1
the king
x1 is dead
Give a full derivation of the semantic values for all nodes in this
tree, and show that the final result for the semantic value of the
entire sentence is the same as the semantic value we calculated in
the previous section.

We’ll now derive semantic values for this tree giving the king scope over
might, to see if the resulting truth conditions are suitable for Scenario 1. Be-
cause we are using variable binding, we need to relativize semantic values
both to worlds and to assignment functions. So we will calculate ~The king λ1
might λw x1 is deadw,g , for an arbitrary world w and assignment function g.

289
We start by adding lexical semantic values to the tree:

λx(e, t) .λy(e, t) .∃z ∈ e(x(z) = > λx.x is king in w λ1


∧∀u(x(u) = > ↔ u = z) ∧ y(z) = >) λx(w, t) .∃u ∈ w
king
(wRu ∧ x(u) = >) λw
the
g(1) λx.x is dead i
might
x1 is dead
The and king then compose as before, and x1 composes with is dead straightor-
wardly:

λy(e, t) .∃z ∈ e(z is a king in w∧


∀u(u is a king in w ↔ u = z) ∧ y(z) = >)
λ1
λx(w, t) .∃u ∈ w
λx(e, t) .λy(e, t) .∃z ∈ e(x(z) = > λx.x is king in w (wRu ∧ x(u) = >) λw g(1) is dead in w
∧∀u(x(u) = > ↔ u = z) ∧ y(z) = >)
king might g(1) λx.x is dead i
the
x1 is dead
λw then converts the world-relativized truth values of x1 is dead into a func-
tion from worlds to truth values:

290
λy(e, t) .∃z ∈ e(z is a king in w∧
∀u(u is a king in w ↔ u = z) ∧ y(z) = >)

λ1
λx(w, t) .∃u ∈ w λu.g(1) is dead in u
λx(e, t) .λy(e, t) .∃z ∈ e(x(z) = > λx.x is king in w (wRu ∧ x(u) = >)
∧∀u(x(u) = > ↔ u = z) ∧ y(z) = >) λw g(1) is dead in w
king might
the
g(1) λx.x is dead

x1 is dea
Might then combines with this (w, t) function:

λy(e, t) .∃z ∈ e(z is a king in w∧


∀u(u is a king in w ↔ u = z) ∧ y(z) = >)
∃u ∈ w
λ1 (wRu ∧ g(1) is dead in u)

λx(e, t) .λy(e, t) .∃z ∈ e(x(z) = > λx.x is king in w


∧∀u(x(u) = > ↔ u = z) ∧ y(z) = >) λx(w, t) .∃u ∈ w λu.g(1) is dead in u
king
(wRu ∧ x(u) = >)
the
λw g(1) is dead in w
might
g(1) λx.x is dead in w

x1 is dead
We then λ-abstract ∃u ∈ w(wRu ∧ g(1) is dead in u) in the x1 position to get
λx.∃u ∈ w(wRu ∧ g[x/1](1) is dead in u):

291
λy(e, t) .∃z ∈ e(z is a king in w∧ λx.∃u ∈ w(wRu∧
∀u(u is a king in w ↔ u = z) ∧ y(z) = >) g[x/1](1) is dead in u)

∃u ∈ w
λx(e, t) .λy(e, t) .∃z ∈ e(x(z) = > λx.x is king in w λ1 (wRu ∧ g(1) is dead in u)
∧∀u(x(u) = > ↔ u = z) ∧ y(z) = >)
king
the λx(w, t) .∃u ∈ w λu.g(1) is dead in u
(wRu ∧ x(u) = >)
λw g(1) is dead in w
might
g(1) λx.x is dead in w

x1 is dead
This (e, t) function is then combined with the ((e, t), t) value of the king:
• ~the kingw,g (~λ1 might λw x1 is deadw,g )

• = (λy(e, t) .∃z ∈ e(z is a king in w ∧ ∀u(u is a king in w ↔ u = z) ∧ y(z) =


>))(λx.∃u ∈ w(wRu ∧ g[x/1](1) is dead in u))
• = ∃z ∈ e(z is a king in w∧∀u(u is a king in w ↔ u = z)∧(λx.∃u ∈ w(wRu∧
g[x/1](1) is dead in u))(z) = >)

• = ∃z ∈ e(z is a king in w ∧ ∀u(u is a king in w ↔ u = z) ∧ ∃u ∈ w(wRu ∧


g[z/1](1) is dead in u) = >)
• = ∃z ∈ e(z is a king in w ∧ ∀u(u is a king in w ↔ u = z) ∧ ∃u ∈ w(wRu ∧
g[z/1](1) is dead in u))

• = ∃z ∈ e(z is a king in w ∧ ∀u(u is a king in w ↔ u = z) ∧ ∃u ∈ w(wRu ∧


z is dead in u))
So our final tree is:

292
∃z ∈ e(z is a king in w ∧ ∀u(u is a king in w ↔ u = z) ∧ ∃u ∈ w(wRu ∧ z is dead in u))

λy(e, t) .∃z ∈ e(z is a king in w∧ λx.∃u ∈ w(wRu∧


∀u(u is a king in w ↔ u = z) ∧ y(z) = >) g[x/1](1) is dead in u)

∃u ∈ w
λx(e, t) .λy(e, t) .∃z ∈ e(x(z) = > λx.x is king in w λ1 (wRu ∧ g(1) is dead in u)
∧∀u(x(u) = > ↔ u = z) ∧ y(z) = >)
king
the λx(w, t) .∃u ∈ w λu.g(1) is dead in u
(wRu ∧ x(u) = >)
λw g(1) is dead in w
might
g(1) λx.x is dead in w

x1 is dead
Compare the results we got from the two trees:
1. When might scopes over the king, the final truth conditions are ∃u ∈
w(wRu∧∃z ∈ e(z is a king in u∧∀y(y is a king in u ↔ y = z)∧z is dead in u)).
2. When the king scopes over might, the final truth conditions are ∃z ∈
e(z is a king in w∧∀u(u is a king in w ↔ u = z)∧∃u ∈ w(wRu∧z is dead in u)).
The crucial difference between the two truth conditions is:
1. When might scopes over the king, we find an individual who is both king
and dead in some merely possible world that we reach via accessibility
from the (actual) world of assessment.
2. When the king scopes over might, we find an individual who is king in
the (actual) world of assessment, and then check whether that individual
is dead in some merely possible world that is accessible from the actual
world.
So when the king scopes over might, the truth conditions amount to the re-
quirement that the person is who is in fact king is such that he is dead in some
possible world. This matches the reading that we get in Scenario 1.

BS The Syntax of Modals

It’s time to address the fact that we’ve been cheating throughout this long
discussion of modals. We’ve been treating modals such as might and must as
if they acted on entire sentences, and thus have been making use of artificial
constructions such as:

293
• Might(the king is dead)
• Must(Socrates admires Plato)
But that’s not really how modals work in English. The real English sentences
are:
• The king might be dead
• Socrates must admire Plato
The king might be dead looks like it has a tree of the form:

the king might


be dead
Might can’t be type (w, t) in this tree, since it has scope only over be dead which
looks like a predicate of type (e, t).

We could try to build a fancier semantics for modals that made them some
kind of predicate modifier rather than sentential operator, and gave them some
type other than (w, t). But instead we’ll suggest that English (and other natural
languages) has a more complicated syntax than we’ve been assuming. We
begin by separating inflection fro verb. To see the distinction, consider the
general availability of do-variants of sentences:
• The king rules the land // The king does rule the land
• Socrates taught Plato // Socrates did teach Plato
• Plato thought Aristotle refuted him // Plato did think Aristotle
refuted him
Notice that when do is inserted into the sentence, two things happen:
1. Any tense or person marking on the original verb disappears from that
verb.
2. The tense or person marking appears instead on do.
Thus in Socrates did teach Plato, the verb teach is not marked for third
person (it is not teaches) and is not marked for past tense (it is not taught). Do,
on the other hand, is marked for third person and past tense, and thus appears
as did.

This suggests that the verb itself can be separated from the inflection (tense and
person, at least) of the verb. Returning to our X-bar syntactic framework, we’ll
make four suggestions:

294
1. There is a category head I (for inflection)
2. The complement of I is VP.
3. The specifier of I is DP (the subject of the sentence)
4. The maximal projection IP of I is the entire sentence.
The tree for Socrates did teach Plato is thus:

IP

DP I

Socrates
I VP

did V DP

teach Plato
Next we notice that there are other constructions that force verbs to appear
without the normal tense and person inflection information:
• Socrates can read the book. [not reads]
• Plato will teach Aristotle. [not teaches]
• Aristotle wanted to open the door. [not opened]
• Alexander saw the barbarian die. [not dies or died]
We won’t try to deal here with all of the complications that arise in these cases.
(For example, we won’t touch the question of when the uninflected verb shows
up simply without tense and number (teach) and when it shows up as an
infinite without tense and number (to open). But we can at least suggest that
modals such as may, must, might, and can are in category I, so that we get trees
of the form:

IP

DP I

Socrates
I VP

can V DP

read D NP

the N

book

295
IP ]

DP I
D NP
I VP
the N V AP
might
king be A

dead
Notice that on this approach, the future tense will gets grouped with modals
like may and must. But past tense can’t be handled in quite the same way, or we
get unacceptable trees of the form:

IP

DP I

D NP
I VP
the N
-ed
V PP
linguist
walk P DP

across D NP

the N

street
That’s unfortunate. As a clever fix, we’ll propose that the supposedly unac-
ceptable tree is in fact acceptable, but that the past tense marker -ed moves to
a different position in the tree:

296
IP

DP I

D NP
I VP
the N

linguist V PP

walk-ed P DP

across D NP

the N

street
We then owe a story about why the past tense inflection marker -ed moves
downward but the future tense will, as well as modals like must, do not move
downward. But we won’t worry about those details here.

We still haven’t solved the central problem – we still have modals scoping over
something other than an entire sentence. To get modals in a position in which
they can be of type (w, t), we need to consider the VP-Internal Subject Hy-
pothesis. The VP-Internal Subject Hypothesis says that subjects of verbs are
produced as specifiers of the VP, rather than (as we’ve been doing) as specifiers
of the TP. Again, we won’t worry about syntactic evidence for the VP-Internal
Subject Hypothesis (although see the next problem). So we get the following
tree for Plato must teach Aristotle:

IP

I VP

must
DP V

Plato V DP

teach Aristotle
Notice that if we accept the VP-Internal Subject Hypothesis, the semantic type
for VP is t, which we can see if we add types to the tree:

297
IP

I VP

must t

DP V

e (e, t)

Plato
V DP

(e, (e, t)) e

teach Aristotle
But now we’re back where we started – we’ve got the modal scoping over the
entire sentence, but that structure doesn’t match normal English word order.
So we add another bit of movement. Here we require the subject to move to
the specifier of the IP phrase:

IP

DP I

e
I VP
Plato
must t

DP V

e (e, t)

?
V DP

(e, (e, t)) e

teach Aristotle
Now we’ve got the word order right. But some final adjustment is needed to
get all of the syntactic typing to work out. First, we need to say what is left
behind when Plato moves to the specifier of IP. The obvious suggestion is that
a (type e) variable remains. But then we’ll need that variable to be bound,
which means we’ll need to treat Plato as a variable binder. Fortunately, we
know how to do this, by treating names as generalized quantifiers of type ((e,

298
t), t). So we have:

IP

DP I

((e, t), t)
I VP
Plato
must t

DP V

e (e, t)

x1
V DP

(e, (e, t)) e

teach Aristotle
Second, we need some lambda-abstractors to prepare for (i) the variable bind-
ing by Plato and (ii) the world-binding by must. Adding these, we can give
full typing for the sentence:

299
IP

DP (e, t)

((e, t), t)
λ1 I
Plato
t

I (w, t)

((w, t), t) λw VP
must t

DP V

e (e, t)

x1
V DP

(e, (e, t)) e

teach Aristotle
(If we are treating Plato as a type ((e, t), t) generalized quantifier, we probably
ought to do the same with Aristotle. But the extra complication won’t help
with anything we want to do here (although it also wouldn’t mess anything
up), so we’ll leave Aristotle as type e.)

Let”s quickly check that everything works out now. Adding semantic values,
we have:

300
IP

DP (e, t)

((e, t), t)

λx(e, t) .x(Plato)
λ1 I
Plato
t

I (w, t)

((w, t), t)
λw VP
λx(w, t) .∀u ∈ w(wRu → x(u) = >)
t
must

DP V

e (e, t)

g(1)
V DP
x1
(e, (e, t)) e

λx.λy.y teaches x in w Aristotle

teach Aristotl
We then start composing semantic values:
• ~teaches Aristotlew,g = ~teachesw,g (~Aristotlew,g )
• = (λx.λy.y teaches x in w)(Aristotle)
• = λy.y teaches Aristotle in w

• ~x1 teaches Aristotlew,g = ~teaches Aristotlew,g (~x1 w,g )


• = (λy.y teaches Aristotle in w)(g(1))

301
• = g(1) teaches Aristotle in w
• ~λw x1 teaches Aristotlew,g = λu.g(1) teaches Aristotle in u
• ~must λw x1 teaches Aristotle in ww,g = ~mustw,g (~λw x1 teaches
Aristotlew,g )
• = (λx(w, t) .∀u ∈ w(wRu → x(u) = >))(λu.g(1) teaches Aristotle in u)

• = ∀u ∈ w(wRu → (λu.g(1) teaches Aristotle in u)(u) = >)


• = ∀u ∈ w(wRu → g(1) teaches Aristotle in u)
• ~λ1 must λw x1 teaches Aristotle in ww,g = λx.∀u ∈ w(wRu → g[x/1]
teaches Aristotle in u)
• ~Plato λ1 must λw x1 teaches Aristotle in ww,g = ~Platow,g (~λ1
must λw x1 teaches Aristotle in ww,g )
• = (λx(e, t) .x(Plato))(λx.∀u ∈ w(wRu → g[x/1] teaches Aristotle in u))

• = (λx.∀u ∈ w(wRu → g[x/1] teaches Aristotle in u))(Plato)


• = ∀u ∈ w(wRu → Plato teaches Aristotle in u)
So we get, as desired, that Plato must teach Aristotle is true relative to a
world w just in case Plato teaches Aristotle in every world accessible from w.
(Notice that, as usual, the relativization to the assignment function g washes
out in the end, so that the final semantic value isn’t sensitive to the choice of g.)

So far, so good. But, while we’ve given a syntactic story that allows modals to
function as sentential operators, we don’t yet have everything we need to deal
with the scope ambiguity of The king might be dead. Right now we generate
only one tree for this sentence:

IP

DP
λ1 I
D NP

the N I
λw VP
king might
DP V

x1 V AP

be A

dead

302
This tree results from producing the subject the king as the specifier of the VP
(following the VP-Internal Subject Hypothesis), and then moving it, as before,
to the specifier of IP. But (without going through all the details) we then have
the king scoped above might, so we will pick out whoever is king in the actual
world, and then check, in some accessible world u, whether that person is dead
in u. That gives us only the Scenario 1 reading of our sentence.

Once we have added the distinction between IP and VP phrases and the VP-
Internal Subject Hypothesis, how can we get a tree producing the reading on
which the king takes scope under might? It’s tempting to think we could just
leave the subject the king in its original position as the specifier of VP, so that
it will be scoped under might in the I position. But if we leave the king as
specifier of the VP, we’re again left without an explanation of the word order
of the English sentence. So instead we suggest that there is a second phase of
movement. First the king moves from the specifier of VP to the specifier of IP.
And then might moves from its original I position to a higher I position:

303
IP

I (w, t)

((w, t), t)

might λw IP

DP (e, t)

((e, t), t) λ1 I

D NP t

((e, t), ((e, t), t) N I VP


the (e, t) ∅ t
king
DP V

e (e, t)

x1 V AP

((e, t),(e, t)) A

be (e, t)

dead
The full picture of the derivation of the sentence is:
1. Initially, the subject the king is the specifier of the VP, and the modal
might is in the I position above the subject.
2. The subject then moves above might to the specifier of IP. The tree that
results from this movement is the tree that produces the visible form of
the sentence.
3. The modal might then moves above the (moved) subject to a higher I
position. The resulting final tree is the tree on which semantic processing

304
occurs.
(We owe a story about why the upward movement of the king is visible in
the final sentence on the page, but the upward movement of might above the
(moved) the king is not visible. But we’ll set aside that issue for now.)

Problem 274: Give a full semantic derivation for this tree, and con-
firm that the resulting truth conditions match those of Might(the
king is dead, with the modal might having scope over the quanti-
fied noun phrase the king. What should we say about the seman-
tic value of the empty node ∅ left by the movement of might to the
higher I position?

Problem 275: Consider the sentence:


• Every king might kill some noble.
How many different trees for this sentence can you derive using the
general framework we’ve been developing. Pick one of the trees you
produce and do a full semantic derivation for that tree. Consider
the resulting truth conditions, and informally state a reading of the
original sentence that corresponds to those truth conditions.

BT Things That Might Not Have Been

Alexander the Great, flush with his victory over the Persian empire, announces:
• No army can defeat me.

Wanting to warn Alexander of the dangers of hubris, his friend Hephaestion


warns him:
• Some army could defeat you.
Alexander, offended, begins to run through armies and explain his superiority
to them. The army of the Aspasioi isn’t a threat; he can easily defeat the army
of the Guraeans; he has no fear of the army of the Assakenoi. Hephaestion
explains that he agrees with all of that, but wants to emphasize that Alexan-
der isn’t invincible. Even if Alexander can easily defeat all of the armies there
actually are, there surely could have been an army capable of defeating Alexander.

We can now give two different syntactic trees for Some army could defeat
Alexander, depending on whether the modal could moves above the (moved)
subject:

305
1. IP

DP
λ1 I
D NP

some N I
λw VP
army could
DP V

x1
V DP

defeat Alexander

2. IP

could λw IP

DP
λ1 IP
D NP
I VP
some N

army DP V

x1
V DP

defeat Alexander
We want a reading of Hephaestion’s claim on which the army that (possibly)
defeats Alexander is a non-actual army. So we don’t want the first tree, in
which some army scopes over could, since in that tree we pick an army in the
actual world, and then assess whether that army defeats Alexander in some
other world.

The second tree is better. It has some army scope under could, which allows us
to pick something that is an army only in the possible world under considera-
tion. The second tree produces the truth conditions:
• ∃u ∈w(wRu ∧ ∃x ∈e(x is an army in u and x defeats Alexander in u))

306
But there’s an important difference between the Scenario 2 reading we wanted
of The king might be dead and the reading we want of Some army could
defeat Alexander. In Scenario 2, we had two actual individuals – Edward IV
and Henry VI – but we wanted to consider both worlds in which Edward IV
was king and worlds in which Henry VI was king. It was important that we
pick out the possibly dead individual not by looking at who was actually king,
but rather by looking at who was king in various possible circumstances. But the
people who were kings in those possible circumstances were real people.

The case of Alexander and Hephaestion looks different. Perhaps, for example,
Hephaestion’s thought is this: the Etruscans don’t have an army, being a peace-
ful people. But they’re also a noble and determined people – had they formed
an army, it would have been a formidable one, capable of defeating Alexander.
What, in this scenario, is the member of type e which is, in some possible world,
an army that defeats Alexander? Not the army of the Etruscans – there is no
army of the Etruscans (although there could have been one), so that can’t be the
thing that is possibly an Alexander-defeating army. Perhaps it is the Etruscan
people? They are not an army, but perhaps they – the collection of them, that
very actually existing thing – could have been an army. But that might not
give us what we want. Perhaps the Etruscans are few in number, and they,
no matter how organized and trained, could never be an Alexander-defeating
army. But the Etruscans are long-range planners – had they chosen to confront
Alexander in the field of battle, they would first have had larger families for
many generations, producing many new Etruscans to form their mighty army.
That is indeed a situation in which there could have been an army capable of
defeating Alexander. But the actual thing that could have been that army isn’t
the Etruscan people, because that army isn’t composed of the actual Etruscans,
but rather of merely possible Etruscans.

In fact, there just isn’t anything suitable in type e, if we’re thinking of that as
the collection of all actual entities – nothing that could have been (but isn’t) an
army capable of defeating Alexander. The core issue here is that there could
have been things other than the things that actually are, so talking about what
might have been requires looking beyond the contents of type e.

307

You might also like