%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%	Notes from lecture 3 of Professor Gerald Penn's
%	CSC2519F: Natural Language Semantics course on September 26, 2007
%	as transcribed by jonathan lung, Scribe-for-a-Day.
%
%	These notes are provided as-is and there are no guarantees of correctness
%	or completeness.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\documentclass{article}
\usepackage{stmaryrd}
\newcommand{\interp}[1]{$\llbracket$\texttt{#1}$\rrbracket$}
\begin{document}
We describe higher order semantics via higher order syntax.  We describe higher order syntax via higher order logic [Henkin '50].

BasType = \{Bool, Ind\} .  Bool is usually written as `t' for truth and \texttt{true}, \texttt{false} $\in Con_t$.  Ind is usually written as `e' for entity.  Things that take entities to truths are called \textit{properties}.  For example, \texttt{happy} takes entities and ascribes truth values to various entities.  Adjectives, nouns, and intransitive verbs are properties; properties belong to $Con_{e \mapsto t}$.  It may be strange from our perspective that this set of words belong to the same group.  There are many entities that can exist such as \texttt{john}, \texttt{WTC}, and \texttt{the president of the United States} (ignoring changes over time).  In the case of \texttt{the president of the United States}, we normally use \interp{president}$_{w,t}$, i.e., by using interpretations where $(w,t)$ is the context of the interpretation.  We have a special name for these things that never change: \textit{rigid designators}.  Rigid designators are part of $Con_e$.

Things that take $e \mapsto e$ (entities to entities) are called functions.  For example, consider the English phrase ``John's arm''.  We can write this in one of two ways:  \texttt{arm($x$) \& own($x$)($john$)} or \texttt{arm(john)}.  The second form is rarely used.

Boolean connectors such as logical and ($\wedge$) and logical or ($\vee$) are members of $Con_{t \mapsto t \mapsto t}$ while the unary operator not ($\neg$) belongs to $Con_{t \mapsto t}$.  $eq_\tau$ (equality) is part of $Con_{\tau \mapsto \tau \mapsto t}$.  The description operator, $L_\tau \in Con_{(\tau \mapsto t) \mapsto t}$, returns an individual (i.e., a single, unique) that satisfies $\tau \mapsto t$.  There is disagreement about what happens when no or multiple individuals exist satisfying $\tau \mapsto t$.  

\texttt{every}$_\tau \in Con_{(\tau \mapsto t) \mapsto t}$.  When we say $\forall x.\phi$, we basically mean \texttt{every}$_e(\lambda x . \widetilde \phi)$.  That is, we bind to $x$  all things that are satisfied by $\widetilde \phi$.  Sometimes in class, we will write $\forall x.\phi$, but we really mean \texttt{every}$_\tau \in Con_{(\tau \mapsto t) \mapsto t}$.  To be more explicit about \texttt{every}, \interp{every$_\tau$}$(P) = \left\{ \begin{array}{lll}
	\top \textrm{\ if\ } P(a)=\top \textrm{\ for all\ } a \in Dom_\tau .\\
	\bot \textrm{\ otherwise}.
\end{array}
\right .$.

Similarly, \\
\interp{eq$_\tau$}$(a)(b) = \left\{ \begin{array}{l}
	\top \textrm{\ if\ } P(a)=\top \textrm{\ if \ } a = b.\\
	\bot \textrm{\ if\ } P(a)=\top \textrm{\ if \ } a \ne b.
\end{array}
\right .$ \\
and, e.g., \\
\interp{run}$_{w,t} = \left\{ \begin{array}{l}
	\mathtt{john} \mapsto \top. \\
	\mathtt{mary} \mapsto \bot. \\
	\mathtt{bill} \mapsto \bot.
\end{array}
\right .$; note that in this case, $t$ denotes time, though this is usually included in $w$.

Everything that is of the form $Con_{(\ldots) \mapsto t}$ is called a generalized quantifier (property of properties) and act as noun phrases [Mostowski '57].  All the aforementioned things are called logical constants.


We can create scoping effects in our formulae:  \texttt{every}$(\lambda x.$\texttt{some}$(\lambda y.$\texttt{love}$(y)(x)))$.  This means ``everybody loves somebody''.  But that English phrase can also be written as \texttt{some}$(\lambda y.$\texttt{every}$(\lambda x.$\texttt{love}$(y)(x)))$.  However, this has a different meaning than the first; this particular way of writing ``everybody loves somebody'' means that there is some (same) person that everybody loves.  There is an example of this type of reading of the sentence from \textit{Saturday Night Live}.  Paraphrased, ``A man is mugged every minute in New York City.  We are interviewing him tonight.''  Some languages use case (e.g. nominative and accusative) to resolve ambiguity.

There is another, related, ambiguity.  This is the collective / distributive / iterative ambiguity problem.  Consider the sentence ``three men carried a piano.''  The preferred reading is that there are three men and one piano (\textit{collective ambiguity}).  However, it could also be read that there are three men, each of whom is carrying a piano.  On the other hand, if we substituted ``light bulb'' for piano, the preferred reading would be three men, three light bulbs (\textit{distributive ambiguity}).  If the sentence used the plural, ``light bulbs'', then we would have \textit{iterative ambiguity} as we would be uncertain if each carried more than one light bulb iteratively.  Solving these types of ambiguity is hard as it requires knowledge about the real world.  Quantifier ambiguity, on the other hand, does not, so it is easier to resolve.  In fact, we may not need inference.  We can use information from surrounding sentences to resolve underspecified semantics.  Such sentences may contain words such as ``every'' or ``some''; we can also look for any asymmetries that exist.

Let us now look at the sentence ``John ran.''  We can denote this as \texttt{run}$($\texttt{john}$)$.  The sentence ``some boy ran'' can be written as $\exists x.boy(x) \& run(x)$ or as\linebreak \texttt{some}$(\lambda x.$\texttt{boy}$(x)\&$\texttt{run}$(x))$ or \texttt{some$'_e($boy$)($run$)$}.  What is the last \texttt{some}?\linebreak \texttt{some$'_\tau \mapsto (\tau \mapsto t) \mapsto t$} where the first $\mapsto$ maps as the logical and function.  Similarly, for \texttt{every$'_\tau$}, the first $\mapsto$ is an implication rather than a logical and.  For example, ``every boy ran'' is $\forall x . boy(x) \rightarrow run(x)$.  In this case, we do not want to say anything about \emph{all} $x$s, but only those that are boys.  Note that these primed ($'$ed) versions do not contain $\lambda$.  This is because we have already applied $\eta$-reduction.
\texttt{some$'_e$} and \texttt{every$'_\tau$} are called \textit{generalized determiners}.  These are not adjectives, even though they appear to be a description.  They are in fact determiners.  You an only have one determiner per sentence.  In all natural human languages, determiners are either at the beginning of sentences or are affixes of words.

``\underline{The} boy ran'' contains existence and singular form.  This is written as \texttt{run$(\iota_e($boy$))$}.  ``\underline{No} boy ran'' is written using the negative determiner:\linebreak \texttt{$\forall x.$boy$(x) \neg $run$(x)$} or as \texttt{every$'_e($boy$)(\lambda x.\neg$run$(x))$}.  This view of generalized quantifier is due to the work of [Russell 1905].  In this view, ``ran'' is the operator of a sentence.  Each verb can be thought of as a function and requires an actor and a recipient, even if not explicitly stated.  We cannot have sentences without verbs.  Sentences are fully satisfied verbal phrases.  That is, all the arguments of verbs are satisfied.

This leads to the question, ``Why is $\exists$ and $\forall$ sometimes in front but sometimes not present?  This question was answered by [Montague '73] with \textit{proper treatment of quantification} (PTQ).  We want \texttt{$Q($run$)$} to be equal to \texttt{run$($john$)$}.  We have a solution if we let $Q$ be \texttt{$\lambda P.P($john$)$}.  This operation selects the properties that are true for \texttt{john}.  Its type is $(e \mapsto t) \mapsto t$.  In philosophy, this leads to the ``identity of indiscernibles'' (Leibniz). The problem is distinguishing things for which the same set of properties of true.  More generally, $[\forall P. P(x) = P(y)] \rightarrow x = y$.  We assume that if all properties satisfied are the same between two objects, they are the same object.  \texttt{john} used to be an entity.  Using this knowledge, we can now do $(e \mapsto t) \mapsto t$.  This is called type-raising.  However, we normally use rigid designators as a shortcut.  So when we write \texttt{run$($john$)$}, we mean \texttt{$(\lambda P.P($john$))($run$)$}.  Here, we used $\beta$-reduction.  We can look at generalized quantifiers set-theoretically:

\interp{every$'$}$(P)(Q)$ is $P \subseteq Q$.

\interp{some$'$}$(P)(Q)$ is $P \cap Q \neq \emptyset$.

\interp{no$'$}$(P)(Q)$ is $P \cap Q = \emptyset$.

\interp{three$'$}$(P)(Q)$ is $|P \cap Q| = 3$.

\interp{the$'$}$(P)(Q)$ is $P \cap Q \neq \emptyset$ and $|P| = 1$.

There has been discussion on how to turn other generalized quantifiers like ``some'', ``few'', and ``most'' into set-theoretic representations.

An important consideration in natural language is determining what applies to what.  Because of that, word order is very important.  E.g., when looking at the sentence ``Mary sees John'', we need to make sure that we know who is looking at who.  In this example, we can write \texttt{sees$($john$)($mary$)$}.  \texttt{john} is the \textit{internal argument} while \texttt{mary} is the \textit{external argument}.

Classical logic does not deal with order or multiplicity as in $\frac{P\ P\ P \rightarrow Q}{Q}$.  In natural language syntax, multiplicity and order \emph{do} matter.  John sees John usually implies that there are two Johns.  When multiplicity and order (also known as exchange or permutation) matter, we call the logic \textit{resource order logic} or \textit{linear order logic}.  These logics cannot change multiplicity or order.  In these logics, we only need and ($\wedge$) and implication ($\rightarrow$);  or ($\vee$) can be created using these two.  However, we need two different kinds of implication: \textit{forward looking} (/) and \textit{backward looking} ($\backslash$).  In $P/Q$, we say ``$P$ is forward looking for $Q$'' and a similar version exists for $P\backslash Q$.

A \textit{categorial grammar}, $G = \langle \Sigma, \tau, BasCat, s \rangle$.  $\tau$ (type assignment)$: \Sigma \mapsto P(Cat)$.  Cat is the smallest set such that $BasCat \subseteq Cat$ and $(p, q \in Cat) \rightarrow (q/p, p/q \in Cat)$.  Categorial grammar rules are given by the internal structures of categories which are given by $q/p$ and $p/q$.  $s \in BasCat$ and $\Sigma$ is a lexicon from language $G, \mathcal{L}(G) = \{w_1 \ldots w_n \in \Sigma ^ * |  a_1 \dots a_n \vdash s, a_i \in \tau(w_i), 1 \le i \le n\}$.

We have three systems of categorial grammars.  System 1:  $Q$ seeking $P$ ($Q/P$) and $P$ to the right:  $\frac{Q/P\ P}{Q}/e$.  $Q$ seeking $P$ ($Q/P$) and $P$ to the left:  $\frac{P\ Q/P}{Q}\backslash e$.  When we say $a_1 \ldots a_n$ \textit{derives} ($\vdash$) $s$, we mean that \linebreak $\begin{array}{cccl}
\bar{a_i} & \ldots & \bar{a_n} & \mathrm{(in\ that\ order)}\\
 & \vdots &  & \\
 & \bar{s} & &
\end{array}$.  While $a_i$ through $a_n$ must be applied in order, that does not mean we cannot reuse them.  This system goes by three names:  \textit{applicative categorial grammar}, \textit{Ajdukiewicz  categorial grammar}, and \textit{BGS categorial grammar}.  BGS stands for Bar-Hillel, Gaifman, and Shamir.  We shall deal with the other two systems of categorial grammars next week.
\end{document}