| Department of Computer Science
University of Toronto
CSC485 / 2501 Introduction to Computational Linguistics,
2000
Tutorial notes number 4:
Project Stage 3 --- Semantics;
Building semantic structures compositionally
One of the better ways to approach this stage of the project is to use
a compositional semantics. As it says in your lecture notes, using a compositional
semantics allows you to build your semantic representation in tandem with
the syntactic representation.
So, for each grammar rule you will build a small part of the overall
semantic structure, the same way that you build a small part of the overall
parse tree.
What you should use
Gulp feature structures are ideal for building your semantic representation.
There are two main reasons for this:
-
Order doesn't matter. If your frame representation has the same gross structure
(i.e., things are generally at the same level), then they can still be
unified.
-
You don't need to specify everything. If there is some part of a frame
structure that is not specified in your representation, unification will
take care of filling things in.
A frame representation
When we use Gulp structures, we are essentially using on a frame-based
representation something like the following:
|
frametype:
|
|
action |
|
|
type:
|
|
give |
|
| |
ì
|
frametype:
|
person |
|
agent:
|
í
|
gender:
|
male |
| |
î
|
name:
|
steve |
| |
ì
|
frametype:
|
book |
|
direct obj:
|
í
|
colour:
|
blue |
| |
î
|
subject:
|
mathematics |
| |
ì
|
frametype:
|
person |
|
recipient:
|
í
|
gender:
|
female |
| |
î
|
name:
|
lisa |
This semantic structure could represent the sentence:
Steve gave Lisa the blue math book.
As a bonus, if you build your parser correctly (using the information
from the verb to build the slots), the same semantic structure will come
from the sentences:
Steve gave the blue math book to Lisa.
Lisa was given the blue math book by Steve.
The blue math book was given to Lisa by Steve.
The blue math book was given by Steve to Lisa.
In fact, your semantic representation should have this nice property,
as it will reduce the complexity of your retrieval engine.
Where to start
The base frame representations will come out of the lexicon. The type
of each frame is determined by the lexical entry for the "head'" of the
clause. For example, you may have the following lexical entry for dog:
lex_noun(dog, dogs, count).
You can extend this with one more argument for the semantic structure:
lex_noun(dog, dogs, count,
frametype:animal..type:dog).
or, if you don't mind having a lot of frame types:
lex_noun(dog, dogs, count,
frametype:dog).
For proper names, you could have something like:
lex_pnoun(steve, masculine,
frametype:person..name:steve).
For the verbs you probably need something like:
lex_verb(buy, buys, bought, buying,
bought, [none, np, np_np], frametype:action..type:give).
How to divide your lexicon into frametypes depends entirely on your
domain and what kinds of things you want to do with it. Your frametypes
need to do two complementary things: 1) provide distinctions between concepts
that need to be distinct, and 2) hide distinctions between concepts that
your domain doesn't distinguish.
Other kinds of words won't define the basic frames, but just slot-filler
pairs. For example, an adjective would look something like:
lex_adj(blue, bluer, bluest,
colour:blue).
Adding detail
Once you have the basic frame from the lexicon, you can fill it in during
parsing. For example, some of your grammar rules may look like:
s( s_assert(NP,VP), Features,
Gap, Semantics) -->
{Features = mood:assertion..whphrase:no},
np( NP, Features, nogap, NPSem),
vp( VP, Features, Gap, VPSem),
Semantics = agent:NPSem,
Semantics = VPSem}.
np(np(NPTree, ClauseTree), NPFeatures,
nogap, NPSem) -->
np_Determiner(NPTree, NPFeatures,
NPSem),
np_Clause(ClauseTree, ClauseFeatures,
ClauseSem, gap(np, NPTree, NPFeatures, NPSem)),
{NPFeatures=restriction:(ClauseFeatures),
NPSem =restriction:(ClauseSem)}.
np_Adjectives([NewAdjective |
MoreAdjectives], Noun, NPFeatures, NPSem) -->
adjective(NewAdjective, NPFeatures,
NPSem),
np_Adjectives(MoreAdjectives,
Noun, NPFeatures, NPSem).
np_Adjectives([], Noun, NPFeatures,
NPSemantics) -->
np_Noun(Noun, NPFeatures, NPSemantics).
np_Noun(Noun, NPFeatures, NPSemantics)
-->
noun(Noun, NPFeatures, NPSemantics).
What we're doing here is building up a collection of semantic features
at the same time as we build our syntactic tree and syntactic features.
By nesting feature lists as properties of other features, we get a treelike
structure for our semantics just as we do for our syntax.
To parse an assertion, we parse a noun phrase that has a semantic representation
NPSem,
and a verb phrase with a semantic representation VPSem.
NPSem is built
by the np rule, which
just passes the feature structure to the various rules to fill things in.
For most NPs, the semantic information will be built solely out of the
lexicon.
VPSem is built
by the vp rule. This
is a little more complicated, as we want to be able to assign roles to
the various NPs and PPs in the verb phrase. Already in the syntax phase
of the project, we've paid some attention to roles; most implementations
identify nps and pps with features like dobj_complement,
and values like agent,
source,
or recipient. Notice
that the feature structure VPSemantics
is passed into the complementlist
rule.
This feature will be passed to the vpcomplement
rule, which will build our embedded structure as shown above.
Once NPSem and
VPSem
have been built, we complete the semantic structure of the sentence by
making NPSem the semantic
structure for the agent.
Things to watch out for
The above is a relatively simple example, although it actually will
work, if you want to use it. In fact, for the most part, the semantics
is just as easy as this. The hardest part will be adding the semantics
to your lexical entries. There are, however, a few harder cases:
-
Passive sentences Don't forget that the agent and patient roles
are reversed in passive sentences. If you are using a lexicon based on
the one that we provided, the easiest way to handle this is to pass the
subject role out of the verb phrase up to the sentence level where it can
be used to tag the subject NP.
-
Relative clauses NPs can have embedded sentences. These sentences
have their own semantic structures. You need to decide what sort of features
these things should fill and how to use them later on.
-
Prepositional Phrases The parser you started with and its lexicon
contains roles for most kinds of prepositional phrases, so you can use
these to tag prepositional phrases that modify verbs or nouns.
The roles of the various nps and pps in the semantic structure can by very
difficult to determine - that's one reason we do the project with a limited
domain. This way, we diminish problems from ambiguity. For instance, "with
x" can mean that x is instrumental (e.g. I hit the nail with a hammer),
or accompanying (e.g. never take Aspirin with Coke), or even a more
domain-specific meaning "e.g. Wayne Gretzky is with the New York Rangers."
Within your narrow domain, this type of problem should not arise too often.
Keep in mind
Don't forget that the reason that you're building these semantic structures
is so that you can do retrieval from a database of facts! Try to keep the
representation simple, and try to keep the number of possible representations
for the same semantic content as low as possible (1 is, of course, the
ideal number).
What you need to write
Part of the third stage of the project is a database of facts about
your domain and the code to retrieve these facts from the database (i.e.,
your semantic interpreter). This is relatively straightforward in Prolog
using Gulp feature structures.
How big should your database be? Big enough so that we can ask a few
different kinds of questions and get different objects back, but not so
big that you spend all of your time for this stage entering facts.
If you have to decide between adding five new things to a database and
allowing a different kind of question, add the question.
These notes will show one way that such a database and interpreter can
be written. This is not by any means the only way.
The Database
Your database should consist of objects in your domain. For example,
if your domain is a bookstore, then you probably want to have objects for
(at least), books and authors:
frametype: object
type: book
title: [all, about, figs]
author: frametype: person
name: [joe, foo]
subject: type: figs
genre: non-fiction
Inside of a Prolog file, you can collect all of your objects together
as database facts:
database(frametype:object..type:book..title:[all,about,figs]..
author:(frametype:person..name:[joe,foo])..
subject:(type:figs)..genre:non-fiction).
database(frametype:object..type:book..title:[how,the,east,was,won].
author:(frametype:person..name:[bob,bar])..
subject:(type:history)..genre:fiction).
database(frametype:person..profession:author..
name:[joe,foo]..gender:male..age:37).
database(frametype:person..profession:author..
name:[bob,bar]..gender:male..age:22).
database(frametype:person..profession:author..
name:[lisa,baz]..gender:female..age:46).
database(frametype:person..profession:editor..
name:[larry,quux]..gender:male..age:50).
When should I build the database?
You should be able to decide pretty early on (i.e., do it soon!) what
kinds of objects you want to have in your database. The question of exactly
what feature structures to use to represent these objects should probably
be delayed until you've had some experience adding the semantics to your
parser.
It's a good idea to have as much commonality as possible between the
feature structures that you're storing as your database and the feature
structures that your parser is generating for the semantics. By doing this,
you limit the amount of work that you will have to put into building the
semantic interpretation rules.
So, you should try to use the same feature names for the same things
in both your database and your semantic representation. Furthermore, things
that are named by the same feature name should have the same kinds of values
in your database.
Interpreting semantic representations
If we decide that our database will contain objects in our domain, then
(essentially) what we're doing is representing the objects as "super-NPs".
That is, their structure will be similar to (although probably more detailed
than) the structures for NPs produced by our parser.
You will recall, however, that the semantic representations that we
discussed were centered around the verb phrase. For example, in our book
domain, we may have the following representation for a sentence:
frametype: action
type: write
agent: frametype: person
name: N
patient: frametype: object
type: book
This feature structure would represent a question such as Which people
wrote books?
The process of semantic interpretation then becomes mapping the components
of this feature structure to a partially specified feature structure that
we can use to query our database.
The interp rule
We can build the mapping between the components of the semantic representation
and the database objects using Prolog rules. Essentially, you'll be building
a rule for each configuration of the verb that you want to be able to handle.
This does imply that you'll be writing a lot of rules, but once you
get the first few under your belt, the rest should take less time. Also,
if you're clever about it, you can make one rule cover more than one case.
Let's look at a couple of interpretation rules:
% Handles "author X writes
book Y".
interp(frametype:action..type:write,
SemFeat, Frame) :-
SemFeat = agent:A..patient:P,
P = frametype:object..type:book..title:T,
A = frametype:person..name:N,
Frame = frametype:object..type:book..author:A..title:T,
database(Frame).
% Handles "list authors with
properties X"
interp(frametype:action..type:list,
SemFeat, Frame) :-
SemFeat = patient:P,
P = frametype:person..profession:author..gender:G,
Frame = frametype:person..profession:author..gender:G..name:N,
database(Frame).
The first thing to notice is that we pass the semantic representation
as both the first and second arguments. We do this to take advantage
of unification so that we don't have to actually run rules whose verbs
don't match.
Inside of these rules we are basically breaking apart the semantic representation
and taking the pieces of it that we need to build something resembling
a database feature structure.
A walk-through
Let's look at the first rule. The first thing we do is make sure that
this is a rule for the verb write, by using unification on the left-hand
side of the rule.
We then use the = function to extract the agent and patient of from
the Gulp structure. The idea here is that, for the verb write, the
agent is the thing (person) that's doing the writing (i.e. the author)
and the patient is the thing that has been written.
In the second call to =, we ensure that the thing that was written was
a book. We also bind S
and T to the subject
and title of the book mentioned in the sentence.
The third unification prepares the name for extraction.
The forth call to = builds the frame that will be used to query the
database.
Note that not all of the variables that we use (A,
P,
T,
etc.) will be bound (most of them probably won't be), but that won't matter
for the feature structure unification. If they can possibly be there, we
should pull them out.
So, let's look at our particular example above:
frametype: action
type: write
agent: frametype: person
patient: frametype: object
type: book
The initial call to the first interp rule binds A
to the structure:
frametype: person
name: N
and P to the structure:
frametype: object
type: book
N and T are unbound, so that the Frame structure looks like:
type: book
author: frametype: person
name: N
title: T
When this is used as the parameter in the call to database, it will
unify with the first two entries:
T = [all, about, figs]
N = [joe, foo];
T = [how, the, east, was, won]
N = [bob, bar]
Putting it all together
Once you've got your interpretation rules working, you need to build
some sort of interface to the parsing routines that will take a semantic
representation and return all the objects in the database that match it.
Here's something to get you started:
%
% The top-level rule to be called
by your
% parse-loop. Given a semantic
representation it
% prints all matching objects.
semantics(Sem) :-
findall(Frame,interp(Sem,Sem,Frame),
Frames),
show_structures(Frames,1),
!.
%
% show_structures will print
the feature
% structures returned from semantics.
It does
% this by calling the Gulp feature
structure
% printer.
%
% If the list is empty, and this
is the first one,
% then let the user know that
there were no
% matching entries in the database.
show_structures([], 1) :-
write('Sorry, unable to find
an answer!'), nl.
%
% If this is the last one and
we had some in the
% list, then quit.
show_structures([], N) :-
N > 1.
%
% Print a separator and then
this element of the
% list, along with the object.
You can make this
% more sophisticated if you like.
show_structures([A|R], N) :-
write('--------------------------------'),nl,
write('Object '), write(N), write(':'),nl,nl,
display_feature_structure(A),
N1 is N+1,
show_structures(R,N1).
What to hand in
Code
-
Your parser code, complete with additions to support building a semantic
feature representation
-
Your interpreter code that interfaces with the database to answer questions
-
The code for your database
Output
-
Your semantic representations of the sentences in test.txt (the same ones
you've grown to love while doing the syntax phase). If there are some syntactic
phenomena here your parser can't handle, it's not so important at this
stage.
-
The semantic representations and interpreted responses of sample sentences
from your domain. This is what it's all been working towards - an actual
conversation with your domain. The answers you give can be canned responses
with data from the database filled in, or can simply be a list of bound
responses.
|