Assignment 11: Scene description

Due Friday, 2006-12-08

Download these seven(!) Python modules: utils.py, entity.py, unify.py, utt.py, gen_parse.py, lex_scene.py, scene.py. All of the other modules are imported if you run or import scene.py. You should put your code in a separate module called <name>_description.py that imports scene.py.

For this assignment you'll put together various pieces that already generate simple sentences, produce random scenes, and find the relations between objects in a scene to generate a rudimentary scene description.

What's new

Context

To make it easier to inherit contextual information, utterances now have a single 'context' feature whose value is a dict (Struc) that has features 's' (speaker), 'h' (hearer), and 'r' (discourse referents). For this assignment you won't be concerned with 's' and 'h' and can ignore them (though variables for them will get created and inherited automatically whether you like it or not). You will however have to worry about discourse referents. To create an utterance with a particular set of discourse referents, you have to use the syntax that's illustrated in loc1 and loc2 in lex_scene.py. That is, you have to assign the set of discourse referents using the 'context' option; otherwise it won't get passed to the noun phrases in the sentence, and you'll fail to properly generate the and another.

Grammar and lexicon

For this assignment the sentence-level grammar is simpler since all we need is the single sentence pattern Loc1Sent, which consists of an adjunct prepositional phrase followed by the 'head' of the sentence, there's, followed by the subject noun phrase, for example, to the left of the oval there's a circle. There are separate sentence-level entries for each of the different "prepositions", each representing the relative location of a "figure" object (referred to by the subject) and a "ground" object (referred to by the object of the preposition). For example, in the above example, the circle is the figure, the oval the ground, and to the left of the "preposition". The prepositions are in two groups, one for relations with the figure outside the ground, one for relations with the figure inside a larger ground (in this case the window that contains the objects).

The noun-phrase grammar is more complicated than before because we want a slot for the determiner (a, the, or another). There are two sets of noun-phrase entries, one set of three determiner entries that are tried in order during generation, with only one matching, defP, indefP, and anotherP, and another set for the nouns, squareP, circleP, triangleP, ovalP, and windowP. The noun entries are matched separately during generation. The determiner entries work by checking the set of discourse referents during unification (the only change to unification in this version).

Generation

The sentence generation function generate in gen_parse.py works by taking an incomplete utterance and unifying it with sentence-level and noun-phrase-level lexical entries and then, if this succeeds, ordering the words that it has inherited from the matched entries and returning them in a string. It returns None if it finds no matching entries at the sentence or noun-phrase level.

>>> generate(loc1)
"to the left of the circle there's another square"
>>> generate(loc2)
"above the square there's a oval"

Scenes

The Scene class, in scene.py, represents sets of objects with particular sizes and positions (x and y are their center coordinates) in a window of a particular size. Each object is an instance of Indiv, that is, a dict. The set of objects is stored in the attribute objs with the window always the first object in the list. This class has a method for findings objects that are near a given object (get_near) and a method for returning a location predicate for a figure-ground pair of objects (make_pred), that is, an instance of Struc. For example,

>>> scene2 = random_scene()
>>> for o in scene2.objs: print o
indiv17
indiv18
indiv19
indiv20
indiv21
indiv22
indiv23
indiv24
>>> scene2.make_pred(scene2.objs[1], scene2.objs[2])
{'ground': {'y': 127, 'h': 20, 'cat': 'oval', 'w': 20, 'x': 163}, 'figure': {'y': 174, 'h': 20, 'cat': 'square', 'w': 20, 'x': 34}, 'cat': 'left'}
>>> print _
ground: indiv19
figure: indiv18
cat: left
>>> scene2.get_near(scene2.objs[3])
[{'y': 174, 'h': 20, 'cat': 'square', 'w': 20, 'x': 34}]

To create a new random scene, use random_scene. To display the scene, use show, but since this uses Tkinter graphics, don't try it if you're running Idle. To return to the Python prompt, click on the graphics window's close box.

What you have to do

  1. For all six points, write a program that describes the location of all of the objects in the window, each in terms of a given ground object. Because the only given object is the window in the beginning, the first object must be described with reference to the window. The program maintains a list of discourse referents, which is passed to each utterance as the value of the 'r' feature in the 'context' dict (see the examples loc1 and loc2 in lex_scene.py again). Once the position of an object has been described, it is added to this list. Here is an example, using a Description class.
    >>> scene0 = random_scene()
    >>> show(scene0)
    
    >>> desc0 = Description(scene0)
    >>> desc0.describe()
    on the right side of the window there's a circle
    in the upper-right corner of the window there's a square
    on the left side of the window there's another square
    below the square there's a triangle
    
  2. The example in part 1 doesn't look much like a human description because it jumps all around the scene. In fact it describes the objects in the order they appear in the list (which is the order they were created in), and the square in the last sentence refers to the first square rather than the second one. For four extra-credit points, write a smarter program that attempts to use the most recently described object as the ground for the next new object if there is an undescribed object that is near enough, otherwise resorting to other previously mentioned objects as the ground, and only resorting to the window as ground if there are no undescribed objects near described objects. Here is an example.
    >>> scene1 = random_scene()
    
    >>> Description(scene1).smart_describe()
    in the upper-left corner of the window there's a square
    to the left of the square there's a triangle
    in the middle of the window there's another square
    above the square there's a oval
    on the left side of the window there's another oval
    

Home

Calendar

Coursework

Notes

Code

HLW


IU | COGS | CSCI

Contact instructor