The first
step that needs to be done is to allow for sufficient
world knowledge. To extend the world knowledge, the first
steps are to investigate on some large sets of rules specifying allowable
discourse entities we can assume are inter-inferable. The WordNet
maybe helpful for this procedure, though the problem will become strongly
related to the word sense disambiguation problem, which in itself, is already
an intricate problem. Second, the algorithm
should be scaled up to handle global discourse coherence.
The algorithm needs to somehow acknowledge the title of the text.
Conceptually, having “read” the title, the algorithm would “point” to the
corresponding set of relevant rules it derived from the WordNet and map
the noun phrases it encounters in the body of the text to the set of rules.
This would correspond to a human reading a title
and recalling some of his/her already existing knowledge on the
topic, and then relating what (s)he recalls to what (s)he would then read.
The third step involves keeping a separate
list of frequently appearing words that are
in the subject position. The more frequent a word, the higher it
is ranked in this second discourse entity list. This most highly
ranked element that matches the target referent will be the antecedent
if there is no antecedent to be found in the original discourse entity
list. I only include those words that are in the subject position
due to the strong evidence (GGG93) that the subject will be referred to
again. Since the algorithm processes the text incrementally, each
element in the second entity list will only give number of occurrences
of that element so far. But building from the fact that subject of
a sentence (the probable center of the utterance) is positioned towards
the beginning of the sentence, the key words in a document would also be
towards the beginning of the document. Thus the second entity list
should be able to provide the algorithm with a “global picture”, and so
encouraging global coherence.
I also propose a solution
that will solve the problems arising from (a) incremental
update and (b) the noun phrase “A and B” – evoking two different
discourse entities: “A and B”, and “A” and “B”.
The idea imitates what human beings do. Humans read a sentence from
left to right, processing each word incrementally. When he encounters
one of the two situations mentioned above, he assigns
the target referent with an “unknown” tag without affecting his
big picture understanding of the document so far. With situation
(a), he knows from experience that the identity of the “unknown” tag will
be revealed later in the sentence. He will replace the tag with the
first discourse entity that satisfies the three constraints. With
situation (b), there is more freedom and the tag may never again be reassigned.
He knows that if this is the case, then it is intended by the writer and
no matter which interpretation he chooses, it will not affect his comprehension
of the document. An algorithm can imitate what a human does in situation
(a) with relative ease. And for situation (b), the algorithm must
have both interpretations ready in the discourse entity list. Their
relative order with respect to each other does not matter, since only one
will be chosen, if any.