HMM-Based Semantic Case Frame Analysis

HMM-Based Semantic Case Frame Analysis
HMM-Based Semantic Case Frame Analysis

In the domain of spoken language information retrieval, spontaneous effectsin speech are very important(Minker, 1999). These include false starts, repetitions and illformed utterances. Thus it would be improvident to base the semantic extraction exclusively on a syntactic analysis of the input utterance. Parsing failures due to ungrammatical syntactic constructs may be reduced if those phrases containing important semantic information could be extracted whilst ignoring the non-essential or redundant parts of the utterance.

Restarts and repeats frequently occur between the phrases. Poorly syntactic constructs often consist of well-formed phrases which are semantically meaningful. One approach to extract semantic information is based on case frames. The original concept of a case frame as described by Fillmore (Fillmore, 1968) is based on a set of universally applicable cases or case values. They express the relationship between a verb and its nouns. Bruce (Bruce, 1975) extended the Fillmore theory to any concept-based system and defined an appropriate semantic grammar whose formalism is given in Fig. In the example query could you give me a ticket price on [uh] [throat clear] a flight first class from San Francisco to Dallas please a typical semantic case grammar would instantiate the following terminals:
• price: this reference word identifies the concept airfare (other concepts may be: book, flight, …)from: case marker of the case from-city corresponding to the departure city San Francisco

• to: case marker of the case to-city corresponding to the arrival city Dallas
• class: case marker of the case flight-class corresponding to first
• case system: from, to, class, … The parsing process based on a semantic case grammar typically considers less than 50% of the example query to be semantically meaningful. The hesitations and false starts are ignored. The approach therefore appears well suited for natural language understanding components where the need for semantic guidance in parsing is especially relevant. Case frame analysis may be used in a rule-based case grammar. Here, we apply HMM-based modelling instead (Pieraccini et al., 1992; Minker et al., 1999). In the frame-based representation, the semantic labelling does not consider all the words of the utterance, but only those related to the concept and its cases. However, in order to estimate the model parameters, each word of the utterance must have a corresponding semantic label. Thus, the additional label (null) is assigned to those words not used by the case frame analyzer for the specific application. A semantic sequence consists of the basic labels , (m:case), (v:case) and (null) corresponding respectively to the reference words, case markers, values and irrelevant words. Relative occurrences of model states and observations are used to establish the Markov Model, whose topology needs to be fixed prior to training and decoding.

Semantic labels are defined as the states sj . All states such as the examples (v:at-city), (null) and shown can follow each other; thus the model is ergodic. In direct analogy to the speech recognition problem (equation 1), the decoding consists of maximizing the conditional probability P(S|W) of some state sequence S given the observation sequence W: [S] opt = argmax S {P(S)P(W|S)} (2) Given the dimensionality of the sequence W, the exact computation of the likelihood P(W|S) is intractable. Again, bi-grams are a common approximation in order to robustly estimate the Markov Model parameters, the state transitions probabilities P(sj |si) and the observation symbol probability distribution P(wm|sj ) in state j. In contrast to speech recognition, the computation of the model parameters can be achieved through maximum likelihood estimation, i.e. by counting event occurrences. Usually a back-off and discounting strategy is applied in order to improve robustness in the face of unseen events.An HMM-based parsing module may be conceived as a probabilistic finite state transducer that translates a sequence of words into a sequence of semantic labels. The semantic labels denote word’s function in the semantic representation. Although the flat semantic model has known limitations with respect to the representation of long-term dependencies, for practical applicationsit is often sufficient. It has been shown that several methods, such as contextual observations and garbage models, exist that enhance the performance of HMM-based stochastic parsing models (Beuschel et al., 2004).

Leave a Reply

Your email address will not be published. Required fields are marked *