"Topic Recognition for News Speech Based on Keyword Spotting"
Y.Yamashita, T.Tsunekawa and R.Mizoguchi
Proc. of 5th International Conference on Spoken Language Processing (ICSLP '98), Sydney, 3, pp.839-842 (1998).

Abstarct:
This paper describes topic identification for Japanese TV news speech based on the keyword spotting technique. Three thousands of nouns are selected as keywords which contribute to topic identification, based on criterion of mutual information and a length of the word. This set of the keywords identified the correct topic for 76.3% of articles from newspaper text data. Further, we performed keyword spotting for TV news speech and identified the topics of the spoken message by calculating possibilities of the topics in terms of an acoustic score of the spotted word and a topic probability of the word. In order to neutralize effect of false alarms, bias of the topics in the keyword set is removed. Topic identification rate is 66.5% assuming that identification is correct if the correct topic is included in the top three topics. The removal of the bias improved the identification rate by 6.1%.

ftp article (gziped ps-file, 4 pages, 116655 bytes)
ftp article (PDF-file, 4 pages, 295182 bytes)