Title: Stochastic F0 Contour Model Based on the Clustering of
F0 Shapes of a Syntactic Unit
Author: Y.Yamashita and T.Ishida
Reference: Proc. of 7th European Conference on Speech Communication and Technology (Eurospeech 2001), 1, pp.533-536 (2001).
Abstract: This paper describes a stochastic modeling between an F0 contour and linguistic features of a sentence for speech synthesis. The F0 contour of a sentence is represented by concatenation of the F0 patterns of a Japanese syntactic unit, bunsetsu. A bunsetsu F0 pattern is composed of the F0 average and the F0 shape. The F0 average is independently predicted for each bunsetsu by a quantification theory from linguistic features of the bunsetsu. The most probable sequence of bunsetsu F0 shapes for a sentence are found in the F0 shape database by a probabilistic measure. The probability that an F0 contour is observed for a sentence is defined by two kinds of probabilities, the F0 shape production and the F0 shape bigram. The latter is a probability of adjacent occurrence of two F0 shapes, like a word bigram in speech recognition. Several typical bunsetsu F0 shapes are extracted by clustering of training data and stored in the F0 shape database. The probability of the F0 shape production is computed for each bunsetsu based on the distribution of linguistic features in the cluster.
Ftp article (ps-file, 4 pages, 371920 bytes)
Ftp article (PDF-file, 4 pages, 92626 bytes)