Cynthia Speech Engine

Modeling rhythm and intonation with Cyng

The Cyng specification language can be used to model the rhythm and intonation of an utterance spoken by an actual person. The model is played using Cynthia. A Cyng speech program contains the text of an utterance along with transcriptions for the intonation and the rhythm.

The intonation trancription is given in terms of pitch accents. Cyng is still in the early stages of development, so it only supports the H* and L* pitch accents.

An example

  • Here is a recording of Faye Chalcraft reading from Alice's Adventures in Wonderland.
  • This is Cynthia's model of Faye's utterance.

The following is a plot of the pitch contour from Faye's utterance:

This is a plot of the pitch contour from Cynthia's utterance:

The Cyng program used to model the original utterance is as follows:


part(1) {
  tune {
    - - H* - - H*
    L* H* - - - - H* H* - L* - -
    H* - - H* - H* H* H* H* -
  rhythm {
    8 8 8 8 10 10 8 10 8
    8 8 8 9 8 8 10 8 8 8 8 8 9 8
    10 7 34 6 8 8 8 8 8 8
  text {
    when she thawed ih over afterwards
    id occurred do her that she ought do have wondered ad this
    bu at the time id all seemed quite na trull
Some of the words in the text were changed to produce the desired pronunciation. The original utterance contains several flaps and glottal stops, neither of which is available in the MBROLA databases that Cynthia is currently using. I attempted to simulate these sounds by altering the spelling of some of the words.

For comparison, here is the result of my first modeling attempt.

The CMU Arctic corpus is a collection of recordings of sentences taken from Project Gutenberg books and read aloud by human speakers. I used Cyng to model the timing and intonation from the >speech recordings of one of the speakers. These utterances were produced by Cynthia using Cyng.

Home | Contents | Sounds | Updates | Cyng | About

Copyright 2001-2017 by Bill Hollingsworth