Experiments with rhythm

A stress-timed language, such as English, is one in which the time intervals between stressed syllables in an utterance are relatively equal. This differs from a syllable-timed language, such as Spanish, in which all of the syllables in an utterance tend to have the same length.

For English, this means that word stress within a sentence can be produced partially by assigning a metre to the speech. The ability to understand the speech depends directly on how well the metre is assigned.

The following are recordings of the same sentence produced with different degrees of prosody:

  1. Raw speech with no rhythm assignment
  2. A very simple rhytm assignment
  3. A better rhythm assignment along with a manually assigned intonation contour

The rhythm in utterances 2 and 3 was produced with Cyng.

Here are some preliminary experiments with automatic duration assignment.

