SSML
Prev
Next

SSML

The Speech Synthesis Markup Language (SSML) is a W3C standard for marking up text for speech output. It provides tags for controlling voices, rate of speech, volume, gender, and pitch (tone). It also provides tags for controlling how words are spoken, for instance spelling out abbreviations. SSML is part of the VoiceXML specification, which is also a W3C standard.

At this time, KTTS provides limited and very basic support for SSML. It currently has the following restrictions.

The following sample text can be used to experiment with SSML.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE speak PUBLIC "-//W3C//DTD SYNTHESIS 1.0//EN"
    "http://www.w3.org/TR/speech-synthesis/synthesis.dtd">

<speak version="1.0" xml:lang="en-US">
    <prosody pitch="low">
        Who's been sleeping in my bed?
    </prosody>
    said papa bear.
    <prosody pitch="medium">
        Who's been sleeping in my bed?
    </prosody>
    said momma bear.
    <prosody pitch="high">
        Who's been sleeping in my bed?
    </prosody>
    said baby bear.
</speak>

More robust support for SSML is planned for the next version of KTTS.

Prev
Next
Home


Would you like to comment or contribute an update to this page?
Send feedback to the TDE Development Team