Speech Synthesis Markup Language

El SSML permite una mayor personalización en la respuesta de audio con detalles en las pausas, acrónimos, fechas, horas, abreviaturas...

Lenguaje SSML (Speech Synthesis Markup Language) es una propuesta de la W3c para que la web y varias aplicaciones que la implementen puedan utilizar el sintetizador de voz de una forma más natural mejorando así la pronunciación, enfatizando ciertas palabras, dar pautas al diálogo, reproducir sonidos, etc. Ya cuenta con muchas compañías que la han adoptado gracias a sus virtudes.^[1]

Historia

Este estándar de W3C es conocido como lenguaje SSML (Speech Synthesis Markup Language) y está basado en JSGF y/o las especificaciones JSML, que son propiedad de Sun Microsystems, Inc., California, U.S.A.

Ejemplos

Voz

<?xml version="1.0"?>
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
         xml:lang="en-US">   
  <voice gender="female">Mary had a little lamb,</voice>
  <!-- ahora pedimos una voz de niña diferente -->
  <voice gender="female" variant="2">
  Its fleece was white as snow.
  </voice>
  <!-- selección específica para una voz -->
  <voice name="Mike">I want to be like Mike.</voice>
</speak>

Fonema

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
         xml:lang="en-US">
  <phoneme alphabet="ipa" ph="t&#x259;mei&#x325;&#x27E;ou&#x325;"> tomato </phoneme>
  <!-- Este es un ejemplo de IPA al usar entidades de caracteres -->
  <!-- Porque muchas combinaciones entre plataformas/exploradores/editores de texto
       no copian corectamente el texto Unicode, este ejemplo usa la entidad de
       caracteres proporcionados por IPA. Normalmente puedes usar directamente
       la representación de símbolos UTF-8: "təmei̥ɾou̥". -->
</speak>

Sub

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
         xml:lang="en-US">
  <sub alias="World Wide Web Consortium">W3C</sub>
  <!-- World Wide Web Consortium -->
</speak>

Prosodia

<?xml version="1.0"?>
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
         xml:lang="en-US">
  El XYZ cuesta <prosody rate="-10%">$45</prosody>
</speak>

Énfasis

<?xml version="1.0"?>
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
         xml:lang="en-US">
  ¡Este es un <emphasis> gran </emphasis> carro!
  ¡Esa es una <emphasis level="strong"> gran </emphasis>
  cuenta de banco!
</speak>

Pautas

<?xml version="1.0"?>
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
         xml:lang="en-US">
  Respire profundo <break/>
  ahora continúe. 
  Presiona 1 o espera el tono. <break time="3s"/>
  ¡No te he escuchado! <break strength="weak"/> Por favor, intenta de nuevo.
</speak>

Referencias

↑ http://www.w3.org/TR/speech-synthesis/

Enlaces externos

Datos: Q1971947
Multimedia: SSML / Q1971947

[1] ttp://www.w3.org/TR/speech-synthesis/

[1]