Integrating PlayHT Voices with Twilio

Incorporating PlayHT realistic voices into your Twilio application can significantly enhance user experience. To successfully stream PlayHT voices to Twilio over a bi-directional stream, you just need to adhere to these specifications:

  • Output Format: The audio must be in the 'mulaw' format.
  • Sample Rate: Use a sample rate of 8000 Hz.
  • Encoding: Encode the audio in base64 format before sending it to Twilio through a 'media' event message.

🚧

Make sure you use the TwiML verb to create a bi-directional stream

Implementation example with the Nodejs SDK

Here is an example using the Nodejs SDK for Audio Streaming:

import * as PlayHT from 'playht';

...

// Assuming ws is a WebSocket connected to twilio and streamSid contains the stream identifier
function streamGeneratedAudio(ws, streamSid) {
  const streamFromStream = await PlayHT.stream('Hello from realistic voices ready for the phone.', {
    voiceEngine: 'PlayHT2.0-turbo',
    voiceId: 's3://peregrine-voices/oliver_narrative2_parrot_saad/manifest.json',
    outputFormat: 'mulaw',
    sampleRate: 8000,
  });

  streamFromStream.on('data', (data) => {
    const message = JSON.stringify({
      event: 'media',
      streamSid,
      media: {
        payload: data.toString('base64'),
      },
    });
  	
    ws.send(message);
  });
}