Realtime Audio Streaming

The Streaming API endpoint streams the audio response with our lowest available latency. It's best used for conversational use cases. You can use cloned voices with this API too!

To start using the API, you should first get your Secret API Key and User ID. See Authentication for details. We'll use placeholders for the credentials in the examples below. Replace YOUR_SECRET_KEY_HERE and YOUR_USER_ID_HERE with your actual API Secret Key and User ID.

This guide will get into the details of using the HTTP endpoints directly. If you are looking for easy to use SDKs that wrap this logic, see our Node.js SDK and the Python Streaming SDK.

Calling the Streaming API endpoint

We need to make an HTTP POST request to https://play.ht/api/v2/tts/stream endpoint. We want to stream the results as soon as possible, so we'll add accept: audio/mpeg to the header. This is what we want the header to look like:

'AUTHORIZATION: Bearer YOUR_SECRET_KEY_HERE'
'X-USER-ID: YOUR_USER_ID_HERE'
'accept: audio/mpeg'
'content-type: application/json'

Payload

For the payload, we have to send the text we want to convert into audio and the voice to use. We will also set the quality to draft for fastest latency and output format to mp3 for best compatibility. This is what it will look like:

"text": "Hey there, I'm calling in regards to the car you enquired yesterday."
"voice": "larry"
"quality": "draft"
"output_format": "mp3"

Making the request

Here's what our final request will look like:

# Make the request and save binary output in the file 'playht-stream.mp3'
curl --request POST \
     --url https://play.ht/api/v2/tts/stream \
     --header 'AUTHORIZATION: Bearer YOUR_SECRET_KEY_HERE' \
     --header 'X-USER-ID: YOUR_USER_ID_HERE' \
     --header 'accept: audio/mpeg' \
     --header 'content-type: application/json' \
     --output playht-stream.mp3 \
     --data '
{
  "text": "Hey there, I am calling in regards to the car you enquired yesterday.",
  "voice": "larry",
  "quality": "draft",  
  "output_format": "mp3"
}
'

const fetch = require('node-fetch');

const url = 'https://play.ht/api/v2/tts/stream';
const options = {
  method: 'POST',
  headers: {
    accept: 'audio/mpeg',
    'content-type': 'application/json',
    AUTHORIZATION: 'Bearer YOUR_SECRET_KEY_HERE',
    'X-USER-ID': 'YOUR_USER_ID_HERE'
  },
  body: JSON.stringify({
    text: 'Hey there, I am calling in regards to the car you enquired yesterday."',
    voice: 'larry'
    quality: 'draft',
    output_format: 'mp3'
  })
};

fetch(url, options)
  .then(res => res.json())
  .then(json => console.log(json))
  .catch(err => console.error('error:' + err));

import requests

url = "https://play.ht/api/v2/tts/stream"

payload = {
    "quality": "draft",
    "output_format": "mp3",
    "speed": 1,
    "sample_rate": 24000,
    "text": "Hey there, I'm calling in regards to the car you enquired yesterday.\"",
    "voice": "larry"
}
headers = {
    "accept": "audio/mpeg",
    "content-type": "application/json",
    "AUTHORIZATION": "Bearer YOUR_SECRET_KEY_HERE",
    "X-USER-ID": "YOUR_USER_ID_HERE"
}

response = requests.post(url, json=payload, headers=headers)

print(response.text)

That's it, you are ready to stream audio from text!

Check Stream Audio for more details on using the streaming endpoint.

Notes:

For High Fidelity clones, the first few requests might be slower due to warming up the servers with your voice models.