API Reference

Some techniques to guarantee you always get the lowest latency when streaming audio in realtime using the PlayHT API

Use Play3.0-mini model

To get the lowest latency, you must use our Play3.0-mini model.

import * as PlayHT from "playht";

// Initialize PlayHT API with your credentials
PlayHT.init({
  userId: "<YOUR_USER_ID>",
  apiKey: "<YOUR_API_KEY>",
});

PlayHT.stream("Hello from a realistic voice.", {voiceEngine: "Play3.0-mini"});
curl --request POST \
     --url https://api.play.ht/api/v2/tts/stream \
     --header 'X-USER-ID: <YOUR_USER_ID>' \
     --header 'AUTHORIZATION: <YOUR_API_KEY>' \
     --header 'accept: audio/mpeg' \
     --header 'content-type: application/json' \
     --data '
{
  "text": "Hello from a realistic voice.",
  "voice_engine": "Play3.0"
}
'

Use our Client SDKs

Our Node.js and Python SDKs have performance optimizations, credential caching, and talk directly with our inference workers, which save some of the network latency you might get by using the REST API. The difference is minimal, but if you want the best latency, we recommend using the SDKs.

Deploy your servers in the US

Our API and GPU servers are deployed in AWS US-East and US-West regions; we recommend you deploy your servers in a US region to save any network cost due to region crossing.

If you absolutely need to run your servers outside the US and that is impacting your latency, you might need to look into our on-prem deployment offering, which allows you to deploy on your own cloud in any region, contact us to help.

Upgrade to our enterprise cluster

If you want to have low latency and high concurrency all the time with an enterprise-grade SLA, contact us to get access to our enterprise cluster.

Use an on-prem deployment

If you need the absolute lowest latency (~100ms); you might need to look into our on-prem offering, which allows you to easily deploy the PlayHT API and models in your own cloud, contact our team to know more.