This page will help you get started with our API.

If you need support, reach out to us at [email protected] or join us on Discord

Welcome to PlayHT!

PlayHT is building an infrastructure for Conversational AI Voice Agents so every business, developer, or tinkerer can easily build talking human-like AI agents and use them to serve their customers. To do that PlayHT built the fastest ever realtime generative text to speech models, which you can use at scale with any LLM seamlessly.

Here is a demo of PlayHT's API integrated with chatGPT with input and output streaming to achieve the lowest latency, check this demo's code here.

It's super easy to get started. First thing you need are your credentials - generate an API Secret Key and get your User ID.

Use PlayHT API for -

  • Realtime Text to Speech streaming.
  • Async long form Text to Speech generation.
  • Ultra realistic Instant Voice Cloning.

Realtime Audio Streaming (250ms TTFB)

Create Your First Stream in 2 mins

Input Streaming with LLMs

Our API supports Input streaming to make it seamless to integrate with LLMs like chatGPT, check this guide to see an example of how to use it.

Techniques to guarantee the lowest latency

Check this guide for some techniques to keep in mind when trying to achieve the lowest latency.

Audio streaming with Twilio

Follow this guide for integrating realtime audio streaming with Twilio for over the phone conversations.

Async Audio Generation

For non-streaming generation of long form content where you need a full audio file and not a realtime buffer stream, get started with this guide.

Realistic Instant Voice Cloning

You can clone any voice instantly with only 30 seconds of speech, try it easily in our API Playground, or through the API here.

Pricing & Rate limits

To get more info about the supported plans, please visit and select the "API" tab.

With higher plans you can get higher concurrency rate limits, a performance SLA and better support.