API Reference

Can't figure out which model to use? We've got you covered

Here's how our models stack up:

  • Play 2.0 is ideal for legacy projects, offering reliable performance with good voice cloning and medium speed.
  • Play 3.0 Mini is optimized for developers needing blazing-fast processing for bulk file generation, with improved emotional tones and multilingual support.
  • PlayDialog excels in creating highly emotive and natural speech, with state-of-the-art voice cloning and alphanumeric capabilities, making it perfect for applications requiring expressive outputs.
Play 2.0Play 3.0 MiniPlayDialog
SpeedMediumBlazing FastMedium
Time to first audio230ms190ms350ms
EmotionBetterGoodBest
Alphanumeric SequencesMediumGreatGreat
Voice cloningGoodBetterBest
MultilingualEnglish onlyMultilingualBeta
Best forLegacy projectsBulk file generationEmotive speech

PlayDialog

PlayHT's latest voice model built for fluid, emotive conversation.

It's a more advanced model that can generate turn-based dialogues with multiple voices.

Simple Text-to-Speech with PlayDialog

Use it in your code as follows:

import * as PlayHT from "playht";

// Initialize PlayHT API with your credentials
PlayHT.init({
  userId: "<YOUR_USER_ID>",
  apiKey: "<YOUR_API_KEY>",
});

PlayHT.stream("Hello from a realistic voice.", {voiceEngine: "PlayDialog"});
curl --request POST \
     --url https://api.play.ht/api/v2/tts/stream \
     --header 'X-USER-ID: <YOUR_USER_ID>' \
     --header 'AUTHORIZATION: <YOUR_API_KEY>' \
     --header 'accept: audio/mpeg' \
     --header 'content-type: application/json' \
     --data '
{
  "text": "Hello from a realistic voice.",
  "voice_engine": "PlayDialog"
}
'

Multi-Turn Dialog Generation with PlayDialog

Use it in your code as follows:


import requests

url = "https://api.play.ht/api/v2/tts/stream"

payload = {
    "voice": "s3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json",
    "output_format": "mp3",
    "text": "Country Mouse: How are you, my town mouse cousin? Town Mouse: I am doing fine, country mouse. Country Mouse: This is the greatest place ever, but it is so much busier than where I am from! Town Mouse: Thank you cousin, I love the hustle and bustle as well.",
    "voice_2": "s3://voice-cloning-zero-shot/e040bd1b-f190-4bdb-83f0-75ef85b18f84/original/manifest.json",
    "voice_engine": "PlayDialog",
    "turn_prefix_2": "Town Mouse:",
    "turn_prefix": "Country Mouse:"
}
headers = {
    "accept": "audio/mpeg",
    "content-type": "application/json",
    "AUTHORIZATION": "<YOUR_API_KEY>",
    "X-USER-ID": "<YOUR_USER_ID>"
}

response = requests.post(url, json=payload, headers=headers)

with open('dialogue.wav', 'wb') as f:
    f.write(response.content)
    print("Audio file saved as dialogue.wav")

curl --request POST \
     --url https://api.play.ht/api/v2/tts/stream \
     --header 'X-USER-ID: <YOUR_USER_ID>' \
     --header 'AUTHORIZATION: <YOUR_API_KEY>' \
     --header 'accept: audio/mpeg' \
     --header 'content-type: application/json' \
     --data '
{
  "voice": "s3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json",
  "output_format": "mp3",
  "text": "Country Mouse: How are you, my town mouse cousin? Town Mouse: I am doing fine, country mouse. Country Mouse: This is the greatest place ever, but it is so much busier than where I am from! Town Mouse: Thank you cousin, I love the hustle and bustle as well.",
  "voice_2": "s3://voice-cloning-zero-shot/e040bd1b-f190-4bdb-83f0-75ef85b18f84/original/manifest.json",
  "voice_engine": "PlayDialog",
  "turn_prefix_2": "Town Mouse:",
  "turn_prefix": "Country Mouse:"
}
'

What's new with PlayDialog

  • PlayDialog is a more advanced model that can generate turn-based dialogues with multiple voices.
  • PlayDialog uses a conversation’s historical context to control prosody, intonation, emotion and pacing.
  • PlayDialog uses “Adaptive Speech Contextualizer” (ASC) that allows the model to use the full context and history of a conversation.

Read the full release post here: https://blog.play.ai/blog/introducing-playdialog


Play3.0-mini

PlayHT's latest speech model for realtime use cases.

It's a lightweight, reliable and cost-efficient Multilingual Text-to-Speech model that supports voice cloning and TTS streaming through the API.

Use it in your code as follows:

import * as PlayHT from "playht";

// Initialize PlayHT API with your credentials
PlayHT.init({
  userId: "<YOUR_USER_ID>",
  apiKey: "<YOUR_API_KEY>",
});

PlayHT.stream("Hello from a realistic voice.", {voiceEngine: "Play3.0-mini"});
curl --request POST \
     --url https://api.play.ht/api/v2/tts/stream \
     --header 'X-USER-ID: <YOUR_USER_ID>' \
     --header 'AUTHORIZATION: <YOUR_API_KEY>' \
     --header 'accept: audio/mpeg' \
     --header 'content-type: application/json' \
     --data '
{
  "text": "Hello from a realistic voice.",
  "voice_engine": "Play3.0"
}
'

What's new with Play3.0-mini

  • Reduced hallucinations and Increased accuracy especially with numbers and alpha-numeric sequences.
  • Better consistent latency of <200ms with streaming.
  • Uses a higher quality native 48kHz sampling instead of 24kHz by default.
  • Character limit per streaming request increased from 2k to 20k.
  • Supports 36 languages.

Read the full release post here: https://play.ht/news/introducing-play-3-0-mini/


Upgrading to Play3.0-mini

Python SDK

First upgrade the package:

pip install --upgrade pyht (should upgrade to 0.1.x)
When calling tts(), the voice_engine argument should be Play3.0-mini (this is also the default; pass PlayHT2.0 to use the 2.0 model).

Note that the new voice engine is Play3.0-mini, not PlayHT3.0

Play3.0-mini is multilingual! English is the default; to use a language other than English, pass a Language enum value (from pyht.client ) as the language argument to TTSOptions.


Node.js SDK

First upgrade the package: npm install playht@latest (should upgrade to 0.10.x)
When calling PlayHT.stream(), the voiceEngine argument should be Play3.0-mini.

Note that the new voice engine is Play3.0-mini, not PlayHT3.0.

Play3.0-mini is multilingual! English is the default; to use a language other than English, pass a language as the language argument on the options of PlayHT.stream().


PlayHT2.0-turbo

PlayHT's legacy voice model. Use it in your code as follows:

import * as PlayHT from "playht";

// Initialize PlayHT API with your credentials
PlayHT.init({
  userId: "<YOUR_USER_ID>",
  apiKey: "<YOUR_API_KEY>",
});

PlayHT.stream("Hello from a realistic voice.", {voiceEngine: "PlayHT2.0-turbo"});
curl --request POST \
     --url https://api.play.ht/api/v2/tts/stream \
     --header 'X-USER-ID: <YOUR_USER_ID>' \
     --header 'AUTHORIZATION: <YOUR_API_KEY>' \
     --header 'accept: audio/mpeg' \
     --header 'content-type: application/json' \
     --data '
{
  "text": "Hello from a realistic voice.",
  "voice_engine": "PlayHT2.0-turbo"
}
'