Can't figure out which model to use? We've got you covered

Here's how our models stack up:

Play 2.0 is ideal for legacy projects, offering reliable performance with good voice cloning and medium speed.
Play 3.0 Mini is optimized for developers needing blazing-fast processing for bulk file generation, with improved emotional tones and multilingual support.
PlayDialog excels in creating highly emotive and natural speech, with state-of-the-art voice cloning and alphanumeric capabilities, making it perfect for applications requiring expressive outputs.

	Play 2.0	Play 3.0 Mini	PlayDialog
Speed	Medium	Blazing Fast	Medium
Time to first audio	230ms	190ms	350ms
Emotion	Better	Good	Best
Alphanumeric Sequences	Medium	Great	Great
Voice cloning	Good	Better	Best
Multilingual	English only	Multilingual	Beta
Best for	Legacy projects	Bulk file generation	Emotive speech

`PlayDialog`

PlayHT's latest voice model built for fluid, emotive conversation.

It's a more advanced model that can generate turn-based dialogues with multiple voices.

Simple Text-to-Speech with PlayDialog

Use it in your code as follows:

import * as PlayHT from "playht";

// Initialize PlayHT API with your credentials
PlayHT.init({
  userId: "<YOUR_USER_ID>",
  apiKey: "<YOUR_API_KEY>",
});

PlayHT.stream("Hello from a realistic voice.", {voiceEngine: "PlayDialog"});

curl --request POST \
     --url https://api.play.ht/api/v2/tts/stream \
     --header 'X-USER-ID: <YOUR_USER_ID>' \
     --header 'AUTHORIZATION: <YOUR_API_KEY>' \
     --header 'accept: audio/mpeg' \
     --header 'content-type: application/json' \
     --data '
{
  "text": "Hello from a realistic voice.",
  "voice_engine": "PlayDialog"
}
'

Multi-Turn Dialog Generation with PlayDialog

Use it in your code as follows:


import requests

url = "https://api.play.ht/api/v2/tts/stream"

payload = {
    "voice": "s3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json",
    "output_format": "mp3",
    "text": "Country Mouse: How are you, my town mouse cousin? Town Mouse: I am doing fine, country mouse. Country Mouse: This is the greatest place ever, but it is so much busier than where I am from! Town Mouse: Thank you cousin, I love the hustle and bustle as well.",
    "voice_2": "s3://voice-cloning-zero-shot/e040bd1b-f190-4bdb-83f0-75ef85b18f84/original/manifest.json",
    "voice_engine": "PlayDialog",
    "turn_prefix_2": "Town Mouse:",
    "turn_prefix": "Country Mouse:"
}
headers = {
    "accept": "audio/mpeg",
    "content-type": "application/json",
    "AUTHORIZATION": "<YOUR_API_KEY>",
    "X-USER-ID": "<YOUR_USER_ID>"
}

response = requests.post(url, json=payload, headers=headers)

with open('dialogue.wav', 'wb') as f:
    f.write(response.content)
    print("Audio file saved as dialogue.wav")

curl --request POST \
     --url https://api.play.ht/api/v2/tts/stream \
     --header 'X-USER-ID: <YOUR_USER_ID>' \
     --header 'AUTHORIZATION: <YOUR_API_KEY>' \
     --header 'accept: audio/mpeg' \
     --header 'content-type: application/json' \
     --data '
{
  "voice": "s3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json",
  "output_format": "mp3",
  "text": "Country Mouse: How are you, my town mouse cousin? Town Mouse: I am doing fine, country mouse. Country Mouse: This is the greatest place ever, but it is so much busier than where I am from! Town Mouse: Thank you cousin, I love the hustle and bustle as well.",
  "voice_2": "s3://voice-cloning-zero-shot/e040bd1b-f190-4bdb-83f0-75ef85b18f84/original/manifest.json",
  "voice_engine": "PlayDialog",
  "turn_prefix_2": "Town Mouse:",
  "turn_prefix": "Country Mouse:"
}
'

What's new with `PlayDialog`

PlayDialog is a more advanced model that can generate turn-based dialogues with multiple voices.
PlayDialog uses a conversation’s historical context to control prosody, intonation, emotion and pacing.
PlayDialog uses “Adaptive Speech Contextualizer” (ASC) that allows the model to use the full context and history of a conversation.

Read the full release post here: https://blog.play.ai/blog/introducing-playdialog

`Play3.0-mini`

PlayHT's latest speech model for realtime use cases.

It's a lightweight, reliable and cost-efficient Multilingual Text-to-Speech model that supports voice cloning and TTS streaming through the API.

Use it in your code as follows:

import * as PlayHT from "playht";

// Initialize PlayHT API with your credentials
PlayHT.init({
  userId: "<YOUR_USER_ID>",
  apiKey: "<YOUR_API_KEY>",
});

PlayHT.stream("Hello from a realistic voice.", {voiceEngine: "Play3.0-mini"});

curl --request POST \
     --url https://api.play.ht/api/v2/tts/stream \
     --header 'X-USER-ID: <YOUR_USER_ID>' \
     --header 'AUTHORIZATION: <YOUR_API_KEY>' \
     --header 'accept: audio/mpeg' \
     --header 'content-type: application/json' \
     --data '
{
  "text": "Hello from a realistic voice.",
  "voice_engine": "Play3.0"
}
'

What's new with `Play3.0-mini`

Reduced hallucinations and Increased accuracy especially with numbers and alpha-numeric sequences.
Better consistent latency of <200ms with streaming.
Uses a higher quality native 48kHz sampling instead of 24kHz by default.
Character limit per streaming request increased from 2k to 20k.
Supports 36 languages.

Read the full release post here: https://play.ht/news/introducing-play-3-0-mini/

Upgrading to `Play3.0-mini`

Python SDK

First upgrade the package:

pip install --upgrade pyht (should upgrade to 0.1.x)
When calling tts(), the voice_engine argument should be Play3.0-mini (this is also the default; pass PlayHT2.0 to use the 2.0 model).

Note that the new voice engine is Play3.0-mini, not PlayHT3.0

Play3.0-mini is multilingual! English is the default; to use a language other than English, pass a Language enum value (from pyht.client ) as the language argument to TTSOptions.

Node.js SDK

First upgrade the package: npm install playht@latest (should upgrade to 0.10.x)
When calling PlayHT.stream(), the voiceEngine argument should be Play3.0-mini.

Note that the new voice engine is Play3.0-mini, not PlayHT3.0.

Play3.0-mini is multilingual! English is the default; to use a language other than English, pass a language as the language argument on the options of PlayHT.stream().

`PlayHT2.0-turbo`

PlayHT's legacy voice model. Use it in your code as follows:

import * as PlayHT from "playht";

// Initialize PlayHT API with your credentials
PlayHT.init({
  userId: "<YOUR_USER_ID>",
  apiKey: "<YOUR_API_KEY>",
});

PlayHT.stream("Hello from a realistic voice.", {voiceEngine: "PlayHT2.0-turbo"});

curl --request POST \
     --url https://api.play.ht/api/v2/tts/stream \
     --header 'X-USER-ID: <YOUR_USER_ID>' \
     --header 'AUTHORIZATION: <YOUR_API_KEY>' \
     --header 'accept: audio/mpeg' \
     --header 'content-type: application/json' \
     --data '
{
  "text": "Hello from a realistic voice.",
  "voice_engine": "PlayHT2.0-turbo"
}
'

PlayDialog

Simple Text-to-Speech with PlayDialog

Multi-Turn Dialog Generation with PlayDialog

What's new with PlayDialog

Play3.0-mini

What's new with Play3.0-mini

Upgrading to Play3.0-mini

Python SDK

Node.js SDK

PlayHT2.0-turbo

`PlayDialog`

What's new with `PlayDialog`

`Play3.0-mini`

What's new with `Play3.0-mini`

Upgrading to `Play3.0-mini`

`PlayHT2.0-turbo`