Learn about the different voice models supported by the API

The options available and characteristics of generated audio will depend on the AI model that is used to synthesise the voice.

PlayHT API supports different types of models: 'PlayHT2.0', 'PlayHT2.0-turbo', 'PlayHT1.0' and 'Standard'. Some APIs will only work with specific models. For APIs that support 'Standard' voice engine, see the Standard & Premium Voices section. The other sections are for APIs that operate on PlayHT voices.

Listing Supported Voices

There are a few ways to list supported voices depending on your requirements:

PlayHT 2.0 and 2.0 Turbo Voices

Our newest conversational voice AI model with added emotion direction and instant cloning. Compatible with 'PlayHT2.0-turbo' engine, our fastest model for streaming. Supports english only.

PlayHT 1.0 Voices

Lifelike voices ideal for expressive and conversational content. Supports english only.

Cloned Voices

We offer High-Fidelity and instant cloning options. With instant cloning, a PlayHT 2.0 voice is produced, while High-Fidelity clones are based on our PlayHT 1.0.


Voice Cloning and Realtime streaming

Realtime streaming (PlayHT2.0-Turbo) has support for instant clones only, not for high-fidelity clones.

Standard Voices

For multi-lingual text-to speech generations, changing pitches and adding pauses. Voices with reliable outputs and support for Speech Synthesis Markup Language (SSML). Supports 100+ languages.