In this guide we will go through the steps needed to generate your first audio file using our PlayHT Async Voices API.

🚧

Realtime Streaming

This guide is for async audio file generation. If you are looking for realtime streaming, check this guide instead.

Here is a Demo of using the nodejs sdk to download an audio file and save it locally:

Using the V2 API

The first step is to get you API Secret Key and User Id. Please refer to our authentication page if you haven't already.

Your secret key will be sent in the Authorization header. Your User Id should be provided in the X-USER-ID header. An example cURL command is shown below with placeholders for the credentials.

The only two required body parameters are text which contains the text to be converted to audio and voice. In the example below we use a PlayHT2.0 voice. You can get the full list of PlayHT Voices from our /v2/voices endpoint.

We want the API to return a stream of events with the status of the audio generation job. For that we will send accept: text/event-stream in the header. This way we don't need to call the Get text-to-speech job data endpoint to fetch job status.

Here's the final POST request we are sending to the generate audio endpoint:

curl --request POST \
     --url https://play.ht/api/v2/tts \
     --header 'AUTHORIZATION: Bearer YOUR_SECRET_KEY_HERE' \
     --header 'X-USER-ID: YOUR_USER_ID_HERE' \
     --header 'accept: text/event-stream' \
     --header 'content-type: application/json' \
     --data '
{
  "text": "Check out this realistic generated speech!",
  "voice": "s3://peregrine-voices/oliver_narrative2_parrot_saad/manifest.json",
  "voice_engine": "PlayHT2.0"
}
'
import * as PlayHTAPI from "playht";

// Initialize PlayHTAPI
PlayHTAPI.init({
  apiKey: process.env.PLAYHT_API_KEY,
  userId: process.env.PLAYHT_USER_ID,
});

const sentence =
  "hello, play support speaking? Please hold on a second, uh Let me just, um, pull up your details real quick.";

const response = await PlayHTAPI.generate(sentence, {
  voiceId:
  "s3://voice-cloning-zero-shot/d9ff78ba-d016-47f6-b0ef-dd630f59414e/female-cs/manifest.json",
  outputFormat: "mp3",
  voiceEngine: "PlayHT2.0",
  sampleRate: "44100",
  speed: 1,
});

console.log(response.audioUrl);

Replace YOUR_SECRET_KEY_HERE and YOUR_USER_ID_HERE with your actual API Secret Key and User ID.

The response, as specified by the accept header, will be a text event stream:

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0,"stage":"queued"}

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.01,"stage":"active"}

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.01,"stage":"preload","stageProgress":0}

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.11,"stage":"preload","stageProgress":0.5}

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.21,"stage":"preload","stageProgress":1}

event: ping
data: 2023-03-27T02:20:37.800Z

event: ping
data: 2023-03-27T02:20:52.801Z

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.21,"stage":"generate","stageProgress":0}

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.32,"stage":"generate","stageProgress":0.2}

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.53,"stage":"generate","stageProgress":0.6}

event: ping
data: 2023-03-27T02:21:07.801Z

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.55,"stage":"generate","stageProgress":0.64}

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.57,"stage":"generate","stageProgress":0.68}

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.74,"stage":"generate","stageProgress":1}

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.74,"stage":"postprocessing","stageProgress":0}

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.82,"stage":"postprocessing","stageProgress":0.33}

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.91,"stage":"postprocessing","stageProgress":0.67}

event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.99,"stage":"postprocessing","stageProgress":1}

event: completed
data: {"id":"1i01A5fd8O35iLkhuJ","progress":1,"stage":"complete","url":"https://peregrine-results.s3.amazonaws.com/pigeon/IZ5jJmV1ecnZaVuoIK_0.mp3","duration":2.4107,"size":49965}

The audio URL will be available in the last event, completed, in the url property. In this case, https://peregrine-results.s3.amazonaws.com/pigeon/IZ5jJmV1ecnZaVuoIK_0.mp3.

You just generated your first audio with the PlayHT API!

If you are looking for Standard & Premium Voices API (V1) please refer to their specific doc pages. You can refer to the section on the left to get to know the full range of parameters and all the data formats available.