In this guide we will go through the steps needed to generate your first audio file using our PlayHT Async Voices API.
Realtime Streaming
This guide is for async audio file generation. If you are looking for realtime streaming, check this guide instead.
Here is a Demo of using the nodejs sdk to download an audio file and save it locally:
Using the V2 API
The first step is to get you API Secret Key and User Id. Please refer to our authentication page if you haven't already.
Your secret key will be sent in the Authorization
header. Your User Id should be provided in the X-USER-ID
header. An example cURL command is shown below with placeholders for the credentials.
The only two required body parameters are text
which contains the text to be converted to audio and voice
. In the example below we use a PlayHT2.0 voice. You can get the full list of PlayHT Voices from our /v2/voices
endpoint.
We want the API to return a stream of events with the status of the audio generation job. For that we will send accept: text/event-stream
in the header. This way we don't need to call the Get text-to-speech job data endpoint to fetch job status.
Here's the final POST
request we are sending to the generate audio endpoint:
curl --request POST \
--url https://play.ht/api/v2/tts \
--header 'AUTHORIZATION: Bearer YOUR_SECRET_KEY_HERE' \
--header 'X-USER-ID: YOUR_USER_ID_HERE' \
--header 'accept: text/event-stream' \
--header 'content-type: application/json' \
--data '
{
"text": "Check out this realistic generated speech!",
"voice": "s3://peregrine-voices/oliver_narrative2_parrot_saad/manifest.json",
"voice_engine": "PlayHT2.0"
}
'
import * as PlayHTAPI from "playht";
// Initialize PlayHTAPI
PlayHTAPI.init({
apiKey: process.env.PLAYHT_API_KEY,
userId: process.env.PLAYHT_USER_ID,
});
const sentence =
"hello, play support speaking? Please hold on a second, uh Let me just, um, pull up your details real quick.";
const response = await PlayHTAPI.generate(sentence, {
voiceId:
"s3://voice-cloning-zero-shot/d9ff78ba-d016-47f6-b0ef-dd630f59414e/female-cs/manifest.json",
outputFormat: "mp3",
voiceEngine: "PlayHT2.0",
sampleRate: "44100",
speed: 1,
});
console.log(response.audioUrl);
Replace YOUR_SECRET_KEY_HERE
and YOUR_USER_ID_HERE
with your actual API Secret Key and User ID.
The response, as specified by the accept
header, will be a text event stream:
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0,"stage":"queued"}
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.01,"stage":"active"}
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.01,"stage":"preload","stageProgress":0}
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.11,"stage":"preload","stageProgress":0.5}
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.21,"stage":"preload","stageProgress":1}
event: ping
data: 2023-03-27T02:20:37.800Z
event: ping
data: 2023-03-27T02:20:52.801Z
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.21,"stage":"generate","stageProgress":0}
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.32,"stage":"generate","stageProgress":0.2}
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.53,"stage":"generate","stageProgress":0.6}
event: ping
data: 2023-03-27T02:21:07.801Z
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.55,"stage":"generate","stageProgress":0.64}
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.57,"stage":"generate","stageProgress":0.68}
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.74,"stage":"generate","stageProgress":1}
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.74,"stage":"postprocessing","stageProgress":0}
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.82,"stage":"postprocessing","stageProgress":0.33}
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.91,"stage":"postprocessing","stageProgress":0.67}
event: generating
data: {"id":"1i01A5fd8O35iLkhuJ","progress":0.99,"stage":"postprocessing","stageProgress":1}
event: completed
data: {"id":"1i01A5fd8O35iLkhuJ","progress":1,"stage":"complete","url":"https://peregrine-results.s3.amazonaws.com/pigeon/IZ5jJmV1ecnZaVuoIK_0.mp3","duration":2.4107,"size":49965}
The audio URL will be available in the last event, completed
, in the url
property. In this case, https://peregrine-results.s3.amazonaws.com/pigeon/IZ5jJmV1ecnZaVuoIK_0.mp3.
You just generated your first audio with the PlayHT API!
If you are looking for Standard & Premium Voices API (V1) please refer to their specific doc pages. You can refer to the section on the left to get to know the full range of parameters and all the data formats available.