Discussions

Voice artifacts processing streaming results from ChatGPT

In order to reduce latency we're following the documentation to implement [input streaming from an LLM](https://docs.play.ht/reference/integrating-with-chatgpt) . I'm noticing that this is creating some artifacts in the generated voice because sometimes LLMs would break up words into smaller pieces/tokens. For example, instead of "Cherry Blossom Festival" the LLM might return "Cheery", "Bloss", "om", and "Festival". This will make PlayHT include an unnatural pause in the pronunciation of the word "blossom". Is this expected? We tried with both the V2 turbo and non-turbo models.

Posted by Antonio Z 4 days ago

Can I import a \*.docx file into Play.ht?

Can I import a \*.docx file into Play.ht?

Posted by TED HILDEBRANDT 4 days ago

Human Interruption

If I am building an AI to Human conversation app, and I want my AI/TTS to stop speaking if it detects a Human Interruption, how do I make the TTS mute/stop speaking?

Posted by Shubham Gupta 7 days ago

Utilizing Premium Voice 'Lucia' for Speech-to-Text in Spanish with API

I am interested in using your API to develop a speech-to-text application featuring the premium voice 'Lucia' for Spanish (Spain). Could you provide guidance on how to set up the API for this specific voice and any particular configurations needed for optimal performance in Spanish speech-to-text conversion? Additionally, I would appreciate any documentation or examples that could assist me in this integration.

Posted by Juanma 9 days ago

How to get access to V1 API.

Hello, We are planing on implement the text to speech api in our web-application. One of the most important requirement is the Arabic language, among other like Dutch, English, Turkish and Persian. On the website, it stated that the Arabic language is supported. However, when i tried the V2 API, I saw only English voices. Going through the discussions I learned that other languages are supported in V1 of the API via the endpoint (<https://docs.play.ht/reference/api-convert-tts-standard-premium-voices>). The problem is, any request to that endpoint returns "error": "You don't have access to this API". So, how to use this API? And how much it takes to implement other languages such as Arabic, Dutch, Turkish and Persian in V2? Best Regards, Alaa Semsemea

Posted by Alaa Semsemea 9 days ago

Loading on timeline

Hi, voice loading in timeline but nothing happens, even if i wait for long enough, it shows error.

Posted by Arif Hasan 13 days ago

So cloned voices via the free plan are not accessible via API?

I inputed my credentials here to list cloned voices but the ID provided was just to a stock voice: <https://docs.play.ht/reference/api-list-cloned-voices> I couldn't see any other way to access the ID for my cloned voice, which is a shame and would be useful whilst testing and building. A lot more documentation would be good, eg what audio format is the response via API given in in sample rate etc? For cloning a high fidelity voice on the upgraded plan should we upload .wav or is mp3 essentially equivalent? Where one cloned high fidelity voice is allowed, can this be re-analyzed until it's right or we just get one shot? The documentation is rather limited from what I could see. I've had to do a lot of guessing and trial and error in coding to work it out.

Posted by James 19 days ago

code: 'ERR_UNKNOWN_BUILTIN_MODULE (node.js)

Error [ERR_UNKNOWN_BUILTIN_MODULE]\: No such built-in module: node:stream; at new NodeError (node:internal/errors:405:5) at ESMLoader.builtinStrategy (node:internal/modules/esm/translators:259:11) at ESMLoader.moduleProvider (node:internal/modules/esm/loader:468:14) { code: 'ERR_UNKNOWN_BUILTIN_MODULE' } I am getting this error when try to run "npm run dev" command i am suspecting it coming from importing of PassThrough represented below 'use strict'; import fp from "fastify-plugin"; import { PassThrough } from "node:stream;"

Posted by Moses Oyelade 24 days ago

Voice must be a valid voice manifest uri

Hello, I am using the audio download method according to this instruction <https://docs.play.ht/reference/python-sdk-audio-streaming>, but unfortunately not all voices like Lance, Oliver have a link ("voice must be a valid voice manifest uri"). Using the tool <https://play.ht/studio/files/64023049-b527-4072-b50f-bd5a1913de06> both of these voices work in the latest turbo mode, but in the documentation the lists of these voices do not have a uri assigned. Is it possible to download, generate such a uri somewhere for these voices?

Posted by gosc 26 days ago

Creating TTS jobs documentation is misleading

According to the docs (<https://docs.play.ht/reference/api-generate-audio>) This should work (NOTE the `null` values): ```Text bash curl --request POST --url https://api.play.ht/api/v2/tts --header "Authorization: Bearer $PLAYHT_SECRET_KEY" --header "X-USER-ID: $PLAYHT_USER_ID" --header 'content-type: application/json' --header 'Accept: application/json' -d ' { "text": "What is life?", "voice": "s3://mockingbird-prod/abigail_vo_6661b91f-4012-44e3-ad12-589fbdee9948/voices/speaker/manifest.json", "quality": "low", "output_format": "mp3", "voice_engine": "PlayHT2.0", "emotion": "female_happy", "speed": 1, "temperature": null, "sample_rate": 24000, "seed": null, "voice_guidance": null, "style_guidance": null }' {"error_message":"An unexpected error occurred, please wait a few moments and try again. If the problem persists, please contact support.","error_id":"UNEXPECTED_ERROR"}% ``` Alas, no. The problem is, the remote API endpoint can't deserialize `null` vals, even though the docs explicitly mention them 😮‍💨 This works like a charm, though (NOTE: omitted the `null` vals): ```Text bash curl --request POST --url https://api.play.ht/api/v2/tts --header "Authorization: Bearer $PLAYHT_SECRET_KEY" --header "X-USER-ID: $PLAYHT_USER_ID" --header 'content-type: application/json' --header 'Accept: application/json' -d '{ "text": "What is life?", "voice": "s3://mockingbird-prod/abigail_vo_6661b91f-4012-44e3-ad12-589fbdee9948/voices/speaker/manifest.json", "quality": "low", "output_format": "mp3", "voice_engine": "PlayHT2.0", "emotion": "female_happy", "speed": 1, "sample_rate": 24000 }' ``` So, I'd suggest either updating the API docs or fix this in the backend

Posted by Milos G 27 days ago