Make ChatGPT speak by using PlayHT text-to-speech APIs

In this guide, we'll learn how to convert ChatGPT text output into speech using the PlayHT Node.js SDK. You'll need Node.js installed. Get ready to make your bot talk!

Here is a fully working Demo, and you will find below the step by step guide of how it works:

💡

Authenticating

Make sure to use your PlayHT and OpenAI API Keys in the replit Demo below.

Here is a recording of how the above demo works:

Setting up the environment

First, let's set up the environment. We'll create a new folder called talkingGPT and use npm to install the dependencies. Fire up your terminal and type:

mkdir talkingGPT
cd talkingGPT
npm init -y
npm install --save playht openai

Authenticating

You'll need credentials to authenticate to both OpenAI and PlayHT APIs. Replace the placeholders in the code with your actual API keys.

import * as PlayHT from 'playht';  
import OpenAI from 'openai';

PlayHT.init({  
  apiKey: 'YOUR_PLAYHT_API_KEY',  
  userId: 'YOUR_PLAYHT_USER_ID',  
});

const openai = new OpenAI({  
  apiKey: 'YOUR_OPENAI_API_KEY',  
  organization: 'YOUR_OPENAI_ORG_ID',  
});

Getting a text stream from ChatGPT

Now we need to convert ChatGPT's API output into a text stream. For that we'll use the stream: true option when calling the OpenAI ChatGPT API. Let's use Tell me a joke. as the prompt:

const chatGptResponseStream = await openai.chat.completions.create({
  messages: [{ role: 'user', content: 'Tell me a joke.' }],
  model: 'gpt-3.5-turbo',
  stream: true,
});

This will stream ChatGPT responses as chunk objects. We need to extract only the text into a new stream. We can do that by reading the contents of the object like this:

import { Readable } from 'node:stream';

const textStream = new Readable({
    async read() {
      for await (const part of chatGptResponseStream) {
        this.push(part.choices[0]?.delta?.content || '');
      }
      // End the stream
      this.push(null);
    },
  });

Now textStream will provide the text output from ChatGPT's response. Let's move on to converting this text into audio!

Generating speech

The PlayHT.stream() function conveniently supports a text stream as input. To generate speech while the stream is being generated all we need is:

const audioStream = await PlayHT.stream(textStream);

That't it! stream will start getting audio data as soon as they're ready. For this example, let's save the output into a file so we can play the output later:

import fs from 'fs';

const fileStream = fs.createWriteStream('hello-chatgpt.mp3');
audioStream.pipe(fileStream);

Putting it all together

This is what the final example looks like:

import * as PlayHT from 'playht';
import OpenAI from 'openai';
import { Readable } from 'node:stream';
import fs from 'fs';

PlayHT.init({
  apiKey: 'YOUR_PLAYHT_API_KEY',
  userId: 'YOUR_PLAYHT_USER_ID',
});

const openai = new OpenAI({
  organization: 'YOUR_OPENAI_ORG_ID',
  apiKey: 'YOUR_OPENAI_API_KEY',
});

const chatGptResponseStream = await openai.chat.completions.create({
  messages: [{ role: 'user', content: 'Tell me a joke.' }],
  model: 'gpt-3.5-turbo',
  stream: true,
});

const textStream = new Readable({
    async read() {
      for await (const part of chatGptResponseStream) {
        this.push(part.choices[0]?.delta?.content || '');
      }
      // End the stream
      this.push(null);
    },
  });

const audioStream = await PlayHT.stream(textStream);

const fileStream = fs.createWriteStream('hello-chatgpt.mp3');
audioStream.pipe(fileStream);

To run it, save the file with the name talkGPT.mjs, then use node to run the code:

node talkGPT.mjs

Now a file named hello-chatgpt.mp3 will be created with a ChatGPT joke for you. Go listen to it!

Wrapping up

If you want to see a full server implementation of streaming speech from ChatGPT, check out the example in our SDK repo.

By now, you've successfully combined the power of ChatGPT with the magic of PlayHT. Tweak the code to make it fit your needs and happy coding! 🎉🤖🔊