Integrating ChatGPT into Voice Prompts: Making Iron Man’s ‘JARVIS’

Ever since I first saw “Iron Man” I have been interested in a JARVIS-like program. The ability to have an assistant that can look up anything quickly just through voice commands seemed just so cool! But the technology just wasn’t really there (or at least it was too out of reach).

Queue ChatGPT and some new speech-to-text and text-to-speech capabilities.

Yes, Amazon Echo, and Siri, and others exist, but I wanted something a little more … contained.

I spent a day thinking about it, and ChatGPT seemed to be closer than it seemed. It just needed some voice libraries on either side of it, the input and output, to make it work — and so I wrote a POC:

"""
REQUIREMENTS:
    > pip install pyttsx3
        https://www.geeksforgeeks.org/text-to-speech-changing-voice-in-python/
    > pip install openai


    Speech to text

    [Windows] > pip install speechrecognition
    [Linux] > sudo apt-get install python3-pyaudio

    > pip install pyaudio
"""

import pyttsx3
import openai
import os
import speech_recognition as sr


class TextToSpeech:

    converter = None

    def __init__(self):
        self.converter = pyttsx3.init()
        self.converter.setProperty('rate', 150)
        self.converter.setProperty('volume', 0.7)

    def say_text(self, text):
        self.converter.say(text)
        self.converter.runAndWait()


class ChatGPT:

    # Generate API keys here: https://platform.openai.com/account/api-keys
    key = "<your openai key here>"
    model_engine = "text-davinci-002"
    temperature = 0.7
    max_tokens = 60

    def __init__(self):
        openai.api_key = self.key

    def prompt_GPT(self, prompt):
        response = openai.Completion.create(
            engine=self.model_engine,
            prompt=prompt,
            temperature=self.temperature,
            max_tokens=self.max_tokens
        )
        return response.choices[0].text.strip()


if __name__ == "__main__":
    tts = TextToSpeech()
    gpt = ChatGPT()
    r = sr.Recognizer()

    while(1):
        try:
            with sr.Microphone() as source2:
                r.adjust_for_ambient_noise(source2, duration=0.2)
                audio2 = r.listen(source2)
                prompt = r.recognize_google(audio2)
                prompt = prompt.lower()

                response = gpt.prompt_GPT(prompt)
                tts.say_text(response)

        except KeyboardInterrupt:
            tts.say_text('Shutting down')
            exit()
        except sr.RequestError as e:
            tts.say_text('Could not request results from google')
        except sr.UnknownValueError:
            tts.say_text('An unknown error occurred')
        except openai.error.Timeout:
            tts.say_text('Openai failed to respond to the prompt and timed out. Could not complete your request.')
        except:
            tts.say_text('An unknown error occurred')

I imagine others will have fun making similar home assistants.

Caleb Shortt

Technology and Other Interesting Topics

Integrating ChatGPT into Voice Prompts: Making Iron Man’s ‘JARVIS’

Share this: