gittech. site

for different kinds of informations and explorations.

WebRTC Fork of Simplified OpenAI Realtime Console

Published at
Dec 21, 2024

Simple Realtime Console - now with WebRTC

Note: This is a fork of swyx's simple realtime console

This project was originally created by swyx as a WebSocket-based demo for OpenAI's Realtime API - a simplified version of the official demo but on ozempic πŸ’‰. It has since been migrated to use WebRTC for improved audio streaming capabilities.

The original project stripped out SCSS, added Tailwind, and achieved -1200 LOC while keeping all core functionality. You can see the original diffs here.

Clean Console UI

Key improvements from swyx's version:

  • Suppressed less useful event spam into the console
  • Made transcripts log nicely
  • Added memory injection that starts with initial context
  • Added mute button for better control

Nice Logging

New Features

Setup

  1. Clone this repository
  2. Copy .env.example to .env and add your OpenAI API key

Backend Setup

cd server
npm install
npm run dev

The server will start on http://localhost:3001

Frontend Setup

npm install
npm run dev

The frontend will be available at http://localhost:3000

Using the Console

The console now uses a secure backend to handle API keys. You no longer need to enter your API key in the frontend - just add it to your .env file.

To start a session:

  1. Click Connect (this will request microphone access)
  2. Start speaking! The console uses Voice Activity Detection (VAD)
  3. You can interrupt the model at any time
  4. Use the Mute button to control your microphone

Memory System

There's one function enabled:

  • set_memory: Ask the model to remember information, stored in a JSON blob on the left
  • We've added some basic initial memory to get you started

Architecture

  • Frontend: React + TypeScript + Vite
  • Backend: Express + TypeScript
  • Communication: WebRTC for real-time audio streaming
  • Security: Backend-generated ephemeral tokens for API access

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT