Show HN: open source framework OpenAI uses for Advanced Voice https://ift.tt/J9UgI2Y

Oktober 04, 2024

Show HN: open source framework OpenAI uses for Advanced Voice Hey HN, we've been working with OpenAI for the past few months on the new Realtime API. The goal is to give everyone access to the same stack that underpins Advanced Voice in the ChatGPT app. Under the hood it works like this: - A user's speech is captured by a LiveKit client SDK in the ChatGPT app - Their speech is streamed using WebRTC to OpenAI’s voice agent - The agent relays the speech prompt over websocket to GPT-4o - GPT-4o runs inference and streams speech packets (over websocket) back to the agent - The agent relays generated speech using WebRTC back to the user’s device The Realtime API that OpenAI launched is the websocket interface to GPT-4o. This backend framework covers the voice agent portion. Besides having additional logic like function calling, the agent fundamentally proxies WebRTC to websocket. The reason for this is because websocket isn’t the best choice for client-server communication. The vast majority of packet loss occurs between a server and client device and websocket doesn’t provide programmatic control or intervention in lossy network environments like WiFi or cellular. Packet loss leads to higher latency and choppy or garbled audio. https://ift.tt/WyiCgoP October 5, 2024 at 12:01AM

Cari Blog Ini

BlogViral

Show HN: open source framework OpenAI uses for Advanced Voice https://ift.tt/J9UgI2Y

Komentar

Posting Komentar

Postingan populer dari blog ini

Show HN: Guish – A GUI for constructing and executing Unix pipelines https://ift.tt/HrXz5ub

Twin Peaks for All: Survey Results

Taken with Transportation Podcast: For the Love of Muni