Sensory will bring generative AI to voice chat for consumer devices.
Sensory, a maker of voice AI for consumer products, has announced it has integrated ChatGPT and other AI to drive conversational voice responses on consumer devices.
It is aimed at making devices lacking keyboards and big screens smarter. Targeting in-ear voice assistants, smartwatches, smartphones, automotive infotainment systems, and more, this technology integration delivers a fast and seamless conversational experience on consumer products and unlocks exciting voice chat type capabilities for numerous electronics companies and their customers, the company said.
“Generative AI has the potential to make consumer devices smarter than ever. Integrating this powerful new technology with our robust voice AI stack is a game-changer for the market, and allows our customers to create a new generation of infinitely capable voice assistants tailored to a variety of customized domains,” said Todd Mozer, CEO of Sensory, in a statement.
The company said Sensory has an established reputation for highly accurate voice AI solutions, and generative AI makes it even more accurate. The company’s enabling technology stack includes:
- Wake word recognition.
- Accurate speech-to-text with context and AI-generated prompt engineering to ensure ideal generative AI results.
- Intelligent response selection helps to avoid unpredictable and incorrect responses, aka ‘AI hallucinations,’ which can occur on platforms that rely solely on generative AI.
- Text-to-speech allows users to hear the generated responses in a natural voice.
Sensory’s conversational AI stack also allows users to ask follow-up questions and commands to filter, sort, or add more information to the original request, making the conversation more natural and human-like.
“This launch expands Sensory’s capabilities to bring voice chat capabilities to devices of all types, giving businesses the opportunity to create more engaging and interactive products,” said Mozer.
With Sensory’s hybrid cloud + edge AI platform, customers can choose to implement a number of powerful AI technologies to bolster the end-user experience and security and split AI inference duties between edge devices and the cloud.
Using a smartwatch as an example of an ultra-low-power device, light-duty AI like wake word recognition, speaker verification, simple voice controls, and sound identification can run on-device. More complex AI inference, such as wake-word, speaker, and sound ID revalidation, as well as domain-specific assistants, and natural language understanding engines, can be routed to a more powerful connected device like a smartphone. And for high-horsepower AI inference, like generative AI and today’s generation of voice chat, improved revalidation, face and object recognition, and more can be routed to the cloud.
SensoryCloud’s voice assistant solution is powered by a cutting-edge technology stack that includes Go, gRPC, NVIDIA Triton, and AWS Global Accelerator. The lightning-fast Go programming language builds scalable, high-performance applications that can handle even the most demanding workloads. gRPC enables the creation of advanced SDKs for seamless communication between components. SensoryCloud uses proprietary techniques to compress dialog data to reduce cloud fees and decrease latencies.