The World’s First Self-Learning Artificial Intelligence For Speech-to-Text
Soniox Inc launched the Soniox AI Speech Recognition Platform, the world’s first self-learning artificial intelligence for automatic speech recognition. Soniox Speech AI leverages vast amounts of available unlabeled audio and text to teach itself how to recognise complex speech patterns. As a result, Soniox Speech AI can accurately recognise speech in real-world environments on most topics of human knowledge with up to 24 per cent improved word error rate than today’s leading speech systems.
Soniox has invented a novel approach to training speech recognition models to overcome today’s speech recognition limitations. In an unsupervised fashion, Soniox Speech AI learns from vast amounts of unlabeled audio and unlabeled text that is publicly available on the internet. It learns to recognise words by exploring different interpretations of spoken words in unlabeled audio and their usage in unlabeled written text. Soniox Speech AI can now uniquely recognise near error-free most of the words in the English language without requiring direct human supervision.
Earlier, humans were needed to manually transcribe audio to create accurate labelled speech-to-text datasets, making speech recognition learning extremely time-consuming and expensive. Collecting labelled data for speech recognition was further challenged because of the extreme variety of the speech input and output space. Existing approaches made it practically infeasible to obtain sufficient amounts of paired audio-transcript data to cover the complex input and output space.
In contrast, Soniox Speech AI continuously learns and auto-improves as it gains access to more unlabeled audio and unlabeled text. With each iteration, Soniox Speech AI is slightly better and is able to correctly interpret and recognise more and more of the words in human knowledge.
‘Audio is becoming the prevalent medium for rapid, immersive communication’, said Klemen Simonic, Founder and CEO of Soniox. ‘With our self-learning AI platform, Soniox has built the industry’s strongest infrastructure and toolset to build advanced speech and audio understanding solutions. Our self-learning speech AI is the first example of how Soniox can solve hard problems differently. Expect more to come in the near future!’
Also Read: Invisible, Intelligent, Intrusive
To make speech recognition accessible and easy to use, Soniox has built both the Soniox web application and the Soniox mobile application (for iOS devices). Among other features, these applications enable users to instantly transcribe audio/video files or live streams, such as meetings and conversations. These products are available for free for up to 5 hours of speech recognition per month.
For developers and businesses, Soniox has developed a speech recognition API that can be used from virtually any programming language and platform. To simplify the integration, Soniox offers an easy-to-use Python and Javascript client library with tutorials and extensive documentation. It only requires a few lines of code to integrate world-class speech recognition into (almost) any application.
Privacy and security are critical for speech recognition use. Soniox has developed an on-premises deployment of Soniox Speech AI, where the entire system is deployed within the enterprise’s infrastructure. The on-premises deployment supports efficient and distributed processing of large volumes of audio in real-time and low-latency settings. Soniox has also developed an on-mobile-device deployment of Soniox Speech AI for iOS devices. The entire computation takes place on the mobile device and the audio never leaves the device. It also eliminates the requirement for network connectivity while transcribing audio streams.