Lyra V2 open-source audio codec gets faster, higher quality and compatible with more platforms

Lyra V2 is an update to the open-source Lyra audio codec introduced last year by Google, with a new architecture that offers scalable bitrate capabilities, better performance, higher quality audio, and works on more platforms.

Under the hood, Lyra V2 is based on an end-to-end neural audio codec called SoundStream with a “residual vector quantizer” (RVQ) sitting before and after the transmission channel, and that can change the audio bitrate at any time by selecting the number of quantizers to use. Three bitrates are supported: 3.2 kps, 6 kbps, and 9.2 kbps. Lyra V2 leverages artificial intelligence, and a TensorFlow Lite model enables it to run on Android phones, Linux, as well as Mac and Windows although support for the latter two is experimental. iOS and other embedded platforms are not supported at this time, but this may change in the future.

Lyra V2 vs OpusIt gets more interesting once we start to compare Lyra V2 against other audio codecs such as Lyra (V1) and Opus with the new audio codec delivering a higher quality (MUSHRA score) than those at a given bitrate, and the chart above shows Lyra V2 @ 9.2 kbps offers about the same quality as Opus at 14 kbps.

Latency has also been improved from 100ms to 20 ms, making the second generation codec comparable to Opus for WebRTC, which has a typical delay of 26.5 ms, 46.5 ms, and 66.5 ms. Lyra V2 also encodes and decodes five times faster than Lyra V1 to enable real-time audio encoding/decoding and lower power consumption. For instance, the new audio codec takes 0.57 ms to encode and decode a 20 ms audio frame on a Pixel 6 Pro phone, or about 35 times faster than real-time.

Lyra V2 vs AMR NB vs AMR WB Opus

While LyraV1 would compare to AMR-NB, Lyra V2 offers improved quality compared to Enhanced Voice Services (EVS) and Adaptive Multi-Rate Wideband (AMR-WB), and similar quality as Opus while using just about 50% to 60% of the bandwidth.

The source code for Lyra V1/V2 implementation can be found on Github with the C++ API most of the same since the first release, except for a few changes such as the ability to change the bitrate during encoding. The model definitions and weights are also included as .tflite files.

More details and audio samples can be found on Google Open Source Blog.

Share this:

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

Radxa Orion O6 Armv9 mini-ITX motherboard
Subscribe
Notify of
guest
The comment form collects your name, email and content to allow us keep track of the comments placed on the website. Please read and accept our website Terms and Privacy Policy to post a comment.
5 Comments
oldest
newest
2 years ago

One nitpick: 26.5 ms, 46.5 ms, and 66.5 ms are for combined mode of Opus, but you can go sub-10ms with higher bitrate and computational cost.

Stuart Naylor
2 years ago

Its a strange codec Lyra not just lossy but really complete loss where you are recreating voice based on a MFCC stream.

Pixel 6 Pro phone, or about 35 times faster than real-time

That is on a 6 TOPs NPU and as codecs go, its a jaw dropping amount of processing relative to other codecs.
Its a very strange codec with unparalleled low bandwidth but also decode processing and essentially its not even your own voice but a AI driven deep fake.

Tony Kao
2 years ago

I listened to the released samples and there’s definitely an element of uncanny valley to the compressed speech, it sounds like it’s uncompressed but off.

David Willmore
David Willmore
2 years ago

Why are they comparing a voice band codec against a general purpose audio codec like Opus? Yes, OPUS has SILK for the low bit rates, but that’s not improved for many years. If you want to compare with something meant to compete in this segment, use Codec2. Its max bit rate is the same as lyra v2’s lowest.

Also, compare the computational differences between the two. Codec2 works on microcontrollers. Good luck getting Lyra to run on anything without a dedicated NPU.

David
David
2 years ago

+1 David Willmore. You nailed it. Link to Codec 2:

https://www.rowetel.com/?page_id=452

Boardcon EM3562 Rockchip RK3562 SBC with 8 analog camera inputs