Google Pik Image Format Improves on Lossy JPEG and Lossless PNG

JPEG lossy compression is still used on most photos in the Internet, while PNG is still the preferred format for lossless compressions. Back in 2010, Google unveiled WebP to improve on both, but that’s only very recently that I started to see a few webp image on the Internet.

The company has been working on yet another image for with Pik lossy/lossless image format designed for high quality and fast decoding.

Google Pik butteraugli
Butteraugli Heat Map used by Google Pik

Some of the features enabling high quality:

  • Built-in support for psychovisual modeling via adaptive quantization and XYB color space
  • 4×4..32×32 DCT, AC/DC predictors, chroma from luma, nonlinear loop filter, enhanced DC precision
  • Full-precision (32-bit float) processing, plus support for wide gamut and high dynamic range

Features allowing faster decoding over 1 GB/s multi-threaded:

  • Parallel processing of large images
  • SIMD/GPU-friendly, benefits from SSE4 or AVX2
  • Cache-friendly layout
  • Fast and effective entropy coding: context modeling with clustering, rANS

Google Pik is royalty-free, and is said to achieve perceptually lossless encodings at about 40% of the JPEG bitrate, and store fully lossless at about 75% of 8-bit PNG size, or 60% of 16-bit PNG size.

The SIMD readme explains Pik leverages SSE4, AVX2, and ARMv8 instructions to improve performance.

You can give it a try by checkout the source code on Github. It should be noted that it’s pretty hard to change standards on the Internet, as shown by WebP, HEIF, and FLIF projects all of which are said to be technically superior to JPEG.

Via WorksonArm

Share this:

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

Radxa Orion O6 Armv9 mini-ITX motherboard

Radxa ROCK 5C Lite SBC with Rockchip RK3588 / RK3582 SoC
Subscribe
Notify of
guest
The comment form collects your name, email and content to allow us keep track of the comments placed on the website. Please read and accept our website Terms and Privacy Policy to post a comment.
15 Comments
oldest
newest
tkaiser
tkaiser
5 years ago

> Google Pil is royalty-free

Typo.

And for those looking for a way to get smaller file sizes for images on the internet without (too much) visual harm look at https://tinypng.com — the methods how they shrink PNG and JPEG images are explained there (for JPEG at tinyjpg.com).

jqpabc123
jqpabc123
5 years ago

Re: tinypng and tinyjpg Their methods are very rudimentary yet still very slooow. PNG – They appear to be using color reduction — combining similar colors. JPG – They appear to be simply removing metadata that a typical camera/phone places in image files. Aside from this, they utilize a very flowery description of what is already inherent in the JPEG algorithm. At the same time, they lack the ability to re-scale images which is crucial for images on the internet. An image file from your phone has millions of pixels — which makes sense if you intend to print a… Read more »

tkaiser
tkaiser
5 years ago

They remove metadata (which can be quite huge with images dropping out of Photoshop for example) but with PNG they do not ‘reduce’ color but switch to an optimized indexed color space. Wrt JPEG it’s explained what they do (JPEG allows for different compression schemes in different ‘parts’ of the image). As for imageoptimizer.net I would believe the only benefit here is stripping (some) metadata and downscaling and reapplying different compression levels. Also it seems it doesn’t support PNG transparency (back to IE6 ‘capabilities’? Seriously?!) At least for screenshots I really don’t want downscaling and a quick test reveals that… Read more »

blu
blu
5 years ago

Also, there’s this nice simd ISA matrix in the simd seciton of the project that I find quite useful: https://github.com/google/pik/blob/master/pik/simd/instruction_matrix.pdf

-.-
-.-
5 years ago

Is that for their wrapper implementation though? For example, you can definitely do pairwise addition between 8-bit values with one instruction on x86 (pmaddubsw), but the table indicates that it takes 9. Another example is pairwise subtraction on ARM – it definitely doesn’t take 9 instructions to do.
I do see a lot of “9”s on the page – maybe that just means that it’s not implemented in SIMD at all?

It looks better if you ignore the “9”s, but still seems questionable, e.g. integer negation on x86 (8/16/32-bit) is one instruction (psign*), but listed as two on the table.

blu
blu
5 years ago

Yes, it appears to be for their wrapper. And yes, it’s a fuzzy table indeed, where ‘9’ appears to indicate ‘a few’. I haven’t gone over all cells, but those that I’ve glanced at seemed ok. BTW, pmaddubsw does 16-bit saturation of the pairwise results, so it’s not exactly one instruction that does the 8-bit pairwise job, like addp on asimd2 does. Re int negation in ssse3 — it appears they count the mask load as a separate op.

-.-
-.-
5 years ago

> BTW, pmaddubsw does 16-bit saturation of the pairwise results 8-bit addition cannot exceed the range of a 16-bit signed int, so saturation doesn’t apply here. The semantics are different to ARM’s addp in that it widens the result – it’s more like *addlp, though I’m sure emulating addp wouldn’t take 9 instructions. > Re int negation in ssse3 — it appears they count the mask load as a separate op. I suppose that’s a fair point, although it rarely applies in practice (since you can hold a register of all 1’s and just re-use that). Though by that definition,… Read more »

blu
blu
5 years ago

> 8-bit addition cannot exceed the range of a 16-bit signed int, so saturation doesn’t apply here. The semantics are different to ARM’s addp in that it widens the result – it’s more like *addlp, though I’m sure emulating addp wouldn’t take 9 instructions. Saturation doesn’t apply, but the widening does, so it’s not a single op, was my point. To get the effect of asimd2’s addp you’d need something along: pmaddubsw v0, v_ones pmaddubsw v1, v_ones pand v0, v_low_bytes_mask pand v1, v_low_bytes_mask packuswb v0, v1 which even without the multiplier/mask loading is still a far cry from a single… Read more »

-.-
-.-
5 years ago

> Add to the fact how the table seems to count const loads as extra ops, and that operation goes beyond ‘5’ and into the ‘9’ category. I count 7 from your implementation, so I was right about it definitely not being 9. > For ints the natural LE/GE solution is exactly two ops: LE == GT + not According to the table, GT is 1 op, NOT is 2 ops, so GT+NOT = 3 ops, not 2. You do bring up a good point with MAX+EQ though – I did not consider that. Can you think of a 2… Read more »

blu
blu
5 years ago

> I count 7 from your implementation, so I was right about it definitely not being 9. Many of the ops there are not 9 ops in either x86 or arm. I believe I’ve already explained my interpretation of the table, to which you disagree. So we agree to disagree. I’ll just comment on a couple of things below. > According to the table, GT is 1 op, NOT is 2 ops, so GT+NOT = 3 ops, not 2. You’re right — with the const load that would indeed account to 3 (as in the not case accounting to 2,… Read more »

-.-
-.-
5 years ago

> You _really_ don’t wont to mix sse and avx ops on an amd64 It was mostly to save typing a third parameter, since you get the idea anyway and the compiler would never mix the two. Though I really see no problem with it since there’s no transition penalty between 128b instructions, and SSE encodings are often shorter than VEX encodings. > avx would save the move I derped when I thought of the implementation, but the move isn’t necessary for SSE: pxor xmm0, xmm0 pcmpgtq xmm0, v paddq v, xmm0 pxor v, xmm0 So 4 ops (assuming you… Read more »

blu
blu
5 years ago

My narrative? I’ve explained my ‘narrative’ about three posts ago. Judging by your last response, you’re yet to catch up with that. > I derped when I thought of the implementation, but the move isn’t necessary for SSE: Congratulations on shaving off the move from the sane implementation. Perhaps the authors have derped as well in some of their mappings? It’s funny when we hold others to high standards, when we can just derp and come around on a high horse later. > I claim that 1+2=2. I make no claims in regards to the precision of the answer. Do… Read more »

-.-
-.-
5 years ago

> I see the table bugs you a lot It actually doesn’t. Your claim that it’s useful is what bugs me a lot, even when a number of mistakes were pointed out. Or rather, the endorsement promoting misleading information being disseminated is what bugs me a lot. Wrong information is not useful in my book. “1+2=2” is wrong, and hence not a useful statement in my book. Clearly you think otherwise, which is fine I suppose, but methinks that you may be alone with that view. (note that *accurate* statements don’t have to be precise: “1+2 is between 2 and… Read more »

dave
dave
5 years ago

PIK is merely a research project. PIK, avif and jpeg have joined together to make the Jpeg XL image format which will be ready later this year.

Boardcon EM3562 Rockchip RK3562 SBC with 8 analog camera inputs