09/30/2010 (7:48 pm)

JPEG is a very old lossy image format. By today’s standards, it’s awful compression-wise: practically every video format since the days of MPEG-2 has been able to tie or beat JPEG at its own game. The reasons people haven’t switched to something more modern practically always boil down to a simple one — it’s just not worth the hassle. Even if JPEG can be beaten by a factor of 2, convincing the entire world to change image formats after 20 years is nigh impossible. Furthermore, JPEG is fast, simple, and practically guaranteed to be free of any intellectual property worries. It’s been tried before: JPEG-2000 first, then Microsoft’s JPEG XR, both tried to unseat JPEG. Neither got much of anywhere.

Now Google is trying to dump yet another image format on us, “WebP”. But really, it’s just a VP8 intra frame. There are some obvious practical problems with this new image format in comparison to JPEG; it doesn’t even support all of JPEG’s features, let alone many of the much-wanted features JPEG was missing (alpha channel support, lossless support). It only supports 4:2:0 chroma subsampling, while JPEG can handle 4:2:2 and 4:4:4. Google doesn’t seem interested in adding any of these features either.

But let’s get to the meat and see how these encoders stack up on compressing still images. As I explained in my original analysis, VP8 has the advantage of H.264′s intra prediction, which is one of the primary reasons why H.264 has such an advantage in intra compression. It only has i4x4 and i16x16 modes, not i8x8, so it’s not quite as fancy as H.264′s, but it comes close.

The test files are all around 155KB; download them for the exact filesizes. For all three, I did a binary search of quality levels to get the file sizes close. For x264, I encoded with --tune stillimage --preset placebo. For libvpx, I encoded with --best. For JPEG, I encoded with ffmpeg, then applied jpgcrush, a lossless jpeg compressor. I suspect there are better JPEG encoders out there than ffmpeg; if you have one, feel free to test it and post the results. The source image is the 200th frame of Parkjoy, from derf’s page (fun fact: this video was shot here! More info on the video here.).

This seems rather embarrassing for libvpx. Personally I think VP8 looks by far the worst of the bunch, despite JPEG’s blocking. What’s going on here? VP8 certainly has better entropy coding than JPEG does (by far!). It has better intra prediction (JPEG has just DC prediction). How could VP8 look worse? Let’s investigate.

VP8 uses a 4×4 transform, which tends to blur and lose more detail than JPEG’s 8×8 transform. But that alone certainly isn’t enough to create such a dramatic difference. Let’s investigate a hypothesis — that the problem is that libvpx is optimizing for PSNR and ignoring psychovisual considerations when encoding the image… I’ll encode with --tune psnr --preset placebo in x264, turning off all psy optimizations.

Files: (x264, optimized for PSNR [154KB]) [Note for the technical people: because adaptive quantization is off, to get the filesize on target I had to use a CQM here.]

What a blur! Only somewhat better than VP8, and still worse than JPEG. And that’s using the same encoder and the same level of analysis — the only thing done differently is dropping the psy optimizations. Thus we come back to the conclusion I’ve made over and over on this blog — the encoder matters more than the video format, and good psy optimizations are more important than anything else for compression. libvpx, a much more powerful encoder than ffmpeg’s jpeg encoder, loses because it tries too hard to optimize for PSNR.

These results raise an obvious question — is Google nuts? I could understand the push for “WebP” if it was better than JPEG. And sure, technically as a file format it is, and an encoder could be made for it that’s better than JPEG. But note the word “could”. Why announce it now when libvpx is still such an awful encoder? You’d have to be nuts to try to replace JPEG with this blurry mess as-is. Now, I don’t expect libvpx to be able to compete with x264, the best encoder in the world — but surely it should be able to beat an image format released in 1992?

Earth to Google: make the encoder good first, then promote it as better than the alternatives. The reverse doesn’t work quite as well.

Addendum (added Oct. 2, 03:51):

maikmerten gave me a Theora-encoded image to compare as well. Here’s the PNG and the source (155KB). And yes, that’s Theora 1.2 (Ptalarbvorm) beating VP8 handily. Now that is embarassing. Guess what the main new feature of Ptalarbvorm is? Psy optimizations…

Addendum (added Apr. 20, 23:33):

There’s a new webp encoder out, written from scratch by skal (available in libwebp). It’s significantly better than libvpx — not like that says much — but it should probably beat JPEG much more readily now. The encoder design is rather unique — it basically uses K-means for a large part of the encoding process. It still loses to x264, but that was expected.

Personally, I think that JPEG XR is a far better alternative since it has a good lossless format and more sophisticated prediction scheme (comparing to JPEG). But Microsoft just didn’t promote it hard enough …

> it doesn’t even support all of JPEG’s features, let alone many of the much-wanted features JPEG was missing (alpha channel support, lossless support). It only supports 4:2:0 chroma subsampling, while JPEG can handle 4:2:2 and 4:4:4. Google doesn’t seem interested in adding any of these features either.

Wrong:

> We plan to add support for a transparency layer, also known as alpha channel in a future update.

Sounds better than expected then, but don’t count your chickens before they hatch. They said they’d make libvpx good too, and look where that’s gone — there hasn’t been much of any work on the encoder in the past month.

I don’t know much about JPEG-XR, as was demonstrated today on #theora when I didn’t even realize that it used a lapped transform. In terms of psy, the only bitstream feature that really matters enormously is whether or not it supports adaptive quantization.

If you can do a test with a JPEG-XR encoder, I’d be happy to post it, but keep in mind that I’ve heard the official encoder is not very good…

That’s something that’s been tried for a while… to begin with there was MPNG and APNG (to try to get an animated GIF with 24-bit color support). But none of these formats even have motion compensation… not even basic fullpel.

“WebP”, aka VP8, is definitely much more complex. Most importantly, it uses arithmetic coding and a complex deblocking filter, both of which are not present in JPEG. I would guess it’s at least 3 times as intensive to decode.

1. I think it is a terrible idea to test still image compression with a still taken from a video shot with a video recorder, not an actual camera. Even the source has an inferior image, none of the coders can really shine if they already have to use a crappy source. You can grab a whole lot of amazing pictures under creative commons license.
2. You keep repeating it is not the format that is important for the perceived quality but the psyvis optimizations. This means, that eg. your work of psy optimizations for x264 could be “ported” to produce only those type of frames and features that are also present in VP8, and then we would have much better looking VP8 encodes? (I also see no point in actually doing it, since I suppose it would still be a bit inferior to x264).

Imho an h.264 based still image format should have been standardised for years already… not necesarilly just for web use; as said, jpeg has been here for 20 years. I would even be fine with Apple pushing the format (where is that Jobs bloke when one needs him, heh).

Yes, some of x264′s psy optimizations can in theory be ported to work on VP8. Adaptive quantization is the iffy one, as VP8 doesn’t have delta quantizers. It only has “segments”, which cost roughly 2 bits per macroblock to signal, and you can only have 4 of them. Ptalarbvorm has demonstrated that you can get a pretty good portion of the benefit without the precision of H.264′s quantizers (i.e. with only a few quants to pick from), but I’m not so sure about the cost of the segments. 2 bits per macroblock probably isn’t too bad for images though; it’s likely much worse for actual videos.

Well, another good overview of a new format! (I would assume, I don’t really know much about video/image formats/encoding myself)

However, I believe that you are wrong in one point:

“Earth to Google: make the encoder good first, then promote it as better than the alternatives. The reverse doesn’t work quite as well.”

The Open Source/Free Software way is to “Release early, release often”. Linus Torvalds released the Linux kernel when it could barely do anything, and look how well it did! I’m not saying that this is going to work for google(it doesn’t always work out), but it is the way it is supposed to be for Open Source software. Only proprietary companies worry about the software looking good when it is first released. Google(I would assume) is counting on the support of many programmers (not unlike yourself) to improve this software. It’s the Open Source/Free Software way.

P.S. I just realized that I assumed(without any direct evidence that I can find) they released/are releasing the source code for their Image converter and whatever other tools are required. If they aren’t, then the comment above is basically invalid. Releasing early and often works best for open source, not near as well for proprietary code. Either way, though, they do specifically say it is a developer release… so it isn’t supposed to be ready for production use yet.

How does this compare to libjpeg’s (from v 7 or arithmetic encoder? Using “jpegtran -arithmetic” usually gives 10-20 % size reduction. I don’t if the IJG jpeg code have implementet other parts of T.851

If Google wanted alpha channel support they could have implemented JNG subset of the MNG.

An interesting question is who WebM is aimed at: is it purely for semi-pro photographers who know lots of technical stuff already, or is it aimed at anyone who’s putting their snaps online? If it’s aimed at everyone, then the other thing that would be interesting would be investigating what happens if you take an image that’s already been compressed in being output from the camera (eg, low-end consumer cameras that don’t offer RAW output, only jpeg) and transcode it and see what happens in terms of transcoding artifacts. With my image analysis hat on I’m shuddering at the thought, but the human eye is sensitive to different things. So they might be negligible, but they can sometimes be visible even when looking at the images at full magnification. In particular, one would expect that a representation that is closer to JPEG would do marginally better (since in general the re-encoder doesn’t know that which parts of the reconstruced-from-jpeg image are artifacts and which are from the original image.

Hmm, why would you encode still images that way? If I take the source image and save it with Photoshop as jpg @ 20 percent (312kB), and save it again as png (3.21MB), I get far better results than your x264 example.

Can’t save it as WebP that way yet (obviously) but if your standard jpg example is already so far off, I doubt that your WebP example is accurate either.

Well, from simply looking at those picture, it just prove why we dont need to replace Jpeg. At least when it is scale down, ( Zooming in shows it loses a of of information )

But 90% of the world dont care. At least on the Web. If we need High Quality we use High Bitrate Jpeg to solve the problem or PNG. Casual web surfer dont want the hassle to save only 20% of file image size for software incompatibilities. If we could get the same image quality for only 20% of file size i suspect the world would pick it up, but that is not even possible with the best encoder we have; x264.

The only image format that I think could succeed jpeg is an H.264 based image format where the decoding complexity is offload to an widely available h.264 hardware decoder. Not to mention a small update to flash player would immediately see 90%+ of the internet being able to view h.264 images. But of coz, in a imperfect world that is not going to happen due to stupid human politics…

webp looks basically like jpeg with applied deblocking filter. It is useless, as deblocking filter can be easly added to jpeg without any compatibility problems.

x264 is much more detailed.

I think adding mandatory deblocking to jpeg, and growing blocks from 8×8 to 32×32 (yes, not 16×16, as 32×32 isn’t a performance problem for still images IMHO), improving “psy” encoding (buy for example using variable quality for each block – by detecting foregraound, backgraound, etc) is better idea. Much more compatible, easier to implement decoder (hey, it is just a tweaked jpeg), and probably better than both jpeg and webp.

As for me webp is no go, as it do not have HDR support or any alpha channel support. Size improvements do not care me so much (Google care i understand), but i do not want to store both webp and jpeg for compatibility and server dynamically one or other, depending which one browser supports.

jpeg2000 would be much better actually as it do not have many jpeg artifacts, and have alpha channel, and lossless mode.

I notice that the sample images here are all compressed WAY more than anyone would ever compress them for actual use. Go look at the pictures on flickr, and you’re going to have a REALLY hard time finding images that have their JPEG quality settings set low enough to produce an image that looks that bad. Is it possible that VP8/WebP does better (particularly in comparison to JPEG) when using higher quality settings?

Highly doubtful; there’s nothing about VP8 that would make it better at high rates — in fact, by design, it should be better at low rates (as should H.264). The reason to compare at low rates is simply that it’s easier to see a difference — people love to compare at high rates and say “there’s not much difference” even when the difference is large enough that it represents a factor of 50% in bitrate.

I used an ffmpeg set up explicitly for the purpose of avoiding colorspace range conversion, since that plagues so many comparisons. That is, I kept the colorspace in TV-range YV12 for both encoding and decoding, without conversion, because of the mess that causes.

Of course PSNR is important. If you optimize for PSNR, your image will look worse at the same PSNR than with an encoder that optimizes for psy. x264 will often look better at 40db (optimizing for psy) than at 43db (optimizing for psnr).

I’ve seen that one before; iirc, last I tested it, it was competitive with x264 and even beat it significantly at times. Given that it was adapted for insanely high compression times, of course, this is not very surprising. Still rather impressive.

Microsoft should subsidize all of the digital camera manufacturers to support JPEG-XR in order to kick-start the use of that format. It is superior by far to WebP, is an open standard, and has a good chance of adoption (particularly in new cameras that will have high dynamic range sensors) if promoted properly.

Mmm… I think the unique feature which is missing in jpg is alpha channel. PNG is for me the best format, but if the image is big, png is a large file vs jpg. I think the next good steep, is improve png format.

Another problem with Google, as always, google is a business, always thinking in business, so, if they are to license vp8, and google MAKE standards (i don’t like that), all people are going to suffering license problems before or after..

Is quality for images produced with libvpx the same as those produced using webpconv? From the way the WebP site is written it sounds like they’re only using libvpx for decoding. While of course you’re still using VP8 I-frames, it may be possible that webpconv is tuned better for quality rather than PSNR.

I am not convinced that introducing a “new” image format makes sense: When I’m browsing the web on a standard broadband connection, images load instantly anyways, and Moore’s law holds not only for the bandwidth of Internet connections, but also for hard drive capacities.

You can tweak all the values in the quant table up/down by a small amount to tweak quality by small amounts. That’s what I did with x264 for the PSNR test, since I couldn’t get the size to quite match.

Why deal with theories? HM has implemented its tiled and multi-dimentional interlacing algos, using high profile H.264 for stills coding. It is a 422 scheme, utilizing standard H.264 encoder under mp4, allowing import/export of EXIF, which can be integrated with HW cellular codecs (a pilot with a leading handset vendor on the way). Goto http://www.hipixpro.com and you can download freely and experiment with great Windows tools and Android. Viewers will be always free for the PC and developers are welcome to have their PC apps. The code has a huge advantage over WebP as it utilizes Inter-Intra tools and not just intra. It has to do with out of the box technology, which leave things wuthin the satndard. Try it yourself.

Sorry I didn’t explain. cjpeg is a command line utility that comes with the IJG “reference” libjpeg library.

Note that the usage of non-standard quantization tables (that take human perception into account) is the reason that these jpegs look so nice but it is also those quantization tables that likely cause some decoders to fail with these jpegs.

Here’s the resulting image from the Hipix software referenced in response 68.http://img121.imageshack.us/img121/9370/hipix.png
I used 165 for “Set Target Size” for an output size of 153KB.
IMO it’s nowhere near the quality of x264 on this particular image, there’s significant loss of detail, and noticable artificial lines added into the image.

For some reason I might prefer Theora over x264. Sure x264 has more details, but sometimes it looks too peaky compared to blurred neighbour. Theora is overall blurry, but has less jumps of texture retention and feels more natural.

Maybe there’s also a difference between psy for movies and psy for stills?

And yes, for vp8 that is embarassing anyway. It fails. Even block jpeg is better than blurred lumps…

Lossless-wise, I never understood why JPEG-LS got no attention. Compression is about as good as JPEG2K lossless (possibly a bit better), but it’s so much faster both to encode and decode, also compared with PNG.

Dearest Han, I have no ides what you did, and I don’t have your source images. We tried all the WebP images and soon we’ll upload the gallery. Our gain for immaculate quality – compared to the source JPEGs is huge. Where WebP saves 10& we save 40 and more. Try yourself if you like. Everybody invited to do that – hipixpro.com

Here’s the resulting image from the DLI software referenced in response 66.http://img814.imageshack.us/img814/7184/dli.png
I used the default settings with “-q 20″ for an output size of 153 KB.
The output quality was shocking.. I’ve tested numerous codecs that claim better compression than h264/x264 but this is the first time I’ve seen it. The image has similar detail as the x264 image but has significantly less artifacts (blocking, distortion around edges, pink chroma).

I used the source image referenced in the article/blog. Your free to download it and test to see if I somehow compressed the image incorrectly. I’ll get around to trying your software with more images for a better evaluation.

Guys, the use case is clear, and the technology can be used with any H.264 CoDec (including x264). Goto the hipipro.com site, there’s a lot of stuff to read and experiment with. The main advantages: No limitation of resolution, using exosting HW/SW CoDecs (could be even WebP) for Inter-intra stills coding. This is going into cellular with the existing HS… Try using the presets for simpler operation, and see the results for yourselves.

1. First there was a data compression (zip, rar) and all people compressed their 20-30 KB files with it.
Now when we use email we barely care about compression if the size of attachment is smaller lets say than 1-2 MB. Just try to send something like .7z

2. The same is applicable to lossy compression of images. JPEG is just universal like ZIP.

3.MP3 compression as old as JPEG and AAC is from the same decade (+3-4 years). HE-AAC doesn’t count as it was a subset for low bitrates and not entire replacement of AAC. And MP3 is vastly more popular than AAC today.

Not sure how well applies with relation to colorspace conversions and whatnot, but I just took the source image and simply saved it out with IrfanView as a JPG (no fancy reworking of quant tables), setting the quality down to 18 to reach a 152 KB file size, and it looks substantially better than the jpg DS put up. I’d rate it as slightly better than the theora pic, but a bit worse than the x264 pic (x264 is substantially better in the water reflections, but mostly similar elsewhere). So the bar that jpg sets seems to be a fair bit higher than implied in the article.

On the other hand, the hipixpro image really does look quite impressive, mainly in not introducing a lot of the noise that you get in the x264 image while still retaining that high image quality. If they could add transparency to it (not sure if that’s easy to do, given that it’s based on h.264) I’d be quite happy to switch to it… aside from the fact that Firefox will never support it because it’s still tied to h.264…

Your site, as good as your article seems to be, is unreadable. My eyes hurt after a few seconds. Do yourself a favour and switch your colors to an eye-friendly format as described in thousend bokks about webdesign for the bloody beginner! Have you ever tried to read your site your own?

I use the same colorscheme for my site that I use for all my applications on my own computer (at least ones where I can change the color scheme easily).

I personally do not believe that staring into the sun (i.e. dark text on a light background) is suitable for anything any sane person would want to read. Maybe you are of some other species which is adapted to staring into extremely bright lights for long periods of time. I am not, and neither are most of my readers. Well, at least I assume most of my readers are human — I can’t be sure of that!

Above all, I refuse to use a colorscheme that I cannot read comfortably for more than a couple minutes. Black text on a white background falls squarely into that category, giving me headaches quite rapidly. I would rather not have to pop ibuprofen every few hours (and ruin my vision!) just because some random commenter on the internet tells me that I should stare into a lamp all day.

Now, if you are reading this on a device without a backlight (e.g. a Kindle), then it might be useful to have an alternate CSS. But I don’t think anyone is reading this on a Kindle.

If you do in fact happen to be of some alien species that likes staring into bright lights, try this script.

@Alan:
> Note that the usage of non-standard quantization
> tables is the reason that these jpegs look so nice
> but it is also those quantization tables that likely
> cause some decoders to fail with these jpegs.

I strongly doubt that. In JPEG, there’s no scalar quantizer (like the QP in H.264), so the only way to signal the amount of quantization to the decoder is specifying complete quantization tables in the header. A decoder that assumes fixed quantization tables would fail for 99.9% of all files.

It seems that Han and some of you guys don’t really understand that it is still pictures and not HDTV frames we are discussing. First – for low res (circa 1MP) x264 and hipix are more or less the same. However there is a huge gap between the efficiency of pure intra to the inter-intra coding we use. The reason? Hipix was designed for cellular handsets and cameras, supporting 5,8,12,16,21 and also 40 and 50MP. the higher the resolution – the more effective it is.

dark Shikari – Inter-Intra means IP and IBBP GOPs for high res images. Read more at hipixpro.com we have a description of the technoilogy. Try it with 8-12-16-21MP images and you’ll find how effective it is copared to simple Intra coding.

Wouldn’t it be possible to create a JPEG encoder targeting SSIM over PSNR, possibly using multiple quant tables to allow some sort of psy optimizations (I’m not sure if JPEG allows switching quant tables in the middle of a scan… I’ll check)?
Moreover, if the (optional) arithmetic coding feature was nowadays sufficiently widespread decoder-side it should allow JPEG to improve quite a lot while retaining compatibility with existing decoders, wouldn’t it?

Given the information that Hipix performs better on higher resolution images, I re-tested using some of the images from http://www.imagecompression.info/test_images/
Unfortunately Hipix performed similarly as it did with the smaller < 1MP images, ie. results were similar to x264 targeting PSNR. x264 with psychovisual/HVS optimizations is significantly superior on every image tested. To help Hipix's cause, I suggest providing a specific example where it performs better than x264 that the public can verify.

Regarding site colour scheme: while I prefer bright text on dark background (just check out my Windows colour scheme: http://eternallybored.org/imgs/thebat.png), I think that the text on this site is a bit too bright – however since most sites use black text with white background, I browse most of the time with custom CSS enabled anyway, so it doesn’t bother me as much.

Dear Han, since Hipix was designed to enhance the efficiency of “existing h.264 encoders” to compress still images, and utilize existing HW codecs, within the HW constrains (like coding 12MP images using a 720p encoder), I’ll be happy to explore the possibility of enhancing further the quality by using the x264 code within the scheme for PC SW, which we want to be distributed freely (within the limits of the h.264 licensing terms). If it is of interest to you, contact me at ira@human-monitoring.com and we’ll be more than happy to cooperate.

Just wanted to say: if I were google and trying to improve my image format, I don’t think I could have made a better move than to post it early. Just reading through this thread, they would have most of the brainstorming already done for them, and ready to organise

PNG is inferior itself – it *only* does lossless compression (hello tenfold increase in bandwidth; oh wait, wasn’t this what google meant to adress?) – but at lossless compression it is easily beaten by jpeg XR or jpeg 2K. Screenshots of desktop dominated by gui elements are exception, but hey…

On photographic image?
Quick test here (only with an old HD Photo plugin though – should be at least similar) shows not. Also, it is with a pngout, while no such optimizer was used for the hdp/jxr image.
I don’t even know where to get a proper/good jpeg XR encoder, much less some that would be guaranteed to have better than average compression.

Guys, soon we’ll release Hipix plugins for a few popular picture editing tools like Photoshop and XnView. The main issue for us is photos of hi-res coming from cellular handseta with 8,12,1nd soon 16MP, where the CMOS output is compressed using Hipix far better than JPEG-XR JP2K and WebP. We also support the following: 422 color format (as coming from the CMPS) EXIF import/export. PNG is worth nothing for this purpose (as well as for pictures as pictures). It is a great format for other purposes. We are operating in the space which is focused on EXISTING HARDWARE. JP2K necer got there, and so JPEG-XR. Show me a codec, which can encode 60MP per second on existing cellular device with 30% higher file size than Hipix and I’ll call it competition.

Next to JPEG2000, LuRaWave (developed by the German Aerospace Company = Deutsche Luft- und Raumfahrt-Technik) could have been one of the best technologies to concur with JPEG. If it was not relying on plugins…

I haven’t seen any image quality comparison based on only one image and one compression point like this. We only can say this is a special case analysis for these image coders. In this special case, we can see that VP8 turns the blocky effect to the blurred effect by using deblocking filter. If you don’t like this blurred image, just turn off the deblocking filter. (ps. since this is intra coding, decoder can choose to turn on or off the deblocking filter. I see similar blocky effect with JPEG for this case if I turn off the deblocking filter when decoding.) If you want less blocky effect, just decrease the DC QP value. BTW, the flat quantization matrix of AC coefficients of VP8 is designed to optimized PSNR, not this kind of subjective view.

Please never compare image coders using such small data set and configuration. It is much more misleading than any other comparison.

BTW, there is indeed one problem of VP8 intra coding for still image. It can’t provide lossless and fast transcoding for image rotation and flip like JPEG. The intra prediction will make this kind of transcoding lossy and time consuming.

It has nothing to do with wrong promotion, better compression than JPEG, 20ys of JPEG etc. etc. etc.

It has to do with who came first up, standarized into a wealth of software and hardware just about the same time as internet really ignited and blew up into our faces and spawned over to phones and other media devices. This WebP effort is way too late. They have the same arguement about OGG VS MP3. Which came first? MP3. MP3 will and are always to most supported/common format around even if it supposedly have flaws in compare with OGG. Most people don’t care, MP3 was available first and is always supported everywhere. Easy answer to why we do not need JPEG improvements or MP3 improvements, in any case it will take several decades before something like this becomes reality and implemented. There is way to many newer video formats, audio formats to confuse us already, lets pretend it dont happen to pictures too.

Jason, thank your very much, very interesting conclusions, but let me add a few my dimensions in Your Model:

1st Dimesion – patent cleanliness
So, in this model, increased in one dimension are interested in patent cleanliness. It is not secret for anybody that scientific clusters (R&D bundles) supplying modern standards, begin to intertwine form the 3x-years of the XX century. Not only H.264, but also all modern “technology bundles”, such as LTE, WiMAX etc has as minimum of 50 years of iterative research history. Simple see the ITERATIVE graph of research history for all H.264-related mathematic and engeeniring stuff in your brain. Almost of 99% of this stuff you and personally I, memorized as empiric approach. What is the meaning of blame one or another group of people in applying the algorithm similar to another algorithm? (or even a set of algorithms for the problem area?).

It seems that the 20th century has arrogated to itself the right to all newly invented. We are just lucky, that up to this were the other century, when this is not done. Otherwise we could not have breakfast, lunch and dinner, and even send their natural needs.

The main VP8 Google initiative, by my opinion, is not in some qualitatively new innovations. It is simple creation of the (near to) innovative flow, free from patent claims, that known by google. Although, of course, because of a very troubled patent system of the USA it is not clear what are hidden in the sleeve by google. And after this core is formet, most of the initiative and ambitious peoples, such as you, must simple extend this core with empirical and theoretical points.

2nd Dimension – energy
It is absolutely clear that it is now the time for freeze the full standard of digital content and for good escape from the hell of historical (almost analog) layers in the media. Yes, the most part of these transformations connected with transformations of meta-information. However, major headaches are converting heavy media content, such as photos, audio, and especially video. It is quite obvious that we simply do not realize the energy scale of this problem. For example, (no, not need power consumption example, because all that read this blog, almost known about it)
Video industry has also absolutely wild time-domain PWM-like interpolation solutions, such as interlacing and inverse telecine process, requiring very power and energy-intensive decoding approach with incoherent result. because the path back to the source material does not remain in 5-10 years!

3rd Dimension – politics
Some states, which call the states of the “third world”, have patent policy, significantly differs from the patent policy of the US. And in these states are home of 150+160+2500+1500+500=4.81 billions of people. You can be sure that no one ever there does not transfer a cent of money for such patent as H.264. Why, indeed, for other list? Because its is too comprehensive and contains a large number of patents, intersecting with local patents of these countries.
In order to avoid problems related to the discredit of the patent law of US, most of the patents on H.264 will be so or otherwise released in foreign areas.

The idea of targetting for x264 project
1. To create an umbrella empirical algorithms, especially the algorithms of mocomp/moest, the most protected from the patents. Most likely, they will also be used later in the VP8.
2. Do not scold VP8, let guys to relax
3. R&D. Still more serious approach to the issue of hw-sw partitioning, enter the third level of abstraction for internal API/thread model and to start experiments at least with CPU-based OpenCL (blocked , attached to current theread model).
4. R&D. Redesign the heterogeneus, OpenCL-based thread model.
5. Evolution to OpenCL-based heterogeneous design
If you do not go into details, the Opencl is the only way for a programmer to go the way of the von-neumann to the hardware pipeline without serious bugs in brain.