There are two colorspaces commonly used in video today, which are defined in Rec. 601 and Rec. 709 respectively. Simply speaking, Rec. 601 is mainly used for analog sources, while Rec. 709 mainly is for HD TV and BD.

So how do VLC handle this? It assumes everything is Rec. 601 and you get something like this:

The bottom left is from a DVD and the top right is from the BD release. In comparison, here is how it looks in Overmix, using Rec. 601 for the DVD release and Rec. 709 for the BD release:

VLC also seems to ignore the gamma difference between Rec. 601/709 and sRGB, and it handles 10 bit content in a way that reduces color accuracy to worse than 8 bits sources. Behold, the histogram from a Hi10p source:

Free stuff might be nice, but this is what you get…

EDIT: I messed up the studio-swing removal in Overmix (which is now fixed), so the colors were slightly off. It was consistent between rec.601/709 so the comparison still holds. Overmix might be nice, but this is what you get…

Just five months later… Here are some early results using artificial data.

Using Wikimedia Commons “picture of the day” for October 31. 2013 by Diego Delso (CC BY-SA 3.0), I created LR (Low Resolution) images which were 4 times smaller in each direction. Each LR image had its own offset, so to have one LR image for all possible offsets, 16 images was created.

To detect the sub-pixel alignment afterwards, the images were upscaled to 4x their size and ordinary pixel-based alignment was used. The upscaled versions were only used for the alignment and thus discarded afterwards. The final image was then rendered at 4x resolution using cubic interpolation, but taking the sub-pixel alignment into account. Lastly the image was deconvolved in GIMP using the G’MIC plugin to remove blur. The results are shown below:

Left side shows the LR (shown upscaled using Nearest neighbor interpolation) and original image respectively. Right side shows the SR (Super Resolution) results, using different interpolation methods. Both are cubic, however the top is using Mitchell and the bottom is using Spline. In simple terms, Spline is more blurry than Mitchell but has less blocking artifacts. Mitchell is usually pretty good choice (as it is a compromise between several other cubic interpolation methods), however the blocking is pretty noticeable here. Using Spline here avoids that and since we attempt to remove blur afterwards it works pretty well. However do notice that Mitchell does recover slightly more detail in the windows to the right.

But while Mitchell often does appear to be slightly more sharp, it tends to mess up more often, which can clearly be seen on the “The power of” building to the left. The windows are strangely mixed up into each other, while they are perfectly aligned when using Spline.

Conclusion

Results are much better than the LR images, however it is more an magnification of 2x instead of the optimal 4x. And to make matters worse, this is generated optimal data without blur or noise.

However this is the simplest way of doing SR and I believe other methods do give better results. Next I want to try the Fourier-based approach which is also one of the early SR methods. It should give pretty good results, but it is not used much anymore because it does not work for rotated or skewed images.

Using artificial data has really shown me why I have had so little success with it so far. I’m mainly working with anime screenshots and the amount of detail which can be restored is probably not that much. My goal is actually more to avoid blurriness that happens when they are not aligned perfectly. Thus while it should have been obvious, lesson learned, do not test on data which you are not sure whether will give an result or not… What I did gain from this is that anime tends to be rather blurry and that image deconvolution can help a lot. When I understand this blurriness in detail I will probably write more about it though.

As I was researching on digital signal processing I found an interesting term: Super Resolution. Super Resolution is a field which attempts to improve the resolution of an image, by using the information in one or more images. This is exactly what I was doing with Overmix, using multiple images to reduce noise.

However another aspect of Super Resolution use sub-pixel shifts in the images to improve the sharpness of the image. This could not only solve the issue with the imperfect alignment I was having, it could straight out improve the quality further than I had thought possible.

(I had actually tried to use sub-pixel alignment when I ran into the issue and I speculated it might could increase sharpness. But after much work I only managed to make it align properly without reducing the blur I was having even without it, so I didn’t press it further.)

Limits

Super Resolution has it limits however. First of all, as it tries to estimate the original image, it cannot magically surpass it and give unlimited precision. If the image was created in “480p”, even a 1080p BD upscale will still only give the “480p” image. If the original was blurry by nature, Super Resolution will result in a blurry image as well, unlike a sharpness filter.

And that raises the question, why is anime blurry and why does it not align on the pixel grid? With one sample, I got the same misalignment with both the 720p TV version and the 1080p BD version. If this was caused by downscaling the issue would be smaller at 1080p, however it isn’t. Most anime does not appear to push the boundaries of 1080p, but since there are misalignment issues I suspect their rendering pipeline isn’t optimal.

The other limit is the available images used for the estimation. If the images we have does not contain any hints on what the original image looks, we can’t guess it. Thus if there are no sub-pixel shifts in an image, Super Resolution can’t do much. And that is actually an issue because most slides only moves vertically which means we only have vertical sub-pixel shifts. In those cases we can only hope to improve detail in the vertical direction.

Using all available information

Since Super resolution uses the information in the images, the more we can get the better.

First of all, the closer we can get to the source the better, as we don’t have to estimate the defects that happens on each conversion. A PNG screenshot is better than a JPEG, and the TV MPEG2 transport stream is better than a 10-bit re-encode.

One thing to notice here is that the PNG screenshot is (with all players I have tried) a 8-bit image, not 10-bit (16-bit*) for Hi10p h264. So using PNG screenshots would loose us 2 bits.

However more importantly, PNG cannot represent an image from a MPEG stream directly. The issue is that PNG only supports RGB and MPEG uses Y’CbCr. Y’CbCr is a different color space invented to reduce the required bandwidth of image/video. The human eye is most sensitive to luminance and not so much to color, which Y’CbCr takes advantage of. MPEG then (normally) uses Chroma subsampling which is the practice of reducing the resolution of the planes containing color information. A 1280×720 encode will normally have one plane at 1280×720 and two at 640×360.

So to save as a PNG, the video player upscales the chroma planes and converts to RGB, losing valuable information.

Going even further, video is compressed using a combination of key- and delta-frames. Key-frames stores a whole image while delta-frames only stores how to get from one frame to another. The specifics about how those frames were compressed is again valuable information. (But I don’t know much about how this is done.)

Status of Overmix

Overmix now accepts a custom file format which can store 8- and 10-bit chroma subsampled Y’CbCr images. I created an application using libVLC that takes the output with minimal preprocessing and stores it in this format. (It also makes it easier to save every frame in the slide.)

Overmix now only uses the Y’ plane to align on, instead of all 3 in RGB. My next goal is to redo the alignment algorithm. Currently it renders an average of all previous added images to align on, as otherwise the slight misalignment would propagate with each added frame. However I will try to use a multi-pass method now, where it will roughly align all images and then do a sub-pixel alignment on the images afterwards. Sub-pixel alignment will, at least in the start, be done by upscaling as optical flow makes no sense to me yet.

Then I need to redo the render system, as it is currently optimized for aligned images, and this will clearly not be the case anymore.

I haven’t worked on Overmix for quite some time due to University stuff, but the next three months I should have plenty of time, so hopefully I will get it done before that is over.

I have been developing a new application named Overmix, which attempts to improve the quality of anime screenshot stitching. This article will shortly explain what stitching is, what issues affect the quality and how Overmix tries to fix those. At the end a short summery of the results for the current progress is given.

Background

One common animation technique is panning where the camera moves/pans over the image, showing only a part of it at a given time:

Very little movement actually happens during the shot, in fact only the mouth is moving (presumably to reduce animation costs). This makes it possible to combine the frames together to one large image, which is known as “stitching”.

Source quality

The issue is however that more often than not, the video quality isn’t that great. The video has been compressed and especially if the source is a TV-transmission or webcast, visual artifacts can be quite noticeable:

Reducing artifacts

A stitch is normally done by taking two frames, finding the offset between the two images and then soften the edges between the images to make the transition less apparent (which is usually done by applying a gradient on the alpha channel).

Since this is a time consuming process, as few frames as possible is used. The idea is to do the opposite, use as many frames as possible. The reason is that the artifacts are not static, for every frame they differ slightly. In result, every frame carries a slightly different set of information. The goal is then to derive the original information, based on this set of inconsistent information.

Just by using the average, we can get quite decent results:

(Right is a single frame, left is the average of all unique frames.)

Results

Noise artifacts has shown to nearly disappear completely when simply averaging every frame with each other, even when the source has a significant amount of noise artifacts. Color banding is also reduced but with much more varying amounts.

Even with modern TV-encodes, stitches sees a significant improvement from using this technique and can visually be tell apart at normal magnification. Surprisingly, even when using good BD-encodes there is usually a slight improvement, but normally requires 2-4 times magnification to be noticeable.

It has shown that it often is not possible to make a perfect alignment when sticking to the pixel grid. This causes the images to be slightly more blurry than originally. It is an area which still requires work.

Using the average to derive the result is not always desirable, as the encode might contain information not related to the image. Such information could be subtitles, TV logos or simply errors in the source. See the following image as example, the most-right column of pixels was completely black and shows up as lines in the averaged image.

However the currently devised algorithms has a tendency to choke on the slight misalignment mentioned previously and cause unwanted artifacts. If this is solved best by fixing the misalignment or by improving the algorithm is up to discussion.

A long time ago on Mindboards some talk was made about displaying text-output like how it is done in a console, but I never ended up writing any code. Since it have been quite some while since I last wrote anything in NXC, I did this as a quick brush-up project.

Supports scrolling up and down with the left and right button on the NXT, and supports the control characters ‘\n’, ‘\t’, ‘\a’ and ‘\b’. ‘\b’ only works on the text you are currently adding though.

Since I’m not working actively on RICcreator in the moment, not much have happened the last few months, however I just rewrote some terrible code which was not safe. RICcreator would crash if the settings.xml file is formed differently than expected, so when a new version changed the format, it would crash if the old .xml file was kept.

XML handling

The XML parser previously used was rapidXML which promises great performance, and since I want to do some game related programming with XML I wanted to learn the API. However as I tried to use it with RICcreator I quickly realized it was rather tedious to work with. So I slacked on the implementation. Most importantly, I didn’t do any validation and simply chained the node lookups. So for example to get to the “settings” node in the “RICcreator” node I did this:

doc.first_node( "RICCreator" )->first_node( "settings" )

However if first_node() can’t find the node it returns NULL and in case of this the second call will try to dereference a NULL pointer and crash the application. To avoid this it should have been done like this:

This is quite some code for a simple lookup, so as said I slacked. Adding nodes to a new document (when saving the settings) was even more tedious as you had to allocate nodes and then add them. (On the other hand, I couldn’t slack here.)

So I have been looking for a simpler C++ XML parser and recently I heard about pugixml. I like the API much better, it is also a lightweight parser and apparently even faster than rapidXML so I tried it out. The previous lookup would look like this in pugixml:

doc.child( "RICCreator" ).child( "settings" )

This doesn’t share the problem rapidXML has, because child() returns a “NULL” object on failure. The NULL objects functions all return another NULL object, so you can chaining like this is completely safe as long as you check the final result.

So I have rewritten all XML handling to use pugixml now and I’m quite happy with the result. The code is a lot prettier and most importantly, it shouldn’t be able to crash like before.

Other changes

The dithering have been changed to use Filter Lite instead of Floyd-Steinberg, which produces nearly as good result but which is quite simpler (and therefore faster).

A fun addition is grayscale importing support. If you want to toy around with grayscale images on your NXT, RICcreator makes this easy by letting you import the same image several times with different thresholds in one step.

The Number opcode is now no longer aligned to multiples of 8, as this limitation has been removed in the enhanced firmware.

A few bugs have been fixed and it should be possible to compile in Visual Studio again. (I haven’t checked after rewriting the XML stuff though.)

The time has finally come where I ventured out in the myriad of paid hosting providers. Free hosting isn’t really worth it and WordPress.com doesn’t allow you to do anything (without paying buckets). WordPress.com is fine if you just want to have a blog, however as a CS student I want to mess with everything. So I tried a local hosting provider and we will see how this goes…

Status on RICcreator

I haven’t had the time to develop on this for quite some time but I can do some work on this regularly now. However I’m not going to do that. While far from perfect, RICcreator have matured and I do no longer feel the need to use nxtRICeditv2 anymore. I haven’t even started the program up in months as I’m using RICcreator for everything RIC related now.

As I’m not seeing people use it I will simply fix bugs and add features when I want the functionality. So while not dropped, development will be slow. I will still fix bugs if anyone reports them and I’m open for feature suggestions. I will do one more release which probably will be the last major update for a while.

While I want to move my focus to other things, RICcreator marks an important milestone for me: I am now creating software which I find useful enough to use in my everyday life. RICcreator is my first medium-sized application and I’m quite satisfied with the result.

What I have been doing the last few months

Superus

I was working on in a project group for a project for CS at university, however 4 out of 7 people in the group decided to drop out of CS. So the lone 3 remainders had to write 80 pages in 2 weeks, which worn me out a bit. Anyway, the project was about image editing and we wrote a couple C programs to do several functions. The code is made available at SourceForge as Superus.

It uses a command line interface and only accept PPM input/output so it might not be that useful for many, however it can read 16-bit PPM, edit it in 64-bit float and save it as 8/16-bit using dithering to ensure maximum quality. So it is possible to do certain things which simple isn’t currently possible in GIMP. (It should also be gamma-correct unless we screwed up on the conversions.)

I have written about half of the code, the parts which handle PPM input/output, image representation (how it is stored and worked with internally in the programs), scaling, brightness, blurring/sharpening and various filters. “nielssonnich” wrote the Command line interface and scripting code.

I might continue to do some experiments with this codebase, but it should be considered to be a one-shot project though. I might also write a post about one of the cases where I had to use Superus instead of GIMP to achieve the results I wanted.

BPM graphing

I wrote a small quick program to graph the BPM changes in a .sm chart for DDR rhythmic games which you can read more about here: BPM graphing [Thirdstyle]

I was in the process of porting it to PHP but I didn’t finish it. However I want to finish it soon as I have some bigger plans (see below) which I hopefully will start on in the summer vacation. It shouldn’t be more than 1-2 days work anyway…

What I’m doing now

I have started on a new project for CS with a new group and our goal is to create a spell checker which can correct grammatical errors in Danish. We have not yet decided whether we will try to parse sentences or try to make something like a neural network, but nevertheless this will be a challenging and interesting project.

Another thing I’m working on is a custom theme for this blog. Now that I’m not limited by WordPress.com I can write my own PHP code so I can achieve just want I want. I’m going to take my sweet time on this as it is low on my priority list.

LDraw viewer

At university we are taught C#, however the exercises are quite boring as we have to write some random code which does nothing more exciting than a “Hello, World!” program. So instead I will be working on a project I wanted to do for nearly a year now, a LDraw viewer. It should load a LDraw CAD model and display it on the screen using OpenGL for the graphics.

I’m obviously not trying to create something special here, as there are already several software out there which has similar functionality. This is purely a study project and in particular I want to get a good grip on OpenGL and 3D graphics. However since I also want to cooperate this into my education I will be writing it in C#, to get some proper experience in it. I haven’t written any code so far, but I will do so within a weeks time.

NXT RPG

You heard right, I haven’t forgotten about this project. It pains me that I have not completed the ‘level 2’ mode which contains the battles, so I want to do something about that. I might do a bit of work on it from time to time, but don’t expect sudden rapid development on it ; )

6wd Off-roader

I’m still not done with this project, however I’m almost finished now. Just missing a bit of reinforcing of the mechanics and improving my software and I will call it done, I want to move on after all.

What I will do in the future

DDR for keyboarders

One of the project ideas I have on my Projects plan page and I will attempt to start on it this summer. (This was why I did the BPM graphing thing.) I’m been playing Stepmania for 4-5 years now however I absolutely hate the engine. It is slow, buggy, old-fashioned and straightforward annoying. I want to try making an engine which is very different in a lot of points but which still keeps the good old game play.

This is a very ambitious project and I might not end up with anything useful. However I want to challenge myself even further and I want to tell the community that there are people out here that wants to move on from the old DDR arcade days.

This is definitely a large project and I have to make use of a lot of technologies and techniques I haven’t used on the PC before, only with NXT or PHP. OpenGL, threading, databases and perhaps even some driver interaction stuff.

Conclusion

There is really a lot of stuff I want to do and not really that much time. Can I pull all this off without going overdue? Probably not, but I will really want to try my hardest to arrange my time so I will complete everything. I tend to waste a lot of my time doing random stuff which is really a shame when there is so much I want to do.

I have and it have started to become annoying. ‘[‘ getting replaced by ‘ %5B’ happens rather rarely, however I hate how spaces are replaced by underscores all the time. Renaming the files manually is even more annoying so I made a program to do this.

Drag and drop files into the program and click “Rename!”. (If a file is open or otherwise locked, it will currently just silently fail.) If you drop a folder, it will rename the folder, not its contents.

Use the check-boxes to turn conversions on and off. Conversions are done in the same order as the check-boxes, so if you turn “Convert spaces to underscores” remember to turn “Convert underscores to spaces” off. ; )

After a weeks delay (3 exams, playing Little Busters! for three days, and a bit of piano training) I’m finally able to do some more on RICcreator.

Lately I have been working on nxtCanvas, the class which attempts to emulate the drawing functions on the NXT.

First of all, LineOut() and RectOut() have been optimized quite a bit since they weren’t that well written… (For example, when drawing a filled rectangle, it first drew a normal rectangle and then a filled one inside that.)

Not minding the small stuff, I finally fixed the EllipseOut() implementation. After doing a bit of math it just worked much more like I wanted in a much simpler procedure… I also added fill-shape, so this is now working for both circles and ellipses. To avoid it drawing on some pixels several times it has quite a bit of exceptions, so it needs to be cleaned up someday. But well, it works.

RIC fonts

I have also finally written a simple implementation of RIC fonts. In short, it is just FontTextOut() without any special font copy options. However this much is enough to get a TextOut() and NumOut() working. I made a RIC font which looks like the default font on the NXT and TextOut() simply calls FontTextOut() using this RIC font.

So the Number opcode is now working, meaning only Polygons aren’t fully supported. And I’m not sure when this will be added, because I’m simply unsure on how it should work…

Interestingly, the font I used is different from the one used in nxtRICedit. I based it on the 1.29 standard firmware source and it looks the same when trying it on my NXT. Was the font different in the 1.0x firmwares? For now I’m not planning to emulate how it would look on the standard or older versions of the firmware, but it would be good to know if there is a difference here.

Editing nxtCanvas

I have also done a bit of work on the GUI side.

You can now edit the copy options settings which opens up the possibility of drawing in white, using fill-shape, and XOR drawing:

There are some performance issues with displaying the canvas. Right now it needs to fully redraw everything each time the nxtCanvas has been changed. It does so by drawing the entire field white and then adds the black pixels. This means the performance decreases the more black pixels there are. (So the performance issues with fill-shape is actually because of the huge black areas.)

There are two solutions to this problem. One is to make nxtCanvas draw directly on the image class QT needs. However since I want to make riclib completely independent on QT for reusability reasons, this is not an option.

The other solution is to only partially redraw the image. It would surely speed up the redrawing but nxtCanvas would need to specify the changes with every drawing operation… I will try to add this some other day.

It might be possible to take a more low-level approach to this, I will try to look into this and see if it is possible to gain anything by doing that.

EDIT START: Just tried that, made quite a difference… It needs to redraw everything each time this way, so no point in partial redrawing in nxtCanvas. Still slows down with more black pixels, but it should rather be treated as an optimization anyway… EDIT END

Future additions to nxtCanvas

Other than features to speed up redrawing, I’m also planning to make it possible to auto expand the canvas when the drawing operation attempts to draw outside the canvas. The issue here would be performance though… I’m not going to add it right now anyway though…

There are also a few functions that need to be added, like shrinking the canvas size to the area actually used, and filling an random shape with a color.

VarMaps?

The most important missing feature right now is editing VarMaps. I just can’t come up with a good interface to do this with though. The solution in nxtRICedit works, but I don’t quite like it… There must be some better way to do it, but I’m still on a loss on how this should be…

Well, this is it for now… I will upload a build in a couple of days, I want to add a little bit to the GUI first.

EDIT: it has been uploaded, revision 80.

EDIT2: I forgot to upload the font file together with it, fonts will not draw without it… Will be fixed in the next release. For now, you can place a ricfont named “font.ric” together with the .exe, it will use that one. If you really want to test it out…

\Quite some time ago, way back to the time of nxtasy.org, I was writing a Bézier curve implementation in NXC. Muntoo was also writing one at the time, you can check his out on his blog, here. His implementation is generally more feature rich, like several different drawing modes, while mine is geared towards pure speed.

To start it off, here is a screenshot of the current progress:

The process is two-step, a initiation function which calculates some basic stuff which takes 11 msec, and then the drawing functions. In the screenshot there are 10 cubic curves drawn by the general Bézier algorithm which takes 419 msec in total, so ~42 msec per function call in average. (The first draw with a new amount of control points is a bit slower as some of the calculations are reused.)

Right now there is only a general Bézier algorithm, the one I optimized for cubic curves isn’t working correctly…

Implementation – v1.1.0

The general Bézier formula looks like this:

So to draw a Bézier curve you would start with t=0, calculate the x and y positions, increment t by 1/”desired precision” and recalculate. This process is repeated until t=1. There is a lot of calculation needed for each t value, yet we might be able to do some of the calculation that doesn’t directly involve t beforehand.
We could for example calculate the binomial for each P and store it in C like this:

Implementation v2.+

NXC is compiled into a bytecode format in which math opcodes are polymorphic. If you try to multiply an array with a scalar, the scalar will be multiplied with each of the elements in the array. This is however a lot faster than manually iterating a loop multiplying each element.

So my idea was to take advantage of this by making an array containing each value of t. Instead of iterating a loop, you would simply do the math operations on this array. The array of t values would also only be calculated once in the beginning of the program, instead of each time you need to draw the curve.

So for the Bézier algorithm there would be two arrays, one containing each value of t and another containing each value of 1-t.

However we want to optimize this further. Since we start at i=0, lets consider that case:

We will only need to calculate , so lets try finding the difference between the first iteration and the next:

We can therefore get to the next iteration by multiplying into and into the next iteration by multiplying again:

This fraction can be calculated without knowing n or i, so we can calculate this at the start of the program and reuse it. We will need to calculate each time we get a new n value (but we can keep the array, so if the next curve have same n value, we also reuse this (and when done right, even the binomial)).

However a small loop is still needed to calculate the sum since ArraySum() isn’t polymorphic. Currently there is also a little bit of multiplication in the loop, but I’m working on removing that too.

The major downside with this approach is that if you use high resolutions the arrays become rather large (and they are using float to boot) so memory usage rise quite a bit. (Some temporary arrays are also needed, so it quickly becomes a few KB.)