Tuesday, January 27, 2009

Recently I faced a problem related to short text compression. The idea was to reduce the space needed to save SMS messages. Since each SMS contains up to 160 characters the classic LZW or Huffman methods won't work out-of-the-box, mostly because of the size of the resulting dictionary. It is even worse if we consider that most messages are less than 80 characters long.

Finally I decided to use a fixed dictionary and a sort of hybrid Huffman code compression (which isn't Huffman at all, but it retains some similarity -somehow). Then I started looking for letter, digraph and trigraph probability in words and texts. There are many resources on the net like this and this one where symbol probability is listed for different languages, including Spanish which was the one I used.

The compression algorithm tries to use the minimum number of bits to encode frequent letters/digraphs/trigraphs. It is not optimal since the dictionary is fixed but it does a good job, reducing message size to 60-70% on most cases. If the right text is picked then it can be reduced to 30% but that is cheating, of course! The space character is one of the most used ones along with the 'e' letter (Spanish at least). Trigraphs and digraphs also play an important role in compression.

Lower/uppercase letters is another issue, but since SMS messages are mostly written in lowercase or uppercase that is not a problem. A trick is to invert the whole text to see which text gets the best compression ratio. This is quite fast since the algorithm is simple and the strings are short. Another trick would be to provide more than one dictionary (maybe three or four) and see which one does better with the desired message. The resulting space overhead is about two or three bits which should be acceptable for long messages.

Another possibility is to compress many messages together with Huffman or any other compression method. The drawback is that the message won't be a unit itself and then message management becomes messy.

Friday, January 16, 2009

Measuring and observing usually (if not always) change what one is measuring. I've never seen Schrödinger's cat, at least not when leading with embedded systems nor in my garden, but the former implication causes many problems when trying to debug time-critical functions/operations in embedded software and hardware.

Usually a printf-like UART output works but that would slow down or even hang code which can't wait for the UART to stream its data out. Even though with a substantial buffer it will come a time where it will get stuck.

There is a nice, well known techinque, that uses output pins to reflect some state, such as which function or interrupt handler the microcontroller is doing at the time. Since setting the output value requires only a few instructions, this approach suits better the time-critical scenarios. This technique is helpful in many ways: one may measure from interrupt latency and cpu usage to interrupt or thread deadlock.

That's how the following code came up to mind, here is a simple header file which is based on the macros I posted some time ago here.

There is a small drawback: the do-while structure won't let any variable declared inside 'block' to be visible outside. That can be solved by removing the do-while, if you need it that way just go ahead and delete those two lines.

Also notice that a return in 'block' won't allow the macro to clear the corresponding pin, so applying the macro to a function with many return points is messy. The alternative is to change the original function's name and call it through a sort of wrapper which does it inside a DBGPIN_BLOCK macro. It's not pure beauty but it works nice, and once you've done the debugging it can be disabled by changing DBGPIN_ENABLED to 0.

Sunday, January 4, 2009

Qt-embedded is astonishing. I'm currently developing a product which uses embedded linux, color 800x480 LCD and touchscreen. Qt-embedded does really nice, faster than gtk on DirectFB and much better-looking.

However, qt won't show any notification to the user when it is busy, be it refreshing an window, loading a frame, etc. This behaviour is somewhat expected, but on a product I'm developing the user needs to be notified (somehow) that the application is doing alright but is busy, so the [in]famous sand clock jumps in.

Since it's a touchscreen device the mouse cursor was disabled when compiling Qt, and in the end it was better since it looks like only B/W cursors are supported on embedded linux (or at least in this platform?).

I thought of a trick by using QTimer, and later I found out some posts when googling. Here is what I finally did:

#ifndef _busy_notify_h_#define _busy_notify_h_

#include <QTimer>#include <QApplication>

/*** Class to show a bitmap inside a widget when* the event loop is busy.*/

class BusyNotify : QObject{ Q_OBJECT

public: // Call to initialise an instance static void init();

BusyNotify(); /** * Show the busy cursor, * it will disappear once the event loop is * done and the timer times out */ void showBusyCursor();

After defining the source and header files above one just needs to call BusyNotify::init() once, at program startup, and then BusyNotifier->showBusyCursor() any time the sand clock is needed. 'showBusyCursor()' comes from an old version based on the mouse busy cursor.

How it works

This class works fine when processing or loading (widgets, etc.) is done in the main loop which is also Qt's event processor. When calling showBusyCursor() Qt will be forced to show the sand clock. Then processing can be done and the QTimer will timeout whenever Qt is able to process events once again (since the 100mseg delay is small enough), so the sand clock will remain on the screen as long as needed. Of course, if processing is done in another thread this won't work, but that's not what it is intended for.