Choose your variables wisely…

I am stuck home with a big fever so I hope this post is going to make sense. We’ll see! I named it against a miracle remedy that I would love to have under my hand.

One of the fundamental choices any programmer has to make is to decide on a particular data type for each one of his variable. In the past, Matlab used to work only with double data type. But now, it has been extended to a variety of class that it is important you know. Still, contrary to C, Matlab has a default datatype (Double), so you can get along for a long time without bothering about this. But, in fact, double is a very very conservative choice that you rarely need. Memory usage of variables is often overlooked.

There are 11 numerical data type : Double, Single, Int8, Int16, Int32, Int64, Uint8, Uint16, Uint32, Uint64, Logical. These data types are different in the way they store information and their intrinsic resolution. Please take a look at the following table :

Class

Max value

Min Value

Bytes

Smallest difference

logical

1

0

1 (yes, Matlab wastes 7 bits here)

1

int8

127

-128

1

1

int16

32767

-32768

2

1

int32

2.14e+09

-2.14e+09

4

1

int64

9.22e+18

-9.22e+18

8

1

uint8

255

0

1

1

uint16

65535

0

2

1

uint32

4.29e+09

0

4

1

uint64

1.84e+19

0

8

1

single

3.40e+038

-3.40e+038

4

1.1755e-38

double

1.79e+308

-1.79e+308

8

2.2251e-308

Single and double are not stored the same way as integer. They are “floating-point”. This means that instead of using the full 64 bits to store the actual value of the number, some of these bits are dedicated to store the position of the “point”. This enable Matlab to allow a much larger range of numbers than expected with a pure 64 bits coding (as can be seen when you compare double and int64).

Indeed, as you can see, they have very very different range. Double has a huge range of values available to it (10^308!). If you need to store real data (as oppose to integer) than single is most likely good enough for you. You’ll save half of your memory usage!

S0me complicated mathematical operations do need double to avoid numerical inaccuracy but 99% of the time you are good to go. Besides Matlab will tell you before hand. Most Matlab functions can take both single and double.

Integers are good if you only need to apply simple arithmetic to your data. You can still divide 2 integers but the result is rounded to the closest integer. In image processing, for instance, nearly all the function can take any data type.

Why is all this important? let’s consider two examples :

If you record at 100 kHz (this is a standard data rate for many acquisition cards), for 100 seconds, you ends up with 10^7 data points. In double precision that means 80 Megabytes. This is quite heavy on memory but more importantly, on hard drive if you want to save your trace.

If you are playing with movies, it becomes even more critical to use the smallest data type as possible. For example, any camera can record around 500 by 500 pixels. If you image at 20 Hz (just faster than persistence of vision), than 100 seconds of movie is 500 x 500 x 20 Hz x 100 seconds x 8 = 4 Gigabytes in memory in double precision! Most of the time, camera give you a 16 or even 8 bit integer, so you can save a factor of 4 in memory and on your hard drive (I am not talking about compressing the data yet, this is raw).

In a following up post, I go in more details on how to convert between these types and why you would want to do that.