A large png on disk may only take up a couple megabytes but I imagine that on the gpu the same png is stored in an uncompressed format which takes up much more space. Is this true? If it is true, how much space?

4 Answers
4

JPG and PNG files will almost always be smaller on-disk than in memory; they need to be decompressed on-the-fly to acquire raw RGB data, thus requiring more processing power for the loading and more RAM afterwards. So many modern engines opt to store the same format on disk as they do in memory, leading to files that are the same size as the texture's memory requirements (but also larger than a PNG or JPG). RGB/RGBA and S3TC/DXTn/BCn are the most widely used formats, because they are read straight into memory without any processing (DXT textures are precompressed).

If you use a image with mipmaps, the texture will require 4/3 as much memory. Additionally, the texture width and height may be rounded up internally to be a power of two on old or less capable hardware, and on some very limited hardware, also forced to be a square.

More info on DXT: it's a lossy compression; this means, some color data is lost when compressing the texture. This has a negative impact on your texture, distorting sharp borders and creating "blocks" on gradients; but the benefits are far better than the disadvantages (if you have a texture that looks horribly bad in DXT, just keep it uncompressed; the other ones will make up for the size loss). Also, since the pixels are compressed by fixed-size blocks, the texture width and height must be a multiple of four.

This is correct except for your first sentence - the texture's format on disk can be any highly compressed format, and so it does not take the same space on disk as in VRAM except for disk formats that are direct serializations of the memory formats.
–
user744Nov 4 '10 at 9:33

Of course it can, but check the assets used in games built with Unreal Engine, Source, etc. They aren't usually compressed on disk, because nowadays there's more than enough disk space to leave resources uncompressed; and the space saved doesn't make up for the extra RAM and CPU time needed to decompress the files on each load.
–
r2d2rigoNov 4 '10 at 10:48

1

I think you'll find that varies from engine to engine. Many of the larger engines - especially those that work on consoles - will use a disk format identical or close to the memory format. But it's pretty easy to find PC-only games shipping with PNG or JPEG assets. If you have to go to disk for a load that's going to dominate your time anyway. Plus, Mike specifically mentions PNG and JPEG as the disk format.
–
user744Nov 4 '10 at 15:35

Most GPUs can only read a very specific compression format. eg. BC*, DXT*, not formats like png. So yes, it is true for the most part that a .png will take more space in video memory than on disk.

Textures can be stored compressed or uncompressed in both video memory and system memory.

For uncompressed textures, the general rule of thumb is that it will take the same amount of space in video memory as it does in uncompressed form in system memory.

For DXT1 compressed textures. the GPU stores 8 bytes for each 4x4 tile in your texture. The uncompressed data (at 8-bits per RGB channel) would ordinarily be 4x4x3 = 48 bytes, so that's a compression ratio of 6:1. For DXT3/DXT5 compressed textures, the GPU stores 16 bytes for each 4x4 tile in your texture. That's a slightly lower compression ratio of 3:1.

There are some caveats with both uncompressed and compressed textures:

Most memory is allocated in pages (the size of which varies between GPUs) of a fixed size. eg. 4KB and often that is not sub-allocated and shared with other gpu data. Ie. if your texture footprint is smaller than the page size, the footprint in vid mem will often still be the page size.

Some gpus have very specific alignment requirements. In the past, some GPUs had the requirement that textures be a power of 2 in size. This was often required to support a swizzled representation (see Morton Ordering: http://en.wikipedia.org/wiki/Z-order_(curve)) to improve access locality when sampling from the texture. This meant that textures of odd sizes would be padded in order to preserve these requirements (typically this padding is handled by the driver). While morton-order is not necessarily used in modern gpus, there may still be bloating to support the specific requirements of the gpu.

Multiple representations of your texture may exist in memory at any point in time, especially if you're using discard locks on them. This can bloat your memory usage until representations are no longer used by the gpu (which is typically a few frames behind CPU rendering)

If you enable mipmapping, the additional mips will consume on average around a third of the base mip level. YMMV based on the above caveats.

Let's take a 256 by 256 pixel square texture. If it's uncompressed 32-bit with an alpha channel (Color in XNA) then it takes 256KB (256*256*4 bytes).

16-bit formats (eg: Bgr565) will obviously be half the size - 128KB.

Then you get onto the compressed formats. In XNA you have DXT1, DXT3 and DXT5 (also known as S3 Compression). This is a lossy compression format. It is also a block-based format - which means that you can sample from it (because you know which block a pixel is in). It's also faster, because you use less bandwidth.

The compression ratio of DXT1 is 8:1 and for DXT3 and DXT5 is 4:1.

So a DXT1 image of 256x256 is 32KB. And DXT3 or DXT5 is 64KB.

And then there's mipmapping. If this is enabled, this creates a series of images in graphics memory each half the size of the previous. So for our 256x256 image: 128x128, 64x64, 32x32, 16x16, 8x8, 4x4, 2x2, 1x1. A texture with mipmapping is approximately 133% the size of the original.

Images also have a pitch (or stride), which is the amount of bytes between the end of one line and the start of the next line of pixels. Nobody else has mentioned this so I could be mistaken.
–
CiscoIPPhoneNov 3 '10 at 22:57

1

Usually "pitch" refers to the length of a scanline in bytes (as in Freetype and SDL), and "stride" refers to the space between elements, which may be pixels or scanlines (as in OpenGL and Python's 3rd slice argument). Both values are necessary to do image processing, but "usually" pitch = width * bytes_per_pixel and stride = 0. The terms are often used loosely and confused, so it's best to check the API docs for your library of choice.
–
user744Nov 4 '10 at 9:47