I'm working on a program that needs to run on a Linux distro with an ext2 filesystem. This program will write files which may become very large. I notice that ext2 has a maximum file size of 16GB to 64GB. However, one thing on wikipedia's page that scared me somewhat is the following line:

There are also many userspace programs
that can't handle files larger than 2
GB.

...when it's talking about ext2's limitations. Does this mean that I should be careful about letting a file grow larger than 2 GB?

This question came from our site for professional and enthusiast programmers.

4

He is working on a PROGRAM that is handling files. Why on earth do you think this is about file systems or administrating a server. Shouldn't a programmer tell him what happens when a file crosses the 2gig boundary ?
–
Hassan SyedDec 8 '09 at 15:24

4 Answers
4

What you'll find is that some programs use 'fseek' to move around in a file.

int fseek ( FILE * stream, long int offset, int origin );

If they do things relative to the start of the file (SEEK_SET for the origin parameter), then they only have a signed 32 bit integer as the offset parameter, so they can only get 2 gig into the file.

For programs that don't use fseek/ftell (for instance, a program that just reads through the entire file in a linear fashion), and for programs that just use fseek to jump back and forth slightly from the current position (SEEK_CUR with offsets < 2G), there's no problem, everything will work just fine, no matter how big the file is. It's only programs that randomly access the file data that are going to have a problem.

Note that some environments have an 'fseek64' and 'ftell64' functions, which give the caller a 64 bit signed integer, and thus access to anything they want.

I've never had problems, and my system logs are routinely larger than 2 gigs on some of my servers with external IPs (the logs rotate weekly, not by size). I also run a couple of massive feeds that produce files that are 3-6 gigs in size, and I haven't had problems with those either.

I'd say it's completely dependent on what user-land programs you need: if there is a deal breaker, you may need to re-evaluate.

The file size limit is very dependent on the block size of your file system. The single file limit is 16GB if you have a 1K block size, 256GB for 2K and 4TB for 4K. You can check your block size using:

This is on an ext3 partition, but they'll have the same limits. I would be very surprised if you have a 1K block size partition, and as such, you don't need to worry about the file system.

Having said that, some programs do fail to have large file support (larger than 2GB), but I've not seen one in a very long time. The last one I saw was commons-java's jsvc, which fell over when its log file got larger than 2GB. Pretty much anything written in the last 6 years will work unless someone went out of their way to do something weird.