fstream - binary VS not binary mode

My question is this: apart from handling new lines (which can be 'CR+LF' or 'LF' depending of if we're using Windowns or Unix), is there any diffrence, what so ever, when using binary or non-binary mode in fstream?

To clarifiy my question: If my files don't have any 'CR' or 'LF' characters, is it totaly safe to switch between binary to non-binary mode, without any effect?

Replies To: fstream - binary VS not binary mode

Re: fstream - binary VS not binary mode

Posted 04 February 2010 - 02:43 AM

Not really.
Text mode files may do translation of for example 'CR+LF' or 'LF' to something else.
But other characters may be changed as well, or a zero in the file may cause the read process to stop.
Also, what those special characters get changed into may depend on the OS or computer system that the code runs on.
With binary files you are SURE that the file will be read as-is, on any computer and regardless of the contents of the file.
The difference in the kind of file IO says it all: Text mode is for text based files,
Binary is for all other kinds of IO (even text files if you don't want any interpretation to take place!)

Re: fstream - binary VS not binary mode

There are LOTS of differences between binary and text files -- what a silly question.

If you wanted to store the number 1234567 in a text file you would need 7 bytes, in binary you only need 4. (actually 3 only but you would probably use 4).

The reason you would use 4 bytes is because that is the standard size of an int. So lets say you wanted to store a list of numbers

123454
65423
23
23453
32345

and you wanted to get to the 4'th number... well in a text file you would have to read the first 4 numbers because you have no way of knowing where the 4'th number is (actually not 100% true, but in general). In a binary file though you know the number of bytes used by each number so you know that the 4'th number (index 3) starts are byte index*size = 3 * 4 = 12.

Think of a long list of 100's or 1000's or 100000's of numbers... being able to skip to a particular number without having to read all of the other ones in between is a real benefit.

Then there is the fact that a text file generally only has text characters (ASCII codes 0x20-0x7E and a few control characters). This can make storing data in general very problematic and wasteful. For example SMTP (simple mail transfer protocol) -- the basic email protocol -- is a text base protocol and so to send an image via email it generally must be converted into text. To do this the file is sent via a base64 encoding which is generally much larger than the original file because binary generally uses 256 values per byte but base64 uses only 64 values per byte... Storing an image in a text file is VERY wasteful!

As another example look at the PBM image format (a text based image format) used to store black and white images -- it uses 1's for white and 0 for black it WASTES 7 out of every 8 bits. Sure the image can be edited via a text editor, but it is hardly an efficient mode of storage.

this is only the tip of the iceberg. Binary has MANY advantages over text based file -- it also has many detractors.

For example it is not easy to change a binary format... adding new features to a binary format can be difficult and often an earlier version of a program can not read the files produced by subsequent versions (take MS-Word for example) (though this really has more to do with being short sighted when designing a file format than it does anything else). Where as text formats tend to be more adaptable to change.

Binary formats are generally tied to a particular program or programs -- that is, you can't just open a binary file up in any-old text editor. Even opening a file in a hex-editor might not give you any information about how the data is actually formatted, where as with a text file, one can open the file in a text editor and analyze the data and figure out how to use it.

Text files are simple, easy to read, easy to work with.
Binary files tend to be more complex/sophisticated, offer better performance or efficiency, are more difficult to work with.

Re: fstream - binary VS not binary mode

Well you can look at a number of file formats (both text and binary) at www.wotsit.org

It is pretty neat to look at some of the markup languages -- early ones like troff and TeX and RTF. Then look at more modern formats like HTML.

For binary files generally I think that BMP is a good beginner format. It is pretty simple and even its basic compression scheme (Run length encoding) is pretty simple to understand... the hard part to understand is why they choose to store the image upside down...)