Language Thread

SCIENCE Thread

Inspired by this question I thought I provide my implementation. I tried to go with the spirit of the *nix tool chain - read from stdin and write to stdout. This has the added benefit of making buffering very easy (current and previous characters and the count).

2 comments

@user3629249 2016-01-04 06:04:17

When compiling, always enable all the warnings, then fix those warnings.

For gcc, at a minimum, use: -Wall -Wextra -pedantic (I also use -std=c99 -Wconversion)

The compiler outputs several 'problem' statements:

To start: the main() function signature of int main(int argc, char* argv[]) which in this case should be int main( void ).

unused parameter `argc`
unused parameter `argv`

And, because of a missing #include <stdlib.h> statement:

implicit declaration of function: `exit()`
EXIT_FAILURE not declared

And this line:

while (EOF != (current_char = getchar())

has a syntax error (always check for matching numbers of open and close parens):

error: expected ')' before '{'

That error means the posted code was never compiled.

@ChrisWue 2016-01-04 21:38:37

As I've stated as comment to SirPython's answer: I accidentally copied an broken code version.

@SirPython 2016-01-03 00:33:28

Compressor number or real

When you are write_counting, you are writing the ASCII number characters to the new file. However, when you go to decompress this file, how are you going to differentiate between the actual content in the file and the numbers that mark the occurrences of a character?

A possible solution for this might be to just write the number itself to the file (no ASCII). That way, when you encounter a number that is ASCII, you can be almost sure that the number is part of the content (that is, unless there was a letter that occurred so many times in a row that the counter rose into the '0'-'9' range).

Two ones or twelve?

This is kind of a continuation from the top one. Let's say your compressor went to go compress this file:

12

Now, I am ready to decompress it. Since your compressor writes a number to show occurrences of a character, the output would be this:

1121

How do I know if all of those numbers are part of the content?

The only fix I can think of, unfortunately, would be to follow the above tip and write 0x01 instead of an ASCII number.

This also showed a problem that two ls are written after the number that shows how many character occurrences there were.

@ChrisWue 2016-01-03 04:56:49

Regarding your Misc bugs: must have copied accidentally an older version of the code. Sorry about that, should have checked more carefully. Good point about the number encoding, got some ideas about that.