I am having a problem tailing a log file. The problem is that it will eventually write non-printable characters to my log file (i.e. square boxes).

At this point, I have basically stripped down all of my code down to the following lines in hopes that I could just compare the original log file to the one I created (they should be the same). But they never are. My script will take several lines in a row (20-30) and represent them with non-printable characters. When I open the original file, of course the lines contain regular characters. Another interesting fact is that the original file and my file are always the exact same size.

The real problem my script needs to resolve is that fact that I am missing events......

So, I wrote a script that is monitoring a log file, in real-time, for certain events. If I see that event in the log file, I do something.

The script was working, however I was missing some events. I looked in the log file and saw that the events existed, so I wasn't sure why my script wasn't picking them up. It had to be one of two things:

1. My code to check for the events was wrong. OR 2. File::Tail was not giving me the events.

In order to check if File::Tail was giving me the events, I took my script and just basically tried to recreate the log file to ensure it was exactly the same. Since it wasn't I determined that for some reasons File::Tail was now printing the characters I see in the log file as non-printable squares. If I figure out why that is happening, I believe my true script will pick up all events.

So I really need to figure out why File::Tail is not printing out the entries as they happen. Is it because of utf-8?

Another interesting finding......

Lets assume I kill the script and it has been running all day long. If I set File::Tail to start from the beginning of the file, the script works perfectly...UNTIL.....I get to the real-time writing. It appears this is when I start getting the non-printable characters.

open my $debug_fh, '>:encoding(UTF-8)', $debug_log or die "failed to open '$debug_log' $!";

And lets try adjusting the File::Tail object.

Code

my $file = File::Tail->new( name => "/home/logs/2013/05.15/users/Client.log", tail => 0, # set to -1 if you want to start at the beginning of the file instead of the end interval => 2, maxinterval => 10, debug => 1, );

from the command line tail -f > file.txt has the same problem. Bummer.

Ok, so I need to find a new implementation of monitoring a log file that has 3.5 mil rows and a file size of 200MB. Any suggestions? I was thinking of:

1. Keep a counter of how many rows I have read so far. Next 2. running a wc -l on the file to get the total number of rows Next 3. Subtract total number of rows - rows read so far. Next 4. File::ReadBackwards.pm the number of rows I need to read and put it into an array.

Can you provide more details on the overall process/problem that you trying to solve?

One possible approach, depending on what your real goal is, would be to open a standard filehandle and process the data as needed. When you get to the eof, use the tell() function to retrieve/store the current byte offset and then close the file. When you reopen the file, use the seek() function to move the file pointer to that offset and begin reading/parsing from that point.

Unfortunately, I can't post the data as it is sensitive financial transactions. So here is why I don't think its utf-8 or utf-16:

If I run a simple Perl script to just tail the original log file and print to the debug log file starting from the very beginning of the log file, all the data gets printed perfectly (and I am talking about more than 3/4 of day of logged data)....once the tail catches up to the real-time log entries its starts printing the non-printable characters every so often 5,000 - 10,000 rows work and then it starts to fail.

If I stop the Perl script, and re-run again from the beginning of the log file, when the Perl script hits those same rows it processed as non-printable during 'real-time', it processes them fine....

I would not expect your OS to allow it, but it sounds like perl is reading data before it is physically written to the disk. Perhaps, as a work around, you could try to redirect the log file to a perl program and write both files with that program. Good Luck, Bill

That tells me that the problem is probably at the OS level. However, we haven't been given any details on how/when he's viewing file.txt. Is it as it's being written to via another tail -f process in a separate terminal window? Or is the tail -f process being killed and the file then loaded into a text editor?

I often redirect the output of tail -f of large fast growing log files and have never come across his reported issue.