Help: Create a shell script to move only files wich has stopped growing

User Name

Remember Me?

Password

Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!

Notices

Welcome to LinuxQuestions.org, a friendly and active Linux Community.

You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!

Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.

If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.

Having a problem logging in? Please visit this page to clear all LQ-related cookies.

Introduction to Linux - A Hands on Guide

This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.

Help: Create a shell script to move only files wich has stopped growing

An external service I dont manage pushes mediafiles into a shared directory on my server. I need to move these files into their correct directories automatically. The problem is that if I run my script as a cronjob once every 3 minutes, I notice that the script copies files which are still on their way into my server. So I need to figure out how to have the script check that the files are complete (done downloading) before the script moves the files.

Check for the timestamp of the file, if it less than some number say 5 minutes move them

Try it but it may not work until the file is closed and that will depend on how it is being written. The only robust way is if the program writing the file writes two files, one to indicate completion. For example myfile.part which is receiving the data and myfile; when the upload is finished the program writing the file moves myfile.part to myfile and completion is indicated by myfile.part disappearing.

If the program writing the file doesn't do something like that you could note the file size and, if it hasn't changed after, say, 5 minutes consider it finished. Not robust but maybe "good enough".

Hello proxmity Try it but it may not work until the file is closed and that will depend on how it is being written. The only robust way is if the program writing the file writes two files, one to indicate completion. For example myfile.part which is receiving the data and myfile; when the upload is finished the program writing the file moves myfile.part to myfile and completion is indicated by myfile.part disappearing.

If the program writing the file doesn't do something like that you could note the file size and, if it hasn't changed after, say, 5 minutes consider it finished. Not robust but maybe "good enough".

Best

Charles

Unfortunately, I don't control the service which uploads the files into my server, and that's why I can make the writing program act in any other way than it does today :-/

So I think your second idea is a great one! - which I figure would work out good, except I don't know how to write a script like this. Something I'm looking into now.

Quote:

Originally Posted by onebuck

Hi,

What about using 'sync' then test the timestamp?

Using timestamps is dangerous because the filesize can in one case be only 5 MB, but in the next it might be above 150 GB. So the loading time from the external server may vary to much for a timestamp to be able to be useful, and at the same time provide me with a way to move the files almost as soon as its done loading.

If anyone else have any ideas, know scripts etc, please give me a shout!
I'll make this promise: When I have the solution, I'm not leaving this thread empty and unanswered!

So I think your second idea is a great one! - which I figure would work out good, except I don't know how to write a script like this. Something I'm looking into now.

Let us know how it goes; you are going to have fun scripting to maintain a log of file names and time stamps, deleting records applying to moved files. It may prove useful to write those deleted lines out to another log, a log of files moved.

Do you know what the remote systems are using to write the files on your system? I'm wondering if the files are held open while they're being written. Mmm ... that would be too flakey over a network link, unless a local process was doing the holding open and being fed data from a remote process.

A gotcha or two to be wary of ...

A smart file upload system would reserve space on the target system (to save running out of space part way through) so files would never change size!

If you are moving the file to another file system it could take a while, making a long time window for an interrupted transfer to restart with unwanted effects of having the file open for reading and writing. Safer to mv the file to a new name before "moving" it to another file system (= copying and deleting it).

Check for the timestamp of the file, if it less than some number say 5 minutes move them

Please correct me if I'm wrong, but did you mean, 'if the file timestamp (modified time, presumably) is more than five minutes prior to the current time'?

This idea can only work if you can do the comparison within the time-out time of getting mtime (so, it probably would be a bad idea to get all of the mtimes, compare, copy the file if not within the time out, compare an mtime... as, by the time you get to the last compare, the list could be quite old).

I think the most reliable way to do this would involve lsof. I had though that there was something like a 'remote' option, but my memory must have gone (must get an upgrade); I think with some combination of fd w, +d and -c you can get a list of 'don't move' files; the only exception that comes to mind is if a transfer starts between getting the don't transfer list and getting the list of candidate files. Ideally, you would like prevent new transfers commencing in this time window, which should be short.

(Actually, what you want is probably the opposite of the -X option (-X being thoroughly non-portable, btw), but I'm not sure that is available.

Just be aware that you really, really, will need to read the man page, as and'ing and or'ing the options in lsof doesn't work in the way that you would think.