Best way to process a group of files (whole directory, wildcards etc.)

Maybe this is more a Programming than a Newbie question, I'm not sure. I made MS-DOS apps years and years ago, and now I am trying to learn the unix way.

Although there are certainly already many similar programs, I just made a tiny utility I call "unbreak", to reformat text files for easier e-book reading. It's in C and it uses stdin/stdout. You may see it here: http://derner.com/code/unbreak.c

Now that I have it working the way I like, I want to process groups of files, without having to specify each file. I want to be able to do something like:

unbreak raw/*.txt reformatted/

Of course in the above example, the shell would pass as parameters, the names of all the *.txt files in raw/, if they exist, followed by the directory name at the end.

OK, I could easily change the code, to have it open and process each input file listed, and save the output files in the directory named at the end. But is that the best way? Am I missing something easier or more standard? Or maybe it would be simple with some generic shell script?

I ended up making a shell script. I did start adding to "unbreak" the ability to open files in a directory, but I got bored. Anyway I needed more practice with shell scripts. So I made a general purpose script, "cmd-dir1-dir2", to use one directory as input and another as output. For instance I can do:

cmd-dir1-dir2 unbreak raw unbroke

Seems to work ok even with spaces in filenames or directories. Posting for comment, or in case it is useful to anyone else.

I'd check $? after each invocation of $app and bail with a msg if it fails. You do obey the conventions right? ie if app rtns 0, ok, else error....

Well in a perfect world...

It looks like my early version of "unbreak" returns 255, because I never explicitly return a value from main(). There must be many apps like that. OK, I can say "return 0" at the end, but that's sort of just pretending there is error checking in the app. Real error checking in the case of "unbreak" would mean calling ferror() after every get and put, to make sure a disk didn't fill up or go offline or something. Could be worth it maybe OK, but it does complicate it. Anyway, in the script I will try following the suggestion.

Quote:

I prefer deeper indents (4 spaces) but that's just me.

So do I for readability, but I don't like editing and navigating through so many spaces. So I really like tabs. Tabs were made for indenting. Many developers seem to hate tabs with a passion; I don't know why.

When you write to a bash var, do not inc leading $. Do use leading $ when reading a bash var; so:

A=25
echo $A

Actually, I use tabs as well: 1 tab = 4 spaces
My .vimrc says:

set tabstop=4
set shiftwidth=4
set softtabstop=4

garrettderner

07-30-2008 02:49 PM

Thank you Chris! I definitely bookmarked all the links you posted.

Thanks for the vim settings. I hope I may actually be using them one day soon. After 22 yrs of using text editors I still seem too lazy to learn vi or emacs; maybe I am a confirmed n00b? This week I decided to try again. I installed vim and gvim, and succeeded in making some simple changes to my script. But copying and pasting turned out to be beyond me. And I got lost just trying to insert a newline. So I went running back to bluefish with its menus and its nice pretty syntax highlighting.

You can tell I'm hopeless because to create .vimrc I did:

mousepad ~/.vimrc &

But I'm sure if I do get back into coding, vi or emacs will be useful or even indispensable, so maybe I'll keep working at it. One thing I'll need to find is key bindings for the dvorak keyboard layout I use. Edit command keys that are supposed to be next to each other or in locations that make sense, are not where they should be, because I am not using QWERTY.

Below is the script with exit status checking; it refuses to continue if it gets non-zero. Maybe it would be nice if it would report return code -1 as "-1" instead of "255"? I don't know how to do that, or whether it would be desirable.

Also it might be nice to put in switches for forced continuation, and for verbosity.

Well, a good reason to know at least the basics of vi/vim is that its been the default editor in Unix/Linux based system for yrs, so even if a system is broken, the recovery tool will prob have a cut-down version.
Also, its very low overhead for remote work.
Last but not least, in commercial orgs, sysadmins can be hard to persuade to install your favourite.
vi (esp vim) is a very quick editor once you get used to it.
As for copy/paste, I cheat these days. I use xterms, so I use the mouse; just highlight to copy and centre button or both buttons to paste.
Sometimes you need to add ctrl-c, ctrl-v if copy/paste between xterm and browser/word processor.
Full docs here: http://vimdoc.sourceforge.net/htmldoc/

Re error val: bash only uses values 0-255 (8-bits), so you'll just have to manage with that ;)

If you app writes errors to stderr (it should) then redirect to a temp file and print (cat) that if an err occurs.

garrettderner

07-31-2008 08:15 PM

Thanks for the tips and links!

Good arguments for knowing how to use vi/vim. OK, it's on my to-do list, and I'll work on it from time to time. {'Course I've also had learning Morse Code on that list for a long time and haven't done that yet, but I did at least learn my "Alpha Bravo Charlies.")