Posted
by
kdawson
on Monday April 19, 2010 @06:14PM
from the ten-little-endians dept.

An anonymous reader writes "Developing GUI script-based applications is time-consuming and expensive. Most Unix-based scripts run in a CLI mode or over a secure ssh session. The Unix shells are quite sophisticated programming languages in their own right: they are easy to design and quick to build, but they are not user-friendly in the same way the Unix commands aren't (see the Unix haters books). Both Unix and bash provide features for writing user friendly scripts using various tools to build powerful, interactive, user-friendly scripts that run under the bash shell on Linux or Unix. What tools do you use that spice up your scripts on the Linux or Unix platforms?"

I know this is troll-ish, but the way I view it a script is just that.. a script. A series of commands to be executed in a specific order designed to automate a repetative task. Basic logic, control, and input are generally ok.. but interaction is in my opinion an indicator that your task is out of scope for a "script" and should become a full fledged application.

(you may now freely argue amongst yourselves on the difference between a script and an application)

There are a metric ass-tonne of dialog-type apps out there.. just google for your favorite toolkits prefix and "dialog" and you'll probably find something..

The CLI is powerful because it's a CLI, you do not need or want pretty dialog boxes. Help is whats available with man --help usefull errors messages and the contents of/var/log. It works over 9600 baud serial and works pretty well so you can ssh from your smartphone with 1 bar and fix something at 3am before the GUI would have time to come up to a login screen. A good CLI expects things to be piped into and out of it and can get any required information via the command line. The power of the CLI is that you can chain bits together run to do things or wrap scripts around other scripts and do useful work.

You point to a 20 year old book that mostly bitches about how slow/ugly X is, guess what things have come a long way, I run one laptop with native X and it looks good is responsive I export X all the time over ssh to my primary desktops. Take a step back and think why your trying to shoehorn GUI functions onto a CLI if you really need to do it look at some of the toolkits that can detect if there is a X server present and use that fallback to text gui and run entirely headless by pure command line but think long and hard about why you would want to do this.

It implies that the script will only be run by human users (and probably, human users who happen to run a particular flavour of GUI).
Traditional shell scripts are written for all users, not just human users.

Why should developers care about non-human users? It's what makes automation
possible. Every time time a script delegates work to another script, that's
a non-human user scenario.

If you build enough scripts that can be used by all users, then you have a critical mass and your system becomes really powerful. If you build enough scripts that can only be used by human users, then your system stays weak, for it is limited by the actions of a single human operator.

Simple solution: don't use filenames with spaces in them. They're an incredibly stupid idea. If you need something that looks like a space, use an underscore. The same practice has been done in C since the early 70s, since having spaces inside C tokens would be stupid.

Simpler solution. Don't use computers.

Seriously now. You expect all the end users out in the world to stop using spaces... just so your script works?

Please, for the love of $DEITY, learn Perl or Python or Ruby or SOMETHING. VB's syntax is not predictable or reasonable if you've programmed with any other language or know how a computer works. And the other languages are actually cross-platform and can do everything VB can do and then some.

My vote is for perl. It's more common in a "base install" than any other shell (in the BSDs and most Linux distros) and has a non-trivial amount of power. It's good at dealing with path and input permutations and you can interface it with pretty much anything. Hell, pcre came from perl, and that's used almost everywhere these days: it's got a lot of things right for the little that's wrong, at least in terms of being a good scripting language.

I avoid "shell" scripting (csh, sh, bash) if at all possible, too. The contortions necessary to do the frequently-necessary evaluations takes quite a bit longer, even with a chain of awk/sed/grep and the like. Unlike those languages, perl is entirely self-contained and does not have any system-specific oddities (eg. with a shell script, many system binaries are different and an option/parameter pair on one system might do something entirely different on another - or not work at all).

I realize perl can often (usually) be difficult to read. But for my purposes, it's good enough, because I'm a bit of a prolific comment writer as a matter of process.

As said previously, scripts are scripts and don't often need a GUI. But for grep's sake, make them consistent!!! The only spicing up _really_ needed are some standards:

o output errors to STDERR; normal output to STDOUT
o include (-h, --help) processing - and send it to STDOUT so the help can be piped to 'less'
o use getopt(1) or process-getopt(1) so that options on the CLI parse in a predictable and standard way
o keep it terse except for errors so that the user can easily see if it worked or not without scanning vast output
o provide a --verbose option to help with tracking down those errors

... and the most annoying thing of all - make sure --help _always_ works, even if the script body itself can't - at least the user can then be told about what the prerequisites are.
Head over to http://mywiki.wooledge.org/BashFAQ [wooledge.org] for much wisdom on how to write better bash scripts.

So you've never reset IFS (or set it to a newline or tab) in your shell scripts? Your scripts just bail out instead of handling perfectly valid filenames? Because they are valid.You seem to transform the weakness of shell programs into an O/S guideline, and preach to that effect.

Twenty years ago, the shell creators gave you the ability to enclose $VARIABLES in "$QUOTES". Methinks you BELIEVE you know how to script bash, but you have not really learned anything beyond typing commands in an interactive shell. Shell quoting is just so fundamentally obvious and they are mentioned so early in the bash manual, I have a REALLY HARD time believing you are a competent software developer (unless you program mirc scripts or in visual basic).

You still need to work with the user's files, which will inevitably have spaces in them. If a space is a valid character in the filesystem then your scripts need to reflect that. Erroring out is not a valid solution.

I will quickly write a shell script any time I have some simple task I want to automate. You cannot beat the convenience:cd/some/directory/$1some_program --foo $2 --bar $3rm -f *.temp

Wow, three lines, and it runs the program, then cleans up the temp files that program always litters in my directory. And I don't have to memorize the --foo and --bar options! Shell scripts rock!

The problem comes when you start to do nontrivial things. When you start processing lists of files, and the files can contain spaces, the amount of quoting drives me insane. At that point I rewrite in Python.

The spaces-in-file-names problem can bite even this trivial shell script! If any of the three arguments ($1, $2, $3) is specified as a string containing spaces, this script won't work, because the shell interpreter needs quotes at every step where it evaluates something. If you pass "my file.txt" as the second argument, the $2 won't evaluate to "my file.txt" in quotes, it just evaluates to the bare string. So to be fully safe, the above program needs to be:cd/some/directory/"$1"some_program --foo "$2" --bar "$3"rm -f *.temp

And woe is you if you forget the quotes.

Python loses in convenience for running a program... here's a Python equivalent of the above:import osimport subprocess as spimport sys

lst_args = ["rm", "-f", "*.temp"]sp.check_call(lst_args, shell=True) # run in a shell to get wildcard expansion

At first glance this looks horrible. It's much more than the three terse lines of the original. But it's easier to get right, and this is safer to run. If the user specifies something silly for the first arg, or doesn't provide it, this program will immediately stop after trying to change directories. The original would change to "/some/directory" and blindly run on, trying to run "some_program" there, and who knows what would happen? Likewise, if "some_program" fails, this script will stop immediately, and the deleting of the *.temp files will not occur (making it easier to debug what's going on). Finally, in this code we don't have to worry about quoting the arguments; we can just use the arguments and it just works. It is much harder to write a fail-safe shell script: you would have to explicitly test that $1 is provided, and you would have to check the result of running "some_program" to see if it failed or not.

The nontrivial scripts I write tend to have a lot of logic in the scripts themselves, and Python is much much more pleasant and effective for evaluating the logic. If I want to write a script that sweeps through a bunch of directories and deletes files that match certain criteria, it is so much easier to write the tests on the file in Python. If I write ten lines of "if" statements to look at a filename, that is ten lines where I didn't need to fuss with the double quotes. In Python, you can do things likejunk_extension = (".temp", ".tmp", ".junk")if filename.endswith(junk_extension):
os.remove(filename)

Shell scripting cannot match this convenience. And note that if I use the native Python os.remove() I don't need to worry about quoting the filename; it can have spaces in it and os.remove() doesn't care.

Other people might prefer to use Perl or Ruby. Either of those, or Python, are much better than shell scripts for anything nontrivial.

Look, regardless of your "special" way of naming files, the point is that *other people* who don't share your retarded opinion are going to put spaces in the filenames sooner or later-- so you need to be able to cope with it!

I use sh and relatives (and vi) because they're ubiquitous, stable, small, light, and reasonably fast, consistent, capable, and fairly understandable. Every program in/etc is a shell script, and by default system utilities such as cron call on sh. Everything entered at a command line is interpreted by sh. sh is as much a part of UNIX systems as C. You might as well suggest GNU/Linux be rewritten in a better language than C.

And if you're going to suggest that, why not also reexamine the basic architecture of UNIX? If anyone produces an open, formally verified microkernel OS in Haskell that actually works, isn't dog slow, and has sufficient functionality and apps to be useful, I'll surely check it out. I'd love to see more consistency between how applications accept parameters from the command line and how programming languages handle parameters. The former tends to be named and unordered, while the latter is anonymous and ordered. Then there's the defacto standard for libraries, worked out in the days when memory and disk space was extremely limited. It doesn't support enough meta information, making it necessary for a compiler to read header files. It's made libraries many little worlds of their own. As long as a programmer sticks to C/C++, it is relatively easy to call C library functions, but step outside that and it becomes a huge pain. Therefore we have these monstrous collections of duplicate functionality and wrapper code such as CPAN, abominations such as SWIG, attempts to bridge things by providing some commonality and standardization such as CORBA, and separate worlds such as the gigantic collection of Java libraries.

Something like Perl or Java is heavy enough to be impractical on a slow computer with little RAM. Can take over 5 seconds just to load the language. I'm not familiar enough with Python or Ruby to know if they're as heavy. You can't always be sure they're there, whereas whatever was used in/etc/rc.d, and is run in a terminal, is guaranteed to be present. Don't know about a "pysh", but there is a "perlsh", for use in a terminal. Never seen perlsh used though, and it seems to demand a nasty hackish sort of interaction. Press Enter twice to execute commands, as one press of Enter is apparently used as a statement or formatting break. Maybe that's because those languages actually aren't too suitable for an interactive environment? As to connecting to the web, there's wget, wput, and curl.

It could be a lot worse. Bash is pretty nice compared to MS DOS batch language.

A series of commands to be executed in a specific order designed to automate a repetative task. Basic logic, control, and input are generally ok.. but interaction is in my opinion an indicator that your task is out of scope for a "script" and should become a full fledged application.

So if your script needs to just ask for a path or a couple inputs to create a configuration file, should you build an installation utility? What if the script is just to ask for how many days of log files to keep? Should we install xdialog or zenity in order to put a nice GUI around "How many days of log files should be kept (1 or more, 0 to disable cleanup)?"

Sure you can get the
same result, but the syntactic sugar in your example is much more
verbose, and conceptually more complex.

For each of the three
components, there's a mental context switch (File object on the left,
reader object in the middle, and substitution method on the right).

The shell language does the right thing by handling components more
uniformly (ie they all have STDIN/STDOUT regardless of the nature of the command). The user needs to know what each command will do, but he does not need to know if the result is
an array object, or a stream ojbect, or a file object etc.

The shell also has less redundancy.
Compare cat foo.txt with File("foo.txt"), there should be no need for both parentheses and quotes. Now in the wider scheme of Ruby this redundancy makes perfect sense, but users don't need all this, only programmers do.

Users need the bare minimum to communicate with the machine in a language that takes 30 seconds or less to type (or speak in a microphone...), but still lets them do as much as possible.

It's an interface issue, it's got little to do with the range of things that can be done in the language. Ruby is much more powerful than bash, but bash is
still better at starting and stopping programs (and rc is better than bash...).

I don't think I said it was nontrivial; I just said that Python was more convenient. If you wanted to test a single file and see whether it ended with one of three extensions in a shell script, what would you do?

Oh, everyone's favorite user-friendly command, find(1). What an amazingly baroque set of command-line arguments it takes!

I trust you realize that using find(1) to delete a single file is about like using a chainsaw to cut butter to put on a piece of toast.

But if you want to remove all files that end with *.temp, *.tmp, or *.junk from a whole set of directories, it's this simple and friendly command:find/path/to/top/directory \( -regex ".*\.temp" -o -regex ".*\.tmp" -o -regex ".*\.junk" \) -exec rm {} \;

Don't forget that you have to put a backslash before the parentheses or the shell complains. Don't forget to put a space between your escaped parentheses and the find(1) predicates. Don't forget to use those parentheses or else the -exec command will bind with the last predicate (in this example, it would only delete the *.junk files).

Or, if you know your target platform is using GNU find(1), you can shorten it a lot:find/path/to/top/directory -regextype awk -regex ".*\.(temp|tmp|junk)" -exec rm {} \;

That assumes that you already knew that in AWK it is legal to put regular expression alternatives in parentheses, separated by vertical bars. Of course, you can also do this trick without specifying AWK mode but you need to backslash escape the parens and the vertical bars in the regexp that specifies the alternatives:find/path/to/top/directory -regex ".*\.\(temp\|tmp\|junk\)" -exec rm {} \;

You can do crazy powerful things with find(1) but it's syntax is annoying. I'd rather write a simple Python script using os.walk, such as:import osimport sys

And really, if I'm doing this a lot, I'll write a simple Python function that hides some of the ugly details. And again, the Python solution is more bulletproof; it doesn't matter if any filenames have spaces in them, you get a sensible error message if you forget to specify an argument, etc. find(1) scares me; its syntax is tricky, and you are doing things over whole directory trees. If I'm going to automatically delete a bunch of files, I kind of want them to be the correct ones, and the Python is much easier to get correct.

Okay, I wrote lots of code. Your turn. Please show us all your most elegant solution, in shell script, to the problem of identifying whether a file has any extension from (".temp", ".tmp", ".junk).

You don't have to tell me. I'm fond of Perl. I do admit that it's easy to write obtuse code, but if you just try a little bit, you can write readable and straightforward code. In fact, some of the often-derided syntactic constructs make Perl easier to read, not harder.

For example, even opening a file descriptor and then iterating upon it is awkward in Bourne shell - you end up stowing it as some FD number over 2, and then writing odd redirection like "3>&" on every line that connects with it.

The problem is that in most half decent language you can express almost anything. It is about choosing to refrain from expressing. As I stated, to me shell scripting languages are mostly about setting up environments and starting shells.
I've seen people creating monstrous programs in Bash -because apparently the f...ing could- using complex arrays, reinventing clib functions, resulting in an badly performing system, rendering their product unmanageable and thus becoming a liability.

I program only since 3 decades and 2 of them professionally. I'm still looking for the fine equilibrium between what you could and what you should do. My focus in this era is on Java, Perl and Bourne shell. Java can do most things I need on application level on most platforms and is well accepted almost everywhere within the corporate world. Perl is suitable for slightly more complex system stuff, is readily available on most systems and skills are still around. Bourne shell I use for straight forward system stuff and is very available.

I'd refrain from using Bash or Korn shells -not readily available, historically challenged-, from writing specific system programs in Java -no intrinsic POSIX support- and some other obvious permutations of language, applicability and practicality.

A rather frivolous parallel would be to compare programmers and music composers. The best ones are mostly technically accomplished and excel at refraining from using phrases that would not fit the composition.

Back to the Bourne shell. It is IMHO a truly remarkable piece of work, to be used for what it does best.

It doesn't have to be that way. I always try and write Perl to be readable rather than concise. Sure I might take 6 lines to do something that can be done in 1 and may use some other stuff that isn't particularly necessary but I'm writing to get the job done and for maintenance, not to show how clever I am.