Given many of your recent questions, either language (as well as Python...) can do the job as there is overlap in their functionalities. What you should do next is look at a little of each & determine which seems more intuitive in terms of syntax, usage, script construction, etc.

Recognize that if you are simply wanting custom sorting, then awk(1) may very well be your best choice (for now...). However, if you continue down this path of wanting custom scripts for this or that need, then you should begin assessing which language meets your more long term goals & go with the best choice. It takes time & effort to mount the learning curve of any language, & continually flipping from one choice to the next is counterproductive.

...happens to have a number of good Python articles written by some influential members of the Python community.

Although this partial/simple book list is O'Reilly-centric, O'Reilly cornered the market when it comes to Perl titles. Other good non-O'Reilly titles exist, but when starting out with the language, staying with the animal books is a reasonable choice.

As for Python, O'Reilly has some good titles, but they did not capture the Python book market as they did with Perl. Python came out after Perl, & the industry was at a different point in its maturation. These may be contributing factors as to the difference.

Thanks a lot for your suggestions.
I think right now I might first use awk, which seems from the outside "smaller" and "simpler", but then I'll have to learn at least Perl. In fact, yesterday I've found a converting tool (Encode::HanConvert) which I will need very often to convert simplified chinese characters to traditional ones and vice-versa. This tool is in Perl, so I guess it has all I need. As far as Python goes, I presently cannot understand the difference between the two, so maybe with time I will.

As far as Python goes, I presently cannot understand the difference between the two, so maybe with time I will.

From the perspective of the English speaking hordes, Python's syntax is more "English"-like without the plethora of special characters & special nuisances required by other languages (specifically Perl). Some find this minimized amount of computer science cruft makes Python easier to write than other languages modeled on C (like Perl). Personally, I don't have such misgivings about Perl, but I know many that do.

How this "ease of use" translates to those speaking Chinese is unknown to me. Maybe the simplicity doesn't translate at all.

As for the goals of both languages, they are very similar, but Perl comes from a heritage inheriting the syntax & mindset of both shell & C programming. Python doesn't duplicate this lineage.

And for what it is worth, awk also inherits various idiosyncrasies from both shell & C programming. awk has a lot of power & served as a prominent scripting language alternative until Perl (& later Python...) arrived on the scene.

Well, I'm neither English nor Chinese mother tongue, so the "Englishness" does not make a big difference to me. Maybe with time I might learn all the three languages, but now I'll go first for awk and then Perl, and if its syntax is similar to shell and C, it will also help me understan Unix better, I think.

I use awk mostly for field-related, single-shot programs. If I needed advice, I would ask at http://www.unix.com/shell-programming-scripting/ -- that's a hot-bed of awk questions and answers. I have seen some very complex and creative solutions there, as well as gentle answers for novice users. As usual, it is in one's best interest to try to solve a problem first, then -- as necessary -- post sample input, desired results, and actual results.

Well, I'm neither English nor Chinese mother tongue, so the "Englishness" does not make a big difference to me. Maybe with time I might learn all the three languages, but now I'll go first for awk and then Perl, and if its syntax is similar to shell and C, it will also help me understan Unix better, I think.

This is not what ocicat meant, he meant that python is more like a natural language (ANY language), and has less syntax, for example python doesn't require a semicolon ( at the end of each statement, python doesn't require curly braces ({ }) and parenthesis ( () ) at many places that most other languages do, and so forth.

This is very different from other languages which sometimes require excessive parenthesis (*cough* lisp *cough*).
The syntax of many languages seems to be designed so that the parser/compiler can easily understand&read the language, python syntax is designed so that it is easier for humans to understand&read the language ... This may make the compiler slightly harder to write, but you only write a compiler once, and you write code many times.

__________________
UNIX was not designed to stop you from doing stupid things, because that would also stop you from doing clever things.

I see, thank you for the explanation.
In the meantime, if anyone could direct me to the relevant part of awk or perl I should study first to solve my sorting problem, I'd be very grateful (could not find it on google).

awk is great; the syntax is a lot like C, only less finicky about declarations. So if you know C you can get started quickly (and perhaps vice versa).

But long ago in my first brushes with awk, I was very confused and bogged down in the command-line syntax, patterns and pre-defined variables. The big picture was missing, and it really didn't start to click until I realized a simple analogy that made it clear.

So here's my mini-contribution to awk 101 . (For those who know awk, allow me the leniency of over-simplification in descrbing this analogy.) In a language like C, the functions have names. The code within the function block gets executed when the function is called by name, either from another such function or from main().

The analogy is that awk is like this, except the "functions" don't have a name: instead they have a pattern associated with them. The code in a "function" block gets executed when the pattern matches (part of) an input-data line.

To me, that's awk in a nutshell, the rest is details. (Of course, the "functions" are called "action statements" and awk does have named functions of its own just like in C.)

ok guys, I started to read tutorials and all. I've also found a tool which should help with this sorting of mine: Unicode::Collate (from cpan).
now I have this test file:

Code:

abc
aab
bbc
mmn
lmn
aaa
ššš
sss
zzz

if I sort it I get this:

Code:

$ sort test
aaa
aab
abc
bbc
lmn
mmn
sss
zzz
ššš

if I sort it with this Perl script I worked out with the usage indications of Unicode::Collate, I get this (and it's really slow!):

Code:

aaa
ššš
aab
abc
bbc
lmn
mmn
sss
zzz

As you see, the "ššš" are not after "z" which is already an improvement, but they should be right after "s".
Do I have to explicitly tell Perl where to put them? How?
Here's the script (don't laugh too loud):

I still don't completely understand why you can sort in proper order ū not considering it an extra letter like you have to do for š, but I evenctually will in the future. Anyway it gives the expected result: