I'm running a very time-consuming script which takes many hours to end. Watching top I see that it's only taking 5% of the CPU at best, usually around 3%.

Is there any way to force the script to use more CPU in order to end faster?

Edit:

Basically the script is bruteforcing 5 chars length passwords given a the salt and the hash.

Not at home right now, but something like:

charset = ['a','b',.........'z'];
for i in $charset do
for j in $charset do
for k in $charset do
for l in $charset do
for m in $charset do
pass = `openssl passwd -salt $1 $i$j$k$l$m`
if [ pass == $2 ]]; then
echo "Password: $i$j$k$l$m";
exit;
fi
done
done
done
done
done

you should consider to provide your script so we can really tell you if it's possible or not, if your script rely a lot on disk IO then you could put the highest priority on it, it wouldn't be able to go faster than your hard drive.
–
KiwyMar 12 '14 at 9:26

1

Dude, you need a different script - in fact, you need a different programming language. That said, your problem could be openssl and /dev/random. You could try moving your mouse a lot and smashing keys on the keyboard a lot. If your processing time jumps from 3 hours to 3 minutes you might want to look into pseudo randomness generators. Catting/copying files from nowhere to nowhere can help a lot too.
–
mikeservMar 12 '14 at 11:26

Spawning a new openssl process for each attempted password will have a noticeable impact on performance. You should consider rewriting your script to hash passwords without spawning thousands of processes or (preferably) switch to a more sophisticated tool like John the Ripper.
–
n.stMar 12 '14 at 11:31

If I wanted to use it for real brute-forcing, I would use John or Hydra. But that script is part of my Master's programme, and I need to use that for "learning purposes".
–
yzTMar 12 '14 at 11:41

1

You may consider building your script around the -in option to openssl passwd. That way you reduce your total number of execve calls. I agree with the other, you're probably better off writing this in C. The openssl tool is mostly just a vehicle for the project maintainers to do a proof of concept on how you would implement the underlying libraries.
–
BratchleyMar 12 '14 at 13:46

This took ~426 minutes. I actually Ctrl+C this, so it hadn't finished, but I didn't want to wait any more than this!

NOTE: Both these runs were on this CPU:

brand = "Intel(R) Core(TM) i5 CPU M 560 @ 2.67GHz

Improvement #2 - Using nice?

The next logical step would be to nice the above runs so that they can consume more resources.

$ nice -n -20 ./pass.bash ab hhhhh

But this will only get you so far. One of the "flaws" in your approach is the calling of openssl repeatedly. With {a..z}^5 you're calling openssl 26^5 = 11881376 times.

One major improvement would be to generate the patterns of {a..z}.... and save them to a file, and then pass this as a single item to openssl one time. Thankfully openssl has 2 key features that we can exploit to get what we want.

Improvement #3 - our call structure to openssl

The command line tool openssl provides the switches -stdin and -table which we can make use of here to have a single invoke of openssl irregardless of how many passwords we want to pass to it. This is single modification will remove all the overhead of having to invoke openssl, do work, and then exit it, instead we keep a single instance of it open indefinitely, feeding it as many passwords as we want.

The -table switch is also crucial since it tells openssl to include the original password along side the ciphers version, so we can make fairly quick work of looking for our match.

Here's an example using just 3 characters to show what we're changing:

This is a massive improvement! This same search that was taking more than 426 minutes is now done in ~1 minute! If we search through to say "nnnnn" that's roughly in the middle of the {a..z}^5 character set space. {a..n} is 14 characters, and we're taking 5 of them.

Conclusions

So with a restructuring we're running much faster. This approach scales much better too as we add a 6th, 7th, etc. character to the overall length of the password.

Be warned though that we're using a smallish character set, mainly only the lowercase alphabet characters. If you mix in all the number, both cases, and special characters you can typically get ~96 characters per position. This may not seem like a big deal but this increase your pool tremendously:

$ echo 26^5 | bc
11881376
$ echo 96^5 | bc
8153726976

Adding all those characters just increased by 2 orders of magnitude our search space. If we go up to roughly 10-12 characters of length to the password, it really puts a brute force hacking methodology out of reach.

Using proper a salt as well as additional NONCE's throughout the construction of a hashed password can add still more stumbling blocks.

References

The first example is taking exactly the same time as when bruteforcing. Maybe it has something to do that I'm running the script in a VM? Exactly it's taking 50s on 3 chars. The VM setup is 2 cores 3GHz and 2 GB RAM.
–
yzTMar 12 '14 at 21:31

1

the nice example also lasts 50s. The example of the table is the good one.
–
yzTMar 12 '14 at 21:40

mmm, what does the =~ do? First time I see it.
–
yzTMar 12 '14 at 22:10

I don't understand one thing. According to man bash, =~ compare two expressions, doesn't it? However, if we are just matching two strings, why == doesn't work?
–
yzTMar 13 '14 at 9:40

Why the echo ... | while read where you could have done for i in ...?
–
Stéphane ChazelasMar 13 '14 at 10:58

Your script isn't just sitting on its hands! It's probably waiting for a resource other than the CPU; perhaps it's manipulating lots of files and waiting for the disks, or sending lots of stuff over the network and waiting for that.

Look at the aggregate resource usage lines at the top of top and you'll see something like

The us number is the amount of CPU actually being used by the logic of your script (or other things that it calls). sy and wa are CPU time spent in system calls and waiting for I/O. If these numbers are high while your script is running, and there's no other activity on the system, then something other than the CPU is your bottleneck.

Your script is probably using more than 3% as explained by Flup but you are also wasting a lot of time spawning a copy of openssl every time you want to create a password (which is probably the main reason for the missing cpu time). If you have a multicore/threaded machine you will also not be able to stress more than a single thread of execution with your script so you will be using 1/max_number_of_threads of possible CPU time.

You will be better off implementing the code in something other than a shell script, that can natively do the encryption for you. Your openssl command is incomplete so my suggestions are a bit vague, but generally perl, python and ruby will have modules that implement the standard stuff (like scrypt and bcrypt)

I need to use shell script. It's not for "production", it's for knowledge purposes. maybe a -crypt is missing on the command, but anything else. And -crypt is the default I think.
–
yzTMar 12 '14 at 11:43

1

Oh.. you're running openssl passwd then. Running the openssl binary charset^5 times is always going to take a while. If you want it to complete quicker, you'll need to do it differently.
–
MattMar 12 '14 at 12:15

yea, indeed passwd was missing :D. That's what my teacher want me to do, modify the script so it is quicker, and I thought about increasing the resources usage. What other solutions do I have? I've tried to use a dictionary with every possible combination, but it lasts the same as when bruteforcing.
–
yzTMar 12 '14 at 12:54

@yzT, one invocation of openssl can process as many passwords as you want if you use -stdin (and newline is not part of your charset).
–
Stéphane ChazelasMar 12 '14 at 13:00

If you have a multicore/threaded machine you could run multiple copies as well.
–
MattMar 12 '14 at 13:01

In addition to the various implementation optimizations in @slm's answer, there is a huge algorithmic optimization to make. (There may also be some cryptanalytic attacks, I'm not sure how strong the UNIX crypt algorithm is, but that is probably beyond the scope of a shell script)

The problem you're trying to solve was probably described as "brute force a password". In order to do that, you generated all possibilities in alphabetical order, and tested them, exiting at the first match. If you model a password as "random string of 5 characters", then this is actually optimal.

A few people will use an actually random password. A few, not most. Even for things where they really ought to. Especially for things they don't consider important. We've had this confirmed recently and repeatedly by numerous leaked password databases.

The password you're looking for could be dbKbuW. But it's many, many times more likely to be qwerty, abc123, or 123456. We know from the leaked databases that the top thousand or so most common passwords will crack most accounts.

So the real speed up is to generate your guesses in a better order. At minimum, sort your letter-lists by frequency in English (or, better yet, in passwords). You should be trying e a lot earlier than z. Run through a list of 1000 or so common passwords first, before even starting your brute-force iteration across the search space (wastes a little time if it doesn't work, but depending on your users it'll work well over half the time).

If you look at how actual password cracking tools work, they run through a few lists of common passwords, then through a dictionary, then through variations (e.g., "change e to 3"), then through various password generating strategies, and only if none of that works, then finally start iterating through the search space.