Posted
by
timothyon Friday July 15, 2011 @07:44PM
from the picasso-on-security dept.

Makoss writes "Normal cryptographic hash functions turn any input text or data into a compact set of bits; useful for computers, not useful for humans. Visual hash functions turn data into graphical representations which are more easily recognizable and memorable to humans. You've seen Identicons and other simple geometric image generators already, but Vash takes the technique beyond basic geometry and produces some really striking images."

Unless the name is grossly misleading, "hash" implies one way, by design.

With a suitably poorly designed hash algorithm, it may be possible to extract certain outputs; but that's a bug, not a feature(also, assuming the hash produces outputs of some limited size and accepts inputs of size bounded only by your computational resources and patience, as they tend to, it is easy to see that it cannot be reversible in general because the set of possible inputs is vastly larger than the set of possible outputs,

After downloading and reading bits of the docs (but not the code), it appears that it hashes the data you give it (SHA-1 or MD5) and builds the graphic based on the hash.

(You specify the hashing algorithm by a parameter, and, no, they don't recommend the parameter that specifies MD5. I didn't read far enough to guess as to why the parameter is not the name of the algorithm.)

So, since it appears that not every geek here is familiar with hashing (Huh?), I'll point out the obvious: The hash does not give enough information to reproduce the original data. (But what about very short inputs, like passwords, which they, erm, suggest?) Also, since the hash is cryptographically hard, reversing it is rather difficult even if you can afford to search through the pseudo-reversion set.

.md is the extension for MarkDown, which github automatically turns into pretty html.

(You specify the hashing algorithm by a parameter, and, no, they don't recommend the parameter that specifies MD5. I didn't read far enough to guess as to why the parameter is not the name of the algorithm.)

The algorithm string also controls the node frequencies of the guided random walk that builds the function tree. Different algorithm specifiers can give you wildly different looking images. At the moment, it just changes the hash function, but future versions will add new node types and need a way to those parameters to generate backwards-compatible images.

Might want to give a nod to the node-walk in the readme, and you may want to add sub-parameters, including, not just a set of good node frequencies or other parameters of the node walk, am I making sense?, but also, perhaps, a salt file to help work with short input sets like passwords and passphrases and account numbers and such.

(Started to add PINs to that list, but now I'm thinking that's a really bad idea. On the other hand, this sort of thing suggests t

... and you may want to add sub-parameters, including, not just a set of good node frequencies or other parameters of the node walk, am I making sense?

I've already got an issue in the tracker for that:-). I want to make the full suite of low-level parameters tweakable through a special algorithm string, so that people can make their own unique stylings using Vash's generator, without having to do any coding.

"Hash" the word doesn't imply much of anything; but any function with for which the set of legal inputs is larger than the set of possible outputs must be one-way in general(there may be a number, potentially a very large one, of special cases that are reversible; but it cannot be reversible in general). Most functions described as hash functions fall into that category. They may well have other problems, like making it quite easy to construct inputs that yield a desired output, or having a set of reversibl

Well, to be secure, you'll want to hash your data with a standard hash algorithm, and then submit the hash to this "vash" thing. Who knows, it might actually be useful, once the actual hashing algorithm is separate from representation.

What's too bad is that the system is (comparatively) slow: having a new vash computed on the contents of (say) a password field after each keystroke would make entering passwords under error-prone conditions(such as touchscreen keyboards, or pitiful human weakness) much, much, easier without being nearly as insecure as the prevailing "show the last character entered until you enter the next one" scheme.

Since humans are pretty good at visual recognition, they'd pick up that the picture was 'wrong' after a

(Just to save anybody the trouble of correcting me, it would be the case that the sequence of images thus generated would make brute-forcing the password much easier: If you knew the sequence of N images, with M legal characters for the password, you'd only have to do N rounds of between 1 and M guess-and-checks, rather than between 1 and N^M guess-and-checks... So, it would be grossly unsafe to let an attacker have accurate copies of your image sequence. It would still, though, be substantially less unsafe

If you hashed each character as it was entered, that would be a pretty serious issue(it would basically just reduce the whole thing to a computationally pathological substitution cipher, which would be pretty pointless); but if you re-hash the entire contents of the password field every time a character is added or removed, you merely(if one can use the word to describe such a gigantic loss...) suffer the loss of complexity from N^M guesses to N rounds of M guesses...

Don't think so. The whole idea of the "visual hash" is to make a long, complicated hash value into something easier to memorize. As a consequence, you're not memorizing the equal of 128bits (or more, for longer hashes) worth of data when you memorize the picture. So brute-forcing a "collision" is much easier to do when you're trying to duplicate a picture than if you were trying to duplicate an exact hash output. You just need something that looks sufficiently like the target image, which won't be nearly as

SSH has been generating random ascii art with ssh-keygen for ages now, using basically the same principle. Besides, for most users, this should be way more secure than throwing a raw hex string at them and expecting them to manually diff it against their paper copy.

No cryptographic hashes, as opposed to any other type of hashing, of which there are many (error correction checksums, etc.), are specifically designed to make a predictable change in input string result in a very unpredictable change in hash output.

If you are not doing anything special to filter the images, it's not terribly difficult to find a duplicate. If Alice is concerned about her security, she would do well to check every bit of her fingerprint twice. If Alice is my grandmother, on the other hand, I would be lucky if she even glances at the fingerprint at all, much less verifying it. In short, the point of Vash is to augment existing security mechanisms to make them more accessible to an audience with less understanding of public key cryptog

No idea how useful it is going to be. One application I can see is feedback whether you typed in the right password/passphrase before it gets stored (to prevent giving one system the password for a different system). A second one is giving feedback about password/passphrase correctness, again before processing.

I did a couple of things like this back in the mid-90's. One used iterated fractals. I think the original idea was by Ian Goldberg, and I added the coloring.

http://www.tastyrabbit.net/visprint/

But I wasn't satisified by the fact that lots of different hash values produced similar-looking images, so I also cooked up one that had a guarantee that a single-bit change in the hash led to at least a single-bit difference in the image, and came up with these snowflakes:

http://members.shaw.ca/dlakwi/snowflake/snowflake.html

Could be this is a better and slicker implementation than any of this stuff, but the underlying ideas are not quite new.

From a quick look at the example hash images, it looks like the code is just randomly choosing placement, coloring and alpha levels of predefined graphic elements. For instance, almost every image I saw had an image of a flower-like object.

While this does make for unique and more pleasing-to-the-eye images, I doubt that humans would feel confident in picking out their unique hash among similar others. The graphical elements themselves would have to be generated via an algorithm for the images to feel truly unique ("feel" being determined by the limitations of human visual processing and pattern recognition abilities).

One of the potential uses listed on the Vash FAQ is to recognize changes in crytographic keys for security purposes. I don't know enough about how the code generates the images to know whether a minor change in the key would generate a completely different picture, or merely move over the flower a little to the left and change the red to a bit lighter hue. If the latter, most would be hard-pressed to spot any difference at a quick glance.

Perhaps having the algorithm also add a unique animation sequence would help make these visual representations more identifiable to users. If a flower's rotation suddenly goes from 6 RPM to 60 RPM, that would be a much quicker tipoff that something has changed.

After playing around with the "try it yourself" input, it does seem that the generated images differ quite drastically from small changes in the source text. So monitoring changes in keys seems a plausible use.

Individual bit flips will lead to wildly different images.
What's actually going on behind the curtain is: the input data gets run through SHA512, which seeds a Mersenne Twister, which drives a guided random walk over a collection of input nodes to build a tree. Random elements (like position and color) are only a very small part of the algorithm. However, as you note, we really do need more terminal elements, and that is one of the main things I want to do for version 2.

Random elements (like position and color) are only a very small part of the algorithm.

Tell me more of these non deterministic "random" elements; How exactly can you produce a deterministic resulting image?

Might I suggest you use pseudo-random elements instead? Additionally, I suggest that you take a look at turtle graphics [wikipedia.org] -- You can then easily produce distinct shapes with wildly different "spiro-graph like" images... [wikipedia.org]

Finally, I would suggest creating many significant improvements and then stopping. You know what makes Cryptographic Hashing functions useful? The fact that, given the

Finally, I would suggest creating many significant improvements and then stopping. You know what makes Cryptographic Hashing functions useful? The fact that, given the same inputs, version 0.9alpha and version 99-nonnillion point oh of a SHA-512 implementation will always generate the same results.

This is the purpose of the algorithm string. The same algorithm string and data pairing will produce the same images, regardless of Vash's version.

However, as you note, we really do need more terminal elements, and that is one of the main things I want to do for version 2.

o_O Stop.

I'm sorry, perhaps I took the above statement to mean you would, in fact, be changing the output of Vash in a future version...
Let's face it -- The visuals do need to be improved, do so significantly, then stop.

Perhaps I'm totally off base here, and the currently released version of Vash isn't actually considered a version that your future versions of Vash will have their output compared against... (In which case I'd like to apologize for our discrepancy in versioning terminologies)

If you are building a system that uses hashes that cannot upgrade cleanly when the hash function it is using is compromised, then your system has a serious design problem. Our documentation notes that you should store the algorithm identifier alongside the hash because not doing so is simply wrong, no matter what hash function you are using, visual or otherwise.

Perhaps having the algorithm also add a unique animation sequence would help make these visual representations more identifiable to users. If a flower's rotation suddenly goes from 6 RPM to 60 RPM, that would be a much quicker tipoff that something has changed.

I agree the images seem unremarkable and not very memorable. How about using the hash to walk the space of facial parameters, generating character faces instead of curves.

It's amazing how many Mii's one can recognize and remember. Use 2 different hashes and generate a male, female pair.

But that's why it's using a cryptographic hash first (SHA-512). The process is text -> SHA-512 hash -> image. Even if the image generation algorithm might only move the flower a little to the left if you change one bit of the hash, it doesn't matter. Because if you change one bit of the text, the hash will completely change, and so will the image. There is no computationally feasible way to change one bit of the hash.

Vash makes extensive use of structure, intensity, and position in its image generation. Despite its visually striking and distinctive impact, color plays only a small role in differentiating between Vash images.

I'll try to put together some specific thoughts on *why* I think VisualIDs did it better, and what the issues with Vash are, in the morning. For the time being, I guess I'll just throw that out there as a conversation-seed and let people Google it....

Part of it is that, going by the examples, this statement in the Vash FAQ is just flat-out wrong:

How does Vash work for color blind or other visually impaired individuals?

Despite its visually striking and distinctive impact, color plays only a small role in differentiating between Vash images.

Rather, it shows that the intent is right but the execution has failed: that no two images are differentiated only by being coloured differently is good, but that the shapes composing a given image are defined ent