Posted
by
Soulskill
on Friday June 27, 2014 @08:31AM
from the grab-your-binoculars-and-go-code-watching dept.

An anonymous reader writes "Many people reading this site probably have a functional understanding of how algorithms work. But whether you know algorithms down to highly mathematical abstractions or simple as a fuzzy series of steps that transform input into output, it can be helpful to visualize what's going on under the hood. That's what Mike Bostock has done in a new article. He walks through algorithms for sampling, shuffling, and maze generation, using beautiful and fascinating visualizations to show how each algorithm works and how it differs from other options.

He says, "I find watching algorithms endlessly fascinating, even mesmerizing. Particularly so when randomness is involved. ... Being able to see what your code is doing can boost productivity. Visualization does not supplant the need for tests, but tests are useful primarily for detecting failure and not explaining it. Visualization can also discover unexpected behavior in your implementation, even when the output looks correct. ...Even if you just want to learn for yourself, visualization can be a great way to gain deep understanding. Teaching is one of the most effective ways of learning, and implementing a visualization is like teaching yourself."

It's ironic you post this on this particular article. It highlights how the placement of receptors in your eye effects how you see the world. What makes you think your own in-built bias (apparently culture-jamming) is any less than anyone else?

I drive many other programmer's batty because when they ask me for help the first thing I do is "survey the scene", the code surrounding their point of interest, rather than listen to anything they think or *know* about what is happening. Once I have my bearings, a

A group I was working with had this strange phenomenon where their windowed machine learning algorithm would just crap out on certain training sets.

A few weeks later, I was presenting my results and casually mention that, hey, the dataset i got is just outright missing 20% of the data, but there's still enough to illustrate the results, next slide. One of the leads asks, "wait. why didn't anyone notice this?"

I had assumed that it was just my dataset. Nope, turns out that the data just wasn't there at all (t

I'm sure this is fun cool stuff to play with but... I'm pretty sure my imagination is still better than what a computer screen can show me. "Playing computer" is one of the first practices most new programmers learn, and if you're good at it, it is one of the most powerful tools in your arsenal.

Here's hoping "kids these days" don't skip out on the importance of programmer's imagination over these new fangled tools.

Actually, after reading the article, I'd call what he's doing extremely good basic engineering and model design/view. It's very cool for the problems he presents, but looks like a ton of work. To be generally useful, it seems he'd have to come up with rules of thumb and generalizations for what it is typically important to see/understand in a given algorithm and a way to identify *what* to model/visualize that isn't completely subjective.

I think the overall point he's making is that visualizing an algorithm's behavior can offer us better insight, faster, vs. just looking at our code and our error logs. I'm sure there are ubermensch programmers out there that never have their programs exhibit unexpected behavior, and always understand exactly why a test fails, but I'm not one of them.

I encountered this firsthand when I spent a couple of days trying to write a simple algorithm to detect clipping on an oscilloscope output. We had a secondary

> Once you get familiar with the tools to make these kinds of visualizations, it can become very straightforward to develop one for your specific use case.

That's the thing, we need a Visualization for Understanding 101 class in comp-sci or something similar.

I guess I had scientific modeling in physics as part of EE but over half the focus was on how to gather/deal with the physicalities of real-world data, which isn't so important when you're modeling something which lives inside a computer to begin with

Note: I thought it was obvious reading digital output as analog (or merely hooking together input to output on two sensitive instruments) is always going to cause a lot of artifacts and distortion. You don't chain 2 microscopes together and expect to get twice the magnification with no problems...

Maybe my description wasn't clear. These were two oscilloscopes, reading from the same source in parallel. One scope was looking at a high resolution, set to trigger on anything above noise, but with something like a 1V maximum amplitude (hence clipping at 1V). The second was set to trigger at anything over 1V, and had a much larger view window, so there was no data lost to clipping.

The motivation here was to get detail data while simultaneously making sure to capture the full amplitude of big signals. N

I've always visualized what's going on, this is how I do everything. Doesn't everyone program this way? Think about something for a while, building up the model in your head, then visualize the interactions among the parts. If the problem is too complex to visualize, then I simplify and add abstractions until I can visualize it. This iteration naturally creates simplified abstract layers in my code.

I debug most of my code this way also. When I'm internally visualizing stuff, I lose track of what's going on around me. I wouldn't be surprised if a brainscan would show activity in my visual cortex.

First the light is, and then I see the yarns and the strings that link them. Then the strings start vibrating and I can hear them talk to each other. Then I hear the races in the polyphony and I reach my arms out to the strings. I sip from the source and smell the algorithms. It is only then that the notes come to my fingers.

I can't speak for anyone else, but I'm definitely this way. When I get to work in the morning I return to the massive subjective machano-city in my head where the repository lives and plug it in to the objective reality that is Mercurial. Much of bug-fixing is ferreting out the differences between the two. Reading code is removing the fog of the unknown in my head and writing is creating new streets, buildings, and even neighborhoods. (Well... You could hardly call it something so terrestrial, more like

I'm not surprised, but it's interesting how you describe your visualization doesn't quite match how I "see" them. I would think it to be a cool topic to categorize different types of visualizations and finding correlations with who knows what.

Assuming I'm taking this the correct way, I find there to also be a "fog" of the sort when I work with stuff. I would describe my visualization as a "living" yet static picture. Different "parts" of the picture interact with other near-by parts, yet nothing actually

Holy crap, you just described in perfect detail the way my brain works. I've never really been able to describe it, but you just did it perfectly... even right down to your past experiences in school. I particularly likes how you describe it as a living, yet static picture. The best I've always been able to describe it is; "I have a lot of RAM, but a slow CPU." Someone should make a myers-briggs-like classification specifically for engineers (predominately INTP/INTJ types?) . There are clearly some common t

I identify with your school experiences, although maybe not in as extreme a way; I could generally muddle through rote memorization to a certain degree, but my retention was terrible. The understanding was left behind when the specifics faded away...

Anyhow, for me, the "picture" exists, but it's more tactile than visual. There are visual aspects, but it's not how I process most of the information. Loops are spinning wheels when they don't have a clear exit condition, and feel like unrolled spirals when the

when the animations end, the resulting mazes are difficult to distinguish from each other. The animations are useful for showing how the algorithm works, but fail to reveal the resulting tree structure.

Sure, I visualise how algorithms work, otherwise I can't code them. But the difference in maze generators struck me. Because, as quoted, they're difficult to distinguish from each other.

Other things, such as sorting algorithms, are fairly obvious though... but still nice to see, again.

That's not what the article is about. It's about generating graphical visualizations of what your code is doing on a computer screen, not generating internal visualizations of what you think your code is doing in your head. In fact, I'd guess that most of the value of the former comes from noticing the ways in which it's different from the latter.

Hmm... Get much performance out of your systems that way? Procedural methods have a time and place, and have given us most of what we have now. Functional methods have given us... A pale, slow, anemic cousin of what procedural has done? (that can't fail in-theory, much like the titanic...)

Honestly, garbage collection did way more for the industry than any other sea-change... Here, was some scut work we could freely let the computer do completely for us. (At least when any kind of hard time-keeping does

Like many things in physics, it's really a discrete signal at a rate so high as to be continuous for all practical purposes. If you could get your sampling rate ridiculous enough, you'll start to detect individual photons.

I don't think he meant the existence of a photon at a certain time, but to actually be flooded with photons, like from a spot light, and still be able to distinguish the individual photons.

Some place, like MIT or something, was playing with a 1 trillion FPS camera, and by processing the individual photons, they could see around corners to a certain degree. They could use non-polished surfaces like a minor.

This is interesting work, and very well presented no doubt. But it shows why your PhD guru is making you spend seemingly unreasonable time doing literature surveys. At first glance this work seems to be very close to solution adaptive meshing techniques used in computational physics.

There's no claim to novelty for any of the algorithms. If you scroll down you'll also see imagery of quicksort: stop the presses, this was invented in 1960. The point is the graphical presentation of existing algorithms. This is all made very clear in the prose, and in the summary, and in the title. So well done Captain Blowhard, top marks for your knowledge of a somewhat related domain (although what you're talking about really has fuck all to do with Poisson-disc distributions) but minus several milli