Author: Zbynek Stara,

Interactions with the final version of my final project are presented below:

Zen is a zen-garden simulation that invites users to slow down and relax during the hectic time of final exams and projects. It does so by presenting an outline of the user in a field of flowers (thank you Lama for making such beautiful plants!), and allowing one to wade through. If one moves too much, the color changes from green to yellow to red, which has consequences on the growth of the plants in the garden.

Yellow state – “slow down”

Red state – “take a break”

In the red state, the plants do not grow, and touching them with one’s hands or feet causes them to wither. It is only when the user slows down and reaches the green state that they get to experience the reward – planting their own flowers for everyone to see!

If one stays calm for a short while, pink lotus plants spawn in the regions occupied by what the Kinect detector sees as people. These can be picked up by people’s hands, and planted in the garden.

If one stays calm for a slightly longer while, a purple plant sprouts in one of the person’s hands; the person can plant those as well! As can be seen from the video, these plants are easier to plant (because the person does not need to actively pick up the plant with their hand).

I am very happy that I was able to implement most of the changes requested during user testing! Planting was a big part of the challenge, and I had to rework the majority of the code to make it possible; after making it happen for the purple plants however, adding the lotus plants was very easy.

Additional signifiers were added to tell people what they should try doing with their plants – this helped explain the interaction better, but I am afraid people still did not have enough patience to wait and see what happens. Also, markings were added to the floor that showed people precisely where they should stand in order to be seen by the Kinect camera (blue area), and precisely where they should plant their flowers (green area with a plant symbol). This proved fairly intuitive.

Unfortunately, I did not have enough time to implement the wind-like effects that would bend the flowers with people’s movement. To make the interaction more intuitive, I added the functionality to shrink flowers when they come in contact with a user’s hand. This is not precisely an intuitively expected interaction, but it did engage the users and showed them that there is something that can be done with the visualization, that it is not just a static arrangement of flowers.

I was struggling with interference from people behind my detection area. The other visualization was too close and the Kinect was mistakenly detecting those users as my users. This was a problem because the visualization beyond my detection area had a very long interaction – so people, once detected, would not be un-detected unless the Kinect was forced to forget them (by me blocking their body with my body). The problem was alleviated a little by tweaks to the code that ensured that only one person would be tracked by the Kinect at a time (which was difficult because the library code I was using did not work as expected), but the visualization still required constant surveillance on my part – which is obviously not ideal. I realized too late that I should have requested a blanket or a screen to prevent background people from interfering with the visualization…

Nevertheless, I am pleased with the end result, I think people liked it much more than they did during user testing, and I think they appreciated the ability to leave their mark for others to see in the visualization.

The code is presented below. It is the longest I have written for this class, overcoming even the CM Visualizations project.

I asked two people who were not familiar with my final project to test it and to offer feedback on what worked and what did not.

When it comes to my first tester, I received the following feedback:

The first thing he tried to do (not in the video above) was to run his hand through the flowers, expecting them to react (e.g. bending). Nothing happened, which left him disappointed.

He wanted more interaction with the hand flowers – he tried to plant them in the ground, but nothing happened.

He also suggested removing the stems of the hand flowers, to make them look more natural.

He wanted more body flower colors, to engage the users (“Let’s guess which color will grow next!”) while they are in the calm state.

He said that he was waiting for something to happen; and that the waiting was good with a chair, but frustrating when standing.

He also remarked that the Kinect silhouette looks ugly and should be smoothed.

Additionally, at first, he stood too close to the screen – which meant that he missed out on the wandering-through-the-field interaction.

Me second tester had additional feedback:

Her first attempted interaction was to run her hand through the background flowers.

She also tried to plant the hand flowers into the ground.

She attempted to pick up one of the body flowers with her hand, and move it elsewhere.

Like the first tester, she did not notice the stomping interaction. She also stood too close to the TV.

Unlike the first tester, she said that the color scheme is nice, and that no new colors need to be added to the body flowers.

She suggested keeping score – e.g. how many body flowers did one manage to grow, how long one was calm, how many flowers were planted.

To address the concerns and suggestions, I plan to do the following for Wednesday:

Mark an interaction area on the ground to signal to users how far away from the TV they should stand.

Implement the planting interaction – this is relatively straightforward, and was one of the top expected interactions.

Allow body flowers to be picked up and planted – similar to the first point, it is simple to implement and would add a playful element to the project, without detracting from the narrative of growth-from-calmness.

If I have time, implement the interaction between the hands and the background flowers – this is the number one expected interaction, but it is more difficult to implement. Also, it encourages people to move around more, which might be distracting from the purpose of the project.

I decided not to alter the color scheme and shape of the flowers since one of my testers liked it; I also decided against keeping score for people’s calmness, because it contradicts the calm-down spirit of the project. Smoothing out the silhouette was deemed too difficult and not too necessary for the quality of the visualization (especially since people’s feet will be hidden by the ground flowers).

The Crystallic visualization transforms live video frames into a grid of interconnected areas of distinct colors.

Input frames are sampled at every nth pixel, in both the x and y dimensions (where n is a pre-set constant number, for example 7). The selected pixel’s color is compared to a list of 26 colors; the closest color among the options is identified. Then, the algorithm considers the neighbors of the sampled pixel (where neighbors are n pixels away from the sampled pixel in each dimension). If the neighbor has the same identified color, a line is drawn between the two pixels. This produces white-space boundaries between the distinct color bands identified in the frame.

It is possible to change the look of the visualization by selecting a different set of clockwiseExtraX and clockwiseExtraY. The different values in these two arrays represent the different neighbors to consider. By removing some values, the visualization considers fewer neighbors.

The visualization can thus be modified to have a square pattern

a slanted-squares pattern

or a drawing-like, diagonal-line pattern

Furthermore, changing the sampling distance changes the granularity of the visualization. This produces a more modern-art look:

However, reducing the sampling distance slows down the visualization; if near-real-time responsiveness is desired, it was determined that the values should not be reduced below 7.

An additional problem concerned the choice of colors in the color palette. Originally, the visualization used only 9 colors – all the combinations of 0 v. 255 for RGB values. This led to visualizations that featured too many flat surfaces; the banding effect was too extreme. To increase the variety of colors, the set of HTML/CSS named colors was considered instead. However, since this palette contrasted the “extreme” colors (using only 0 and 255 in RGB) with two non-extreme colors (orange, and rebeccaPurple), the two non-extreme colors proved closest to too many sampled colors. The result was an over-abundance of purple in the output.

A solution was to return to the constructed palette, increasing the number of different combination to 27 by adding a third RGB color level. Thus, again, each palette color should have an equal slice of the sampled color space. This was still not optimal, however:

There was an overabundance of gray in the output visualization in bad light conditions (which means, basically, all the time), causing the person’s face to blend with the background. Removing the gray color from the palette proved to be an appropriate solution to the problem; thus, the final number of colors in the palette was reduced to 26.

I liked Golan Levin’s overview of the techniques for computer vision. His exposition allowed me to look at the complex problem with new eyes, and made me realize that simple algorithms may be used for a complex effect – frame differencing, background subtraction, color tracking, and thresholding; all of which we have mentioned in class. At the same time, I liked Levin’s mention of the state-of-the-art techniques, and everything in between. I felt like that provided perspective to the field and showed me that despite its accessibility, computer vision can also answer some complicated questions. (Consider the question of gaze direction detection – not only does it require tracking of one’s pupils; the orientation of the face in 3D space is also required, as is some notion of depth in the field of view.)

I learned the most from Levin’s emphasis on the importance of physical conditions when using computer vision. His insistence that the assumptions of the different algorithms be taken into account when designing the interactive art piece made me realize how prevalent these problems are. At the same time, it illustrated how impossible-to-solve software questions (e.g. how can I know whether this dark spot in the frame is a person’s hair or a black area on the background wall that just happens to be next to the person’s head?) can be solved by preparation of the scene (e.g. perhaps just use a green screen behind the person. Or the person can be illuminated by sharp light and stand in front of a black wall.).

I have one complaint about the article – despite all of its talk about bringing a fresh, artistic, set of perspectives to computer vision, four out of the six examples revolve around surveillance. Although it is an important topic – and perhaps very natural, given the fact that computer vision systems must necessarily use a video-recording device – I would have appreciated to be exposed to more variety, to get my creativity going in more directions than just surveillance.

When it comes to computing, I have to say, I like IM more than regular CS. It is so much more satisfying; it is literally enough for me to make an LED to make me happy! It is so much easier to make me feel like I actually accomplished something. I actually feel happy when playing with wires and LEDs and buttons. I would not claim that the class has made me a better person, but I think that the experience that I got in this class will be very important for me in the future, especially since data visualization is a very hot topic right now.

For this week’s assignment, I decided to go back to Assignment 9, and make a custom controller that allows the user to control how the visualization looks like.

I wanted the controller to be easy/ergonomic to use – I used flat cables so that the user does not need to weave their hand through to reach a button; I also bent the resistors down so they would be out of the way.

The nine function buttons are color-coded as much as possible given the limitation of four colors available in the lab. There are three paired controls (region/center region toggle – green, outlines/center outline – red, stats/speed stats toggle – yellow) and three individual buttons (links, person, hitpoints display modes switches – each with a different color).

In addition, the controller has four potentiometers in the upper left corner. These allow the user to change the four dynamic variables that affect the visualization at runtime: number of people, number of colors, speed of simulation, and the amount of “ghosts” (object traces).

The dynamic changing of ghosting is the biggest change from the original program – there used to be four distinct settings with set amounts of transparency applied. Now, the user can use the fourth potentiometer to choose from ten different settings, for a more customized visualization.

It is important to note that the original functionality – using the keyboard to interact with the visualization has been preserved. Thus, one can still use number keys to change settings, and function keys to change the dynamic variables. Importantly, the function of spacebar (refresh the board), and of the letter keys (add a new person at the position of the mouse cursor) have not been replicated on the controller.

I liked the point the authors made about the difference between Waze and older GPS navigation systems. Both of those technologies are relatively new (20o0’s vs 2010’s) and yet they are world apart. That is because of the big data revolution made possible by the proliferation of Internet-connected devices. Not only can we know precisely where we are in the world (GPS) and what roads to take (GPS navigation), we can now pool every other user’s position and use this to make accurate map that reflects current traffic (Waze).

The authors are arguing that it is precisely that third element that drives the digital revolution. There is a good reason why there is a flood of startups being founded right now in the technology market, and it is not only because of the lack of barriers to entry. No – it is because about every single market and service can be disrupted by pooling data and knowledge from everyone else.

This trend also shows no signs of stopping. It made a big impression on me when the authors pointed out that we are just one step away from running out of SI prefixes when it comes to the amount of information exchanged on the internet, and how quickly we got there. (Watson’s database, after all, takes up “only” 4 terabytes – I have 25% of that in my laptop right now! And that is 200 million documents, including the whole Wikipedia!)

It is incredibly exciting to live in this era of infinite possibility! Just think of all the hard problems computer scientists were trying to solve for the past 60 years that became possible thanks to this big data revolution: Speech recognition with Siri, image recognition, beating the best human in Go, self-driving cars. Video recognition is fast becoming a reality too, and we are talking about a problem that makes speech recognition seem like a piece of cake. When put into this sort of context, Waze does not actually seem all that exciting anymore – that is how amazing the technology has become! With these sort of tools becoming a reality soon, who knows what kind of products people will build that seem like sci-fi to us right now…

Capulets and Montegues is a game in which several rival families struggle to control a city. A wealth of additional options allows the game to be transformed into an interactive art piece, too, though – all within a few key presses! Oof though; this assignment turned to be much bigger than I expected, but I am very pleased with the result.

The basic game setup looks like this:

Settings 110112113 (more about those later)

There are 10 groups of people (class Person in the code) in the city, identified by the color of their clothes, starting out from random positions. They have a few desires in life – they do not want to be too close to the edge of the map, they do not want to be packed too closely and they do not want to be near to their enemies, but they do want to get as close to their friends as they can while working their way towards the center of the city. The game makes them move in accordance with these motivations according to a set of weighting variables.

The region each Person controls is identified by a correspondingly-colored Voronoi cell. The most important district of the city is the central one: the citadel. Holding it gives points to the closest Person around (first number in the top-left stats rectangle); that is why the color of the central region is more saturated and why it is encircled by a wall.

The colored lines between people show interactions: enemies hit each other while friends give each other a small health boost (he health of each Person is shown by the little number above their head). That is why people tend to hang out in groups in this game! For it to be easier to defend, the citadel gives a small extra health boost to current owner.

To prevent any one group from snowballing out of control, the game has maximum and minimum limits for the number of group members (the second number in the stats rectangle). A balancing algorithm reduces the allowed maximum for the group that is the current leader in score (“Leading” in the fourth column), and reduces it even further for the current holder of the center. On the other side, the current score loser is allowed to have more members. When a group has more members than its allowed quota (“Over” in the fifth column), all members of the offending group suffer severe health decreases.

As people die (from crossing the population quota, succumbing to enemy attacks, being squeezed too close to the edge of the screen or to other people), the game respawns them at random position, potentially as members of another group. If there is a group with fewer members than what its minimum limit dictates (“Under”), it is selected as the recipient of this reborn Person. If all groups are at or above their minimum limits, the algorithm selects the group with the lowest number of points that is not yet maxed out; ties are broken in favor of the group with the fewest members (“Next”).

There are a few ways to interact with the game, although it can evolve on its own in pleasant ways. The most straightforward one first: pressing the spacebar respawns all people on the map.

The most fun way to interact with the game, though, is to play favorites – when a key is pressed that corresponds to a group (first column in stats), a member of that group is spawned at the mouse cursor’s position (replacing a randomly chosen existing person). Spamming group members at the center can quickly propel that group to score leader status! (Even though the members may quickly despawn if their numbers are over the group’s maximum limit.)

In addition, several keys are used to adjust parameters of the simulation. The -/+ keys change the speed of the simulation (lower right corner of the screen). </> keys change the number of people in the simulation; meanwhile, the [/] keys adjust the number of group/colors in the simulation (between 1 and 10).

I am proud that I was able to figure out a way to change those last two variables at runtime – almost everything in my program depends on them staying constant during each simulation step. This is not the case between simulation steps, however. So, I can afford to replace the Swarm class’s personArray with a completely new one with a new number of people or group colors in that time.

Changing the number of colors turned out to be easy. Removing a color respawns any members of that group as a different color. Adding a color did not even require any intervention on my part – the balancing algorithm notices that the added color is underrepresented in the sample and spawns new members automatically.

Adjusting the number of people was more difficult but also ultimately solvable. All people from the old array get copied over during the process, except the ones for whom there is no more room – those are discarded. Conversely, if additional people are necessary, they are spawned as new randomized instances of Person. There need to be at least 10 people in the simulation. There is no explicit upper limit on the number of people, apart from available computer resources (the Voronoi and Delaunay calculations slow the simulation a lot) and screen space (from some point, the people start overlapping, despite the built-in repulsion value).

I was not satisfied with these interactions, however. I wanted more. Having put so much work into the game’s core algorithm, I saw that it was held back by the visuals. I started moving away from the original concept by using different links between people. Instead of showing the health effects, I connected neighboring people of the same color with their corresponding Delaunay link:

Settings 1100221113

The people came next. Even though I love the simple humanlike visuals (with random hair color, height, and width!), in visualization, simpler is better. Enter dots:

Settings 1100211113

Let’s face it, though, the numbers are quite ugly. Also the stats rectangles. And while we are making this into an interactive media visualization, why not remove the dots as well? Stunning:

Settings 1100200003

Or keep the hitpoints instead of the lines? I feel like that makes it look like some sort of military map:

Settings 1100001003

That effect is much more pronounced, though, if we also remove the region colors (I still like to highlight the center region though):

Settings 0100001003

Okay, that might have been too conceptual – let’s roll back. What if we used the Delaunay links on their own?

Settings 0100200003

Okay, let’s do one more step back and add the dots back in. We get a map of the night sky! (I love this one.)

Settings 0100210003

All of these visualizations were made possible by the last kind of interactivity: visualization options. Pressing the keys 1234567890 changes whether and how different elements of the visualization are displayed. It is possible to:

The last setting was the one that truly revolutionized what is possible to do with the visualization – object trails. It became possible to use the different visual components to construct beautiful interactive art. There is definitely some desktop wallpaper potential!

The code is presented below. There are two classes: Person (data and methods for individual people in the simulation), and Swarm (data and methods for the collection of people). Also notice the wealth of customizable variables – adjusting everything from default settings and options, through min/max member limits, distance from boundary/others, health effects, score effects, to movement weights:

For this assignment, I decided to replicate a simple loop-based algorithm from Computer Graphics & Art, the SNEKAD:

Obviously, the original consists of discrete cells with blocks of horizontal or vertical lines. I made that an integral part of my design with a double for-loop – one for each row, and one for each cell in a row.

The probability that a block has horizontal lines increases as the column of the cell increases – the more we go to the left, the more horizontal lines we get. I made that a part of the design, as well:

The number of blocks in a row (26) and in a column (36) corresponds to the original graphic. In order to fit this many blocks onto my screen, I had to make the blocks relatively small – 24 pixels to the side. (This also explains the weird dimensions of the window – 624 = 24*26 and 864 = 24*36.)

Unfortunately, that meant that I was not able to approximate the wood texture that seems to be a feature of the original graphic (although it is also possible that it was just an artifact of scanning). It is impossible to assign non-integer line thicknesses via strokeWeight in Processing – and the jump from 1 to 2 is too noticeable to produce a nice pattern.

I had a small issue when trying to match the probability distribution of the original image – while the first column never had any horizontal blocks, the last column always had a few vertical ones. This imbalance was due to the fact that the probability of a block being horizontal depends on the leftmost x value of the block. Thus, while the leftmost column always had a probability of 0 for horizontal blocks, the rightmost column always had a probability other than 100. Changing the multiplier to 110 instead of 100 fixes the issue.

I like the fake 3D feeling of the blocks, caused by the blocks’ trailing white space. I considered removing it by tweaking the block size or the spacing, but I realized that its presence makes the experience more interesting; in the end, I decided to keep it.

The code is presented below. Each block is generated with the elem function. The window regenerates on mouse press. Additionally, the program is adjustable, with all important variables presented as constants. This allows the user to tweak the look of the output as they want; I particularly like a version with large spaces between lines:

It has a very different feel from the original graphic – it looks like some sort of a taxonomic hierarchy, a flowchart with a lot of branches. It is amazing how a small change in two variables completely overhauls our perception.

Far from explaining the concept of new media, I feel like Lev Manovich only managed to muddle my understanding of what new media is, what it should be, and what it is not.

I do not think that the definition of new media as “anything that was generated with an aid of a computer” (I am simplifying, of course) captures a sufficient or a necessary condition. I fail to see how “being able to be described with a mathematical function” is so revolutionary a concept, and, nevertheless, physical and analogue objects of old media can be “translated” by finding an appropriate function to describe them. A professor for my J-term class (Wasting Time on the Internet) argued that everything on a computer is text, for example – because everything digital can be changed into a .txt file and read as text. However, I considered that to be such a arbitrary and useless observation. (We could similarly argue that everything digital is a number…) I am not convinced that it matters that a digital image can be translated, modified, or “programmed” easier than an analogue image. It is still an image, fundamentally! A picture is a picture is a picture. So what?

Conversely, even if we accept the criteria for new media, I fail to see how one can argue that analogue film and photography do not belong in the category that digital films and photography do. New media is not the only thing that can be manipulated algorithmically. One can manipulate analogue media, too – it is just more tedious. Similarly, one can argue that these borderline old media are modular, too – we can take out a still, and copy it, can’t we?

Ultimately, I do not think that the distinction between new and old media should be a binary one, but rather it should be a spectrum. And, anyways, I am not convinced that the difference is so fundamental that we need a distinction, at all!