Homework 1 – Survey

Homework 2 – Two interesting visualizations

Find two visualizations of data from a domain that is of interest to you (health, politics, finance, economics, world demographics, sports, etc etc etc). They can be information presentations (static) or information visualizations (interactive, just take a screenshot to include). For each visualization, do a one page write-up. Each one-pager should have the visualization (about half the page) and then:

What variables (things like population, cost, RBIs, GDP) are represented in the vis?

What message(s) the visualization is intended to convey?

A critique of each – pros and cons – how “good” is the vis in conveying the message of part 2? This is clearly a subjective judgement, there is no right or wrong, I want you to get in the habit of critiquing visualizations.

Bring hardcopy to class, I will ask each of you to share/discuss with neighbors and may ask some of you to show your visualization and share your pros and cons with the class. No T-Square turn-in.

Think of this assignment as a first step in identifying a potential project, which is why I suggest that you be interested in the type of data being portrayed.

Homework 3 – Create a visualization with GapMinder

Use GapMinder to create a visualization. Do NOT use the default data of life expectancy and GDP per capita and population; explore the types of data in the drop downs along each axis and in the drop down in the Size area at lower right, and choose variables of interest to you. (Bubble sizes can be bound to a variety of data, not just population.)

Prepare one page with a screenshot of the visualization AND a bullet list of insights/understandings that can be learned from your visualization. The longer the list, the better. An insight/understanding is NOT equivalent to a data point (the sales of Ford cars was 3 Million in 2011) but rather something that comes from seeing multiple data points (All car companies are selling more cars, but Ford has been gaining market share at the expense of GM and Fiat).

Bring hardcopy to class, I will ask each of you to share/discuss with neighbors and may ask some of you to show your visualization to the class. No T-Square turn-in.

Homework 4 – Data Source and Sketch

Locate a data source on the WWW with data on a topic of interest to you – could be related to what you found for HW2, or something else. Give the URL and a couple of sentence description of the data.

Sketch (pencil sketch is fine, no need to be fancy) an Information Presentation to represent the data. It should be more than just a bar chart or pie chart. It should encode at least 4 or 5 variables (possibly but not necessarily using multiple linked views).

As you do this, continue to think in terms of an interesting project, remembering that you will soon need to do a 30-second “elevator pitch” to the entire class about a project idea!

List the variables that are being represented, their data type, and the visual encodings you are using AND the number of cases in the data set (assuming it is multivariate; if not info about the tree or network or other more complex data schema) AND the interaction methods you would use to turn the Info Presentation into an InfoViz.

Trouble finding data? Go to the resources tab on this site, and scroll way down. Or, Google “census data” or “economic data” or “xyz data” where xyz is any subject of interest to you, and you will likely find some interesting data!

Bring hardcopy to class (no more than two pages, it all might fit on one page). I will ask each of you to share/discuss with neighbors and may ask some of you to show your visualization to the class. No T-Square turn-in.

Homework 5 – Timeline

Create an Information Presentation (not Visualization) of your life, intended for a potential employer. School, work, skills/expertise, extracurricular activities. You can sketch with pencil on paper, or use Photoshop or similar computer-based tool.

Write a paragraph describing some ways in which you could make this into an InfoVis.

Submit on T-Square as a pdf, and bring a hardcopy of the visualization (not the paragraph) to class on the due date to share with your neighbors and show to class.

Homework 6

Write a short essay question for the upcoming test AND your answer to the question. No T/F or multiple choice questions! You will earn more points for questions that have to do with understanding/explaining than regurgitation! (“Name the four data types” is a regurgitation question; “Why do we care about different types of data, like nominal, ordinal etc. and give an example?” is an understanding question); “Here is some sample data, how would you visualize, and justify your choices” is a really good question because it requires synthesis of multiple ideas. The most creative and challenging questions that require both understanding and synthesis will receive full credit.

If you have a question that is like someone else’s that is already posted, then make up a new question. I will discuss selected questions with the entire class as part of our test review.

Post your question and answer on Piazza in the HW6 folder by 8pm the day before the test review session. After 8pm I’ll add comments to the questions.

One or two of the submitted questions is guaranteed to be on the test!

Homework 7

Use and critique Tableau – an Information Visualization System that does not require programming. This assignment will familiarize you with a full-featured InfoVis system – Tableau – which will be demonstrated in class.

The goals of the assignment are for you to learn the capabilities provided by Tableau (it is one of the best commercial systems), learn the basic visualization methods that it provides and assess its utility in analyzing data.

You can write the report on this homework by yourself, or you can do it with a partner (which I encourage, it will be more fun and you will learn more). If you write with a partner, you will both receive the same grade. You may ask others for help with downloading and figuring out how to use Tableau. The paper and its ideas should be developed by you or by your two-person team.

2. Examine the data sets – Browse several data sets to decide which one to use for the rest of this assignment. Decide on one, and then use the system to explore it further.

3. Develop three interesting questions about the selected data set – put yourself in the shoes of a data analyst, and think about all the different kinds of analysis tasks that a person might want to perform. For instance, someone working with breakfast cereal data might have analysis tasks like:

• Find all the information on Cocoa Pebbles.

• Identify the cereal with the least fat that is also high in fibre.

• What is the distribution of carbohydrates in the cereals?

• Does high fat mean high calories?

• Which of the following three cereals is best for people on a diet?

Do NOT make all of your questions be about correlations or min or max values.

4. Write a report – Part 1 – List your three questions and answers, along with a screen shot showing the visualization you used to answer each question. One page per question – screen shot and narrative. Each question should be answered with a different visualization – so three different visualizations (and not just different data overlaid on a map as can be done in Gapminder). Part 2 – Critique the system. What are the system’s strengths and weaknesses? For what kinds of user tasks is the system particularly well suited? Focus more here on the visualization techniques as opposed to the particular user interface quirks, though you should feel free to comment on UI aspects when they are particularly good or bad. Describe characteristics of the UI using the concepts and terminology you have learned in class. This second part should be close to 2 pages.

Submission: Your document should be in PDF format and is limited to a maximum of 5 pages, no cover sheet. Use Times Roman 12 point type with normal margins, 1.5 line spacing. Submit the paper via T-Square. If you worked with a partner, only one of you needs to submit it to T-Square, but ensure both partners’ names are on it.

Homework 8 – Test 2 Question

Write a short essay question for the upcoming test AND your answer to the question. No T/F or multiple choice questions! You will earn more points for questions that have to do with understanding/explaining than regurgitation! (“Name the four data types” is a regurgitation question; “Why do we care about different types of data, like nominal, ordinal etc. and give an example?” is an understanding question); “Here is some sample data, how would you visualize, and justify your choices” is a really good question because it requires synthesis of multiple ideas. The most creative and challenging questions that require both understanding and synthesis will receive full credit.

If you have a question that is like someone else’s that is already posted, then make up a new question. I will discuss selected questions with the entire class as part of our test review.

Post your question and answer on Piazza in the HW8 folder by 8pm the day before the test review session. After 8pm I’ll add comments to the questions.

One or two of the submitted questions is guaranteed to be on the test!

Programming Homework 1 – Setup GitHub

For the first programming homework, we will need you to set up a GitHub account. You will be using GitHub to submit source code for the other programming assignments, and to share with the teaching assistants.

Step 1: Go to https://github.gatech.edu/ and create an account by using your Georgia Tech credentials. The user name will be the local part of your official email address (burdell will be the username for burdell@gatech.edu)

Step 2: Create a repository on GitHub. Provide an easily understood and logical name for the repository; for example, cs-4460-fall-15. Make sure your repository is private. Give read access to the TAs – (lxu315, sreshta3, qhou6, xyuan39). Additional help on creating a repository can be found here – https://help.github.com/articles/create-a-repo/

Working with D3.js

D3.js is the Javascript InfoVis toolkit we will use for the programming assignments. Download it to your personal computer; this will allow you to code and test your work without an active internet connection. Go through the following short tutorial on the fundamentals and set up of D3.

Programming Homework 2 – Diving into D3

This homework assignment is relatively simple. You will make modifications to the code from the Introduction to D3 lecture. This will help you gain familiarity and a stronger grasp on D3.

This assignment requires you to –
(1) Read data from the provided CSV file instead of a JSON file (Link to data file)
(2) Draw a bar chart for the provided hypothetical data using D3

Here are a few specifics that you need to work on –
(1) Make the SVG background color #cfcfcf
(2) Adding the following style conditions for the SVG rectangles.
If the average GPA is less than 1, make the rectangle red.
If the average GPA is 1 or more, but less than 2, make the rectangle orange.
If the average GPA is 2 or more, but less than 3, make the rectangle yellow.
If the average GPA is 3 or more, but less than 4 make the rectangle blue.
If the average GPA is 4 or more, make the rectangle gold.

(3) Adding the following style conditions to the bar text labels.
If the average GPA is less than 1, make the bar label’s color white.
If the average GPA is 1 or more, but less than 2, make the bar label’s color black.
If the average GPA is 2 or more, but less than 3, make the bar label’s color black.
If the average GPA is 3 or more, but less than 4 make the bar label’s color gold.
If the average GPA is 4 or more, make the bar label’s color black.

(4) Add black borders to the rectangles

The resulting chart, should look something like this –

Upload the assignment to your GitHub accounts. We will use the latest copy that was uploaded before the deadline for grading.

Programming Homework 3 – Filtering and Animations

For this assignment, you will build on what you learnt in the “D3 Deep Dive” lecture, and create a more detailed filtering mechanism. The code from lecture can be found under T-Square Resources.

Add a filtering mechanism on the page to allow the user to select a Department(CS, MATH, MGT…).
We recommend using drop-down menus (aka HTML Select).

Add a text-box to enter a numerical value for the GPA.

Add a Filter button on the page (explained below).

Animate the filtering behavior, utilizing both duration and delay methods provided by D3.

Expected filtering behavior: When the user clicks on the Filter button, your code should filter by BOTH conditions given below:

The selected Department in the drop down menu.

Courses whose average GPA is greater than or equal to the value entered in the text box.

Example: If the user selects “CS” in the drop down menu and types “3.04” in the text field, your graph should only show CS courses with a GPA greater than or equal to 3.04.

Warning: Be wary of corrupt data in the data set, and make sure to handle them appropriately in your code.

Submission guidelines: Please create a folder (e.g. PHW3) in your 4460 repository on GitHub with all the files necessary to run your visualization. We will use the latest copy that was uploaded before the deadline for grading.

Programming Homework 4 – Linking

The last programming homework incorporates all the concepts discussed so far. You will need to –

1. Use the CSV data set on Cereals. (Link to data file)
2. Using filtering, preprocess the data and get rid of rows with negative values (i.e. Don’t modify the data file).
3. Draw a bar chart and a scatter plot side by side, on the same page.

On the bar chart, display the following the Manufacturer column on one axis, and the average Calories for that manufacturer on the other axis.

On the scatter plot,
1) Display Calories on one axis.
2) Display Sugars on the other axis.
3) Pick your own color scheme, and encode Manufacturer data using colors.
4) Add a key for the graph denoting which color represents which Manufacturer.
5) Encode Serving Size Weight as the size of the scatter plot circles.
6) Add details on demand in a tooltip, to display the Cereal Name, Calories, and Sugars on mouse-hover.

You will also need to implement the following events –

When an individual bar on the bar chart is clicked:
1) Change the opacity of all scatter plot circles that do not belong to the clicked Manufacturer, to 25%.
2) Use a transition, with duration and delays while affecting the opacity.

When an individual scatter plot circle is clicked:
1) Change the color of bars whose average Calories count higher than that of the clicked circle.
2) Use a transition, with duration and delays while modifying the color.

Submission guidelines: Please create a folder (e.g. PHW4) in your 4460 repository on GitHub with all the files necessary to run your visualization. We will use the latest copy that was uploaded before the deadline for grading. As always, make use of the code skeletons provided on T-Square. They serve as a good starting point, and you won’t have to code everything from scratch.