Assignments

Homework Assignments (HW)

These individual assignments will help you develop your knowledge for design principles for Information Visualization. For each of these, the deadline to submit your work is by the start of class on the day they are due. Unless otherwise described, the submissions must be submitted to t-square.

The grading distribution is broken down as follows. The sum of your assignments count for 25% of your total grade, broken down as:

  • HW1: 1 point
  • HW2: 2 points
  • HW3: 1 point
  • HW4: 3 points
  • HW5: 1 point
  • PHW1: 1 point
  • PHW2: 3 points
  • PHW3: 3 points
  • PHW4: 5 points
  • PHW5: 5 points

Homework 1: Survey

Complete this programming background survey. Nothing to submit on t-square. Only submit 1 response per person.


Homework 2: Project Pitches – Data Source and Sketch

Locate a data source on the WWW with data on a topic of interest to you –  could be related to what you found for HW2, or something else.  Give the URL and a couple of sentence description of the data.

Sketch (pencil sketch is fine, no need to be fancy) an Information Presentation to represent the data. It should be more than just a bar chart or pie chart.  It should encode at least 4 or 5 variables of the data (possibly but not necessarily using multiple linked views). Use captions to describe what the visuals in your sketch represent. E.g., what are the nodes, what are the edges, what does size mean, what mappings exist between color and categories in your data, etc.

As you do this, continue to think in terms of an interesting project. You will turn in this assignment for your classmates to see, and “pitch” your idea to the group. As such, include answers to questions such as:

  • What variables (things like population, cost, RBIs, GDP) are represented in the vis?
  • What message(s) the visualization is intended to convey?
  • What questions should people be able to answer using this vis?
  • A critique of each – pros and cons – how “good” is the vis in conveying the message of part 2? This is clearly a subjective judgement, there is no right or wrong, I want you to get in the habit of critiquing visualizations.

List the variables that are being represented, their data type, and the visual encodings you are using AND the number of cases in the data set (assuming it is multivariate; if not info about the tree or network or other more complex data schema) AND the interaction methods you would use to turn the Info Presentation into an InfoVis.

Trouble finding data? Go to the resources tab on this site, and scroll way down. Or, Google “census data” or “economic data” or “xyz data” where xyz is any subject of interest to you, and you will likely find some interesting data! Finding data is important! As you’ve probably found out by now, it’s hard to create a data visualization without data.

There are two submissions for this assignment. First, you will turn in your 2-3 page pdf on t-square. Second, you will post your submission on Piazza (under the HW2 folder) so that your classmates know how to get a hold of you to potentially form groups for your project. Simply paste your content into the text field (including images). At this point, you should be forming (or have formed) groups for your project. This assignment helps you do that. We will also compile all of the submitted pdfs into a single one and post it to the resources folder on t-square.


Homework 3: Test 1 Question

Write a short essay question for the upcoming test AND your answer to the question. No T/F or multiple choice questions! You will earn more points for questions that have to do with understanding/explaining than regurgitation! (“Name the four data types” is a regurgitation question; “Why do we care about different types of data, like nominal, ordinal etc. and give an example?” is an understanding question); “Here is some sample data, how would you visualize, and justify your choices” is a really good question because it requires synthesis of multiple ideas. The most creative and challenging questions that require both understanding and synthesis will receive full credit.

If you have a question that is like someone else’s that is already posted, then make up a new question. I will discuss selected questions with the entire class as part of our test review.

Post your question and answer on Piazza in the ‘Test 1 Questions’ folder by 8pm the day before the test review session. After 8pm the TAs and Instructor will begin adding comments to the questions.
Update: Please also submit your question and answer as a pdf to t-square for grading purposes.

One or two of the submitted questions is guaranteed to be on the test!


Homework 4: Try Tableau

Use and critique Tableau – an Information Visualization System that does not require programming. This assignment will familiarize you with a full-featured InfoVis system – Tableau – which will be demonstrated in class.

The goals of the assignment are for you to learn the capabilities provided by Tableau (it is one of the best commercial systems), learn the basic visualization methods that it provides and assess its utility in analyzing data.

You can write the report on this homework by yourself, or you can do it with a partner (which I encourage, it will be more fun and you will learn more). Note only groups of 2 are allowed, no larger. If you write with a partner, you will both receive the same grade. You may ask others for help with downloading and figuring out how to use Tableau. The paper and its ideas should be developed by you or by your two-person team.

The assignment has four parts:

1. Gain familiarity with Tableau – Familiarize yourself with the visualization techniques and the user interfaces during the class presentation, and via on-line videos at http://www.tableausoftware.com/learn/training

2. Examine the data sets – Browse several data sets to decide which one to use for the rest of this assignment. Decide on one, and then use the system to explore it further.

3. Develop three interesting questions about the selected data set – put yourself in the shoes of a data analyst, and think about all the different kinds of analysis tasks that a person might want to perform. For instance, someone working with breakfast cereal data might have analysis tasks like:

• Find all the information on Cocoa Pebbles.

• Identify the cereal with the least fat that is also high in fibre.

• What is the distribution of carbohydrates in the cereals?

• Does high fat mean high calories?

• Which of the following three cereals is best for people on a diet?

Do NOT make all of your questions be about correlations or min or max values.

4. Write a report – Part 1 – List your three questions and answers, along with a screen shot showing the visualization you used to answer each question. One page per question – screen shot and narrative. Each question should be answered with a different visualization – so three different visualizations (and not just different data overlaid on a map as can be done in Gapminder). Part 2 – Critique the system. What are the system’s strengths and weaknesses? For what kinds of user tasks is the system particularly well suited? Focus more here on the visualization techniques as opposed to the particular user interface quirks, though you should feel free to comment on UI aspects when they are particularly good or bad. Describe characteristics of the UI using the concepts and terminology you have learned in class. This second part should be close to 2 pages.

Submission: Your document should be in PDF format and is limited to a maximum of 5 pages, no cover sheet. Use Times Roman 12 point type with normal margins, 1.5 line spacing. Submit the paper via T-Square. If you worked with a partner, only one of you needs to submit it to T-Square, but ensure both partners’ names are on it.


Homework 5: Test 2 Question

Write a short essay question for the upcoming test AND your answer to the question. No T/F or multiple choice questions! You will earn more points for questions that have to do with understanding/explaining than regurgitation! (“Name the four data types” is a regurgitation question; “Why do we care about different types of data, like nominal, ordinal etc. and give an example?” is an understanding question); “Here is some sample data, how would you visualize, and justify your choices” is a really good question because it requires synthesis of multiple ideas. The most creative and challenging questions that require both understanding and synthesis will receive full credit.

If you have a question that is like someone else’s that is already posted, then make up a new question. I will discuss selected questions with the entire class as part of our test review.

Post your question and answer on Piazza in the ‘Test 2 Question’ folder by 8pm the day before the test review session. After 8pm I’ll add comments to the questions.

One or two of the submitted questions is guaranteed to be on the test!


Programming Homework Assignments (PHW)

These individual assignments will teach you the basic skills for developing visualizations. You are expected to complete these assignments using d3.js. The code you create will be placed into the Georgia Tech GitHub account, where the TAs and instructors will grade your assignments.

D3.js is the  Javascript InfoVis toolkit we will use for the programming assignments. Download it to your personal computer; this will allow you to code and test your work without an active internet connection. Go through the following short tutorial on the fundamentals and set up of D3.

(1) http://alignedleft.com/tutorials/d3/fundamentals
(2) http://alignedleft.com/tutorials/d3/setup

Warning: There are many existing examples and source code widely available online. While these are great resources for you to learn, note that copying these is considered a breach of the rules from the Office of Student Integrity, and will be handled accordingly. Be careful and thoughtful. Many of the assignments will ask you to start from existing source code or examples. In these cases, it is expected that parts of your assignments will resemble the original.


Programming Homework 1: Setup GitHub

For the first programming homework, we will need you to set up a Georgia Tech GitHub account. You will be using GitHub to submit source code for the programming assignments in this course, and to share with the teaching assistants.

Step 0: If this is your first time using GitHub, this link (https://guides.github.com/activities/hello-world/) should help you familiarize yourself.

Step 1: Go to https://github.gatech.edu/ and create an account by using your Georgia Tech credentials. The user name will be the local part of your official email address (i.e., burdell would be the username for burdell@gatech.edu)

Step 2: Create a repository on GitHub. Provide an easily understood and logical name for the repository; for example, cs-4460-spring-17. Make sure your repository is private. Give read access (i.e,. add us as collaborators – don’t worry, we won’t push code to your repo) to the TAs (sdas37, jayanth6, ayshwarya6, lxu315) and instructor (aendert). Additional help on creating a repository can be found here: https://help.github.com/articles/create-a-repo/

Step 3: Make your first commit on the repository. Additional help on making a commit and setting up GitHub on Windows and OS X can be found here: https://guides.github.com/introduction/getting-your-project-on-github/. For GitHub on Linux, we recommend SmartGitHg (http://www.syntevo.com/smartgithg/index.html). This commit can be an existing d3 visualization, or simply a readme file that has your name and contact info. The point is that you should be able to commit files at this point.

What you will turn in: In t-square, submit a link to your repository. In this repo, you will need to at least one file you created and uploaded. If you want, you can use code from one of the starting visualizations for your other PH assignments below.


Programming Homework 2: Make an Existing D3 Vis Better

This homework assignment is relatively simple. You will make modifications to existing d3 code and turn it into something more complex.

Start with this code:
https://bl.ocks.org/d3noob/bdf28027e0ce70bd132edc64f1dd7ea4

This assignment requires you to –
(1) Read data from the provided CSV file instead of a JSON file (use the State-GPA.csv file on t-square)
(2) Draw a bar chart for the provided hypothetical data using D3

Here are a few specifics that you need to work on:
(1) Make the SVG background color #cfcfcf
(2) Add the following style conditions for the SVG rectangles.
If the average GPA is less than 1, make the rectangle red.
If the average GPA is 1 or more, but less than 2, make the rectangle orange.
If the average GPA is 2 or more, but less than 3, make the rectangle yellow.
If the average GPA is 3 or more, but less than 4 make the rectangle blue.
If the average GPA is 4 or more, make the rectangle gold.

(3) Add the following style conditions to the bar text labels.
If the average GPA is less than 1, make the bar label’s color white.
If the average GPA is 1 or more, but less than 2, make the bar label’s color black.
If the average GPA is 2 or more, but less than 3, make the bar label’s color black.
If the average GPA is 3 or more, but less than 4 make the bar label’s color gold.
If the average GPA is 4 or more, make the bar label’s color black.

(4) Add black borders to the rectangles

Note that the code for the starting visualization has a y-axis that shows the value. In the resulting visualization, you should have the value embedded inside the top of the bar. It is up to you if you want to keep the y-axis (and double encode the value), or hide the y-axis since your dataset is small and you have the value inside the top of the bar.

The resulting chart, should look something like this:

Submission instructions: Upload the assignment to your GitHub accounts. Make a folder called ‘PHW2’ and place your source code in there. We will only use the latest copy that was uploaded before the submission deadline for grading. Download this folder and place it into a .zip archive and upload it to t-square before the submission deadline.


Programming Homework 3: Filtering and Animations

For this assignment, you will build on what you learnt in the d3 lecture, and create a more detailed filtering mechanism. The starting code and data file is in the t-square resources folder under sample code (called d3-filtering.zip).

This assignment requires you to –

  1. Read data from the provided CSV file (provided in the .zip).
  2. Using D3, draw a bar chart for the provided data.
  3. Add a drop-down filtering mechanism on the page to allow the user to select a color for the filter.
    We recommend using drop-down menus (aka HTML Select).
    Give a choice of at least 3 colors.
  4. Add a text-box to enter a numerical value for the cutoff.
  5. Add a Filter button on the page (explained below).
  6. Animate the filtering behavior, utilizing both duration and delay methods provided by D3 (as in the sample code). The bars that should be filtered out should be hidden (not visible).
  7. Add a “Reset Filter” button which clears all filters and allows the user to start over.

Expected filtering behavior: When the user clicks on the Filter button, your code should filter as follows:

  1. The selected color from the dropdown is used to color the bars that fall within the filter.
  2. Letters whose frequency are greater than or equal to the value entered in the text box turn the selected color.
  3. All other bars are hidden.
  4. Clicking “Reset Filter” should reset the filters and let the user start over (all bars should be blue again, and shown)

Example: If the user selects “Red” in the drop down menu and types “0.01” in the text field, your visualization should show all the bars that are greater than or equal to 0.01 in Red, and hide all the other bars. If the user clicks “Reset Filter”, all bars return to blue and are shown again, and the user can start over with another filter.

Submission guidelines: Please create a folder (e.g. PHW3) in your 4460 repository on GitHub with all the files necessary to run your visualization. We will only use the latest copy that was uploaded before the deadline for grading. Download this folder and place it into a .zip archive and upload it to t-square before the submission deadline.


Programming Homework 4: Improve the Treemap Visualization

For this assignment, you will develop a treemap from the following existing example:
http://bl.ocks.org/mbostock/4063582

As we talked about in class, treemaps can be generated using a number of different algorithms which change the way the space is filled. The current example uses the squarified algorithm. Also, there are two main methods of quantifying the proportion for each region (in this case, by ‘size’ and by ‘count’). The radio button toggles at the bottom let you switch between the two.

For this assignment, you will need to perform two updates to the current visualization, as described below.

Update 1: Add a radio toggle to switch between the existing squarified algorithm and the ‘slice-dice’ layout. Place the radio button below the existing “size” and “count” toggle. You will notice that the slice-dice layout has tradeoffs, and may result in a less visually appealing visualization (which is ok for the assignment). For instance, you’ll likely have to truncate many of the labels, as slice-and-dice suffers from long, skinny rectangles. Further, be sure that the algorithm works for both the “size” and “count” toggle (responds to it and shows the correct visualization). You may find these links helpful:

http://d3-wiki.readthedocs.io/zh_CN/master/Treemap-Layout/

https://github.com/d3/d3-hierarchy/blob/master/README.md#treemapSliceDice

Update 2: Add a nested header (i.e., labeled nesting). Add another toggle to select between ‘nested’ and ‘non-nested’ versions of treemaps. Use nesting to show the labels at each level of the hierarchy. The nesting should work for both toggles for ‘size’ vs. ‘count’, as well as ‘squarified’ vs. ‘slice-dice’. Your resulting visualization should look something like this (with 3 options / 6 toggle buttons):

In the above visualization, there is a unit showing the count or size of each partition. You do not have to do that for this assignment (just the label is enough). You can use different color schemes if you like, as long as they are valid and effective for hierarchical data such as this.

Submission instructions: Submit the resulting visualization (which should have 3 sets of toggles, 2 for each option) to your GitHub repo. Create a folder call ‘PHW4’ and submit all your source code there. We will only use the latest copy that was uploaded before the submission deadline for grading. Download this folder and place it into a .zip archive and upload it to t-square before the submission deadline.


Programming Homework 5: Brushing & Linking

The last programming homework incorporates all the concepts discussed so far. You will need to –

1. Use the CSV data set on Cereals. (data is in the resources folder on t-square, called ‘cereals.csv’)
2. Using filtering, preprocess the data and get rid of rows with negative values (i.e. Don’t modify the data file, do this in your code).
3. Draw a bar chart and a scatter plot side by side, on the same page.

On the bar chart, display the following the Manufacturer column on one axis, and the average Calories for that manufacturer on the other axis.

On the scatter plot,
1) Display Calories on one axis.
2) Display Sugars on the other axis.
3) Pick your own color scheme, and encode Manufacturer data using colors.
4) Add a key for the graph denoting which color represents which Manufacturer.
5) Encode Serving Size Weight as the size of the scatter plot circles.
6) Add details on demand in a tooltip, to display the Cereal NameCalories, and Sugars on mouse-hover.

You will also need to implement the following events –

When an individual bar on the bar chart is clicked:
1) Change the opacity of all scatter plot circles that do not belong to the clicked Manufacturer, to 25%.
2) Use a transition, with duration and delays while affecting the opacity.

When an individual scatter plot circle is clicked:
1) Change the color of bars whose average Calories count higher than that of the clicked circle.
2) Use a transition, with duration and delays while modifying the color.

As always, make use of the code skeletons provided on T-Square. They serve as a good starting point, and you won’t have to code everything from scratch.

Submission guidelines: Please create a folder (e.g. PHW5) in your 4460 repository on GitHub with all the files necessary to run your visualization. We will only use the latest copy that was uploaded before the deadline for grading. Download this folder and place it into a .zip archive and upload it to t-square before the submission deadline.