Project – CS 8803 – Visual Data Analysis

Description

This is a semester-long group project. I will facilitate some in-class discussions about project groupings, but you should explore ideas amongst yourselves as well. I want the teams to be balanced in terms of background and experience.

The group project for this semester will consist of building a visual data analysis prototype to use for a sensemaking task. The task is to analyze a simulated intelligence analysis dataset (one of the VAST challenge datasets, available on t-square). It’s like you get to play a detective, and are tasked with “finding something fishy” in the data.

D3 is the preferred software development tool. It is becoming the de facto standard for InfoVis programming. However, there are many other development environments that you’re welcome to use. If you choose to use something other than d3, please let the instructor know so that we can make sure it’s a good choice for you.

You are not allowed to simply use commercially-available analysis tools. These include Excel, Tableau, MS BI, Spotfire, and likely others. If you are not sure about a library or toolkit you want to use, please check with your instructor before you plan to use it.

Also, using specific devices for your projects will require instructor approval (e.g., tablets, watches, VR, etc.). However, that doesn’t mean you shouldn’t try it. These devices offer interesting interaction and display options that can make for really nice projects!

Teams sizes are set to 4. Exceptions are made to allow 5 or 3, but you’ll need instructor approval before doing so. If you slack off in sharing the workload of the project, you will receive a lower grade than your teammates. Because the project counts for a significant amount of your grade, slacking off is not a good idea. If you have the impression that a team member is slacking off at any time during your project, you should first speak with them to address your concern. If that does not help, please notify the instructor and TA.

There are two primary components to this project:
Component 1: Developing an Interactive Analysis Prototype: Your team will design and develop a prototype for interactive data analysis. Specifically, you will pick 1 visual metaphor, 1 analytic model, and at least 1 interaction. So, for example, you could pick to use a cluster, spatial layout, generated by PCA, showing relationships between documents spatially, and give people the ability to steer the PCA decomposition using a set of graphical controls (direct manipulation, dynamic querying).

Think about the methods that you will use to visualize the output of your model. You will take that information and bind it to a visual encoding, and place it in a visual metaphor. What is the visualization telling the user about the data, and about the model. Think carefully about how your chose to display specific outputs of specific models. There are many types of models out there, and many visual metaphors, so there are plenty of combinations.

For the interaction, think about what the most usable method for having users interact with the visualization and the model may be. Is it to give users direct, graphic controls over values of the parameters of the model? Is it to allow them to perform interactions in the visualization, and steer the model based on inferences computed on the interaction logs (e.g., semantic interaction, v2pi, etc.)?

Plan ahead with your overall design of the tool. You will want to help support the process of analysis. That is, your tool should help you interactively explore the data and make sense of it. As you’ll learn throughout this class, that’s a complicated high-level task. Therefore, understand that aspects of tracking your process are important, as well as other question-asking and hypothesis-forming components.

Component 2: Describe your Analytic Provenance: Your team will go through the dataset and “solve” the challenge (i.e., detect suspicious activity in the data) and generate an “analytic provenance document” that you will submit as part of Milestone 2. Given the content we talked about in class, your team will prepare a document describing your process. It should include information regarding what structured analytic techniques you used (i.e., what’s you process?), what hypotheses and evidence you found (and discarded), how you worked collaboratively (or individually), how your tool helped foster sensemaking activities, how you handled cognitive biases, etc. Your team will prepare a document (max 10 pages) that describes your process. This document should contain screenshots, pictures of things you drew on whiteboards (if you did), and information that helps answer the questions above. In general, reading this document should give us an idea of how you analyzed the dataset, and what role your prototype played in helping you do so.

Milestones

(Milestone 1) Project proposal presentation: your team will present a short, 10-12 minute presentation to the class regarding your design and implementation plan for your prototype. Your presentation should convey information about your project including: What visual metaphor have you chosen? What analytic model have you chosen? What interactions will you design for? What analytic processes will this support?

(Milestone 2) Final Demonstration and Analytic Provenance Document: Your team will prepare to demo your prototype to the class as part of your presentation. This can be a live demo, or a video with you talking over it. Prepare a scenario/use case to show how your tool was able to help create insights for this dataset. This should include explaining the views, functionality, and other information about your system to help your audience understand what it does. Plan for a 10 minute presentation/demo during class. Additionally, you will submit your analytic provenance document (described above) to t-square at this time (max 10 pages).

(Milestone 3) Video Demonstration of Prototype: Your team will prepare a short 2-5 minute video demonstrating the functionality of your system. This is intended to give the audience a short, and concise description of what your team has done over the course of the semester. Please email a link to the instructor (Alex Endert) by the 11:59pm on the due date.

Milestones are graded individually, and add up to the final grade for your project. Each milestone is weighted equally (i.e., worth 1/3 of your project).

Grading and Tips

Grading: We will evaluate the overall quality of your project, including all milestones and components as described above.

Remember that there are two major components of these projects: the tool itself, and the ability for the tool to support your analytic process. Both will be counted equally. So, that means that if you build a tool that is awesome, but it doesn’t let you do your analysis task, your grade will reflect that. Make sure you think about all the concepts we have talked about in class regarding what it means to help people make sense of data, and make sure that your tool design and implementation takes those into considerations. Please find a time to talk to the TAs or the instructor if you want feedback on early designs or ideas.

Great projects will typically require a significant amount of original work (coding, designing, reporting, video creation, bug fixing, …). Some of the code will be new, some heavily-modified, some from libraries on the web. I expect that each project have a combination of these. Functionality of great projects typically includes two or more simultaneous views of data with linking between the views, details on demand, and some means of selecting which aspects of the data set are displayed (maybe dynamic queries). Whatever interactions you provide should support users in answering questions about the data!

Poor projects will typically consist of existing code found on the web, yet simply applied to your data. The reports will be brief, and without the necessary detail to address the questions for the milestones above. The visualization design will consist of existing visualizations, where the data is simply applied to the known technique. The teams often will not form clear expectation of what team team member should do, and much of the work is put on 1 team member who is unable to complete the entire project by his or herself. The prototype will have limited to not interaction, and it is often unclear what questions a user is supposed to answer using the vis. Teams who create poor projects often do not meet with the TAs to discuss their progress, ideas, and status – thus lack guidance along the way and are surprised when major flaws are found late, resulting in a poor grade.

I want each of you to succeed in your project, so make sure you and your team do the things that “great projects” do!