2018 Data Storytelling Studio @ MIT

Redrawing Borders using Facebook Likes

People often feel a connection to their community/town/region, manifesting itself most ostensibly in the support of a sports team. While the borders between cities, counties, and states are often arbitrary, social media lets us see how our allegiances truly lie around this country. This New York Times visualization shows which MLB teams have the greatest fraction of Facebook likes within a zip code. Aggregating the data literally paints an interesting picture, in some cases redrawing borderlines around the nation. The creators of this visualization overlay team colors onto the map, creating a heat map of allegiance to each team.

Created for the statistically-minded sports fan, this visualization attempts to show where each team has developed a following. The authors focus on the boundaries between rivals or metro areas, to support the hunches of baseball fans about where one team’s turf ends and another begins. Diving into the data, we can see how certain teams, such as the New York Yankees have expansive influence well into Connecticut, but surprisingly also around Las Vegas, North Carolina, and Montana. Oppositely, teams like the Oakland Athletics or New York Mets are unable to capture the zip code around their own stadium.

This is an effective visualization because it elaborates on an important question: what are the distinct regions of the US. The sample size for the data is large and the results both plausible and surprising. The graphic is relatively easy to read and guides the viewer to some of the interesting contrasts, like between the LA Angels and Dodgers or the Cleveland Indians and Cincinnati Reds. The flaw with this visualization is that it hides information that I would like to see. For example, the color shading doesn’t show anything about which team is 2^nd most popular. While the percentage of supports for each team is provoking, it would be helpful to see how many supporters are recorded for each team, or at least the sample size for each zip code.

This visualization is aesthetically attractive, surprising, and also plausible enough to be credible. To take it to the next level I want to see more information overlaid about the 2^nd place teams, how other sports compare, and how other measures of cultural influence interact with fan heat map for America’s favorite pastime.

Source: https://www.nytimes.com/interactive/2014/04/23/upshot/24-upshot-baseball.html

Visualizing how the American diet contributes to California drought

“Your Contribution to the California Drought”

This data presentation by the New York Times is a depiction of how much water California uses to grow various foods that are part of the American diet. The goal is to show how our consumption habits contribute to a drought in California and vice versa, how a drought in California can alter our eating habits. They also seem to want the reader do something about it, judging by their usage of “your” in their title “Your Contribution to the California Drought.”

They show California export and import data as well as agricultural data. Their target audience appears to be American consumers, perhaps more specifically whoever is stuck with the household chore of going to the grocery store. They show not only images of produce that California exports that require the most gallons of water to grow, but also ones that require the least. The images of foods coupled with the number of gallons of water required is simple, but very relatable. This simplistic depiction makes some of the data more shocking and surprising (i.e. that it takes 42.5 gallons to produce 3 mandarin oranges). Furthermore, showing the food as cut up or on a plate, instead of using some online graphic, makes the data easier to digest because it is how food is generally presented to us. However, they only have a quote about how the drought will cause consumers to pay more or switch to other products, which isn’t a very compelling call to action. It seems like there is another data story that they skimmed over with that quote, a story about how much food prices will go up etc. Overall, I think they are effective in conveying how the American diet contributes to the California drought, but does not do a great job explaining what one can do. I had a “so what?” feeling at the end.

About the Data Storytelling Studio

We are swimming in data – “Big” and small, global and personal. And we are also facing complicated problems like Climate Change and inequality whose stories can only be told with data. The need for public understanding of data-driven issues is higher than ever before. But raw data doesn’t make a good story… and that’s where you come in.

This class is focused on how to tell stories with data to create social change. We will learn through case studies, invited guests, examples, and hands-on work with tools and technologies. We will introduce basic methods for research, cleaning and analyzing datasets, but the focus in on creative methods and media for data presentation and storytelling. This is NOT a data science course, nor is it a data visualization course, nor is it a statistics course! We will consider the emotional, aesthetic and practical effects of different presentation methods as well as how to develop metrics for assessing impact. Over the course of the semester, students will work in small groups to create “sketches”, each using a different technique for telling a data-driven story. Think about a sketch as a half-realized project; where you have implemented just enough of the most important details of the idea in order for us to understand your vision. A sketch is NOT a fully realized presentation of a data story. For the final project, students will have the chance to expand on one of these sketches to create fully realized presentation of a data-driven story. Why is this called a studio? This course has few lectures and lots of group project work time.

The course is open to all technical levels and backgrounds. We will prioritize students with a strong background in one or more of the following areas: journalism, software development, data analysis, documentary, visual and performing arts.

As part of the Boston Civic Media Consortium, the course will have a special focus on Climate Change data. Most examples will use data related to this topic, homeworks will be related to it, and sketches and final projects must be connected to it as well. We will take a broad view of that topic, including everything from traditional datasets about the warming oceans, to data on migration caused by the effects of climate change.

Learning Objectives

Students will learn techniques for finding a story in data, building a basic set of tool-assisted data analysis skills
Students will build things that tell data-driven stories with a rich set of digital and non-digital tools, online and offline
Students will practice arts- and rhetoric-based approaches to telling data-driven stories
Students will learn to connect data stories to meaningful, situated social action
Students will learn basic techniques for measuring the impact of data-driven storytelling
Students will learn basic ethnographic and anthropological approaches to identifying and researching audiences

Logistics

Spring 2018 Semester
Category: Hass Arts
Units: 3-0-9
Meets Tuesday & Thursday from 11am-12:30pm in room E15-341
Undergrad (CMS.631) and Graduate (MAS.784/CMS.831) meet together
Admission is by permission of the instructor – please fill in this quick enrollment survey

Faculty

Instructor: Rahul Bhargava <rahulb@mit.edu> Research Scientist, MIT Center for Civic Media

Faculty Sponsor: Jim Paradis – Robert M. Metcalfe Professor of Writing and Comparative Media Studies