Putting It Together: Statistics
The following is excerpted from "The Trials and Tribulations of Data Visualization for Good" by Jake Porway.
The trials and tribulations of data visualization for good
“I love big data. It’s got such potential for storytelling.” At DataKind, we hear some version of this narrative every week. As more and more social organizations dip their toes into using data, invariably the conversation about data visualization comes up. There is a growing feeling that data visualization, with its combination of “engaging visuals” and “data-driven interactivity”, may be the magic bullet that turn opaque spreadsheets and dry statistics into funding, proof, and global action. However, after four years of applying data-driven techniques to social challenges at DataKind, we feel that data visualization, while it does have an important place in our work, is a mere sliver of what it takes to work with data. Worse, the ubiquity of data visualization tools has lead to a wasteland of confusing, ugly, and sometimes unhelpful pie charts, word clouds, and worse. The challenge is that data visualization is not an end-goal, it is a process. It is often the final step in a long manufacturing chain along which data is poked, prodded, and molded to get to that pretty graph. Ignoring that process is at best misinformed, and at worst destructive. Let me show you an example: In New York City, we had a very controversial program called Stop and Frisk that allowed police officers to stop people on the street they felt were a potential threat in an attempt to find and reclaim illegal weapons. After a Freedom of Information Act (FOIA) request by the New York Civil Liberties Union (NYCLU) resulted in the New York Police Department (NYPD) releasing all of their Stop and Frisk data publicly, people flocked to the data to independently pick apart how effective the program was. The figure below comes from WNYC, a public radio station located in New York City. Here they’ve shaded each city block brighter pink the more stops and frisks occurred there. The green dots on the map indicate where guns were found. What the figure shows is that the green dots do not appear as close to the hot pink squares as one would believe they should. The implication, then, is that Stop and Frisk may not actually be all that effective in getting guns off the street.![Map of New York City, against a black backdrop. The Burroughs appear in shades of purple (majority area) and pink (smaller areas scattered in the city) to reflect number of police stops per block. Small dots of green, hard to see, note where guns were found during police stops.](https://s3-us-west-2.amazonaws.com/courses-images/wp-content/uploads/sites/1141/2016/12/28174610/Screen-Shot-2016-03-15-at-9.19.55-AM.png)
![Map of New York City against black backdrop, which is a smaller subset of previous map. Compared to previous map, this one also shows hues of purple and pink, though the pink is much more prominent in the map overall. Green dots to show guns found during police stops are significantly bigger and more prominent across the map.](https://s3-us-west-2.amazonaws.com/courses-images/wp-content/uploads/sites/1141/2016/12/28174826/Screen-Shot-2016-03-15-at-9.20.05-AM.png)
![Heat Map of The Bronx. A key shows shades of orange, from light to dark, to reflect number of police stops. Green circles indicate guns found during police stops; bigger green circles reflect more guns found in an area.](https://s3-us-west-2.amazonaws.com/courses-images/wp-content/uploads/sites/1141/2016/12/28175052/Screen-Shot-2016-03-15-at-9.20.17-AM.png)
- Humans don’t make decisions based on data, at least not alone. Plato once said “Human behavior flows from three main sources: desire, emotion, and knowledge.” I want to believe he listed those aspects in that order intentionally. Study after study has shown that humans rationalize beliefs with data, not vice versa. If behavior change were driven by data and graphs alone, we would be 50 years into a united battle against climate change. Conversely, we will leap to conclusions from data visualizations that “feel” right, but are not rigorously tested, like the conclusions from the Stop and Frisk images above.
- The public still treats data and data visualization as “fact” and “science”. I believe the public has gained enough visual literacy to question photojournalists or documentary filmmakers’ motives, aware that theirs is an auteur behind the final piece that intends for us to walk away with their chosen understanding. We have yet to bring that same skepticism to data visualization, though we need to. The result of this illiteracy is that we are less critical of graphs and charts than written arguments because the use of data gives the sense that “fact” or “science” is at work, even if what we’re doing is little more than visually bloviating.
- The data or visualization you see at the end of the road is opaque to interrogation. It is difficult, if not impossible to know where that “58%” statistic or that flashy bar graph came from, grinning up at you from the page. Because we don’t have ways to know how the data was collected, manipulated, and designed, we can’t answer any of the questions we might want to raise above. If point 2 means we need to treat data visualization as photojournalism, then this point implores us to go further to requiring forensic photographers in this work.
Celebrating Visualization
No surprise, creating data visualization well simply entails designing in a way that leads people to make scientific conclusions themselves. There are many examples of insightful, persuasive, and downright clever data visualizations, but perhaps one of the best visualization practices I know of is to turn the idea of visualization on its head. Data visualization is incredibly good for allowing one to ask questions, not answer them. The huge amount of data that we have available to us now means that we need visual techniques just to help us make sense of what we need to try to make sense of.So where do we go from here?
First off, you can boycott the tyranny of pie charts and word clouds, rail against those three pitfalls, and share these last two examples far and wide. But I think we can also all go out and start thinking about how data can truly be used to its fullest advantage. Aside from just using “data for machines,” the best data visualization should raise questions and inspire exploration, not just sum up information or try to tell us the answer. Today we have more information than ever before and we have a new opportunity to use it to mobilize others, provided we do so with sensitivity. Now, more than ever, we need to all be out there on the front lines looking beyond data visualization as merely a way to satisfy our funders’ requirements and instead looking at data as a way to ask deep questions of our world and our future.![](https://s3-us-west-2.amazonaws.com/courses-images/wp-content/uploads/sites/5362/2020/04/21003253/Stop-S2021.png)
Licenses & Attributions
CC licensed content, Original
- Revision and Adaptation. Provided by: Lumen Learning License: CC BY: Attribution.
CC licensed content, Shared previously
- The Trials and Tribulations of Data Visualization for Good. Authored by: Jake Porway. Located at: https://digitalimpact.io/the-trials-and-tribulations-of-data-visualization-for-good/. License: CC BY-NC: Attribution-NonCommercial.