By: Laura Schewel on October 30th, 2014

Print/Save as PDF

Big Data + Small Data for Smart, Sustainable and Equitable Cities

Retail  |  Smart Cities

I just wrapped up a few days at the always-impressive VERGE Conference. This post summarizes the most important messages I drew from a panel discussion including StreetLight Data, StreetWize, and the City of San Francisco’s Neighborhood Empowerment Network. The focus of the conference was “Resilient Cities,” and the focus of the panel was the interaction of sustainability, social justice, and technology.

To create cities that are equitable, pleasant, resilient, safe, sustainable, and every other adjective we want to append to our cities, we need both Big and Small Data. The combination of Big and Small Data particularly matters when we confront the intersection of climate change and equity.

First, let’s define “Small Data”: for this post, “Small Data” means data that captures the voice of individuals. For example, asking the local homeowner, who knows exactly where storm water tends to pool on her block may be the best way to figure out where to put storm controls. Technology can magnify Small Data. Apps like StreetWize or SeeClickFix act as enzymes, reducing the time and energy it takes to make those voices heard. This barrier reduction is especially powerful for communities with fewer financial and advocacy resources, for which the traditional channels are especially onerous.

Now on to Big Data. For the context of this post, “Big Data” means data that is created at massive scale, usually by being “thrown off” as a side product from another system. For example, cellular networks delivering an SMS or phone call to the right phone at the right time initially created some StreetLight’s Big Data. StreetLight takes that data and, after anonymizing and aggregating it, up-cycles the data to create new metrics about how different populations (with different incomes) go to important places like the grocery store, or work.

Now that we’ve defined the terms somewhat lengthily, here are their relative strengths in the context of creating sustainable and equitable cities.

Big Data's Strengths

  • Big Data covers many people at once. That large scale means that nationally-standardized measured can be made in a few instants and at low marginal cost. Such scalability is an oft-sought after, not-so-oft found asset for Smart City solutions.
  • StreetLight makes sure that our Big Data sources are fully representative as possible of the entire US population. Some Big Data solutions can unintentionally exclude certain populations during data collection, because they do not engage in the types of originating activities that Big Data is designed to capture. For example, the elderly are far less likely to use geolocational features of a Smart Phone, even if they have a Smart Phone. Thus systems that collect spatial data from Smart Phone apps may under-represent the elderly. Recent research has indicated that this type of systemic Big Data exclusion can lead to policies that are unintentionally biased in favor of those whose data is captured.  For this reason, StreetLight captures data from all types of phones and works hard to correct any residual under-representation.

Correlated Weakness

  • The massiveness and power of Big Data creates a correlated weakness—what I’ll call the “illusion of Big Data infallibility.” Billions of data points, and gorgeous maps can be dazzling, and are certainly buzz-wordy. This means that they can be given more credit and credence than perhaps they are always due (whereas Small Data is often given less credence and credibility than it’s due). Big Data like StreetLight can do remarkable things, but it can’t answer every question and we can make mistakes!

Small Data's Strengths

  •  Small Data’s strength is that it leverages the most powerful computer ever built—the human brain. Humans are extraordinary sensors, we capture vibe, emotion, intention, opinion, insults, and irregularity in ways impossible to replicate by machines.

Correlated Weakness

  • The human time and energy that must be spent to create good Small Data limits its scalability (both within one community and to new communities). New technology can reduce the time and energy burden, but can’t eliminate it. Relatedly—Small Data tends to be hard to compare between communities and projects. This means that the wisdom created by Small Data projects can be difficult to share and adapt to new environments.

Of course, these strengths and weaknesses are complementary. Wise cities will take advantage of Big Data and Small Data, fitting the best solution to the problem at hand and combining them to correct for each others biases.

I think this data dyad resonates especially loudly in rooms where the discussion centers on climate change and equity because these issues ricochet between the massive and the hyperlocal. A carbon dioxide molecule enters the atmosphere from the tailpipe of a car and immediately becomes ubiquitous, fungible and literally invisible. It will, with countless numbers of its colleagues, affect a family living by the ocean 50 years in the future 5000 miles away. That is BIG.

And simultaneously, that molecule was released by a decision to drive to work today, made by one under-paid Oakland woman, forced by rising housing process to live 15 miles from her job. Her decision weighed the cost of gas for her 21 mpg car, the middling transit system connecting her home to her work, and how safe she feels walking home from the bus at 1AM. At StreetLight, we hope that by collecting data about that commute and synthesizing it with billions of other records, we can support the creation of better transit systems so she doesn't take that drive in the future. 

These problems, deeply connected, are simultaneously massive and minute in scale. The data techniques we use to measure and eventually solve them must also simultaneously be massive and minute. 


travel metrics