Taking the Long View: Give Big Data Long Context

The era of big data, the term used to describe data sets so large and complex that it is difficult to comprehend, is alive and well. Marketers use data to glean insights into consumers, while political pollsters use large data sets to categorize and target voters.

In business, the big data craze is equally strong. In human resources, executives leverage human capital analytics to measure and make workforce decisions.

But big data also has its limitations. According to Samuel Arbesman, senior scholar at the Ewing Marion Kauffman Foundation, an education and entrepreneurship research foundation, there’s another class of data worth paying attention to: long data.

Arbesman, an applied mathematician and network scientist, says that the long data concept will play a major role in the future analysis of human existence and other subjects with deep histories.

And as businesses continuous to collect vast troves of data, long data is likely to become a valued tool to study trends that affect every organization’s workforce, products and services.

Talent Management spoke with Arbesman on the concept. Following are some edited excerpts:

What is long data, and how is it different from big data?
Big data, as we’re all familiar with, is just vast amounts of information — whether or not it’s data from cars, from your cellphone or from medical devices. And the idea is by using this large amount of data you can extract anything you want. It’s been very, very powerful. The problem, though, with big data is that often even though it’s very rich and might cut across a wide cross-section of humanity and various situations, it doesn’t have a long time scale.

So for example, let’s say you’re trying to understand how people move around or how people interact with each other. And so you’re using cellphone data, you’re using big data from mobile phones. You often only have maybe data over the course of days, weeks or months, or if you’re lucky maybe a year or two. The problem is, though, that that’s essentially a snapshot of humanity; you don’t actually see people, interact or do what they do over swaths of time. And so when I think about long data, I think about trying to see sort of these long patterns over vast reaches of time relative to human civilization.

So it’s really big data with more of a time context to it. Is that one way to think about it?
Right. And the truth is some of these long data sets are not particularly big, so they might actually have the same number of data points as traditional big data. That being said, most of big data, in order to make it useful we have to make it medium data, otherwise it’s just too overwhelming. You have to find a way of reducing the dimensions, and I think with long data you might not have the same number of data points, but you want to have data over a longer scale. Ideally you want both data that is big and long, but long data is pretty much this argument that we need to think over longer time scales than we traditionally think of.

How might a human resources executive apply this concept? Has there been enough accumulated data in recent history for companies to be able to apply long data?
I think we’re getting there. I think once you have the question you want to ask, sometimes the data might be more available than we might realize. When it comes to human resources you might be able to say, in what way has technology changed how we hire and retain employees? And people say, historically, whenever there’s a large technological change, it introduces these massive displacements of personnel.

For example, when the automobile came around, all of the buggy whip manufacturers went out of business. If we can actually find the data of how technological changes impacted people who were hired by various companies, or maybe even fired by various companies, I think maybe then it could help place a context on human resources, but also then help us plan for future technological changes.

A lot of HR executives grew up in an area where the data rush didn’t exist. What would you advise them on how to approach this subject?
Well, I think I wouldn’t be too hasty. I think when people hear a lot about big data I think there’s initially this rush to get as much as data as possible in the hopes that meaning will all rise out of it. The goal ultimately is to figure out what are really good questions, and if you have a good question that you want answered, then often the correct sort of data set will appear — you will then realize what sort of data set you need to gather or what data you need to acquire.

So ultimately it’s much more about the types of questions you want answered, and once you can figure that out more clearly then I think it’s a lot more manageable. And people who maybe didn’t grow up with big data or are maybe new to information science and data science, I wouldn’t view that as immediately a strike against you.

First, gain a sense of familiarity with the space — sort of recognize what is possible. But then after that don’t immediately rush in and try to gather as much data as possible. Make sure you know the kinds of questions you want to answer. I think having that perspective can be much more powerful than simply gathering as much data as possible.

Frank Kalman is an associate editor at Talent Management magazine. He can be reached at fkalman@talentmgt.com.