The Big Data Machine
This story is from the Spring 2012 issue of OPEN magazine.
Profiles by David McKay Wilson
Illustrations by Quick Honey
Click-throughs. Tweets. Likes. Purchases. Keywords. Location coordinates. Bounces. Reviews.
Some of these terms entered the lexicon less than five years ago. But today they’re all examples of information being collected, parsed, analyzed and shared by all corners of all companies, across industries. Big Data, as this avalanche of information has come to be known, is transforming the way companies operate, from one end of the value chain to the other.
The origin of this revolution is the technological explosion that took us from floppy disks that held a measly few kilobytes of data to thumb drives that have no trouble with multiple gigabytes. The same advances that allowed for the creation of (very) smart phones enabled computers to process data by the multi-terrabyte—and fast. Suddenly, crunching oceans of numbers took no time at all.
The key is knowing what data to collect and ferreting out the answers hidden in the patterns. We spoke with three McCombs alumni who are working with Big Data to improve school districts across the country, measure influence in the Twitterverse and help Googlers—the ultimate purveyors in Big Data—help each other. And Professor Anitesh Barua, one of several McCombs scholars researching in this emergent area, puts this revolution into perspective.
The education accountability movement has inundated U.S. public schools with mounds of data. But many educators remain perplexed about how to mine that information to improve classroom learning.
Enter Sarah Glover, MBA ’00, executive director of the Strategic Data Project at Harvard University’s Graduate School of Education, where she heads up a $23 million program that aims to transform the use of education data to improve student achievement.
The project, supported by the Bill and Melinda Gates Foundation, has taken on increased importance as school districts gallop toward public education’s new frontier: evaluating teacher performance based, in part, on data from student performance on standardized tests.
“Most would agree that our focus on student proficiency has moved us collectively forward, but that’s not enough,” says Glover, 42, of Arlington, Mass. “We’ve outgrown it. Now all the effort is connecting teacher evaluations to measures of student growth in a way that’s appropriately attributed to teachers and does not account for what’s beyond a teacher’s control.”
Glover, who earned an MBA and a master’s degree in public policy in a joint program at the McCombs School and the LBJ School of Public Affairs, joined the Strategic Data Project (SDP) in February 2010. That’s when SDP began working with its first cohort of school districts in Fort Worth, Boston, Charlotte-Mecklenburg, N.C., and Fulton and Gwinnett counties in Georgia.
In 2011 and 2012, the project expanded its reach with partnerships in Philadelphia, Denver, Los Angeles, Albuquerque, Delaware, Massachusetts, New York, Kentucky and Colorado.
SDP fellows—recruited from the fields of public policy, economics, education, statistics, and business administration—become employees of these districts for two years and collaborate on data-driven analyses that can have an immediate impact on policy decisions that affect student outcomes.
For instance, data show that a student’s eighth-grade achievement level can be a substantial predictor. But a closer look shows that eighth-grade performance is not destiny because students with similar eighth-grade scores at different schools graduate at varying rates.
Glover’s team sifts through the data to find relevant factors—it could be a school’s guidance counselors, its curriculum or the standards it sets for its students.
“We want to see what practices are in place and to reveal the variations in a way that can be acted upon,” says Glover. “Using a district’s own data to show the reality of what is happening helps to illuminate some things and make a compelling case to act on it.”
SDP has several standard analyses that have provided insight into school district recruitment, placement and retention practices. One measures the relationship between advanced degrees obtained by teachers and student performance in their classes. It’s an important metric, partly because most teacher pay scales provide increased pay for higher degrees.
However, SDP’s findings may cause districts to rethink their teacher pay scales.
“We call it the chart of nothing,” says Glover, referring to the results in district after district that show no correlation between student performance and advanced teacher degrees. “Having advanced degrees does not increase teacher effectiveness.”
Another analysis explores the relationship between new teachers and low-performing students. Results in four of five districts found that novice teachers were regularly placed with low-performing students.
“It’s well-understood anecdotally, but after we show them the data, it has been a bit of a show-stopper,” she says. “If you strategically want to improve achievement, why would you disproportionately place novice teachers with low-performing students?”
Leaders at the sprawling Charlotte-Mecklenburg district, which serves 141,000 students, used those findings as part of an initiative to make principals accountable for teacher assignments, to better reach the district’s goal of boosting achievement for low-performing students.
“Accountability became more nuanced,” Glover says. “Teachers were asked to think of using data as a strategic act. They need to think how to place their teachers in ways that would be best for student growth.”
Gathering good data isn’t always easy, though. Cheating and gaming by test administrators and the pressure of creating new, high-quality tests each year can potentially cloud the data collected.
And as with any effective data analysis, comparing apples to apples is key. Glover and her team are working to nail down 10 to 12 indicators—such as a district’s high-school completion rate, college enrollment rate, and rate of college persistence into the second year—that all schools would measure. The result would be figures like the price-to-earnings ratio that stock analysts use to assess the financial health of a publically traded company.
“Novice teachers assigned to teaching low-performing students could be one,” Glover says. “It would be easy to track, and could potentially have high impact.”
Ken Cho, MBA ’03, the co-founder and chief strategy officer of Spredfast, calls himself the “Godfather of SMMS.”
That’s the acronym for Social Media Management System, the analytics toolbox that empowers companies to analyze their presence on the burgeoning number of online and mobile channels. Such tools allow companies to listen to what’s being said about them and provides the data that lets them be both proactive and reactive in the rapidly developing social media world.
“Everything with social media is so unstructured,” says Cho, 39. “We are pulling in data from Tweets, status updates on Facebook, blog posts, online videos and video comments. We suck out all the information that’s measured on each platform.”
Spredfast, which opened in Austin in 2008 with 16 employees, had grown to 75 by the end of 2011. Cho says he expects to double his workforce by this summer. Clients include IBM, Nokia, Wells Fargo, CNN, Warner Bros. and AARP, the media-savvy organization for Americans over the age of 55. (The McCombs School also uses Spredfast's tools.)
“You wouldn’t think AARP would be part of our target demographic,” Cho says. “But AARP has at least 60 social-media managers—one in every state and 10 in Washington, D.C. They are ramped up with multiple geographically specific campaigns and are one of the most forward-thinking organizations we work with.”
Spredfast’s success comes from its ability to scour the social web: aggregating data from blogs and online forums and presenting it to companies in useful ways. Spredfast taps hundreds of data sources to pull in all the conversations about a company, using search engines such as Boardreader.
Such a search, for example, may find that postings about a given company are 65 percent positive and 35 percent negative. The company can then adjust its message to respond to the negativity. He says the mobile phone manufacturer, Nokia, has hundreds of employees on the Spredfast system, analyzing the data that comes streaming in and using it to recalibrate its presence in the public sphere.
Spredfast has also evolved into marketing. A client will launch an experimental marketing campaign on Twitter and Facebook, and Spredfast will track engagement and aggregate the comments in its “customer care analytics.” The client can then respond directly from Spredfast to the person making the comment.
“Companies may want to respond to customer inquiries within 90 minutes, and those metrics are measured by our platform,” Cho says.
From the early days of social media—way back in the early 2000s—Cho had a sense it would be an important new industry.
“I saw Facebook and MySpace getting traction, and I knew I needed to get into the social space,” he says.
Before Spredfast, Cho held leadership roles at Enron, Lehman Brothers and PriceWaterhouse. After earning his MBA in 2003 he joined IBM, where he served in sales and business development roles, including managing the computer giant’s VISA credit card account. He left IBM in 2007 to set up private-label social networks for the Special Olympics, Save the Children and Oracle.
When he co-founded Spredfast with Scott McCaskill, he was focused on the growing popularity of Facebook, right at the moment it expanded from the college community into the general public. His bet was that Facebook would expand beyond personal communications to become a corporate platform as well.
His company developed an application in 2008 that gave companies a presence on Facebook. But that business model crashed a year later when Facebook changed its application protocol interface, broke Spredfast’s corporate applications and launched its own “Page” for companies. Cho says he then realized that Spredfast needed to go beyond Facebook and develop a business involving multiple online channels.
“It feels like that was 20 years ago,” he says. “But it was only yesterday. This speed in this industry is just crazy.”
Spredfast’s new frontier is what Cho calls “predictive analytics,” in which his programs will develop a profile for a company, based on what people in its target demographic are saying about its products in social networks. The company can then design a marketing campaign to target those users. He says the amount of data about consumers that’s now available online is unprecedented. The online public provides a treasure trove for those who want to analyze and package it for marketers with something to sell.
“It provides real-time information for marketers,” Cho says. “There’s so much data out there, and so much more to be learned.”
Google is the king of Big Data, using its well-tuned analytics to deliver consumers’ eyes to the messages of its advertisers and find answers for the hundreds of millions of its search engine users.
That mindset also takes hold within the company, with Google using data in novel ways to increase productivity and enhance the quality of a “Googler’s” work life.
Sudhir Giri, MBA ’96, global head of learning technologies at Google, says his company’s explosive workforce growth has created a difficult internal problem: How does a Googler know who in Google is good at what? Where can Googlers find the person they need for help?
“Skill-finding in a company gets more difficult the bigger it gets, as people look to leverage each others’ expertise and skill-sets,” says Giri, 43, who came to Google’s London office in 2007 after managing learning programs for consulting firms Accenture and Deloitte for nine years.
At his previous employers, Giri says the human resources office would circulate a survey, asking employees to complete a skills profile, then enter that information into a database. But Giri says the surveys were ineffective. Some employees didn’t fill it out. Others neglected to update their profile as their skills improved. Yet others were perplexed by how to benchmark themselves—they might consider themselves great project managers, while their colleagues may have a more dispiriting view.
To help Googlers more easily find the right collaborators, Giri’s team used a process called crowdsourcing to develop a database nicknamed “GWhiz.” It was sorely needed. As Google’s workforce grew from 22,000 in 2010 to 32,000 by the third quarter of 2011, it became increasingly difficult to keep up with the huge influx of talent.
Through a simple online tool, Googlers were encouraged to “tag” their co-workers with skills they had. Googlers could also tag themselves.
Such tags could include workplace skills such as project management or content creation. It also highlighted aptitude in cheese-making, weaponizing office supplies or ballroom dancing.
“It ended up being fun to see what people were tagged with,” says Giri, whose own tags include learning optimization, chess and learning strategies. “As people had more fun with the tool, we generated more and more data.”
If a Googler was looking for a project manager, they’d type that phrase into a simple search box and quickly see a list of people identified by others with that expertise. Those topping the list had been tagged the most times for that skill.
Once tagged, an employee was notified and asked if he or she knew others with that skill. That created a built-in viral component, spurring the creation of more data on skill identification within the company.
“People really got into it,” says Giri. “And people could look at an individual’s profile and get a rudimentary CV.”
Giri says Google has thrived by creating a culture within the corporation that supports experimentation, knowledge sharing and a dedication by its staff to engage in learning. To foster what Giri calls “a learning ecosystem of teachers and learners,” his team has created a program called “Googler to Googler,” which links employees who want to teach with others intent on learning.
Employees are encouraged to share their expertise through short videos, which are produced with assistance from technical staff and go up on Google’s internal YouTube channel. The online courses are catalogued and made accessible through a Google search engine. The project’s next phase is developing a tool with the data to recommend courses to Googlers.
“It could be a way for one’s peers to suggest learning opportunities for me,” says Giri.
Launching programs to train employees can confound executives who understand that one size does not fit all. At Google, Giri’s team is developing a system in which employees are encouraged to create “learning paths,” which link together resources to better one’s performance on the job.
It may start with a YouTube video on presentation skills, and then be linked to other resources, which could include an actual class that Google offers.
“We’ve had Googlers publish a number of learning paths, which are findable and discoverable,” says Giri. “You can join them and get on that path.”
Once you’ve joined the path, the online tool tracks your progress, and others on that same path can see where you are on your learning journey. If it’s a five-step path, and several Googlers find themselves on Step 2, Giri says they could form a study group and do it together.
“The key is to create more useful content, and make it easier to share and track progress,” he says. “It has such interesting implications. In some way, you are annotating the Web itself, taking objects on the Web—unique URLs—and linking them together. They might be resources that exist on other learning paths that you could link to. We’re working to put the basic infrastructure in place to make it happen.”