Can Big Data and Privacy Coexist?
Can Big Data and privacy coexist, or will respondents get relegated to the cheap seats when decisions are being made about and with their own data?
A summer of privacy scandals “have focused public attention on how governments, businesses and other entities collect vast amounts of data about peoples’ lives and how that information, now called Big Data, is analyzed and used,” according to Jules Polonetsky, executive director of the Future of Privacy Forum (FPF).
Big Data’s growing use, and fervor inside and out of the survey, opinion and marketing research profession, mean that privacy concerns regarding Big Data need to be addressed sooner rather than later. FPF and the Stanford Law School attempted to do just that on September 10 with an all-day symposium in Washington, DC, “Big Data and Privacy: Making Ends Meet.”
The first panel looked at Big Data from a broad perspective. Omar Tene, senior fellow at FPF, considered potential negative societal impacts from secondary privacy risks, such as the risk of 24 hour surveillance to the functioning of a democratic society. Omar made one of many references during the day to Jeremy Bentham’s “panopticon” as well as to solutions like a proposal for every organization to have internal privacy advisory boards modeled after the Institutional Review Boards required by the federal Human Subject Rule for most any entity receiving federal funding. (MRA will grapple with that particular idea in a separate article later.)
Professor Neil Richards from the Washington University School of Law highlighted the lack of transparency in Big Data as the biggest problem, highlighting that every knowledgeable discussion about the practices of Big Data entities gets bogged down by strict non-disclosure agreements. Richards complained that there is no transparency; either for algorithms behind the scenes or for how collection is happening and how data is being used. Big Data is being used “to shape identity,” according to Richards, and consumers may be completely unaware of its effects. For instance, he cited Netflix’s prediction algorithms, which recommend to users only what users would like and thus users are losing “the chance to explore.” He took this argument further, claiming that Big Data is creating a “power paradox” with big winners and big losers and is creating “digital or income inequality.”
Continuing that strain, Deirdre Mulligan, assistant professor at the UC Berkeley School of Information, expressed concern about discriminatory effects of Big Data and how Big Data erodes autonomous individual decision-making with “big hidden influences.”
Jules Polonetsky, executive director of FPF, asked if it “would be creepier if the computer could detect what things would be perceived as creepy by a consumer and then suppress them?” Are we going to be asking the machine to have an ethic built into the system? He felt that it would constitute a “level of sophistication that is creepy and scary.” Mulligan retorted that the algorithms have to be written by someone and perhaps the code-writers should be the target of concern.
Turning back to transparency, Assistant Illinois Attorney General Erik Jones noted that “benefits tend to be overrated in debates … when the harms are not yet understood.” Jones felt it may be easy to say that classification is ok when identifying groups of consumers, but open to abuse when used to offer products that could abuse some consumers.
Martin Abrams, executive director of the Information Accountability Foundation, noted that legal issues for Big Data arise primarily outside of the United States, where innovators need permission from the law to explore or use data. Citing the case of Columbia, Abrams noted that they have no room for a repurposing of data for Big Data analytics under their country’s strict consent-based data protection law. By contrast, the United States mostly allows for experimentation, although the Federal Trade Commission (FTC) regularly shuts down Big Data uses with consent orders when the agency identifies “unfair” or “deceptive” practices.
Pondering various solutions, Mulligan pinpointed “Big Data ethics,” especially “privacy by design” (a favorite principle of the FTC, highlighted in the agency’s Privacy Report last year). Abrams agreed that there “has to be an ethical filter on all applications of Big Data” and “an ethical process behind it.”
Key questions that the panel could not quite answer: (1) Should discrimination be part of the privacy discussion (Big Data classifying different people into different buckets to receive different treatment); and (2) is Big Data inherently good or bad, or are actors using it in good or bad ways?
The societal impact of Big Data
Opening the second panel, New York Times reporter Natasha Singer, who writes regularly about what she dubs the “surveillance economy,” identified a “fundamental shift in the social order” where consumers are making choices about what data they share without knowing the implications of those choices. Of data companies, Singer said, “like Santa, they know when we’re sleeping and know when we’re awake.”
Jonas Lerman, attorney-adviser at the U.S. State Department, pointed out that individuals have not yet figured out how to handle “the creepiness factor” of following every aspect of people’s lives on social media. Having acknowledged that, Lerman failed to reflect on how we should expect companies and institutions to handle “the creepiness factor.”
The conversation then turned to concerns about being categorized and segmented (a fundamental part of marketing research). Joseph Jerome, a legal fellow at FPF, noted that, “I don’t want my girlfriend to be transparent to me; I want her to be open with me.” He complained about a new website from consumer data broker Acxiom, AboutTheData.com, which tags him as a BMW owner who plays golf, “which is not true.”
Singer responded that the Acxiom site informs consumers how they’re being classified and segmented. “But what inferences are being made about me? What is the impact on me and my family? What am I being left out of?” An audience member chimed in, saying that “segmentation is dangerous” and treats consumers monolithically.
Berin Szoka, president of TechFreedom, rejected the panelists’ complaints about segmentation as a variation on old Marxist arguments. He encapsulated panelists’ statements as “capitalism oppresses us by putting us in little boxes,” and wondered why Acxiom was not getting more kudos for their transparency efforts instead of criticism.
Do we need a new legal regime of data governance for Big Data?
That was the question posed to another panel.
Justin Brookman, director of consumer privacy for the Center for Democracy & Technology, responded yes, that “we need a comprehensive privacy law.” In the meantime, he would encourage the FTC to use their Section 5 authority to prosecute “unfair” or “deceptive” practices as aggressively as possible. He concluded that “there is a real cost,” to a tougher legal regime, “but it is worth it” and it will “by necessity inhibit some uses of Big Data” but be beneficial overall.
Michael Donohue, senior policy analyst at the Organization for Economic Co-operation and Development (OECD) noted that the OECD’s privacy guidelines have not solved the Big Data problem yet, since they are the consensus views of 34 countries. Newly-revised after 30 years, he said there was a huge change in scale. Donohue noted that the OECD now calls for each country to have comprehensive privacy laws.
The divided views from Capitol Hill were provided by Senate Commerce Committee staffer Christian Fjeld, with whom MRA has worked on data privacy and security legislation in the past. He sketched out a divide between the likes of his Democrat Chairman Jay Rockefeller (WV), who believes there needs to be a new regime of notice, choice, and security, and some Republican Senators, who think that regulation could pose a danger to the benefits of Big Data and cheap or free online content. That divide could be seen reasonably clearly over Chairman Rockefeller’s Do-Not-Track Online Act.
Fjeld also highlighted the Chairman’s own efforts to inform a new legal regime, with multiple Committee investigatory efforts over the last few years, to dig up information and bring greater transparency to complex technological issues. Asked if these efforts have helped companies “get their act together,” Fjeld replied, “I would hope so.”
Capital University Law School Professor Dennis Hirsch drew broad comparisons from his experience in environmental law, summarizing that, “Big Data is like big oil.” For all the huge benefits, he contended, there are negative aspects and “data spills.” He further reached to compare Big Data privacy problems to the threat of climate change (which, depending on one’s views, could indicate that there are no problems or huge looming problems). His comparison did make sense in the assumption that new technologies could be developed that would provide the same benefits with significantly less externality cost, but the key would be the incentives for developing that technology, which is where Hirsch suggested government could step in with subsidies or tax credits.
Identifying possible harms from Big Data, according to Felix Wu, associate professor at the Benjamin N. Cardozo School of Law, requires recognition that not each or every one of them will be considered harms by everyone. They include harms from surveillance, disclosure and discrimination. While he suggested it was still too early to engage in any risk-benefit analysis for still-amorphous harms, he didn’t think that should limit regulators to simply “harms cognizable under current law,” such as identity theft, which has a tangible financial impact and can be handled under tort law.
“Self-regulation has been insufficient to give consumers the feeling of control over their data,” according to Brookman, who also contended that having the FTC or Congress “hyper-regulating every little thing as it comes up each year is impractical.” Ultimately, he said, consumers need the right to control certain uses of their data, like marketing. For instances where there is no practical way to opt out, like video surveillance or credit reporting, Brookman said we need to have robust controls.
That discussion sparked another over the current legal regimes’ focus on consumer notice and choice. “I’m a notice and choice skeptic,” said Hirsch. “If people don’t know what they’re doing,” a focus on data use won’t be helpful, so regulators “need to determine which secondary uses are improper and then prohibit them,” as has happened in many states with laws restricting employer access to their employees’ and job applicants’ social network accounts.
Looping back to Hirsch’s earlier environmental comparisons, an audience member posed that, “big oil technology is pretty static,” but data technology is advancing at “light speed” and is an “early adopter industry.” How can government regulate something like that? Hirsch responded as before, noting the need to value and punish any negative externalities and create the right incentives for privacy to occur.
As FTC Chair Edith Ramirez noted in an August 19 speech, “There is little doubt that the skyrocketing ability of business to store and analyze vast quantities of data will bring about seismic changes in ways that may be unimaginable today.” However, she said, “not everything breathtaking is new” and “the fact that Big Data may be transformative does not mean that the challenges it poses are, as some claim, novel or beyond the ability of our legal institutions to respond.”
MRA is discussing these issues regularly with legislators and regulators, including the FTC. Big Data excites some of them, but many find it a frightening prospect that must be controlled and curtailed. We must continue to engage and educate them, but more importantly, we must imagine new and better solutions to better respect and protect the source of the research profession’s prosperity: the people who provide us our data.