On September 15, the FTC convened a public workshop to explore, in the words of FTC Commissioner Edith Ramirez (D), “whether and how big data helps to include or exclude certain consumers from full opportunity in the marketplace.”

Ramirez opened the workshop, noting that the output of data in the world is doubling every two years and “advances in computational and statistical methods mean that this mass of information can be examined to identify correlations, make predictions, draw inferences, and glean new insight.”

Hitting upon concerns raised in the FTC’s data broker report, Ramirez sketched a picture of the Big Data world: “The proliferation of connected devices , the plummeting cost of collecting, storing, and processing information , and the ability of data brokers and others to combine offline and online data mean that companies can accumulate virtually unlimited amounts of consumer information and store it indefinitely.”

Big Data keeps getting bigger and more useful, but also, said Ramirez, growing in its capacity to “reinforce disadvantages faced by low - income and underserved communities. As businesses segment consumers to determine what products are marketed to them, the prices they are charged, and the level of customer service they receive , the worry is that existing disparities will be exacerbated.”

According to Ramirez, discrimination “is what big data does in the commercial sphere – analyzes vast amounts of information to differentiate among us at lightning speed through a complex and opaque process. But is it unfair, biased, or even illegal discrimination? And if so, can steps be taken to level the playing field?”

Segmentation, like in the Big Data report, was again discussed as a boogeyman. Some data brokers “create segments or clusters of consumers with high concentrations of minorities or low income individuals.” While Ramirez admits that “there may be legitimate reasons why businesses would want to sort consumers in this fashion,” such segmentation could be used for what she calls “discrimination by algorithm,” and what the White House Big Data report has referred to as “digital redlining.”

Ramirez was on the top 10 list of government players in consumer privacy in 2014.

What’s on the policy horizon for Big Data
Forty-one percent of Hispanics and African Americans could not be scored by traditional credit scoring mechanisms, according to Mark MacCarthy, vice president for public policy at the Software & Information Industry Association (SIIA). As a result, he said, alternative scoring methods achievable through Big Data can provide opportunity for more than 80 percent of those consumers to get a score. Such observations are expanded upon in “Big Data: A Tool for Fighting Discrimination and Empowering Groups,” a recent report from the Future of Privacy Forum and the Anti-Defamation League.

Carnegie Mellon University Professor Alessandro Acquisti dismissed the report, insisting that Big Data won’t grow the “economic pie,” it will only “increase inequality.”

Pamela Dixon, founder and executive director of the World Privacy Forum, referenced the “Scoring for America” report she produced with Bob Gelman earlier this year. The report emphasizes the need for “statistical parity” which the authors define as “fairness in how people are treated in analytics, including predictive analytics.” For Dixon, the “root of the matter” comes “the moment that a person” is categorized, “classified in some way, or is scored in some way, that triggers a data paradox.” The Big Data “classification effect” or “data paradox” is because the data could be used for both good or ill. She felt we “have to do something about this in terms of fairness structures.”

“We’re being profiled constantly every day, and we don’t know where our data is going,” said Nicol Turner-Lee, VP and chief research and policy officer for the Minority Media and Telecommunications Council. She observed that the Internet of Things requires “extensive data collection” around these connected devices, which presents potential threats to consumers.

“Big Data is immature,” said Dixon. She stressed that there’s no “firm scalpel-like legislative definition of Big Data” and that “there are no global solutions to the various problems it poses.” However, there are “surgical strike solutions.” Dixon said that we cannot “just throw out the existing information fairness structures,” such as the Human Subjects Rule, FCRA, of Fair Information Practice Principles (FIPPs), but that “we need to look at what do we do in terms of what I would call statistical parity or statistical fairness.” How do you choose your data sets, where do you get the data, did the individual give it knowingly or willingly? If someone is found to have HIV-AIDS, that information is both “sensitive” and “highly prejudicial,” observed Dixon. “HIPAA was right” and so was the Human Subject Rule with its IRB structure.

To that point, MacCarthy commented that specific laws already exist for specific kinds of data in specific circumstances. An all-encompassing data law that applies in all circumstances for all consumers, he said, would be the wrong way to go.

Addressing the continuing relevance of the FIPPs, Turner-Lee said that “The Internet is this big big buffet of places you can go,” with algorithms being created around every corner. It is “a hard ecosystem to distinguish” which makes “a general framework like the FIPPs” appropriate, whether the data is small or big. Taking aim as “segmented marketing,” she said that “we’ve not been able to see the exact discrimination that happens,” but she is certain “it is going to happen.”

Turner-Lee thought that transparency has become one of the most important things. “Do most consumers know… particularly minority consumers… how their data is being used?” She wondered if consumers understand what they are giving up in order to participate in online conversations. Are they “losing something of social value,” leading to higher rates when they are applying for a car loan?

Meaningful human consent for participation, as in the Human Subjects Rule, would be the bedrock foundation from which to work, according to Dixon. Although the Rule was originally “an ethical framework with no legislative teeth, it appealed to our humanity.” Even today, she said, we see violations of it “an unfairness.”

Turner-Lee commented that, while there may be some data collections where we would want everyone to participate, such as improvements to the smart grid, what about when a person walks into a fully-wired IOT home, do they “have the ability to opt out of that environment, because you don’t want someone to see?” Does the privacy framework balance “use versus harm” and can the consumers understand privacy notices?

Of course, if you put too much weight on transparency to consumers as the defense against unfairness and privacy violations, MacCarthy mused, “you’re doing customers and consumers a disservice” and “disresponsibilizing” and frustrating consumers. To him, making data collectors/users responsible makes more sense.

Just because the scale of the data has changed doesn’t mean we need to change the structures necessarily. According to Dixon, we need to look at the underlying ethics involved, not just the existing laws. Collection limitation may be hard to regulate, she said, but that doesn’t mean we should throw it out to focus entirely on data use instead.

Harmful uses of Big Data not addressed by the current legal and regulatory landscape
Data is not inherently good or bad, said Chris Calabrese, legislative counsel for the American Civil Liberties Union. Data “reflects the disparities in our society” and our job is to make sure that Big Data does not exacerbate economic and racial disparities.

Dan Castro, senior analyst for the Information Technology and Innovation Foundation, said that the workshop hadn’t heard any specifics of harm; certainly “not enough to say there might be a problem.”

Automated decision-making via Big Data, even at high levels of reliability, “like 96 percent, still leaves millions of people out of luck,” according to Jeremy Gillula, staff technologist at the Electronic Frontier Foundation. “We need to make sure the underlying technology is working as we think it should.”

While the FTC workshop heard a lot about risks and anecdotes, “we haven’t agreed on what the harm is,” according to Michael Spadea, Director of Promontory Financial Group. Setting a “perfect regulatory regime” isn’t the goal, he said, because it “would probably kill the economy.” The real goal should be the most benefit with the least harm to consumers. As of yet, there is “no evidence” of a “regulatory gap.” Identifying harm is the critical piece, since companies need clarity on the risks they should be acting to mitigate.

The UC Berkeley School of Information asked scholars to define Big Data and got forty different answers. As Chris Wolf, senior partner at Hogan Lovells, explained, “We’re painting with an awfully large brush,” in terms of definitions and uses of Big Data.

Getting over the transparency hurdle
The FTC data broker report evinced concerns about the combination of consumer tracking (online and offline) with advanced data analytics, and the same concerns came up at this workshop. And while the FTC emphasizes the need for transparency in Big Data analytics, some panelists questioned how important that would really be. According to Joseph Turow, professor at the University of Pennsylvania’s Annenberg School for Communication, consumers’ transparency worries might be generational and dissipate over time. Older people, he said, are “higher on the creep factor.” More and more consumers are demanding integrated experiences, so as they understand how Big Data works, they may care less about the kind of transparency the FTC is seeking.

However, Turow was outraged that companies will not reveal all their clients, and every source of all of their data. He indicated particular annoyance at Axciom’s testimony at a December hearing in the Senate Commerce Committee, when the company’s representative could not name specifically to whom the company sells data.

Data products and their use have evolved over time as privacy and acceptance have evolved. However, according to Calabrese, “We’re woefully inadequate in transparency right now.” The FTC data broker report indicated there were at least 1500 data points about each consumer, Calabrese said, but Axciom’s About the Data website doesn’t show all of them, nor explain them. “The individual consumer should be able to know if they are the target” of something bad.

How often should consumers be notified? Spadea pointed out that inundating consumers with notifications will “drive them to distraction” or they’ll tune out the notices, which puts them at even more risk. A better approach is to “risk-gate” the data – the most sensitive data should be treated differently, Spadea explained.

Of course, not everyone should always get notice all the time, but it is “not a technologically infeasible thing” to allow consumers who to find out what is going on with their data, said Gillula.

Turow calls this the “century of data,” where the things we say we can’t do today will be done in the future, and he worries about whether or not we’ll have the tools to control those things.

Best practices to restrict bad things done with Big Data without impairing the positive uses
In risk-benefit analysis, said Calabrese, it is “easy when the benefits are to companies and the harms are to consumers.” Regulation is necessary to mitigate harms and prevent them from falling on specific classes of people.

Castro pointed out in response that just because some people lack access to computers doesn’t mean that everyone should stop using computers -- that is not how you “bridge a digital divide.” The smart approach is to share access with those people and “bring them up.”

Deidentification could have great potential to protect consumers without hindering the use of Big Data, said Wolf. Spadea agreed, pointing out that there are lots of tools in the toolbox to mitigate data risks.

FTC seeks to grow that toolbox
FTC Commissioner Julie Brill (D) (also on the top 10 list of government players in consumer privacy in 2014) stated her belief “that big data analytics can bring significant benefits to consumers and to society, but we must endow the big data ecosystem with appropriate privacy and data security protections in order to achieve these benefits.” In her remarks, she advocated “legislation to create greater transparency and accountability for data brokers, as well as their sources and their customers… so we could begin to understand how these profiles are being used in fact, and whether and under what circumstances they are harming vulnerable populations.”

Ultimately, Brill said, regardless of any legislation, “data brokers should find out how their clients are using these products, tell the rest of us what they learn about these actual uses, take steps to ensure any in appropriate uses cease immediately, and develop systems to protect against such inappropriate uses in the future.”