Madden said essentially big data involves the “three V’s” – Volume (because the data is massive), Velocity (because data can be generated quite quickly), and Variety (because the data being used comes from many sources, including social media).

“It’s really taking different kinds of data sources … putting those together and trying to draw out some insights from that information,” Madden said.

There is an ethical debate surrounding big data. Some say there should be boundaries; others disagree. Madden said using big data to make decisions based on “sheer correlation and not necessarily causation” can be problematic, especially for people of color.

Taylor Moore of the Center for Democracy and Technology.

“It comes down to who is designing the technology, who’s at the table, and what sort of values are being built into the system,” she said.

Moore said it is important to also consider how big data is being used and by whom and what impact it has on the broader issue of Internet freedom and access. She pointed to the use of algorithms to define what is legitimate news.

“Who is defining 'legitimate?' What does that do to small voices? Are we not going to get a story about Ferguson (Missouri) in lieu of something else?” she said.

Huang said there is a fine line with big data: no ones wants to over police it, yet when it runs afoul, there should be some accountability.

“We don’t want to squash innovation … but what happens when these sort of unintentional ill effects occur?” he asked.

Mary Madden of the Data & Society Research Institute.

Moore and Madden said there is an “algorithmic accountability” movement and even algorithm ombudsmen. Madden cited an investigation by ProPublica that found that algorithm software used by prosecutors erroneously predicted that black defendants were twice as likely for recidivism than white defendants.

“There are opportunities for companies to become leaders in exposing some of these weaknesses of algorithms … and by doing that, they engender public trust,” Madden said.

Moore doesn’t favor policymakers stepping into the big data debate, even though there are myriad issues that must be sorted out.

“You want to provide a fence around this information but also not to impinge knowledge,” she said.