Big data is growing in importance for organizational research, prompting the OCIS Division to sponsor a PDW on Big Data at the 2016 Academy of Management Meeting in Anaheim, California. Welcoming participants, incoming OCIS Division Chair Mary Beth Watson-Manheim explained that OCIS Executive committee explored different PDW topics and settled on Big Data as potentially affecting many different research areas in OCIS and the larger AOM membership. The committee was thus pleased to have been able to assemble an outstanding group of experts to discuss Big Data from different perspectives focusing on implications for research on organizations and technology, including the opening up new research areas and methods, as well as funding opportunities and ethical dilemmas involved. The PDW comprised a keynote talk by Alex (Sandy) Pentland, MIT and short presentations followed by panel discussion with Anindya Ghose, NYU; M. Lynne Markus, Bentley University; Ashish Thapliyal, Citrix Systems Inc.; and Heng Xu, National Science Foundation and Penn State University.

Summary of Prof. Alex (Sandy) Pentland’s Keynote Address:

In his keynote address, Prof. Pentland highlighted three main points. First, Big Data does not mean just analyzing social media data. Big Data should be more than just Twitter analytics because there are lots of other “digital breadcrumbs” being created, which are increasingly becoming more accessible. For example, Prof. Pentland shared how Big Data on staff communications patterns allowed bank managers to visualize which of their bank units talk more often to each other before, during and after crisis periods. Other interesting Big Data projects include using dynamic social networks to predicting collective influence; using content free, language-independent analytics to predict collective intelligence; using large (e.g. over 100million) credit card records to predict human foraging behavior; and analyzing demographic and socioeconomic data from government and UN open data initiatives.

Second, Big Data analytic fundamentally changes the scientific method, as new mathematical techniques allow better-informed management decisions. Researchers need to fundamentally re-think research methods and organizational theories for dealing with Big Data phenomena. The key challenge is to adequately capture the micro-processes underlying the generation of Big Data, and this may require some creative combination of inductive and deductive scientific approaches. For example, strong designs such as multiple randomized control trials can be employed to deduce disruptions in large communication network data sets; such disruption in communication patterns can predict that “social changes” are happening, for which an inductive approach may then be leveraged to probe more deeply into what kind of social changes are happening and what micro-processes might be driving them. Researchers need to be open to new paradigms, methods and theories that can emerge from the revolution!

Finally, inherent features of Big Data require re-thinking privacy – control and use of personal data, i.e., a “new deal in data” – the right to possess, control, and dispose of your personal data, even if it is an atomistic point in a Big Data set. Users typically do not own the data they co-create with organizations, but they should have rights on how it is used. Moreover, digital identity, digital labor, and the digital economy are likely to become part of a large socioeconomic ecosystem; accountability for data and protection against unauthorized access is therefore key.

Prof. Anindya Ghose shared his unique perspective on Big Data research opportunities gained from interdisciplinary research with his colleagues on mobile marketing and the mobile economy. First, Prof Ghose outlined two major forces shaping the mobile economy: (1) granular mobile channel user-level data obtained via mobile ads and mobile coupons; and (2) data science tools for statistical modeling, predictive analytics, randomized field experiments, and machine learning. On these foundations lay a constellation of nine forces shaping mobile marketing effectiveness, including Context, Tech mix, Social Dynamics, Trajectory, Weather, Crowdedness, Saliency, Time, Location. Crucial to all this is that consumers now expect brands and retailers to know who they are, where they are, where they’re going, what’s nearby, what’s going on, what they need, what they’ve bought, what they’re interested in, and what they respond to. This unleashes an avenue to ask novel and less obvious questions about consumer behaviors and also allow creative research designs to answer those questions. For example, in examining marketing effectiveness, Prof Ghose and his colleagues used mobile data to study whether consumer travel patterns is a stronger predictor of mobile coupon redemption, and how geo-fencing, geo-targeting, and the use of beacons can positively influence value creation by firms. “Simply put, mobile systems are data generators, and mobile data itself further generates tons of data too. The future of research is incredibly exciting”, he says.

Dr. Ashish Thapliyal, Principal, Architect, Machine Intelligence at Citrix, shared a boots-on-the-ground view of Big Data in the real world. Many billion-dollar organizations now use Big Data to boost both their internal and external outlooks on value. At Citrix, for example, the internal goal is to achieve organizational efficiency, product quality, and growth. The value chain comprises four key steps: (1) collect data from sources such as usage surveys and sales support logs; (2) collate them in data stores using Data Lake, Splunk, Oracle, etc.; (4) clean and digest data using tools such as Hadoop, Spark, and Custom; and (4) extract insights with the help of data scientists, analysts and developers. On the other hand, the external goal is to build intelligence into products to help customers achieve outcomes they desire. The value chain here has an extra final step that uses extracted insights to design product features. Ashish explained that organizations have to navigate many challenges to extract value from Big Data, not least being the influx of terabytes of data a day and the need to anonymize individual data points in Big Data. Yet, “firms that do not engage in data driven decisions will likely die in the future – the writings are on the wall!”

Panel: Dr. Heng Xu – NSF Priority Areas of Interest in Big Data

Big Data is now an important priority for the National Privacy Research Strategy in the United States, according to Dr. Heng Xu, who shed light on the evaluation process for Big Data grants at NSF. A submission for a Big Data focused grant is classified as either concerning a foundational issue or introducing an innovative application, before funding recommendations are made. In this evaluation process, NSF uses a model called The Social, Behavioral and Economic (SBE) perspective of Big Data, in which researchers are challenged to combine designed data (i.e. data originating from designed sources such as scientific instruments, large-scale surveys, and large-scale simulations) with organic data (data produced without explicit data collection designs such as data generated by mobile apps, ubiquitous sensing apps, social interaction data from social network sites, twitter feeds, click streams, etc.). Under the SBE scheme, NSF grants to social sciences have considerably gone up in the last three years. Organizational research should thus aim to apply for grants with Big Data projects that creatively combine designed and organic data.

Panel: Prof. M. Lynne Markus – New Ethical Issues Characteristic of Big Data Research
From her deep experience studying the social, economic, ethical, and workforce implications of big data and investigating a major research misconduct case, Prof. M. Lynne Markus discussed the ethical and misconduct concerns raised by Big Data research. Two prominent concerns include (1) non-transparency – inability to review or replicate published research because of lack of access to proprietary data and platforms, and (2) circumvention of university research ethics review though partnerships with corporations and claims to use “public” data. At the same time that research shows the ability to re-identify people by matching so-called “anonymized” data sets, Big Data research advocates are calling for excluding all social and behavioral research involving public or purchased data sets from human subjects protection reviews (https://www.nap.edu/catalog/18614/proposed-revisions-to-the-common-rule-for-the-protection-of-human-subjects-in-the-behavioral-and-social-sciences). Factors contributing to the ethical concerns about Big Data research include inadequate ethics codes in many academic societies and journals, and fragmented ethical control hierarchies, whereby academic misconduct is overseen by different authorities than those that deal with human subjects protection. Journal editors and reviewers have limited ability to address ethical concerns because of weak consensus, norms, practices, and rules regarding conflict of interest and ethics review disclosures, open data/code peer reviews, and research replications. This should sound alarm bells for all stakeholders, because “Big Data is The New Oil” for academic researchers, and, as we have learned from financial crises, fraud increases more during boom times than during bust times.