Two Essays on Content Engineering with Unstructured Data: Business Insights from User-Generated Content Restricted; Files Only

Ko, Eun Hee
(Spring 2019)

Abstract

A primary driver behind the topics in my dissertation essays is the desire to address the challenges that marketing practitioners front into the market environment, where consumer behaviors are changing quickly with the expansion of platforms into new media that are native to computers or mobile devices, which have prompted continuous growth in marketing expenditures. While there is a wide range of research that studies user-generated content (UGC) and its impact on marketing or consumer purchasing behavior, few studies highlight the content characteristics with large-scale data from the field. Moreover, most of the existing empirical research that studies the semanticity of UGC pays limited attention to content beyond the text. To fill this gap, I have initiated and advanced several projects to investigate the content features not only from texts but images in my Ph.D. program. In doing so, I bring a variety of methodological approaches to my research (natural language processing, machine learning, and image processing techniques), having merged public and proprietary datasets – both longitudinal and cross-sectional. The first essay of my dissertation examines consumer engagement, measured as the number of likes and comments tied to a brand-themed social media post on Instagram. I study consumer engagement with brand-themed user-generated content – imaged-based social media posts tagged with #brandname – an increasingly common way that consumers engage with brands. I describe consumer engagement using characteristics of the image and the text of a post – visual sentiment, visual complexity, text sentiment, and text complexity – which I craft using techniques that include deep convolutional neural networks (Deep CNNs), and both a computer vision application programming interface (API) and natural language processing (NLP). Using data from over 86,000 Instagram posts collectively hashtagged with 86 product brand names, I find that visual sentiment and text sentiment are positively associated with higher levels of consumer engagement. Visual complexity and text complexity both positively affect consumer engagement at low and moderate levels, and become negative at high levels. Too much information either from images or from texts attenuates consumer engagement. Around the middle of the range of visual complexity there is an optimal level that makes a post rich and engaging. The second essay of my dissertation investigates factors that characterize manipulated reviews by concentrating on unstructured text data and brand strength as a factor associated with suspicious online review incidences. Studying over 270,000 Amazon.com reviews from 16 product categories, I find that approximately 3% of reviews are ones consumers would be suspicious about. Extreme emotions (e.g., fear, joy) account for a review being viewed as suspicious better than mixed emotions (e.g., anticipation, surprise) or low-arousal emotions (e.g., sadness). I argue that weaker brands have an incentive for review manipulation. I find that a weak brand status, described by lower advertising effort, is associated with suspicious reviews that are promotional (positive) in nature. Though, the effect fades away for suspicious reviews that are denigrating (negative).

Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.

Add to collection

You do not have access to any existing collections. You may create a new collection.