This is a joint article written with Mr. Tommi Salenius who works as a digital marketing specialist at Parcero Marketing Partners.

Introduction

Facebook advertising is a powerful form of online marketing for many purposes ranging from direct response campaigns to brand visibility and awareness. However, the competition in the ad platform is increasing every year, as companies are increasing their investments due to the fact that Facebook advertising, relatively speaking, works very well.

Figure 1 shows how Facebook’s revenue, comprising almost exclusively from advertising, has grown during the last nine years. Last year, almost $40,000,000,000 (that’s forty billion dollars) were spent on Facebook ads.

Figure 1. Facebook worldwide ad revenue statistics from Statista.com.

Increasing budgets imply increasing competition which means that in order to maintain the same visibility, advertisers need to increase their bids. For this purpose, in order to make profit in Facebook, advertisers need to continuously optimize their accounts.

To illustrate the power of Facebook advertising for online sales, Figure 2 shows an example from profitable Facebook account targeting direct online sales.

In this example, every euro invested in Facebook ads has generated direct online sales worth of €10. This means that with budget of €100,000 you can make sales worth of €1,000,000 if your target group is large enough and there is demand for your product (assuming that the sale grow linearly, of course).

The case of international Facebook advertising

Facebook is also one of the best choices to advertise globally, given its user base of more than two billion monthly active users (source: Statista.com).

Using the Locations feature in Facebook Ads targeting, several geographic targeting criteria can be chosen:

worldwide (type “Worldwide”)

country group or geographic region (e.g., type “in Europe”)

free trade area (e.g., type in “GCC, the Gulf Cooperation Council”)

sub-regions within a country (e.g., type in “Washington”)

other features (e.g., type in “Emerging markets”).

Figure 3 illustrates the Facebook targeting interface.

Figure 3. Targeting interface in Facebook Ads.

At the time of writing (October, 2018), the global targeting options in Facebook include the following:

Country groups

Africa

Asia

Caribbean

Central America

Europe

North America

Oceania

South America

Free Trade Areas

AFTA (ASEAN Free Trade Area)

APEC (Asia-Pacific Economic Cooperation)

CISFTA (Commonwealth of Independent States Free Trade Area)

EEA (European Economic Area)

GCC (Gulf Cooperation Council)

MERCOSUR

NAFTA (North American Free Trade Agreement)

Other Areas

Android app countries (paid)

Android app countries (all)

Emerging markets

Euro area

iTunes app store countries

Despite the tremendous potential of global advertising in Facebook Ads, companies often do not exploit this potential to the fullest. Moreover, we have observed that large international accounts tend to be messy and not well optimized. Therefore, in the following, we provide a checklist that can be used to audit such international Facebook Ads accounts.

Checklist for auditing international Facebook advertising

Here is a checklist for auditing Facebook paid advertising for international companies. This checklist is a concrete tool that can be used to evaluate your Facebook ad account’s current performance and identifying development areas that can get you toward desired results. There will be four sections: A) Account setup, B) Ad campaigns, C) Organic content, and D) International aspect.

Section A: Account setup

1. Is Facebook Business Manager activated?Benefit: Gain more control over user rights and possibility to operate with partners.

2. Is Facebook pixel is installed and configured?Benefit: Makes it possible to track business-related goals, for example, sales, visitors, blog reading times etc.

3. Is additional software being used besides Facebook Ad Platform?Benefit: Specific tools (e.g. Smartly, AdEspresso, Qwaya) can enhance Facebook performance by providing special features. If they are not used, at least they should be explored.

4. Is international Facebook page feature acclaimed? Benefit: This feature enables unified follower count for country pages but separated content on the country basis.

6. Are Facebook campaign goals aligned with business goals?Benefit: The campaign goals (e.g. reach, engagement, traffic, sales, leads) should be traced back to overall marketing strategy to ensure they match what is wanted.

7. What is Facebook strategy of the current campaigns? Benefit: In auditing, it is useful to mentally classify the types of campaigns used in the ad account. These can include:

8. Is there something that works already? Benefit: Verifying what already works enables to focus efforts on proven areas (e.g., some campaigns generate sales with low cost, data shows that specific creatives are working, different demographics are responding to ads).

10. Does campaign structure follow best practices?Benefit: Clear division of campaigns provides better tracktability and optimization. There should be different campaigns for all goals: prospecting and retargeting, upselling and cross-selling, reach and sales etc.

11. What auction type is used?Benefit: Auction vs. fixed price: with auction you get better results if you beat competition.

12. What placements are used?Benefit: Performance varies across placements, therefore, they should be tested. Facebook ad platform offers these placements: Facebook, Instagram, Audience Network, and Facebook Messenger. Based on our experiments, Audience Network usually performs poorly, and Instagram is more expensive than Facebook. Moreover, Messenger ads might be thought of more annoying than other placements because they are invading the user’s private space (the inbox).

13. What ad content types have been tested?Benefit: A good account has tested various different ad types (incl. carousel, link ad, instagram story, video, image, canvas).

15. What levels of retargeting are utilized?Benefit: A good account is “deep retargeting”, meaning that retargeting is specified to particular section of the website (e.g., main page, category pages, products pages, blog articles, cart, upselling, cross-selling).

16. What lookalike audience types are used?Benefit: Lookalike audiences can work because they retrieve similar users by “cross-polinating” the targeted subset of users with Facebook’s known information about other users. These options should have been tested (website, email, page likes, purchased lookalikes).

17. Is A/B testing performed systematically?Benefit: A/B test are a sign of active campaign management (both ad set and ad level). Facebook Ads provides a native option for A/B testing as a special campaign type (this campaign type can be used e.g. for testing different creatives, target groups or technical settings).

18. How well are the assets structured?Benefit: Clear naming principles make it easier to analyze and optimize (e.g., are campaigns, ad sets, and ads named systematically).

19. Is UTM tagging used?Benefit: UTM parameters enable tracking visitor performance in other analytics software, such as Google Analytics. The tagging can be done manually or automatically; the main point is that it should be done.

20. What attribution model is used?Benefit: Choosing a different attribution model can drastically change the interpretation of account performance. There are two types of conversions in Facebook: view conversions and click conversions. To get a more conversative view, include only the click conversions with a short attribution window (e.g., 1 day). To get a more rosy picture, include view conversions with a long attribution window (e.g., 28 days). There is no absolutely right or wrong attribution model.

21. Is dynamic advertising used? Benefits:

dynamic advertising can be used both in retargeting and in new customer acquisition

it offers wide range of options, if technical setup is made correctly, e.g., automated price promotions

22. Is advanced configuration of dynamic advertising used?Benefit: This is underused, yet highly potential feature of Facebook Ads — it enables to customize automatic advertising (e.g., prefer products with high gross margin, geographically show right products for right areas).

23. Are rules used for optimization? Benefit: Rules enable the monitoring and automatic response to business critical conditions (e.g., notification from data anomalies, adjusting budget based on results etc.).

24. Is the budget spent effectively?Benefit: Facebook Ads can waste budget, but there can also be much potential for upscaling the spend — based on performance metrics, one should analyze if the budget should be decrease/increased, what is the potential reach of target groups, how well are those target groups reached, and with what impression frequency.

25. What bid strategy is used?Benefit: A good account has tested several options, including: Lowest cost (standard), lowest cost with bid cap (risk of delivery issues), or Target cost (can be used for scaling up the budget).

Section C: Organic content

26. Is there enough quality content to be believable on the eyes of customers if they visit the Facebook page? Benefit: Visitors may want to check the quality of the page. Having little or no organic content creates mistrust.

27. How active are the Facebook followers of the page?Benefit: There can be a possibility to get insights from followers or turn their enthusiasm into more business. Engagement rate is a good metric, i.e. divide post responses by post impressions.

28. Is organic content reaching the target group? Benefit: If not, maybe it should be advertised. Many Facebook pages produce fairly good content that reaches nobody organically.

29. Is there point of focusing organic content or paid advertising? Benefit: The strategic roles of organic and paid should be addressed. What is the role of organic content? What is the role of paid advertising? Note: multiple ads can be advertised and A/B tested without publishing these on the news feed.

Section D: International aspect

30. Are the ads translated? When doing advertising to e.g. 10 countries with different languages, the ads should also be communicated in 10 different languages. Note that one country can contain multiple language groups, requiring localization even within a single country.

31. Is campaign structure supporting multiple languages? Each language should have been placed in separate target groups. For example, campaign could be name after the country, and it should contain different ad groups for each languages.

32. Is there enough budget to advertise internationally to all target groups? If you are targeting several countries, cities, and languages, these all need different budgets. In order to make impact, it is not usually wise to divide budget into too small pieces.

33. Is there other localization besides translation? Often, an error is made to assume localization is only about language. However, it is also about culture, customs, and ethnicity. For example, value propositions of communicated benefits may be entirely different when the same product is promoted to culturally different target groups (e.g., collectivity-individuality aspect might differ). Another example is that imagery matters for ethnic match between the target audience and people shown in the ads.

34. Have the country-basis legal restrictions been taken into consideration? E.g. different countries have different restrictions for promoting alcohol products, and European countries have strict orders for handling the data according to GDPR protocol.

35. How do normalized metrics vary by countries? Compare performance by normalized metrics (e.g., ROI), because that adjusts for variation between the markets. For example, Facebook Ads bids can be ten times more expensive in the US than in Vietnam. Similarly, purchase power differs so avg. conversion value can be one tenth in Vietnam, meaning that advertising would be equally profitable. To account for this, use normalized metrics, such as ROI or ROAS.

36. What are the city-level performance differences? Another common mistake is to assume that country is detailed enough segmentation criteria for performance differences. However, performance can vary greatly by city, e.g. in big countries like China or US. Moreover, rural areas can differ compared to city areas because people’s tastes, values, and behavior is different. To accommodate for this, Facebook advertisers should segment by city in addition to country (e.g., compare TOP 5 cities of each country).

37. What are the segment similarities across countries? Each impression has a cost. And each impression also adds information about customer responses. However, in the Facebook Ads account the performance values are siloed across different campaigns and ad sets. Therefore, to optimize such accounts, data needs to be combined. For example, if targeting 12 countries, the performance by demographic groups can be aggregated to give more statistical power (higher reliability for found similarities and differences).

Conclusion

This list of 37 items is a good starting point for analysing any Facebook Ads account running international campaigns. Besides these steps, Facebook account level data can be used for analysis purposes to find patterns in the data. For example, making country level breakdowns is made easy in the user interface of Facebook Ads platform.

About the authors:

Tommi Salenius is a Digital Marketing Manager at Elämyslahjat.fi, a Finnish e-commerce company that sells experience gifts. Tommi also works at Parcero Marketing Partners as a Lead Digital Marketing Strategist. www.tommisalenius.com

Joni Salminen is a Digital Marketing Manager at Elämyslahjat.fi, a Finnish e-commerce company selling experience gifts. Joni is also a board member at Konvertigo Digital Agency that runs digital marketing campaigns to over 100 countries. www.jonisalminen.com

In our APG team (APG = Automatic Persona Generation), we have set the goal of doing value-driven system development. “Value-driven” means that each feature we add or incorporate, solves a real user problem (i.e., provides real value). Since our clients are typically operating in the business domain, their problems deal with understanding their customers better. That’s the space APG operates in.

To discover real user needs, we’ve been carrying out several user studies. However, there are many issues in conducting user studies. The feedback we get is not always relevant or valid. For example, some participants might not be truly engaged or interested in the system and just participate out of duty or because they were “forced to”. Similarly, users may just brainstorm features that really they would not use but that “sound cool”. Moreover, when compiling the feedback, we find that there are a lot of requests for new features. Say, the users want 10 new features, but we have time and resources for two and therefore need to prioritize.

Below, I’m sharing three principles we’ve developed in order to cope with these situations.

1. Who does the feeback come from? => not all people are engaged, motivated, or knowlegeable to give useful feedback. Therefore, we have to consider if a person is just “shooting ideas” or if he or she actually wants to provide useful feedback. We then prioritize the comments from the people whose feedback indicates they are taking the commenting more seriously.

2. How repetitive is the feedback? => if the request comes from many organizations and many people within an organization, it is more likely to be a real problem to solve. If it’s a rare request, the problem is probably also very rare and worthy to focus on.

3. Is the feedback traceable to a real problem the user has? => this question tries to clarify if the request if a nice-to-have or pain killer. We need to solve real problems with the system, so nice-to-haves need to be minimized. Even if many motivated people suggest a new feature, it could still be a nice-to-have if we cannot logically connect it to a real problem.

Conclusion

Nice-to-have features are like a disease; everything can be done, but only a few things are worth doing. With nice-to-have-features, the system will not have active usage. The goal of value-driven development is to develop a system that has real users that actively use it.

Therefore, focusing on distinguishing the most useful feedback from a lot of interviews, think-alouds and comments is crucial, especially for small teams and startups that are forced to focus their development efforts.

This is an unpublished exploratory study we wrote with Professor Jim Jansen for Machine Learning and Data Analytics Symposium (MLDAS2018), held in Doha, Qatar.

Change of landscape: For a long time, automation has been invading the field of marketing. Examples of marketing automation include the various scripts, rules, and software solutions that optimize pay-per-click spending, machine learning techniques utilized in targeting of display advertising (Google, 2017), automated tools that generate ad copy variations, and Web analytics platforms that automatically monitor the health of marketing performance, alerting the end users automatically in case of anomalies.

In particular, several steps of progress towards automating analytics insights are currently being made in the industry and research fronts of data analytics.

For example, there are several tools providing automated reporting functions (e.g., Google Analytics, TenScores, Quill Engage, etc.). While some of these tools require pre-configuration such as creating report templates, it is becoming more common that the tool itself chooses the relevant insights it wants to portray, and then delivers these insights to the decision makers, typically pinging via email. An example of such an approach is provided in Figure 1 that shows Quill Engage, a tool that automatically creates fluent text reports from Google Analytics data.

As can be seen from Figure 1, the automatic analytics tool quickly displays key information and then aims to provide context to explain the trends in the key performance indicators.

Benefits: The benefits of automatic analytics are obvious. First of all, automation spares decision makers’ time, as they are not forced to log into systems, but receive the insights conveniently to their email inboxes and can rapidly take action. Since cognitive limitations (Tversky & Kahneman, 1974) are imposing serious constraints for decision makers dealing with ever-increasing amounts of “big data,” the need for smart tools that pre-process and mine the data at the user’s convenience are highly beneficial.

The core issue that automatic analytics is solving is complexity.

As a marketing manager, one has many platforms to manage and many campaigns to run within each platform. Multiple data sources, platforms, and metrics quickly introduce a degree of complexity that hinders effective processing of information by human beings, constrained by limitations of cognitive capacity.

In general, there are two primary use cases for business analytics: (1) deep analyses that provide strategic insights, and (2) day-to-day analyses that provide operational or tactical support. While one periodically needs to perform deep analyses on strategic matters, such as updating online marketing strategy, creating a new website structure, etc., the daily decisions cannot afford a thorough use of tens of reports and hundreds of potential metrics. That is why many reports and metrics are not used by decision makers in the industry at all.

The solution to this condition has to be automation: the systems have to direct human users’ attention toward noteworthy things. This means detecting anomalies on marketing performance, predicting their impact and presenting them in actionable format to decision makers, preferably by pinging them via email or other channels, such as SMS. The systems could even directly create tasks and push them to project management applications like Trello. A requisite to automatic analytics should therefore be the well-known SMART formula, meaning that Specific, Measurable, Appropriate, Realistic and Timely goals (Shahin & Mahbod, 2007). Through this principle, decision makers are able to rapidly turn insights into action.

Interfaces for automatic analytics: To accomplish the goal of automatic analytics, one trending area of is natural language systems, where users find the information by asking the system questions in free format. For example, previously, Google Analytics had a feature called Intelligence Events, which detected anomalies. Currently, Google provides automatic insights via a mobile app, in which the user can ask the system in natural language to provide information. An example of this is provided in Figure 2.

However, even asking the system requires effort and prior knowledge. For example, what if the question is not relevant or misses an important trend in the data? For such cases, the system must anticipate, and in fact analyze the data beforehand. This form of “intelligent anticipation” is a central feature in automatic analytics systems.

Examples: In the following, we provide some examples of current state-of-the-art tools of automatic analytics. We then generalize some principles and guidelines based on an overview of these tools.

First, in Figure 3, we present a screenshot from email sent by TenScores, a tool that automatically scans Quality Scores for Google AdWords campaigns.

Figure 3: TenScores, the Automatic Quality Score Monitoring Tool.

In search-engine advertising, Quality Scores are important because they influence the click prices paid by the advertisers (Jansen & Schuster, 2011; Salminen, 2009). In this particular case, the tool informs when there is a change in the average Quality Score of the account.

From a user experience perspective, the threshold to alerting the user is set to very low change, resulting in many emails sent to the users. This highlights the risk of automation becoming “spammy,” leading into losing user interest. The correct threshold should be set experimentally, e.g., according to open rates by experimenting with different increments of messaging frequency and impact thresholds.

In Figure 4, we can see a popular Finnish online marketplace, Tori.fi. Tori sends automatic emails to its corporate clients, showing how their listings have performed compared to previous period, and enabling the corporate clients to take direct action from within the email.

From example, one can click the blue button and the particularly listing which is not performing well, is boosted. In addition, there is a separate section (not visible from the screenshot) showing the best performing listings.

Risks: There are also risks associated with automatic analytics. For example, In search-engine advertising, brands are bidding against one another (Jansen & Schuster, 2011). Thus, an obvious step to further optimize their revenue by providing transparent auction information is Google sending automatic emails when the relative position (i.e., competitiveness) of a brand decreases, prompting advertisers to take action.

This potential scenario also raises questions about morality and ethics of automated analytics, especially in click auctions where the platform owners have an incentive to recommend actions that inflate click prices (Salminen, 2009). For example, in another online advertising platform, Bing Ads, the “Opportunities” feature gives suggestions marketers can implement in a click of a button. However, many of these suggestions relate to increasing the bid prices (see Figure 5).

If the default recommendation is always to raise bids, the feature does not add value to end user but might in fact destroy it. From an end user point of view, therefore, managers are encouraged to take recommendations with a grain of salt in such cases. From a research point of view, it is an interesting question to find out how much the automatic recommendations drive user actions.

Discussion: The current situation is that marketing optimization consist of various micro-tasks that are inter-connected and require analytics skills and creativity to be solved in an optimal way. The role of automated analytics, at least with the current maturity of technology, is pre-filtering this space of potential tasks into a number that is manageable to human decision makers, and, potentially, assigning the tasks priority according to their predicted performance impact.

In this scenario, humans are still needed to make the final decisions. The human decides which suggestions or insights to act upon. Nevertheless, the prospect of automatic filtering and sorting is highly beneficial in maneuvering the fragmented channel and campaign landscape taking place in practical online marketing work.

Practical guidelines:

As each vertical has its own KPIs, metrics and questions, there is a requirement of using many tools. For example, search-engine optimizers require drastically different information than display advertisers, and therefore it makes no sense to create a single solution. Instead, an organization should derive the tools from its business objectives and based on the specific information needed to achieve them.

An example of fine-grained automatic analytics is TenScores that only specializes on monitoring one metric in one channel (Quality Score in Google AdWords). Their approach makes sense because Quality Score is such an important metric for keyword advertisers and its optimization involves a complexity, enabling TenScores to provide in-depth recommendations that are valuable to end users.

However, even though the tools may be channel-specific, their operating principles can be similar. For example, stream filtering and anomaly detection algorithms are generalizable to many types of data, and thus have wide applicability. Moreover,

setting the frequency threshold to pinging decision makers is a key issue that should be experimented with when designing automatic analytics systems.

Even if there is automation, it is too early to speak of real artificial intelligence. The current systems always have manually set parameters and thresholds, and miss important things that are clear for individuals. For example, the previously shown Quill Engage cannot provide an explanation why the sales dropped when going from December to January — yet, this is apparent to any individual working in the gift business: Christmas season was the reason.

Implications for developers of automatic analytics systems: Developers of various analytics systems should no longer expect that their users log in to the system to browse reports. Instead, the critical information needs to be automatically mined and sent to decision makers in an actionable format (cf. SMART principle). There is already a considerable shift in the industry to this direction which will only be emphasized as customers realize the benefits of automatic analytics. Thus, we believe the future of analytics is more about detecting anomalies and opportunities, and giving decision makers easy choices to act upon. Of course, there are also new concerns in this environment, such as biased recommendations by online ad platforms – is the system advising you to increase bids because it maximizes your profit or because it increases the owner’s revenue?

Conclusion: Analytics software providers are planning to move toward the direction of providing automated insights, and researchers should follow suite. Open questions are many, especially relating to users’ interaction with automatic analytics insights: how responsive are users to the provided recommendation? What information do the users require? What actions do users take based on the information? We expect interesting studies in this field in the near future.

I’ve noticed that some people struggle to communicate effectively via email, so maybe sharing these tips will help someone.

Tips:

1. include *one message* per email — when you include 2 or more, the others easily get ignored. It’s better to send a new message, like “ps. one more thing…”

2. don’t make people think why you move from A to B, but make it evident from the text. Like, make a logical argument that explains itself. Find supporting evidence when needed and be truthful to yourself.

3. use short sentences, short paragraphs — people are scanning so shortness sells.

4. use plain words, don’t make people think

5. use words and phrases that cannot be misunderstood

6. be personal, use people’s names to catch their attention

7. use bolding and lists to facilitate scanning — in text-only, use *asterisk symbols* to emphasize

8. include the next steps — too many emails end up in a limbo, like what should I do after reading it?

Moreover,

9. do the thinking for the reader, so it’s easy to take action. Sometimes this means writing a single email can take an hour or more.

10. include all the relevant people when forwarding or replying — maximum transparency, maximum information

11. however, when you want a specific response, send your message individually. For example, don’t send survey links as mass-posting; approach people personally.

I got introduced to network pictures by Valtteri Kaartemo a few years back, and thought it was a cool idea. Since then, I’ve realized — after talking to many startups — that it’s more than a cool idea. It’s actually useful.

That’s because startups routinely overlook their networks and just focus on competitors. They make positioning to competitors, not to collaborators. This can be very detrimental to succeeding, because often the most connected startups do the best: they get the biggest investment rounds, biggest sales deals, etc. They are just liked more.

So you need to network. And using network pictures can help.

What is a network picture?

The idea of a network picture is that you draw your business as a network diagram (i.e., you in the middle as the central node, and other players linked to you as first- or second-degree nodes).

An example of a network picture (source: [1]):

The others in the picture can be any parties with a logical relation to your startup. They can be:

collaborators

customers

investors

suppliers/vendors

resellers/distributors

marketing/business development agencies

freelancers

friends and family

research institutes/universities

state departments

entreprenurship societies

corporations with venture programs

press/media

associations/non-profits.

They can be companies or individual people (e.g., influencers, decision makers, etc.).

Basically, those are the actors that your business interacts with (or should interact with). You are not an island.

How do use network pictures for your startup?

Now, the important thing is this: you first draw the current situation, and then the vision. I repeat,

1. Draw current situation

2. Draw vision

—

3. Compare the two

The point is that when drawing the vision, you automatically make apparent your desired state of mind which makes it easier to create a tangible plan for networking. It’s about making the vision explicit.

It also helps you consider possibilities that you had overlooked. Like, “Oh, we should check if the local university has any research projects that coincide with our product development roadmap“. Or, “We could meet up with the industry association people to ask if they find potential in our tech.” Things like that.

Through this process, you (hopefully) realize that you’re not an island, and that there many parties you could (and should) involve in your business at varying degrees of commitment. You can continue by analyzing the motives and win-wins that your connections to current and future parties entail. A good approach has been illustrated in [2]:

Conclusion

Often, business planning for startups focuses on competitors, but collaborators can be even more important. Start by drawing them, and then make the connections happen.

If you’re a startup founder and haven’t thought about the importance of networks, you should. There is research that shows networks and connections matter — and common sense supports this argument, too. You are acting in an ecosystem of other players. It’s the market, not your garage, that matters.

Footnotes:

[1] Kaartemo, V. (2013). Network development process of international new ventures in internet-enabled markets: service ecosystems approach (Doctoral dissertation). Turku School of Economics, Turku, Finland. Retrieved from http://tsenet.fi/wp-content/uploads/2013/11/network-development-process-of-international-new-ventures.pdf

This is a joint piece by Dr. Joni Salminen and Professor Jim Jansen. The authors are working on a system for automatic persona generation at the Qatar Computing Research Institute. The system is available online at https://persona.qcri.org.

Introduction

Personas are fictive characterizations of the core audience or customers of a company, introduced into software development and marketing in the 1990s (see Jenkinson, 1994; Cooper, 1999). Personas capture and summarize key elements of key customer segments so that decision makers could better understand their audience or customers, not just by using numbers but also referring to qualitative attributes, such as key pain points and desires, needs and wants. We refer to persona creation as “giving faces to data,” as personas are ideally based on real data on customer behavior. Figure 1 shows an example of a data-driven persona in which the attributes are inferred automatically from social media data.

Figure 1: Data-driven persona.

While personas have been argued to have many benefits in the academic literature (see e.g., Nielsen, 2004; Pruitt & Grudin, 2003; Salminen et al., 2017), we are constantly facing the same questions from new client organizations wishing to use our system for automatic persona generation (APG) (An et al., 2017). Namely, they want to know how to use personas in practice. While we often make the analogy that personas are like any other analytics system, meaning that the use cases depend on the client’s information needs (i.e., what they want to know about the customers), this answer is still a bit puzzling to them.

For that reason, we decided to write this piece outlining some key use cases for personas. These are meant as examples, as the full range of use cases is much wider. We will first explore some general use cases, and then proceed to elaborate on more specific persona use cases by different organizational units.

General Use Cases of Personas

In general, there are three main purposes personas serve:

1) Customer Insights. This deals with getting to know your core audience, users or customers better. For example, APG enables an organization to understand its audience’s topics of interest and preferred social media content. Who uses?Everyone in the organization.

2) Creation Activities. Using persona information to create better products, content, marketing communication, or other outputs. Who uses?Everyone in the organization dealing with customer-facing outputs.

3) Communication. Using personas for communication across departments. While it is difficult to discuss a spreadsheet, it is much easier to communicate about a person. Sharing the persona work across divisions thus increases the chance for realization of benefits. Personas make data communicable and keep team members focused on the customer needs. Who uses?Everyone in the organization.

Specific Use Cases of Personas

In addition to shared use cases of personas, there are more specific use cases. For example, product managerscan use the information to design a product that meets the needs or desires of core customers, and marketing can use personas to craft messages that resonate. Here, we are outlining specific examples of use cases within organizational units. More specifically, we allocate these use cases under four sections.

1) Customer Insights and Reporting

Journey Mapping: Plot the stages and paths of the persona lifecycle, documenting each persona’s unique state of mind, needs and concerns at each stage. Understand your website visitors’ customer journey.

Persona Discovery: Document the individuals involved in the purchase process in a way that allows decision makers to empathize with them in a consistent way.

Brand Discovery: Uncover how your core customers feel about your product or service and how they rationalize their purchase decisions.

Reporting and Feedback: Report and review data and insights to drive strategic decisions, as well as provide information to the organization as a whole.

2) Creation Activities

Planning Product Offerings: With the help of personas, organizations can more easily build the features that suit their customers’ needs. Consider the goals, desires, and limitations of core customers to guide feature, interface, and design choices.

Role Playing: Personas help product developers “get into character” and understand the circumstances of their users. They facilitate genuine understanding of the thoughts, feelings, and behaviors of core customers. Individuals have a natural tendency to relate to other humans, and it’s important to tap into this trait when making design and product development choices.

Content Creation: Content creators can leverage personas for delivery of content that will be most relevant and useful to their audience. When planning for content, we might ask “Would Jamal understand this?” or “Would Jamal be attracted by this?” Personas help one determine what kind of content is needed to resonate with core customers and in which tone or style to deliver the content. Naturally, customer analytics can and should be used to verify the results.

3) Persona Experimentation

Channel and Offering Alignment: Align every piece of offerings and marketing activity to a persona and purchase stage, identifying new channels and needs where opportunities exist.

Prediction of Popularity: Predict how a given persona will react to content, marketing messages, or products. This is a particular advantage of data-driven personas that enable using the underlying topical interests of the persona to model the likely match between personas and a given content unit.

Experimentation and Optimization: Carry out well-thought experiments with personas to produce statistically valid business insights and apply the results to optimize performance. For example, you could run Facebook Ads campaigns targeting segments corresponding to the core personas and analyze whether the campaigns perform better than broader or other customer segments.

4) Strategic Decision Making

Strategic Marketing: When you understand where your core customers spend their time online, you are able to focus your marketing spend on these channels. For example, if the data shows that your core customers prefer YouTube over Facebook, you can increase your marketing spend in the former. Think how you might describe your product for this particular type of person. For example, would Bridget better understand your offering as a “social media service” or as an “enterprise customer management tool”? Depending on the answer, the communicative strategy would be different.

Sales Strategies: Targeted offerings can help organizations convert more potential customers to subscribers, followers and customers. You can also use personas to tailor lead generation strategies which is likely to improve your lead quality and performance. By approaching your messages from a human perspective, you can create sales and marketing communication that is tailored to your core customers and, therefore, is likely to perform better.

Executives: Key decision makers can keep personas in mind while making strategic decisions. In fact, a persona can become a “silent member in the boardroom,” evoked to question the customer impact of the considered decisions.

Examples for the APG system

In the following, we will include some use case examples from the APG system that generates personas automatically from online analytics and social media data. The system is currently fully functional, and we are accepting a limited number of new clients with free of charge research licenses. See the end of this post for more details.

Figure 2: This functionality enables the client to generate personas from his chosen data source (currently, following platforms are supported: YouTube, Facebook, Google Analytics). The client can choose between 5 and 15 personas.

Figure 4: This feature enables an easy comparison of the personas across their key attributes. Improves understanding of the core customer segments.

Figure 5: This feature shows which personas most often react with which individual content.

Figure 6: This feature shows how the interests and other information of the personas change over time. Currently, APG generates new personas on a monthly basis.

Figure 7: This feature enables a gap analysis of the current audience and potential audience. The statistics are retrieved from actual audience data of the organization and the corresponding Facebook audience (via Facebook Marketing API).

Conclusion

Forrester Research (2010) reports a 20% productivity improvement with teams that use personas. Yet, using personas is not always straight-forward. Ultimately, the exact use cases depend on the client’s information needs. These needs can best be found by collaborating with persona creators to provide tailored personas that are useful specifically for a given organization in their practical decision making.

Through means of “co-creation,” clients and persona creators can figure out together how the personas could be useful for real usage scenarios. According to our experience, useful questions for defining the client’s information needs include:

What are your objectives for content creation / marketing?

What kind of customer-related decisions you make?

What kind of customer information you need?

What analytics information are you currently using?

What kind of customer-related questions you don’t currently get good answers to?

How would you use personas in your own work?

What information you find useful in the persona mockup?

What information is missing from the persona mockup?

If you are interested in the possibilities of automatic persona generation for your organization, don’t hesitate to contact us! Professor Jim Jansen will gladly provide more information: [email protected]. However, please note that for automatic persona generation to be useful for your organization, you need to have at least hundreds (preferably thousands) of content pieces published online with a wide audience viewing them. APG is great at summarizing complex audiences, but if you don’t have enough data, persona generation is better done via manual methods.

Web 2.0 was about all the pretty, shiny things about social media, like user-generated content, blogs, customer participation, ”everyone has a voice,” etc. Now, Web 3.0 is all about the dark side: algorithmic bias, filter bubbles, group polarization, flame wars, cyberbullying, etc. We discovered that maybe everyone should not have a voice, after all. Or at least that voice should be used with more attention to what you are saying.

While it is tempting to blame Facebook, media, or ”technology” for all this (just as it is easy to praise it for the other things), the truth is that individuals should accept more responsibility of their own behavior. Technology provides platforms for communication and information, but it does not generate communication and information; people do.

In consequence, I’m very skeptical about technological solutions to the Web 3.0 problems; they seem not to be technological problems but social ones, requiring primarily social solutions and secondly hybrid solutions. We should start respecting the opinions of others, get educated about different views, and learn how debate based on facts and finding fundamental differences, not resorting to argumentation errors. Here, machines have only limited power – it’s up to us to re-learn these things and keep teaching them to new generations. It’s quite pitiful that even though our technology is 1000x better than in Ancient Greek, our ability to debate properly is one tenth of what it was 2000 years ago.

Avoiding the enslavement of machines requires going back to the basics of humanity.

Missing values are a critical issue in statistics and machine learning (which is “advanced statistics”). Data imputation deals with ways to fill those missing values.

Andriy Burkov made this statement a few days ago [1]:

“The best way to fill a missing value of an attribute is to build a classifier (if the attribute is binary) or a regressor (if the attribute is real-valued) using other attributes as “X” and the attribute you want to fix as “y”.”

However, the issue is not that simple. As noted by one participant:

From Franco Costa, Developer: Java, Cloud, Machine Learning:

What if is totally independent from the other features? Nothing to learn

The discussion then quickly expanded and many machine learning experts offered their own experiences and tips for solving this problem. At the time of writing (March 8, 2018), there are 69 answers.

Also I got an advice from one of my mentor is whenever we have more than 50% of the missing values in a column, we can simply omit that column (if we can), if we have enough other features to build a model.

2) ASK WHY

From Kevin Gray, Reality Science:

It’s of fundamental importance to do our best to understand why missing data are missing. Two excellent sources for an in-depth look at this topic are Applied Missing Data Analysis (Enders) and Handbook of Missing Data Methodology (Molenberghs et al.). Outlier Analysis (Aggarwal) is also relevant. FIML and MI are very commonly used by statisticians, among other approaches.

In some analysis I have done in the past, including “missing” as a value for prediction itself have got some interesting results. The fact that for a given observation that value is missing is sometimes associated with the outcome you want to predict.

From Tero Keski-Valkama, A Hacker and a Machine Learning Generalist:

Also, you can try to check if the value being missing encodes some real phenomenon (like the responder chooses to skip the question about gender, or a machine dropping temperature values above a certain threshold) by trying to train a classifier to predict whether a value would be missing or not. It’s not always the case that values being missing are just independent random noise.

From Vishnu Sai, Decision Scientist at Mu Sigma Inc.:

In my experience, I’ve found that the technique for filling up missing values depends on the business scenario.

I think it’s important to understand the underlying cause of the missing values. If your data was gathered by survey, for example, some people will realise their views are socially unpopular and will keep them to themselves. You can’t just average out that bias – you need to take steps to reduce it during measurement. For example, design your survey process to eliminate social pressure on the respondent.

For non-human measurements, sometimes instruments can be biased or faulty. We need to understand if those biases/faults are themselves a function of the underlying measurements – do we lose data just as our values become high or low for example? This is where domain knowledge is useful – making intelligence decisions of what to do, not blind assumptions.

If you’ve done all that and still have some missing values, then you’ll be in a far stronger position to answer your question intelligently.

3) USE MISSING VALUES AS A FEATURE

One of my cases was a predictive model of use of antibiotics by patients with chronic bronchitis. One of the variables was smoking with about 20% of missing values. It turned out that having no information in the clinical record about smoking status was itself a strong predictor of use of antibiotics because a patient missing this data were receiving worse healthcare in general. By using imputation methods you someway lose that information.

From Kirstin Juhl, Full Stack Software Developer/Manager at UnitedHealth Group:

Julio Bonis Sanz Interesting- something that I wouldn’t have thought of – missing values as a feature itself.

Thats fine if have one attribute with missing values. Or two. But what if many of your features have missing values? Do recursive filling, but that can lead to error propagation? like to think that there is value in missing value, and so giving them their own distinct label (which, eg, a tree based classifier can isolate) can be an effective option

4) USE TESTED PACKAGES SUCH AS MICE OR RANDOM FOREST

From Jehan Gonsal, Senior Insights Analyst at AIMIA:

MCMC methods seem like the best way to go. I’ve used the MICE package before and found it to be very easy to audit and theoretically defensible.

This is a great advice! In one of my projects, I have used the R package called MICE which does the regression to find out the missing values. It works much better than the mean method.

From Nihit Save, Data Analyst at CMS Computers Limited (INDIA):

Multivariate Imputation using Chained Equation (MICE) is an excellent algorithm which tries to achieve the same. https://www.r-bloggers.com/imputing-missing-data-with-r-mice-package/

From ROHIT MAHAJAN, Research Scholar – Data Science and Machine Learning at Aegis School of Data Science:

In R there are many packages like MICE, Amelia and most Important “missForest” which will do this for you. But it takes too much time if data is more than 500 Mb. I always follow this regressor/classifier approach for most important attributes.

From Knut Jägersberg, Data Analyst:

Another way to deal with missing values in a model based manner is by using random forests, which work for both categorical and continuous variables: https://github.com/stekhoven/missForest . This algorithm can be easily be reimplemented with i.e. a faster than in R implemented RF algorithm such as ranger (https://github.com/imbs-hl/ranger) and then scales well to larger datasets.

5) USE INTERPOLATION

From Sekhar Maddula, Actively looking for Data-science roles:

Partly agree Andriy Burkov. But at the same time there are few methods specific to the technique/algo. e.g. For Time-series data, you may think of considering interpolation methods available with the R package “imputeTS”. I also hope that there are many Interpolation methods in the field of Mathematics. We may need to try an appropriate one.

6) ANALYZE DISTRIBUTIONS

One more way of dealing with the missing values is to identify the distribution using remaining values and fill the missing values by randomly filling the values from the distribution. Works fine in a lot of cases.

From Tero Keski-Valkama, A Hacker and a Machine Learning Generalist:

If you are going to use a classifier or a regressor to fill the missing values, you should sample from the predicted distribution rather than just picking the value with the largest probability.

SUMMARY

This was the best summary comment I found:

“I have used MICE package in R to deal with imputing and luckily it produced better results. But in general we should take care of the following:

Why the data is missing? Is is for some meaningful reason or what?

How much data is missing?

Fit a model with non missing values.

Now apply the imputing technique and fit the model and compare with the earlier one.”

FOOTNOTES

Introduction

At Qatar Computing Research Institute (QCRI), we are developing a system for automatic persona generation (APG). The demo is available online at https://persona.qcri.org

As a part of this research, we’re interested in the information needs of end users of personas [1]. People working in different domains are interested in different information, after all. For example, journalists want to know what type of news the personas are consuming, while e-commerce marketers want to know what products they are buying.

We have reviewed a lot of material relating to interviewing customers in order to create the persona profiles because, although our approach is based on automation and computational techniques, we have an interest to experiment with mixed personas utilizing qualitative data to enrich the automatically generated personas [2].

This brief post shares some of the key insights we’ve found.

Persona Information

In general, when creating personas we need to query two types of information:

Customer information => this means what information we can learn about the customers

For the former, we have developed an Information Needs Questionnaire with eight questions:

What are your objectives for content creation / marketing?

What kind of customer-related decisions you make?

What kind of customer information you need?

What analytics information are you currently using?

What kind of customer-related questions you don’t currently get good answers to?

How would you use personas in your own work?

What information you find useful in the persona mockup?

What information is missing from the mockup?

The purpose of these questions is to discover the interviewee’s professional information needs. This is useful for developing analytics systems, e.g. automatic persona generation, but also extends to traditional persona creation.

In the following, we summarize some questions intended for customers.

From Mr. Steve Cartwright (2015) [3]

“I know that when I am preparing buyer personas I have a whole heap of questions that I ask in fact I have a PowerPoint I go through with clients, this enables me to generate the personas that I need. However, if you start by simply asking:

· Who are they?

· What do these people do?

· Are they married, singles, living with a partner?

· What problems or concerns do they have, that your industry niche can solve?

· Where do they hang out and what do they do online?

· Are these people decision makers, influencers or referral sources?

Just those six questions are all you need to get started and to start to understand who you’re customers are and to turn your business into a customer centric one.”

***

From “Nisha” (2013) [4]:

“Questions for B2B marketers to delve into while creating buyer personas include:

Buyer experience and reporting officer of the prospect

Professional background of the prospect

Kind of organization

Organizations’ segment focus

History of purchases

Change in role in past few years

Market forces influencing buyers

Most urgent problems

What funded initiatives does the buyer have

What are the motivations that drive the buyer

What the buyer’s needs?

What is the budget?

Who are involved in the decision-making?

Attitude of the company towards the product/service

***

From Jesse Ness [5] (2016):

“Demographic questions:

These are the most basic questions that you should be asking your target customers, such as:

· Are they married?

· How old are they?

· Where do they live?

· Do they have children? How many? What ages?

· Which country/city did they grow up in?

· Education questions:

Our early school and college education help us shape as adults. People usually tend to answer these questions more honestly.

· What level of education did they complete?

· Which schools did they attend? Public or Private?

· What did they study?

· Were they popular at school?

· Which extra-curricular activities (if any) did they take part in?

· Career questions:

Questions about the working life of your prospects reveals a lot of interesting details about them.

· What industry do they work in?

· What is their current job level?

· What was their first full-time job?

· How did they end up where they are today?

· Has their career track been traditional or did they switch from another industry?

· Financial questions:

Your customers finances will tell you what they can afford and how easily they make their purchasing decisions.

· How often you buy high ticket items?

· How much are they worth?

· Are they responsible for making purchasing decision in the household?

Keep in mind that people tend to answer financial questions incorrectly, even in anonymous online surveys. Some might even construe this as an invasion of their privacy. Temper your results accordingly (usually by decreasing the stated average income).”

Conclusion

There is a myriad of questions one can ask from the customers when creating persona profiles. However, they should be based on first defining internal information needs. In the persona creation process, the above question lists serve as inspiration.