Classification, manipulation, psychological profiles. Do you know how companies know about you?

Abstract

In the following text, the application of profiling techniques used with the objective to depict the most precise image of an individual is investigated. The exploitation of user knowledge by businesses and institutions to manipulate their audience with personalized contents is discussed. Ethical issues related to profiling, as well as ways for users to gain awareness about the examination made possible by their data, are addressed. The core of the thesis elaborates on a proposal for a fictitious tool designed to inform users about and empower them against classification. This tool is embedded in a near-future scenario in which, similarly to the General Data Protection Regulation (GDPR), online data is made accessible to their users, is erasable and regulated to a greater extent. Finally, the constructed prospect is evaluated and a personal reflection and outlook formulated.

1. Foreword

In the digital as well as in the analog world, everybody has a personal perspective which is alimented by themselves and their environment and is filled with bias. This reality equally applies to us – two Interface Design students at the University of Applied Sciences in Potsdam, living in a highly technology-versed surrounding. Our study-focus on (big) data and visualization rose our awareness on the discriminatory impacts of technology. In addition, we worked for a data-driven media intelligence company, which collects and visualizes public data such as digital news, social media content and newspaper articles in order to enable third-party companies to understand their target groups and predict trends.

Data’s computability enables efficient and profitable user investigation, leveraged to analyze people and to predict their intentions, fears, and desires. Nonetheless, extracting value is not trivial: Raw information is not humanly readable and needs to be pictured in order to be understood and contextualized. Geo coordinates, for example, are of small interest for individuals, but businesses are able to extract very valuable information from them. Knowing where their customers have been, what they like and which websites they have visited, help them decide where to place advertisements, understand which products customers potentially need, or even adapt their ways of communicating with their target group.

Data-processing techniques such as face recognition, sentiment analysis, and psychological examination form the basis for classification. Used by Facebook, collected information primarily enables targeted advertisement, yet used by other parties, it can enable inconspicuous microtargeted manipulation. In the offline world, the provision of personal information like name, age, geolocation or bank account number to strangers, or the gratification of permission to create a psychological profile of themselves appears unreasonable. In the online world, despite that, giving away data that is redistributed to unknown companies is very common. The reason for this ambivalent behavior is due to the lack of awareness about which data is available, what methods are used to interpret it and how the results of analyses can be used for manipulation.

2. Comprehension of consumers is the new oil

The internet has become a source of great interest to businesses, institutions, and governments. Like the economist states in its 2017 article, paraphrasing the British mathematician Clive Humby, “the world’s most valuable resource is no longer oil, but data” (1). The leading tech companies’ profits have drastically surpassed the major oil businesses’ (2). As opposed to oil, data is produced at an increasingly high pace (3) and is far from being a scarce and finite resource (4). Nonetheless, similar to oil, data needs to be refined in order to be exploited for a specific purpose and thereby to acquire its real value (5).

Companies, governmental agencies, and universities work closely together (6) and share know-how about processing techniques in order to transform the ‘raw’ (not yet processed) data into valuable intelligence. These methods mostly consist of software algorithms that statistically analyze large sets of data and produce interpretations of the information. Those procedures can comprise, for example, face recognition, computer vision, supervised machine learning, deep learning, text analysis or sentiment analysis. Some of those techniques have a direct impact on the users' experience (for example, when face recognition is enabled to suggest tagging a friend in an uploaded picture). However, most of the information is invisible to the users, since it is stored in hidden server farms and is used with unclear intentions.

In the case of businesses, the most common justification for the use of data is the improvement and personalization of the services. Facebook’s data policy page states under the section “How we use this information?”, that data is used to “[…] personalize and improve [their] Products” (7). While this is conceivable, it is barely the main goal. The major purpose of harvesting and processing data is to gain valuable and exploitable insights about users, which are eventually used by external parties willing to pay the price for this information (8). A more discrete paragraph of Facebook’s data policy declares that the company uses data to “[…] help advertisers and other partners measure the effectiveness and distribution of their ads and services, and understand the types of people who use their services and how people interact with their websites, apps, and services“ (9). This better illustrates Facebook’s strategy and constitutes the reason for its cost-free nature.

The magnitude of information, which, in the cases of Facebook or Google is willingly produced by the users themselves, form a knowledge-base that can be of high value to advertisers, researchers or even law enforcement (10). As Apple’s CEO Tim Cook affirms in 2014, “when an online service is free, you’re not the customer – you’re the product” (11). Following Clive Humby’s analogy, one could proclaim that the comprehension of consumers is the new oil, and not the data itself.

‣ 2.1 Criticism

Big tech companies are criticized for trespassing the limits of privacy by aggressively intruding users' intimate lives and for trading this information to unconcerned parties (12). They have been suspected to provoke intellectual isolation, (filter bubbles) (13) by “[…] narrowing fields of vision and potentially creating echo chambers of reinforced belief”, contributing to an increase of extremism and a segmentation of society (14).

Moreover, data leaks are frequent and information can fall in the wrong hands. Cybercrimes such as identity theft or ransomware-attacks are common, even more so with the increase of smart wearables (15) and interconnected smart home devices (16). Millions of account-credentials of major tech companies have been stolen in the past (17) and with them, the valuable information that they contain.

Documents ‘leaked’ by Edward Snowden in 2013 revealed several global surveillance programs that raised great privacy concerns among the public (18). PRISM, one of the most controversial programs revealed by the whistleblower, collects targeted communications by soliciting companies such as Google or Yahoo to deliver data under the FISA Amendments Act of 2008 (19).

Since 2013, a debate has taken place (20). The EU General Data Protection Regulation (GDPR) has been adopted in 2016 and has been the most important regulation of privacy since the Data Protection Directive adopted in 1995 (21). The GDPR aims to give users control over their personal data and to unify regulations across European countries. It enables users to request the insight as well as prospective deletion of any data previously produced by the user and gathered by a service.

Although such regulations improve the protection of the users, they do not necessarily prevent cyber crimes to be committed. More importantly, they can not guarantee that misuse is avoided. As follows in the next chapter, marketers are devoted to gathering information and influence audiences. Regulations partially protect the citizens from profiling abuses, however, cannot constitute the only resource for that aim.

‣ 2.2 Protection and alternatives

Protecting personal information in the analog world might seem easier than online. Critical documents can be safed in a vault. Similarly, individuals can be tracked and stalked. Nevertheless, analog surveillance can never attain the extent of coverage that online monitoring supports. Methods such as scroll- or mouse-tracking are performed without anyone noticing, making the concealment of intimate personal information extremely difficult. For example, moving the cursor over an image can be interpreted as interest. Correlated with other information, data can be leveraged to create a detailed image of the user. While stalking and collecting data in online contexts is much easier, the interpretation of this data is complex. However, processing techniques are evolving and the huge collections of data now available, allow much better analyses.

Data is the user’s property but not in its ownership. Whereas property implies a belonging, ownership relies on possession. Companies like Facebook own user-data and are able to delete or study it. Consequently, one option to avoid data abuses could be to prevent companies to own user-data.

Technologies that grant users both property and ownership already exist, but those are complex and currently require users to have advanced technical knowledge. Peer2Peer or Mesh networks, for example, allow anonymous and safe browsing. In a Peer2Peer network, users do not communicate with a centralized server but via multiple devices within the network simultaneously. Data is stored on all devices that require it. Any device can serve as a sender, recipient or just as a transmitter that forwards information to another device. No centralized, big server farm or company of any sort is involved. Peer2Peer networks like Hypha (22) created by Aral Balkan, have a very complicated initialization process, require decentralized servers and many individual users to host the data. Also, because data is duplicated, it relies on incredibly big storage space.

Mesh networks have a complete independent infrastructure. Users communicate and request data via radio waves. The German Freifunk (23) relies on a network connecting multiple individuals, each required to buy specialized hardware. Because sending data with too strong radio transmitters is forbidden in Germany, the distance of communication is restrained.

In both of the later secure browsing mechanisms, users are not browsing the regular World Wide Web, with all its services and websites, but are rather limited to their own closed web. Social networks, news pages, search engines or online shopping platforms need to be created exclusively for the closed network. Because those alternatives do not enable user-monitoring, business-models relying on microtargeting are not conceivable.

To analyze the potential of such systems, a software that combines various technologies to ensure the users’ sovereignty over data while providing an enjoyable experience can be imagined. For this proposal to be successful at scale, businesses still rely on user-data. Nevertheless, they are obligated to communicate transparently which data are requested and for which purpose. Users are empowered with the edition, modification, and deletion of their data. Finally, an overview of exploited data, as well as the applications of this usage is provided.

‣ 2.3 Anonymity is not the answer

The aforementioned proposal cannot solve the core of privacy issues, as it relies on anonymity to protect user-integrity. Anonymity does protect the user from being identified, however, it also has various disadvantages. “It allows some users to break the law or violate the rights of others, such as defamation, cybercrimes, bullying, propagation of racist or anti-Semitic ideas, involving damage to other individuals or to society as a whole” (24).

Furthermore, anonymity cannot be guaranteed in every domain. Agreements and contracts concluded with a bank or insurance – most of them now being concluded online – require identity verification.

A chance to transition back to fully anonymous cyberspace is unrealistic and not intended. Due to fears of exclusion or lack of alternatives, removing an account is not an option.

As Rainer Rehak declares in his talk at the 2018 Chaos Communication Congress (35C3) (25), the core of the issue constitutes finding ways to protect the ‘weak’ actors (i.e. citizens or consumers) by regulating the ‘strong’ actors (i.e. businesses and governments). Let alone, regulating is not enough. In the physical world, people reveal their true identity and take responsibilities for their acts. When interacting in public contexts, social conventions prevent negative judgment and attracting attention. Similarly, conventions of appropriate online behavior should be developed to diminish the damages of manipulation.

‣ 2.4 Manipulation

There are various kinds of manipulation such as crowd-, data-, internet-, market-, media-, psychological- or social-manipulation. While incomplete, this list indicates the omnipresence of manipulation and its various applications.

An experiment that Facebook conducted on November 2nd, 2010, tested whether it was possible to increase the participation in the U.S. midterm elections, by manipulating the Timeline (26). A feature which allowed millions of users to indicate if they already voted was presented on the interface. One group was able to see which friends already voted, and another was not. The results showed that the first group displayed a 0.39% higher probability of voting (27). This appears insignificant, but at scale, it means that about 60’000 people might have voted solely because of Facebook’s experiment (28). Despite their relatively minor effect, the experiment indicates that such manipulations are effective.

In March 2018 the whistleblower Christopher Wylie released a cache of documents prompting the Facebook–Cambridge Analytica (CA) data scandal (29). Wylie worked for CA as a data consultant and was able to give detailed information on how CA collected and analyzed data to enable psychological manipulation by political parties.

In 2014, a psychologist at the Cambridge University, Aleksandr Kogan, published an app on Facebook called „This Is Your Digital Life“. Each visitor of the app was paid $3 to $4 to complete a survey (30). The app did not only analyze the user’s profile with all of its likes, posts, comments and more, but also those of its whole network. That is how Kogan was able to collect data from approximately 87 million Facebook users (31). He developed and used techniques such as the OCEAN model, which is explained later, to create psychological profiles.

Data and analyses enabled Kogan to list relations between factors such as habitat, sex or comments to estimate political views, life-satisfaction or fears. CA advised on how to influence and manipulate the targeted audience. Donald Trump used CA services (32) to shape people’s thinking and to manipulate the votes. He conducted campaigns that discouraged potential Clinton-supporters to question their choice (33). Ted Cruz, spent around 5.8 million US$ (34) to influence attitudes, fan fears or to communicate with each user in a personalized way. According to election forecasts, Ted Cruz’s percentage grew from 5% up to more than 35% (35). Selecting the target group and inspecting specific demographics, psychographics and personality traits facilitate the communication with the group (36).

The practice of targeting single individuals by analyzing their personality is called microtargeting (37). Specifically prepared and personalized ads, statements or fake news are spread into the interface of each user.

CA is not the only company who does and sells microtargeting and profile analyses. Companies like TargetPoint (38) or Grassroots Targeting (39) use similar techniques.

As Antonio García Martínez said, “none of this is even novel: It’s best practice for any smart Facebook advertiser. Custom Audiences was launched almost six (!) [sic!] years ago, marketed publicly at the time, and only now is becoming a mainstream talking point. The ads auction has been studied by marketers and academics for even longer. The only surprise is how surprising it can still seem to many” (40).

The scandal is particularly disturbing, not only because CA ‘stole’ and analyzed information, but also since it counted on a psychological approach to manipulate.

‣ 2.5 Psychological profiles

Methods for psychological profiling are attempts to classify individuals in holistic categories for which cultural, social and economic backgrounds have no influence. As the American psychologist Gerard Saucier declares, “an optimal model will be replicable across methods, cross-culturally generalizable, comprehensive, and high in utility” (41).

The aim of developing systems to categorize individuals arose a long time ago. 300 BC already, Aristotle defined different kinds of human beings (42). The psychological models described below have been established around 100 years ago.

The DISG Theory of William Moulton Marston is based on the assumptions that every ‘normal’ person (without mental illness) can be categorized using four criteria that relatively define a person's traits: dominance, influence, steadiness, and conscientiousness (DISG) (43).

The Enneagram of Personality (44) relies on nine terms to categorize base personality types in a subjective way. Each of these types is combined with ten different perspectives such as ‘Basic fear’, ‘Ego fixation’, ‘Passion’ or ‘Stress/Disintegration’ (45).

As opposed to the abovementioned model, Carl Gustav Jung’s methods divides human beings into rational and irrational people. Each can be split into four subcategories (‘thinking’, ‘feeling’, ‘sensing’ and ‘intuition’ (46)), producing eight different kinds of personalities. Additionally, each person is classified as either extraverted or introverted.

The Big Five personality traits model or also called the OCEAN model was used by Cambridge-Analytica for microtargeting. One of its advantages is its ability to visualize an infinite amount of personalities, as it is not limited to binary classifications and produces individual and unique results. Another difference is the sort of user data that constitutes the source of the analysis. While in most models, users are interviewed or observed, the OCEAN model takes advantage of existing textual data (47).

Martin Gerlach, a former physicist at the Northwestern University, states that we “[…] don't have enough empirical evidence [to show] that something like this [the concept of personality types] really exists” (48). But first results and evaluations also showed that factors for these types “are at least partially genetically predetermined” (49) and thereby enable the estimation of personalities. “It is not yet clear that this is the ‘optimal’ model“ (50) says Saucier, but Richard Robins, a personality researcher at the University of California, argues that “this is by far the most valid estimate we have of how people cluster into types” (51).

The ability of the OCEAN model to express the uniqueness of people reinforces the effectivity of classification and its manipulating character.

‣ 2.6 The limits of classification

In the case of content personalization, the classification of users is inevitable, because marketing experts are required to use understandable terminology in order to define their target groups. When creating an advertising campaign on Facebook, marketers are invited to select among a list of classifications that characterize the target audience (52). Thus, a user with a ‘complicated’ or ‘engaged’ relationship can be picked out of the mass and addressed with tailored advertisements.

As a result of semantic selection, the user is reduced to a description which is limited by the boundaries of the vocabulary and is interpreted subjectively. The selection of words for categorization constitutes a source of marginalization. The word negro, for instance, was first used around 1440 by the Spanish and Portuguese to describe people of dark-colored skin, as the word literally means ‘black’ (53). However, the term is associated with colonial history and is found offensive (54). In 2010, the word was dropped from the U.S. census and was limited to ‘black’ or ‘African American’ (55).

The classification of gender is another example of intensive social debate, in which vocabulary plays a central role. ‘Woman’ and ‘man’ as gender descriptors are considered a source of discrimination (56). In the 70s, the LGBT community has questioned the immutable perception of the sex, both in its social/cultural and physical definition (57). From this time forward, a series of different terms defining diverse gender forms and sexual orientations have emerged. Sam Killermann, a social justice advocate, lists on his website itspronouncedmetrosexual.com a list of more than 50 LGBTQ+ vocabulary definitions, such as androsexual, bigender or skoliosexual (58). Those recent terminologies are attempts to establish norms including people that do not feel represented fairly by the heteronormative gender definitions.

Although the latter might be more appropriate to describe the variety of gender perception or sexual orientations, these definitions are still limited and remain free of interpretation. Likewise, they are charged with cultural values.

All the GAFAM (Google, Apple, Facebook, Amazon, and Microsoft) are US based companies that carry western values and use English to describe their data, despite the globality and multiculturality of their audience.

Both the vocabulary used for classification and the manipulation they enable represent complex issues. The selection of labels will certainly evolve and become less discriminatory, yet labeling will still be crucial in online profiling.

‣ 2.7 Avoiding the classifications

We assume that by making labeling unreliable, we can minimize the risks of manipulation. Personalized content that relies on target groups should become inaccurate and inefficient. However, we do not believe that this inaccuracy should be provoked by chaos, because it could have negative effects on the user experience. Rather, the risks should be minimized by unsharpening the precision of the classifications and by reducing the amount of available data.

Unsharpening and limiting classification require internet users to realize that all data is a source of labeling. To gain awareness, relatability to individual classification is necessary.

The possibility to undertake changes is given, provided the necessary consciousness is reached. Yet knowing how to enable action requires knowledge about how individual classification mechanisms work. Unfortunately, because they mostly rely on machine learning, there is no generic answer. Reverse engineering the services’ mechanisms might help to foresee how they function. Accordingly, online conventions that suggest actions could be developed. They could consist of the following principles:

Reducing the amount of data limits the source material for classification

Ensuring that classifications become as vague or as uncertain as possible, reduces their effectiveness

Diminishing the match-level of classifications can avoid the assignment into target groups

Countering a classification by strengthening a contradictory classification, could help to dull the match-level of both classifications

In the proposal that follows, attention is focused on the first three principles, as the last one deteriorates the user experience.

3. The App - Labelless

In the scenario of the following proposal, further regulation such as the General Data Protection Regulation (GDPR) enforces companies to give users access to the terms they are labeled with. In fact, Facebook already presents a similar yet very minimal list of statements used for targeted advertising on the users’ ad-preferences (59). In a comparable yet enforced way, each service that practices profiling is imposed to provide an API that gives other tools access to the classifications for further analysis, once granted by the user.

Information collected from the user-base enables the training of an algorithm that estimates how content is linked to classifications. Performable actions, that limit the risks of manipulation could thereby be suggested.

We designed a tool that informs the visitors about their classifications. In this chapter, we first explain how the users are introduced to the application to progressively familiarize themselves with the tool. Secondly, we illustrate how each classification is represented and codified on the interface. Consequently, each of the views available in our tool is presented as well as the analysis they facilitate. Accordingly, the ability of users to set goals, in order to restrict the degree of their allocation into a category (match-level), is demonstrated. Finally, we reflect on the strengths and improvement potential of our proposal and provide an outlook on the possible next steps.

‣ 3.1 Initiation and access grant

The purpose of the tool is delicate and its approach complex. The users can be expected to instantly comprehend neither all the intricacies of the issue nor the visual codification of the tool. Therefore, they will be initiated to the application and its purpose in a step by step introduction. Once they are familiarized with the concepts, the visitors are invited to connect service for which an analysis should be created. Using a ‘single sign-on’ mechanism such as the typical ‘login with Facebook’ procedure, tools are linked with the services’ APIs. Once the connection is established, the results of the processed data are displayed.

‣ 3.2 Visualizing the classifications

The proposal uses a visual language that codifies various aspects of the user-classification.

Each individual categorization is represented by a circle or, in other words, a bubble. Each of them communicates two vital types of metrics:

The size describes the match-level

The size of a circle describes how strongly the subject is associated with a category. To indicate the level of correspondence with a descriptive term, the bubble with the highest match-level appears the largest and vice versa. Additionally, each bubble is constituted of three circles, that reflect the uncertainty of the estimation. The outer circle defines the maximum value, the middle circle represents the average and the smallest the minimum estimation. The distances between those three circles suggest the precision of the classification. A bubble with tightly close circles appears sharper and should alert the visitor that his data makes precise classification possible. On the other hand, bubbles with circles of dissimilar radius are considered less precise and therefore less reliable.

The color describes the amount of underlying data

We assume that large quantities of data constitute a greater source for classification. Therefore, we consider crucial to convey the amount of information on which the estimations are based. We use a color scale going from blue to pink to color-code each bubble. Less saturated blue circles are based on fewer data and highly saturated pink circles on more information.

These two representative dimensions are key to the understanding of our interface. The use of both size and color is replicated in the different visualizations of the tool.

‣ 3.3 The views

The tool offers two views: The so-called cluster-view gives an overview of the average match-level for each category. The so-called grid-view explains with greater detail the data-based origins of the estimations.

The cluster-view

As its name suggests, the cluster-view visualizes each classification in organized and understandable groups that form clusters and are placed in cloud-like formations on the canvas. Within each cluster, the bubbles are physically attracted, as if they were magnets or charged with gravitational force. As a consequence, the biggest bubbles occupy the center of the groups and the smallest reside on the margin.

Hovering a bubble displays a tooltip that indicates numeric metrics about the match-level and the amount of underlying data.

The grid-view

The grid-view offers details about the estimations’ source accessible through two perspectives.

Data perspective:

This representation allows the most precise examination. It displays what estimations are made in each data node and with which match-level.

In the left sidebar, the same groups as in the cluster-view are listed. When a group is expanded, its children classifications are revealed. On the main area, right of the sidebar, a grid of thin grey lines represents the data-nodes such as posts, likes, website visits, uploaded pictures, or any piece of data used for classification.

On some of the intersections between horizontal (classification) and vertical lines (data element), a blue bar indicates the degree at which the data element is associated with the corresponding category. As in the bubbles of the cluster-view, the bar is constituted of three shades that illustrate the uncertainty of the match-level. Hovering on either a category, a data-node or on a bar displays a tooltip filled with detailed metrics.

On a group’s row, the blue bars are radially oriented, summarized in a star-like shape. The lopsided distribution of the star’s spikes can express if a data element is the source of an intensive classification, and reveal the sharpness of the estimation.

An additional inspection angle is provided by the time-based summary of the categories. This view is organized in the same way as in the data perspective. However, instead of showing the individual data nodes, the x-axis displays the classification over time. Each term is illustrated by a line chart. Like the bubbles, the chart is colored to demonstrate the amount of underlying data. The gradient colorization is due to the progressive transition from a month’s color to the next’s. Like in the data perspective, group rows summarize child classifications with a star-like form. Yet on this mode, each branch of the star is colorized to illustrate the amount of underlying data.

Wave chart:
The choice for the particular form, as opposed to a simple line chart, is justified as an analogy to the bubble representation. In fact, the bubble and its three circles can also be seen as a mirrored bar chart (like in the data perspective).

‣ 3.5 Setting goals

For each category, users are invited to set a match-level goal. By dragging an elastic slider, a bordered circle representing the well-aimed objective is displayed on the bubble. While the interaction is ongoing, tips progressively slide in a sidebar on the right. Each tip is accompanied by an indication of the potential effect it has on the bubble.

The more challenging the goal, the more impactful the tips. These can consist of either the deletion or creation of data. Modifications that are executable via the services’ APIs can be performed without leaving the tool. The creation of alternative content, however, must be done by the users themselves. Once set, the objective circles are displayed on the bubbles and visible from the cluster-view. This way, the evolution of the classification can be observed in a later visit.

‣ 3.6 Design Evaluation

Strengths

The choice of the visual language of the tool allows a detailed analysis of the user’s classification.

The representation of each category with circles and the color coding, once comprehended, facilitates their investigation.

The visualization of the match-level, as well as its uncertainty, are visualized consistently throughout the whole design.

The form selection is adapted to the summarization of individual classifications, both over time and in total.

The interactions, because of their visual appropriation, feel natural and contribute to the familiarization with the complex concepts of the tool

The two views enable different depths of investigation

Improvement potential

The choice of size- and color-scale is not well suited for comparison and only permits a rough approximation.

The list of provided classifications is also far from being exhaustive.

The choice of vocabulary used in the groups’ titles, as well as the selected terms for the categories inside those groups, are inspired by Facebook’s ad-settings. However, deeper research could help to appropriately justify this selection.

The color-scale is certainly not accessible for color blind people and is potentially hardly differentiable.

The use of light shades results in low contrast that can disadvantage people with limited eyesight.

The relatedness between various categories is not extensively visualized.

The indication of which service is associated with an estimation is not emphasized.

The tool could alert the users more precisely on high priority risks.

Currently, the designs do not support a mobile investigation.

Next steps

A prototype that implements an explorable introduction and schemes the underlying purpose of the tool can be developed.

Extended filters (data-type, sharpness…) can enable a more precise inspection.

To improve the understanding of the personal classifications, a comparison with others can be supplied

4. Conclusion

Profiling is complex and finding solutions that seem to fully protect the users seem impossible. Our thesis relies on the quite pessimistic assumption that manipulative and dishonest profiling will continue to happen. Yet we did not want to do Critical Design in a way that either depicts a utopic future or a dystopic one, by just pointing an issue by the finger, without offering any kind of solution.

We identified labeling as an effective influencing method. As a result, the app focuses on visualizing and modifying the extent of those classification estimations. There are certainly other ways to protect users, not covered by our tool. There are also other forms of manipulation that are not limited to the classification of users for which our tool provides no solution and that might have greater manipulative impacts. However, we decided to focus on user labeling because it is understandable by and accessible to most and because we assume that all practitioners of profiling rely on terminology to classify their audience. Yet in spite of profiling, we do not think that all forms of manipulations are nocive to the users. Although, in our opinion, even a manipulation that serves an honest goal should not be done in the first place when not communicated transparently and comprehensively.

We are aware that by suggesting an approach to a hypothetical and relatively effective solution, we make ourselves subject to critique. For instance, we do not clarify who is to develop the tool we propose and which interests those could have. Would the tool be commercial and if so, would it not be for the ‘rich’ instead of for the ‘weak’? Moreover, it is not clear whether the reverse-engineering mechanisms behind the metrics of our proposal would work, and whether the produced classifications would be reliable enough to produce actionable tips. Observing the app, the services could find alternative, non-identifiable strategies to counter-effect the algorithms.

The evolution of technology has made the world smarter, smaller and more connected. Due to the effectiveness of this evolution, it is our responsibility to take control of individual data. Technology has and will continue to evolve. Keeping abreast with this development is important in order to avoid manipulation.

Our reason to study interface design originates from our fascination for technology. Yet the pace at which it has evolved is so high, that we doubt our capacity to be able to follow it, to understand it and to educate the following generations in a way that prevents great disbalances of power between those who know and those who do not. We guess that this problem is as old as humanity, universal and not limited to technology. Yet our thesis relies on the necessity we see to address this topic with different solution-based approaches. We hope to stimulate reflection and to nourish the debate with a visual rather than a technical approach.

5. Personal reflection

The realization of this thesis has enabled us to deep dive in the very present topics of privacy, big data scandals, profiling, targeted advertisement, microtargeting, or technologies that can offer alternatives to the world wide web. Despite our initial, very limited knowledge and naive grasp of the topic, we were able to learn a great deal and to form the dawn of our opinion. We only scratched the surface and are frankly overwhelmed by its immensity. However, we are neither discouraged nor weary. On the contrary, we developed a great interest and are confident that we will keep cultivating related knowledge.

As mentioned in the prologue, we live in a technology-versed surrounding that doubtlessly influences our perspective. The degree at which we consider profiling to be an issue might also originate from our social bubble and professional situation. Probably, many people neither consider profiling an issue nor have the luxury to question it. Nevertheless, hence the increasing dependence of humanity on technology, and because as interface designers we participate in its development, we believe that it is our role to question such topics. As a matter of fact, whilst our studies at the University of Applied Sciences, we have been encouraged to do so. Consequently, we wish to pursue our life, as professional designers but also as users, aware and critical.