Getting the Most from Facebook Topic Data: Part Two

Our previous post in the ‘getting more from Facebook topic data’ series looked at how non-public networks can deliver greater volumes of data and much richer data than public networks. We also touched on the new mindset and tools required to extract the insight from non-public networks such as Facebook – enter PYLON.

As a company we have a rich history in providing industry-leading analytics tools and with PYLON, we have created a whole new approach to analytics. Not only does it help companies to analyze data from non-public networks, but crucially it also protects user privacy, which is increasingly important for consumers in a post-Snowden world.

All about PYLON – deep insight with data privacy

PYLON is the very first social analytics platform for non-public networks. DataSift’s technology is installed within the firewall of a non-public network, where it receives and processes the incoming stream of data generated by users posting and engaging with content. In the case of Facebook, this is an awful lot of engagement – billions and billions of data points are generated every single day – which leads to incredibly deep insight into an audience’s likes and opinions on a variety of topics.

Using the PYLON Application Programming Interface (API), analysts filter the non-public posts and engagements. The filtered results are stored in an index where analysts make queries against the collected data. The results are anonymized and aggregated to protect the privacy of the non-public network users.

This is vital – before any results and insights are shared outside of the non-public network’s boundaries, PYLON ensures that any Personally Identifiable Information (PII) is removed while the underlying content is available for analysis but never shared.

Working with PYLON

The question “how can an analyst work with a dataset that cannot be seen, touched, manipulated, or played with?” is one that comes up frequently. But in reality, working with data from a non-public network – aggregated and anonymised data – is not so different from working with more standard datasets.

Like data from public networks, analysis on PYLON begins with aggregated results, something that marketers are highly familiar with. Analysts search for what people are saying, carry out aggregated analysis, see trends by product or brands and discover people’s sentiment.

Furthermore, PYLON allows analysts to write queries that run within a non-public network’s firewall and operate on the raw text. Although analysts cannot ask PYLON to output all the words recorded in an index, they can still do a lot of processing and analysis of that text to obtain deep insights.

Analysis findings

PYLON’s deep and sophisticated multi-dimensional analysis results are represented statistically. The index is explored for insights by the number of interactions and authors across the available dimensions such as geography, gender, hashtags, etc.

The interactions analysed are further enriched with classification, reflecting the topics and categories of the stories shared such as movies, brands, or famous people mentioned, giving a more structured understanding of the stories.

All insights are anonymised, aggregated, and provided as ‘audience level’ demographic data. Because this is made available behind the firewall, non-public networks are able to share self-declared demographics such as age, gender and location information contributed by the authors or members.

PYLON analysis techniques

Although data from non-public networks is not much different from what analysts have worked with before, it does require some new analysis techniques. Some of these techniques are out-of-the-box PYLON features designed to help extract deep insights from the data, and include:

Categorisation –PYLON can build a custom, domain-specific taxonomy of reactions, attitudes, and features. This adds further context to query results for detailed, structured analysis.

Topic graphs – these are a great visual representation of the most frequently occurring relationships between topics in the data. PYLON makes it possible to perform open-ended discovery on topics popular with an audience, enabling analysts to understand the data and associated trends.

Top “n” of attributes – PYLON enables analysts to explore large arrays of results by automatically detecting the strongest or most interesting results. All results are presented in descending size order so the most popular results can be identified easily.

Links – shared links is a great way to measure engagement with a campaign, website or any published content, and in PYLON these are automatically detected in the data, with the most engaged links automatically returned when queried.

Super public text –some people on non-public networks choose to make the text of their postings public via privacy settings. PYLON makes available a sample of such posts with the full verbatim text, allowing analysts to validate their results and provide examples of the supporting data.

PYLON embraces a privacy-first approach that enables analysts to tap into extensive datasets from non-public networks for full exploration and analysis, providing enormous insight for marketers. The next post in our series will look in more detail at the biggest dataset of all – Facebook topic data.