Let’s start with the biggest announcement of them all, Elastic is opening the source code of the XPack. This mean that you now not only will be able to access the Elastic stack source code, but also the subscription-based code of XPack that up until now have been inaccessible. This opens the opportunity for you as a developer to contribute back code.

Data rollups is a great new feature for anyone with the need to look at old data but feel the storage costs are too high. With rollups only predetermined metrics and terms will be stored. Still allowing you to analyze these dimensions of your data but no longer being able to view the individual documents.

Azure monitoring available in Xpack Basic. Elastic will in an upcoming 6.x release an Azure Monitoring Module, which will consist of a bundle of Kibana dashboards and make it really easy to get started exploring your Azure infrastructure. The monitoring module will be released as part of the XPack basic version – in other words, it will be free to use.

Forecasting was the big new thing in X-packs Machine learning component. As the name suggest the machine learning module can now not only spot anomalies in your data but also predict how it will change in the future.

Security in Kibana will get an update to make it work more like the Security module in Elasticsearch. This will also mean that one of the most requested security questions for Kibana will be resolved, giving users access to only some dashboards.

Dashboard are great and a fundamental part of Kibana but sometimes you want to present your data in more dynamic ways with less focus on data density. This is where Canvascomes in. Canvas is a new Kibana module to produce infographics rather than dashboards but still using live data from Elasticsearch.

Monitoring of Kubernetes and Docker containers will be made a lot easier with the Elastic stack. A new infra component will be created just for this growing use case. This component will be powered by data collected by Beats which now also has an auto discovery functionality within Kubernetes. This will give an overview of not only your Kubernetes cluster but also the individual containers within the cluster.

Geo capabilities within Kibana will be extended to support multiple map layers. This will make it possible to do more kinds of visualizations on maps. Furthermore, work is being done on supporting not only Geo points but also shapes.

One problem some have had with maps is that you need access to the Elastic map service and if you deploy the Elastic stack within a company network this might not be reachable. To solve this work is being done to make it possible to deploy the Elastic maps service locally.

Swiftype App Search – A set of API’s and developer tools that makes it quick to build user faced search applications

Elastic has also started to look at replacing the Zen protocol used to keep clusters in sync. Currently a PoC is being made to try to create a consensus algorithm that follow modern academic best practices. With the added benefit to remove the minimum master nodes setting, currently one of the most common pitfalls when running Elasticsearch in production.

ECE – Elastic Cloud Enterprise is big focus for Elastic and make it possible for customers to setup a fully service-based search solution being maintained by Elastic.

Another new regulation from EU? Will this affect us? It seems so complex. Can’t we sit back and wait for the first fine to come and then act if necessary?

We have to care and act – start planning now!

I think we have to care and act now. Start planning now so you get it right. The GDPR is a good thing. This is not another EU thing about the right size of a strawberry or how bendy a banana could be. This is about the fact that all individuals should feel safe giving their personal information to business. Cyber security is a good thing, not protecting our data and our customers’ data is a bad thing for us. Credit card numbers and personal data leaks out from companies worldwide with large business risks, companies don’t just face fines or reputational damage, they can have their permission to issue credit cards and other financial services products withdrawn by the regulator and responsible employees faces imprisonment. We can only guess whether a company needs to be GDPR compliant or not to be allowed to compete in a bidding process?

What is the General Data Protection Regulation (GDPR)?

The General Data Protection Regulation (GDPR) is a new legal framework approved by the European Union (EU) to strengthen and unify data protection of personal information. GDPR will replace the current data protection directive (in Sweden Personuppgiftslagen, PUL) and applies from 25 May 2018.

Who is affected?

GDPR has global reach and applies to all companies worldwide that process personal data of European Union citizens.

Identify personal data and protect it

GDPR widely defines what constitutes personal data. Organisations needs to fully understand what information they have, where it is located and how it was collected. Discover, classify and manage all information, both structured and unstructured data and secure it.

Data breach notifications

GDPR requires organisations to notify the local data protection authority of a data breach within 72 hours after discovery.

Do you have the right to store this information? Explicit consent

Personal data should be gathered under strict conditions. Organisations need to ask for consent to collect personal data and they need to be clear about how they will use the information.

The right of access

Individuals will have the right to obtain access to their personal data and other supplementary information in a portable format. You must provide a copy of the information free of charge. GDPR also give individuals the right to have personal data corrected if it is inaccurate or incomplete.

The right to be forgotten

GDPR also introduces the right to be forgotten, or erased. Data are not to be hold for any longer than absolutely necessary, and data should not be used in any other way than it was originally collected for.

Penalties and fines

Companies that fails to protect customer data adequately will face significant fines up to €20m, or up to 4% of global turnover. This should be a serious incentive for companies to start preparing now.

First steps to GDPR compliance

Create awareness and allocate resources
First step is to make sure that your organisation is aware of the new EU legislation and what it means for you. How will your business be affected by the new regulation? You need to allocate enough resources, make sure you involve decision-makers and stakeholders in your organisation. Last, but not least, start today!

Content Inventory
The second step is to discover and classify all your information to identify exactly what types of personal identifiable data you have, where you have it and how it is collected.

In accordance to sources, the birth of the intranet fell on a 1994 – 1996, that was true prehistory from an IT systems point of view. Intranet history is bound up with the development of Internet – the global network. The idea of WWW, proposed in 1989 by Tim Berners-Lee and others, which aim was to enable the connection and access to many various sources, became the prototype for the first internal networks. The goal of intranet invention was to increase employees productivity through the easier access to documents, their faster circulation and more effective communication. Although, access to information was always a crucial matter, in fact, intranet offered lots more functionalities, i.e.: e-mail, group work support, audio-video communication, texts or personal data searching.

Overload of information

Over the course of the years, the content placed on WWW servers had becoming more important than other intranet components. First, managing of more and more complicated software and required hardware led to development of new specializations. Second, paradoxically the easiness of information printing became a source of serious problems. There was too much information, documents were partly outdated, duplicated, without homogeneous structure or hierarchy. Difficulties in content management and lack of people responsible for this process led to situation, when final user was not able to reach desired piece of information or this had been requiring too much effort.

Google to the rescue

As early as in 1998 the Gartner company made a document which described this state of Internet as a “Wild West”. In case of Internet, this problem was being solved by Yahoo or Google, which became a global leader on information searching. In internal networks it had to be improved by rules of information publishing and by CMS and Enterprise Search software. In many organizations the struggle for easier access to information is still actual, in the others – it has just began.

And the Search approached

It was search engine which impacted the most on intranet perception. From one side, search engine is directly responsible for realization of basic assumptions of knowledge management in the company. From the other, it is the main source of complaints and frustration among internal networks users. There are many reasons of this status quo: wrong or unreadable searching results, lack of documents, security problems and poor access to some resources. What are the consequences of such situation? First and foremost, they can be observed in high work costs (duplication of tasks, diminution in quality, waste of time, less efficient cooperation) as well as in lost chances for business. It must not be forgotten that search engine problems often overshadow using of intranet as a whole.

How to measure efficiency?

In 2002 Nielsen Norman Group consultants estimated that productivity difference between employees using the best and the worst corporate network is about 43%. On the other hand, annual report of Enterprise Search and Findability Survey shows that in situation, when almost 60% of companies underline the high importance of information searching for their business, nearly as 45% of employees have problem with finding the information.
Leaving aside comfort and level of employees satisfaction, the natural effect of implementation and improvement of Enterprise Search solutions is financial benefit. Contrary to popular belief, investments profits and savings from reaching the information faster are completely countable. Preparing such calculations is not pretty easy. The first step is: to estimate time, which is spent by employees on searching for information, to calculate what percentage of quests end in a fiasco and how long does it take to perform a task without necessary materials. It should be pointed out that findings of such companies as IDC or AIIM shows that office workers set aside at least 15-35% of their working hours for searching necessary information.
Problems with searching are rarely connected with technical issues. Search engines, currently present on our market, are mature products, regardless of technologies type (commercial/open-source). Usually, it is always a matter of default installation and leaving the system in untouched state just after taking it “out of the box”. Each search engine is different because it deals with various documents collections. Another thing is that users expectations and business requirements are changing continually. In conclusion, ensuring good quality searching is an unremitting process.

Knowledge workers main tool?

Intranet has become a comprehensive tool used for companies goals accomplishment. It supports employees commitment and effectiveness, internal communication and knowledge sharing. However, its main task is to find information, which is often hide in stack of documents or dispersed among various data sources. Equipped with search engine, intranet has become invaluable working tool practically in all sectors, especially in specific departments as customer service or administration.

So, how is your company’s access to information?

This text makes an introduction to series of articles dedicated to intranet searching. Subsequent articles are intended to deal with: search engine function in organization, benefit from using Enterprise Search, requirements of searching information system, the most frequent errors and obstacles of implementations and systems architecture.

This is the seventh post in a series (1, 2, 3, 4, 5, 6) on the challenges organisations face as they move from having online content and tools hosted firmly on their estate to renting space in the cloud. We will help you to consider the options and guide on the steps you need to take.

Starting from our first post we have covered different aspects you need to consider as you take each step including information structure and how it is managed using Office 365 and SharePoint as a technology example. Planning for migration.

Do not even think about moving into the cloud apartment without a proper cleaning of the content buckets. Moving from an architected household to a rented place, taxes a structured audit. Clean out all redundant, outdated and trivial matter (ROT). The very same habit you have cleaning up the attic when moving out from your old house.

It is also a good idea to decorate and add any features to your new cloud apartment before the content furniture is there. It means the content will fit with any new design and adapt to any extra functionality with new features like windows and doors. This can be done by reviewing and updating your publishing templates at the same time. This will save time in the future.

Leaning upon the information governance standards, it should be easy to address the cleaning before moving, for all content owners who have been appointed to a set of collections or habitats. Most organisations could use a content vacuum cleaner, or rather use the search facilities and metric means to deliver up to date reports on:

Active / in-Active habitats

No clear ownership or the owner has left the building

Metadata and link quality to content and collections to be moved across to the cloud apartments.

Review publishing templates and update features or design to be used in the Cloud

When all active habitats and qualified content buckets have been revisited by their set of curators and information owners. The preparation and use of moving boxes, should be applied.

All moving boxes do need proper tagging, so that any moving company will be able to sort out where about the stuff should be placed in the new house, or building. For collections, and habitats, this means using the very same set of questions stated for adding a new habitat or collection to the cloud apartment house. Who, why, where and so forth, through the use of a structured workflow and form. When this first cleaning steps have been addressed, there should be automatic metadata enhancement, aligned with the information management processes to be used in the new cloud.

With decent resource descriptions and cleaned up content through the audit (ROT), this last step will auto-tag content based upon the business rules applied for the collection or habitat. Then been loaded into the content moving truck, or loading dock. Ready to added to the cloud.

All content that neither have proper assigned information ownership, or are in such a shape that migration can’t be done should persist on the estate or be archived or purged. This means that all metadata and links to either content bucket or habitat that won’t be moved in the first instances, should at least have correct and unique uri:s, address, to this content. And in the case a bucket or habitat have been run down by a demolition firm, purged. All inter-linkage to that piece of content or collection have to be changed.

This is typically a perfect quality report, to the information owners and content editors, that they need to work through prior to actually loading the content on the content dock.

Finally when all rotten data, deserted habitats and unmanageable buckets have been weeded out. It is time to prepare the moving truck, sending the content into its new destination.

Our final thread will cover how will the organisation and it habitants will be able to find content in this mix of clouds, and things left behind on the old estate? Cloud Search and Enterprise Search, seamless or a nightmare?

This is the fourth post in a series (1, 2, 3,5, 6, 7) on the challenges organisations face as they move from having online content and tools hosted firmly on their estate to renting space in the cloud. We will help you to consider the options and guide on the steps you need to take.

In the first post we set out the most common challenges you are likely to face and how you may overcome these. In the second post we focused on how Office 365 and SharePoint can play a part in moving to the cloud. In the third post we covered how they can help join up your organisation online using their collaboration tools and features.

In this post we will cover engagement and how sorting and categorisation of artifacts, according to a simple-to-understand and easy-to-use standard, will form the bits and parts of the curation and cultivation process.

All document libraries should have one standard listing of all items – with two very distinct audiences: being either actors within the habitat or the people contributing, acting and joining the daily conversation; and secondly, those visitors who pass-by the habitat to collect, link and act upon the content presented within the habitats realm.

This makes it very easy for visitors to find their way around a habitat, if the visitors’ area (business lounge) is pretty much aligned to the overarching theme of the site… and all artifacts that the project team like to share wider, have been listed in a virtual bookshelf, with major versions only. The visitors’ area, has all the relevant data, presented upfront. Basically the answers to the questions set when starting the project. The visitors’ area shouldn’t be a backdrop, but rather a storefront. The content has to be of good quality. Then there should be options to engage with the inner-living-room of the habitat, and enter the messy on-going conversations, depending on access-rights. But the default setting, should always be openfor unexpected “internal” (within the realm of the organisation) visitors. If the visitors’ area is compiled in a nice and easy to use manner, most visitors are just happy to pick the best-read from the bookshelf, or at least raise a questions for the team! The social construct for this is “welcoming a stranger”, since that visitor might link to your team’s content, cross-linking into his social-spaces.

The habitat’s livingroom and social conversations, will address new context-specific organising principles. A team might want to add new list-items, sort categories or introduce very local what-goes-where themes. This may be especially so when the team consists of actors who have different roles and responsibilities with regard to the overall outcome. And because of this, there may be a certain mix of tools or services in this one habitat of many, where they hang-out for project tasks.

The contextual adjustment is where the curator has to work on a cultivation process that glues the team together. The shared terminology within a group conversation, is what match their practices together. At inception, the curator picks a bouquet of on-topic terms from the controlled vocabularies. Mixing this with everyday use, and contributions from all members, this can be the fruitful and semantically-enhanced conversations with end-user generated tags or “folksonomies”. The same goes for interior design of links, tools, chosen content types and other forms of artifacts that the team will be needing to fulfill their goals and outcome.

The governance of the habitat, leans very much on the shared experiences in the group, and assigned responsibilities for stewardship and curation – where publishing standards, guidelines and training should be part of the mix.

This is the third post in a series (1, 2,4, 5, 6, 7) on the challenges organisations face as they move from having online content and tools hosted firmly on their estate to renting space in the cloud. We will help you to consider the options and guide on the steps you need to take.

In the first post we set out the most common challenges you are likely to face and how you may overcome these. In the second post we focused on how Office 365 and SharePoint can play a part in moving to the cloud. Here we cover how they can help join up your organisation online using their collaboration tools and features.

When arranging the habitat, it is key to address the theme of collaboration. Since each of these themes, derives different feature settings of artifacts and services. In many cases, teamwork is situated in the context of a project. Other themes for collaboration are the line of business unit teamwork, or the more learning networks a.k.a communities of practice. I will leave these later themes for now.

Most enterprises have some project management process (i.e. PMP) that all projects do have to adhere to, with added complementary documentation, and reporting mechanisms. This is so the leadership within the organisation will be able to align resources, govern the change portfolio across different business units. Given this structure, it is very easy to depict measurable outcomes, as project documents have to be produced, regardless of what the project is supposed to contribute towards.

Why? usually defined in project description, setting common ground for the goals and expected outcome. ( dc.description )

How? defines used processes, practices and tools to create the expected outcome for the project, with links to common resources as the PMP framework, but also links to other key data-sets. Like ERP record keeping and masterdata, for project number and other measures not stored in the habitat, but still pillars to align to the overarching model. (dc.relation)

When these questions have been answered, the resource description for the habitat is set. In Sharepoint the properties bag (code) feature. During the lifespan of the on-going project, all contribution, conversations and creation of things can inherit rule-based metadata for the artifacts from the collections resource description. This reduces the burden weighing on the actors building the content, by enabling automagic metadata completion where applicable. And from the wayfinding, and findability within and between habitats, these resource descriptions will be the building blocks for a sustainable information architecture. In our next post we will cover how to encourage employee engagement with your content.

This is the second post in a series (1, 3, 4, 5, 6, 7) on the challenges organisations face as they move from having online content and tools hosted firmly on their estate to renting space in the cloud. We will help you to consider the options and guide on the steps you need to take.

In the first post we set out the most common challenges you are likely to face and how you may overcome these. In this post we focus on how Office 365 and SharePoint online can play a part in moving to the cloud.

Let us be pragmatic and down-to-earth! It is time to roll up our sleeves and consider using Office 365, as one example of how organisations can make this transition from their estate to the cloud. Given that this is the collaborative space many organisations consider using, Office 365 is compelling as a one-size-fits-all, instant build and just roll-out enterprise-wide approach to take sometimes without an Information Architecture plan whatsoever!

In the Office 365 environment, one has to map the terrain, so that there are distinct districts to where things relate – the same goes for the structure of neighborhoods of clustered habitats. But where it gets tough is to have an agile and resilient city plan for the real-world experience. This is actually the pillar construction in a digital domain, aiming for resilience and emerging uses over the time… but with a simple and agreed upon game plan.

Pace-layering the information architecture

Most organisations have an ontology of entities, things, that are generic, as stated in the W3C Organisation schema. And these perspectives, domain models, vocabularies and ontologies, add up to become districts, and neighborhoods in the Information Architecture map, with a few angles:

Regardless of line of Business for an organisation, these pan out as pretty good structural elements on which to build upon. Since an enterprise is a social construct with agreed borders, it is populated with people who act and interplay in various different ways, and have a multitude facets with regards to everyday work. Some entities change more frequently, generally in the organisational units further down in the leaves, and less so in the top main branches. The vocabularies within an organisation needs to be the center pillar, to reduce linguistic insecurities.

From an Information Architecture perspective, in using Office 365 or Sharepoint it is wise to use pace-layering to the building blocks, on to which navigational constructs are built upon. This means, using the highest level of the organisational unit tree branch, a pretty stable foundation for the site-structure can be built. This is where content and teamsites live. More fluid navigational themes (temporal, or topic entities) can then be added.

This goes for activities undertaken within daily practices, where a set of professions and disciplines interact. All of these activities lay out a tapestry of overarching business processes. The outcome or result, might be a thing that is detailed as topic taxonomies. For example, a product structure for a specific manufacturing industry. Since all organisations have actor networks in their ecology, it is preferable to add these entities into the structure, as clients, partners, competitor, regulatory agencies, social networks, communities of practice and so forth.

All of these set of terms, have to be maintained in a Managed Metadata Service, a.k.a TermStore. In most organisations there are other sources of their controlled vocabularies, hence mapping is key, to have aligned master term sets. Either through subscription models (batch) or enterprise-linked-data sets. All these actions, defines the terrain, so we map the ecosystem as taxonomic chartographers.

The Building Blocks: Artifacts and Collections

Office 365 comes with a pretty organised set of tools, themes and things to build upon. For more website related things, one could either use published web sites / portals, or enterprise wikis. The other main services are digital habitats, or collaborative spaces, team-sites. And lasty there are ESN (Enterprise Social Networks) like Yammer, and instant messaging tools like Lync, and Exchange Services like mail and calendar. Sharepoint Online and Office 365 is a Swiss Army Knife.

The emerging hyper-connected and agile enterprises of today are stigmatised by their IS/IT-legacy, so the question is: Will emerging web and semantic technologies and practices undo this stigma?

Semantic Technologies and Linked-Open-Data (LOD) have evolved since Tim Berners-Lee introduced their basic concepts, and they are now part of everyday business on the Internet, thanks mainly due to their uptake by information and data-run companies like Google, social networks like Facebook and large content sites, like Wikipedia. The enterprise information landscape is ready to be enhanced by semantic web, to increase findability and usability. This change will enable a more agile digital workplace where members of staff can use cloud based services, anywhere, anytime on any device, in combination with the set of legacy systems backing their line-of-business. All in all, more efficient organising principles for information and data.

The Corporate Information Landscape of today

In everyday workplace we use digital tools to cope with the tasks at hand. These tools have been set into action to address meta models to structure the social life dealing with information and data. The legacy of more than 60 years of digital records keeping, has left us in an extremely complex environment, where most end-users have a multitude of spaces where they are supposed to contribute. In many cases their information environment lacks interoperability.

A good, or rather bad example of this, is the electronic health records (EHR) of a hospital, where several different health professionals try to codify their on-going work in order to make better informed decisions regarding the different medical treatments. While this is a good thing, it is heavily hampered with closed-down silos of data that do not work in conjunction with the new more agile work practices. It is not uncommon to have more than 20 different information systems employed to do provisioning during a workday.

The information systems architecture, in any organisation or enterprise, may comprise of home-grown legacy systems from the past, or bought off-the-shelf software suites and extremely complex enterprise-wide information systems like ERP, BI, CRM and the like. The connections between these information systems (or integration points) often resemble “spaghetti” syndrome, point-to-point. The work practice for many IT professionals is to map this landscape of connections and information flows, using for example Enterprise Architecture models. Many organisations use information integration engines, like enterprise-service-bus applications, or master data applications, as means to decouple the tight integration and get away from the proprietary software lock-in.

On top of all these schema-based, structured data, information systems, lies the social and collaborative layer of services, with things like intranet (web based applications), document management, enterprise wide social networks (e.g. Yammer) and collaborative platforms (e.g SharePoint) and more obviously e-mail, instant messaging and voice/video meeting applications. All of these platforms and spaces where one carries out work tasks, have either semi-structured (document management) or unstructured data.

Wayfinding

A matter of survival in the enterprise information environment, requires a large dose of endurance, and skills. Many end-users get lost in their quest to find the relevant data when they should be concentrating on making well-informed decisions. Wayfinding is our in-built adaptive way of coping with the unexpected and dealing with it. Finding different pathways and means to solve the issues. In other words … Findability.

Outside-in and Inside-Out

Today most organisations and enterprises workers act on the edge of the corporate landscape – in network conversations with customers, clients, patients/citizens, partners, or even competitors, often employing means not necessarily hosted inside the corporate walls. On the Internet we see newly emerging technologies become used and adapted at a faster rate and in a more seamless fashion than the existing cumbersome ones of the internal information landscape. So the obvious question raised in all this flux is: why can’t our digital workplace (the inside information landscape) be as easy to use and to find things / information as in the external digital landscape? Why do I find knowledgeable peers in communities of practice more easily outside than I do on the inside? Knowledge sharing on the outpost of the corporate wall is vivid, and truly passionate whereas inside it is pretty stale and lame to say the least.

Release the DATA now

Aggregate technologies, such as Business Intelligence and Datawarehouse, use a capture, clean-up, transform and load mechanism (ETL) from all the existing supporting information systems. The problem is that the schemas and structures of things do not compile that easily. Different uses and contexts make even the most central terms difficult to unleash into a new context. This simply does not work. The same problem can be seen in the enterprise search realm where we try to cope with both unstructured or semi-structured data. One way of solving all this is to create one standard that all the others have to follow and including a least common denominator combined with master data management. In some cases this can work, but often the set of failures fromsuch efforts are bigger than those arising from trying to squeeze an enterprise into a one-size-fits-all mega-matrix ERP-system.

Why is that? you might ask, from the blueprint it sounds compelling. Just align the business processes and then all data flows will follow a common path. The reality unfortunately is way more complex because any organisation comprises of several different processes, practices, professions and disciplines. These all have a different perspectives of the information and data that is to be shared. This is precisely why we have so many applications in the first place! To what extent are we able to solve this with emerging semantic technologies? These technologies are not a silver bullet, far from it! The Web however shows a very different way of integration thinking, with interoperability and standards becoming the main pillars that all the other things rely on. If you use agreed and controlled vocabularies and standards, there is a better chance of actually being able to sort out all the other things.

Remember that most members of staff, work on the edges of the corporate body, so they have to align themselves to the lingo from all the external actor-networks and then translate it all into codified knowledge for the inside.

Semantic Interoperability

Today most end-users use internet applications and services that already use semantic enhancements to bridge the gap between things, without ever having to think about such things. One very omnipresent social network is Facebook, that relies upon the FOAF (Friend-of-a-Friend) standard for their OpenGraph. Using a Graph to connect data, is the very corner stone of linked-data and the semantic web. A thing (entity) has descriptive properties, and relations to other entities. One entity’s property might be another entity in the Graph. The simple relationship subject-predicate-object. Hence from the graph we get a very flexible and resilient platform, in stark contrast to the more traditional fixed schemas.

The Semantic Web and Linked-Data are a way to link different data sets that may grow from a multitude of schemas and contexts into one fluid interlinked experience. If all internal supporting systems or at least the aggregate engines could simply apply a semantic texture to all the bits and bytes flowing around, it could well provide a solution to the area where other set ups have failed. Remember that these linked-data sets are resilient by nature.

There is a set of controlled vocabularies (thesauri, ontologies and taxonomies) that capture all the of topics, themes and entities that make up the world. These vocabs have to some extent already been developed, classified and been given sound resource descriptors (RDF). The Linked-Open-Data clouds are experiencing a rapid growth of meaningful expressions. WikiData, dbPedia, Freebase and many more ontologies have a vast set of crispy and useful data that when intersected with internal vocabularies, can make things so much easier. A very good example of such useful vocabularies, are the ones developed by professional information science people is that of the Getty Institute’s recently released thesari for AAT (Arts and Architecture), CONA (Cultural Object Authority) and TGN (Geographical Names). These are very trustworthy resources, and using linked-data anybody developing a web or mobile app can reuse their namespace for free and with high accuracy. And the same goes for all the other data-sets in the linked-open-data cloud. Many governments have declared open data as the main innovation space in which to release their things, under the realm of the “Commons”.

Inaddition to this, all major search engines have agreed on a set of very simple-to-use schemas captured in the www.schema.org world. These schemas have been very well received from their very inception by the webmaster community. All of these are feeding into the Google Knowledge Graph and all the other smart-things (search-enabled) we are using daily.

From the corporate world, these Internet mega-trends, have, or should have, a big impact on the way we do information management inside the corporate walls. This would be particularly the case if the siloed repositories and records were semantically enhanced from their inception (creation), for subsequent use and archiving. We would then see more flexible and fluid information management within the digital workplace.

The name of the game is interoperability at every level: not just technical device specifics, but interoperability at the semantic level and at the level we use governing principles for how we organise our data and information, regardless of their origin.

Stepping down, to some real-life examples

In the law enforcement system in any country, there is a set of actor-networks at play: the police, attorneys, courts, prisons and the like. All of them work within an inter-organisational process from capturing a suspect, filing a case, running a court session, judgement, sentencing and imprisonment; followed at the end by a reassimilated member of society. Each of these actor-networks or public agencies have their own internal information landscape with supporting information systems, and they all rely on a coherent and smooth flow of information and data between each other. The problem is that while they may use similar vocabularies, the contexts in which they are used may be very different due to their different responsibilities and enacted environment (laws, regulations, policies, guidelines, processes and practices) when looking from a holistic perspective.

A way to supersede this would be to infuse semantic technologies and shared controlled vocabularies throughout, so that the mix of internal information systems could become interoperable regardless of the supporting information system or storage type. In such a case linked-open-data and semantic enhancements could glue and bridge the gaps to form one united composite, managed by just one individual’s record keeping. In such a way, the actual content would not be exposed, rather a metadata schema would be employed to cross any of the previously existing boundaries.

This is a win-win situation, as semantic technologies and any linked-open-data tinkering use the shared conversation (terms and terminologies) that already exists within the various parts of the process. While all parts cohere to the semantic layers, there is no need to reconfigure internal processes or apply other parties’ resource descriptions and elements. In such a way only parts of schemas are used that are context specific for a given part of a process, and so allowing the lingo of the related practices and professions to be aligned.

This is already happening in practice in the internal workplace environment of an existing court, where a shared intranet is based on such organising principles as already mentioned, uses applied sound and pragmatic information management practices and metadata standards like Dublin Core and Common Vocabularies – all of which are infused in Content Provisioning.

For the members of staff, working inside a court setting, this is a major improvement, as they use external databases everyday to gain insights in order to carry out their duties. And when the internal workplace uses such a set up, their knowledge sharing can grow – leading to both improved wayfinding and findability.

Yet another interesting case, is a service company that operates on a global scale. They are an authoritative resource in their line-of-business, maintaining a resource of rules and regulations that have become a canonical reference. By moving into a new expanded digital workplace environment (internet, extranet and intranet) and using semantic enhancement and search, they get a linked-data set that can be used by clients, competitors and all others working within their environment. At the same time their members of staff can use the very same vocabularies to semantically enhance their provision of information and data into the different information systems internally.

The last example is an industrial company with a mix of products within their line-of-business. They have grown through M&A over the years, and ended up in a dead-end mess of information systems that do not interoperate at all. A way to overcome the effect of past mergers and aquisitions, was to create an information governance framework. Applying it with MDM and semantic search they were able to decouple data and information, and as a result making their workplace more resilient in a world of constant flux.

One could potentially apply these pragmatic steps to any line of business, since most themes and topics have been created and captured by the emerging semantic web and linked-data realm. It is only a matter of time before more will jump on this bandwagon in order to take advantage of changes that have the ability to make them a canonical reference, and a market leader. Just think of the film industry’s IMDB.

A final thought: Are the vendors ready and open-minded enough to alter their software and online services in order to realise this outlined future enterprise information landscape?

IBM Content Analytics with Enterprise Search (ICA) has its strength in natural language processing (NLP) which is achieved in the UIMA pipeline. From a Swedish perspective, one concern with ICA has always been its lack of NLP for Swedish. Previously the Swedish support in ICA consisted only of dictionary-based lemmatization (word: “sprang” -> lemma: “springa”). However, for a number of other languages ICA has also provided part of speech (PoS) tagging and sentiment analysis. One of the benefits of the PoS tagger is its ability to disambiguate words, which belong to multiple classes (e.g. “run” can be both a noun and a verb) as well as assign tags to words, which are not found in the dictionary. Furthermore, the POS tagger is crucial when it comes to improving entity extraction, which is important when a deeper understanding of the indexed text is needed.

IBM uses ICA and its NLP support together with several of their products. The jeopardy playing computer Watson may be the most famous example, even if it is not a real product. Watson used NLP in its UIMA pipeline when it analyzed its data from sources such as Wikipedia and Imdb.

One product which leverage from ICA and its NLP capabilities is Content and Predictive Analytics for Healthcare. This product helps doctors to determine which action to take for a patient given the patient’s journal and the symptoms. By also leveraging the predictive analytics from SPSS it is possible to suggest the next action for the patient.

ICA can also be connected directly to IBM Cognos or SPSS where ICA is the tool which creates structure to unstructured data. By using the NLP or sentiment analytics in ICA, structured data can be extracted from text documents. This data can then be fed to IBM Cognos, SPSS or non IBM products such as Splunk.

ICA can also be used on its own as a text miner or a search platform, but in many cases ICA delivers its maximum value together with other products. ICA is a product which helps enriching data by creating structure to unstructured data. The processed data can then be used by other products which normally work with structured data.

To achieve a high NPS, Net Promoter Score, the customer experience (cx) is crucial and a critical factor behind a positive customer experience is the ease of doing business. For companies who interact with their customers through the web (which ought to be almost every company these days) this of course implies a need to have good Findability and search on the website in order for visitors to be able to find what they are looking for without effort.

The concept of NPS was created by Fred Reichheld and his colleagues of Bain and Co who had an increasing recognition that measuring customer satisfaction on its own wasn’t enough to make conclusions of customer loyalty. After some research together with Satmetrix they came up with a single question that they deemed to be the only relevant one for predicting business success “How likely are you to recommend company X to a friend or colleague.” Depending upon the answer to that single question, using a scale of 0 to 10, the respondent would be considered one of the following:

The Net Promoter Score model

The idea is that Promoters—the loyal, enthusiastic customers who love doing business with you—are worth far more to your company than passive customers or detractors. To obtain the actual NPS score the percentage of Detractors is deducted from the percentage of Promoters.

Touch point experience (the degree of warmth and understanding conveyed by front-line employees)

According to ‘voice of the customer’ research conducted by British customer experience consultancy Cape Consulting the ease of doing business and the touch point experience accounts for 60 % of the Net Promoter Score, with some variations between different industry sectors. Both factors are directly correlated to how easy it is for customers to find what they are looking for on the web and how easily front-line employees can find the right information to help and guide the customer.

Successful companies devote much attention to user experience on their website but when trying to figure out how most visitors will behave website owners tend to overlook the search function. Hence visitors who are unfamiliar with the design struggle to find the product or information they are looking for causing unnecessary frustration and quite possibly the customer/potential customer runs out of patience with the company.

Ideally, Findability on a company website or ecommerce site is a state where desired content is displayed immediately without any effort at all. Product recommendations based on the behavior of previous visitors is an example but it has limitations and requires a large set of data to be accurate. When a visitor has a very specific query, a long tail search, the accuracy becomes even more important because there will be no such thing as a close enough answer. Imagine a visitor to a logistics company website looking for information about delivery times from one city to another, an ecommerce site where the visitor has found the right product but wants to know the company’s return policy before making a purchase or a visitor to a hospital’s website looking for contact details to a specific department. Examples like these are situations where there is only one correct answer and failure to deliver that answer in a simple and reliable manner will negatively impact the customer experience and probably create a frustrated visitor who might leave the site and look at the competition instead.

Investing in search have positive impacts on NPS and the bottom line

Google has taught people how to search and what to expect from a search function. Step one is to create a user friendly search function on your website but then you must actively maintain the master data, business rules, relevance models and the zero-results hits to make sure the customer experience is aligned. Also, take a look at the keywords and phrases your visitors use when searching. This is useful business intelligence about your customers and it can also indicate what type of information you should highlight on your website. Achieving good Findability on your website requires more than just the right technology and modern website design. It is an ongoing process that successfully managed can have a huge impact on the customer experience and your NPS which means your investment in search will generate positive results on your bottom line.