The Foundation of Data Quality

Blazent's CEO, Charlie Piper and Dan Ortega introduce the company strategy, vision and it's value to customer's and MSP partners. Together, Charlie and Dan describe how Blazent's platform finds the the most accurate data to improve decision manking in IT and beyond.

For a long period of time, state agencies have been built on a foundation of bureaucracy, process and structure, imposing governmental culture and value systems on the citizens and organizations that interact with them. The impact of this is not only in the inherent inefficiencies that have been created, but also in the steadily increasing governmental costs associated with providing service. Fortunately, the environment is changing. Government agencies are increasingly looking to private industry as an example of modern customer-centric interactions and the internal capabilities needed to enable them.
State IT organizations have been some of the strongest proponents of IT service management, enterprise architecture and data governance standards. While it may appear that these approaches perpetuate the bureaucratic mindset, in reality, they establish a framework where the lines between government/private industry can be blurred, and citizens can benefit from the strengths of government organizations in new and innovative ways.
State processes have always been data-centric – collecting, processing and analyzing information to support the agency’s charter. Recently, however, the interpretation of this charter has changed to include a stronger focus on the efficient use of resources and the effectiveness of the organization in making a positive impact on its served community. While standards provide a framework for transparency, responsiveness and connectivity, achieving success relies strongly on implementation. How IT systems are implemented, both internally to the organization and in conjunction with the broader ecosystem of public and private partner organizations, is critical for determining whether the organization’s charter can be effectively fulfilled in the context of modern interactions and under the present-day cost constraints.

One of the goals of health reform and digital medical records efforts during the past decade has been enabling the creation of unified medical records. This “patient health timeline ” would be a complete digital chronology of the patient’s lifetime medical history (including symptoms, test results, diagnosis, provider notes and treatment activities) that providers can use when treating the patient.

An ambitious goal, the “patient health timeline” has been a difficult vision to realize due to the volume and fragmentation of patient health records – some of which have been digitized and some still reside in paper form only.

Fragmentation: Health records for a single patient are spread across the systems of a number of healthcare providers, insurance companies, pharmacies, hospitals and treatment centers. Each of these systems is unique, with no standard means of integrating patient data. Properly contextualizing data through an accurate set of relationships is key to establishing the integrity of integrated data from different sources.

Accuracy: There are portions of a patient’s health record which are relatively static throughout their lifetime (family medical history, allergies, chronic conditions and demographic data) and other portions that change with the patient’s health status and general aging (height/weight, reported symptoms, diagnosis and treatments, mental state, etc.). For the static portions (e.g., profile information), provider records often contain conflicting

Patient Privacy: Regulations require patients to grant specific authorization for the use and sharing of personal health records. Compiling the patient health timeline would require the patient to grant authorization for the data to be integrated, for the use of the timeline data after it is compiled and to allow them to revoke authorization for specific data points or sets during the future.

For almost a decade, companies have been investing in IT systems to support business process automation and to enable data-driven decision making. The good news is that those investments have generated acceptable ROIs, most core functions have IT systems to support them, and the leaders of those functions use the generated data to make decisions every day. What happens now?
Reports aligned to the business functions the software is designed to support – providing improved functional insights to end-users and decision makers. This reporting is sufficient (and in some cases ideal) to support the discrete needs of the individual business function or process and, during time, has enabled companies to independently optimize sales, manufacturing, finance, customer support, IT and other functions. The downside has been a tendency to create siloed business behavior and blind spots to data in other parts of the organization.
Modern businesses are becoming more aware of the blurred dividing lines across organizations, as business leaders work together to address mounting cost pressures to retain their competitive advantage. Not only has the low-hanging fruit of functional optimization already been harvested, but it is also becoming clear to many leaders that optimizing cross-functionally across the company not only leads to greater efficiency and reduced duplication, but impacts opportunities and the potential for value on a much larger scale.
To enable cross-functional optimization, IT organizations must deliver capabilities to business decision makers to look at data across the organization, allowing them to gain the integrated insights they need.

Operational Technology (OT) consists of hardware and software that are designed to detect or cause changes in physical processes through direct monitoring and control of devices. As companies increasingly embrace OT, they face a dilemma as to whether to keep these new systems independent or integrate them with their existing IT systems. As IT leaders evaluate the alternatives, there are 5 key barriers to IT/OT integration to consider.
Business Process Knowledge
Manageability & Support
Dependency Risk – Two of the key challenges of enterprise IT environments are managing the complex web of dependencies and managing the risk of service impact when a dependent component fails or is unavailable. With traditional IT, the impact is typical to some human activity, and the user is able to mitigate impact through some type of manual activity. For OT, companies must be very careful managing the dependencies on IT components to avoid the risk of impacting physical processes when and where humans are not available to intervene and mitigate the situation.
Management of OT Data – The data produced by OT devices can be large, diverse in content, time sensitive for consumption and geographically distributed (sometimes not even connected to the corporate network). In comparison, most IT systems have some level of tolerance for time delays, are relatively constrained in size and content and reliably connected to company networks, making them accessible to the IT staff for data management and support.
Security – IT systems are a common target for malicious behavior by those wishing to harm the company. The integration of OT systems with IT creates additional vulnerability targets with the potential of impacting not just people and but also physical processes.
Segmentation of IT

In the wake of the most recent (May 2017) malware attack impacting computer systems around the world, company executives are in urgent discussions with IT leaders, asking them to provide assessments of risks and vulnerabilities and recommendations to safeguard the company’s information and operations. CIOs and IT leaders strongly depend on the accuracy, completeness and trustworthiness of the data at their disposal to make informed decisions. How confident are you of the data being used to protect your organization from harm?
There are commonly at least 5 independent sources of data that must be combined to identify what devices are potentially vulnerable and what business functions depend on them. When these data sets are gathered, there will undoubtedly be a large number of duplicates, partial records, records for devices that have been retired or replaced, conflicting data about the same device and records with old data that is inaccurate. According to Gartner, at any moment, as much as 40% of enterprise data is inaccurate, missing or incomplete. Data quality technology can help integrate the data, resolve the issues, alert data management staff to areas that need attention and help decision makers understand the accuracy and completeness of the data on which they depend.
Blazent has been a leader in providing Data Quality solutions for more than 10 years and is an expert in integrating the types of IT operational data needed to help CIOs and IT leaders assemble an accurate and unified big picture view of their technology ecosystem. With data quality and trustworthiness enabled by Blazent’s technology, your leaders and decision makers can be confident that the information they are using to assess vulnerabilities and risks will lead to solid recommendations and decisions that protect your organization from harm.

What will you do when your job and the future of your company hinges on your ability to analyze almost every piece of data your company ever created against everything known about your markets, competitors and customers – and the impact of your decision will determine success or failure? That future is closer than you think. Data on an entirely different level is coming, and much faster than anyone realizes. Are you prepared for this new paradigm?

•Technologists have been talking about “big-data” as a trend for more than a decade and that it is coming “” “Soon” is now in your rear-view mirror.
•Companies have been capturing and storing operational and business process data for more than 20 years (sometimes longer), providing a deep vault of historical data, assuming you can access it.
•IoT is leading to the creation of a massive stream of new operational data at an unprecedented rate. If you think volumes are high now, you’ve seen nothing yet.
•The free flow of user-generated (un-curated) information across social media has enabled greater contextual insights than ever before, but concurrently the signal-to-noise ratio is off the charts.

What does all this mean? It means big data is already driving everything we do. The analytics capabilities of IT systems are becoming more sophisticated and easier for business leaders to use to analyze and tune their businesses. For them to be successful and make good decisions, however, the data on which they rely must be trustworthy, complete, accurate and inclusive of all available data sets.

Delivering the underlying quality data that leaders need is no small feat for the IT department. The problem has transformed from“ not enough data” to “too much of a good thing.” The challenge facing most organization is filtering through the noise in the data and amplifying the signal of information that is relevant and actionable for decision-making.

1. What are they? An accurate inventory of what assets and configuration items exist in your IT ecosystem is the foundation of your CMDB. Your asset/CI Records may come from discovery tools, physical inventories, supplier reports, change records, or even spreadsheets, but whatever their origin, you must know what assets you have in your environment.
2. Where are they? Asset location may not seem relevant at first, but the physical location of hardware, software and likely infrastructure impacts what types of SLAs you can provide to users, the cost of service contracts with suppliers and, in some areas,
3. Why do we have them? Understanding the purpose of an asset is the key to unlocking the value it provides to the organization. Keep in mind that an asset’s purpose may change during time as the business evolves.
4. To what are they connected? Dependency information is critical for impact assessment, portfolio management, incident diagnosis and coordination of changes.
5. Who uses them? User activities and business processes should both be represented in the CMDB as CIs (they are part of your business/IT ecosystem).
6. How much are they costing? Assets incur both direct and indirect costs for your organizations. Some examples may include support contracts, licensing, infrastructure capacity, maintenance and upgrades, service desk costs, taxes and management/overhead by IT staff.
7. How old are they? Nothing is intended to be in your environment forever. Understanding the age and the expected, useful life of each of your assets helps you understand the past and future costs (TCO) and inform decisions about when to upgrade versus when to replace an asset.
8. How often are they changing? Change requests, feature backlogs and change management records provide valuable insights into the fitness of the asset for use (both intended use and incidental).

Machine Learning is a game changer for business process optimization – enabling organizations to achieve levels of cost and quality efficiency never imagined previously. For the past 30 years, business process optimization was a tedious, time-consuming manual effort. Those tasked with this effort had to examine process output quality and review a very limited set of operational data to identify optimization opportunities based on historical process performance. Process changes would require re-measurement and comparison to pre-change data to evaluate the effectiveness of the change. Often, improvement impacts were either un-measurable or failed to satisfy the expectation of management.
With modern machine-learning capabilities, process management professionals are able to integrate a broad array of sensors and monitoring mechanisms to capture large volumes of operational data from their business processes. This data can be ingested, correlated and analyzed in real-time to provide a comprehensive view of process performance. Before machine learning, managing the signals from instrumented processes was limited to either pre-defined scenarios or the review of past performance. These limitations have now been removed.
In business process optimization, there is an important distinction to be made between “change” and “improvement.” Machine-learning systems can correlate a large diversity of data sources – even without pre-defined relationships. They provide the ability to qualify operational (process) data with contextual (cost/value) data to help process managers quantify the impacts of inefficiencies and the potential benefits of changes. This is particularly important when developing a business justification for process optimization investments.

The Wanna Cry ransomware worm ravaged computers across 150 countries. The attacks began May 12, 2017, infecting PCs of organizations that had not applied security updates to some versions of Microsoft Windows. This menace paired ransomware that encrypted computers and demanded payment with a worm that enabled it to spread quickly. The ransomware encrypts all the user’s data, then a pop-up message appears demanding a $300 Bitcoin payment in return for the decryption key.
In the UK, the National Health System attack resulted in hospital workers being unable to review patient health histories, causing postponed surgeries and increasing risks to all new patients. Medical staff reported seeing computers go down “one by one” as the attack took hold, locking machines and demanding money to release the data.
Organizations had only days to patch their Windows end-user and server systems. Once on a system, the malware discovers on what subnet it is located, so it can infect its neighbors. Anti-virus software is the next defense when a worm has breached a machine. Ensuring total coverage of IT infrastructure is critical. Any chinks in the armor must be detected and remediated. Anti-virus products detect strings of code known as virus signatures before killing the offending program. When these products fail, network administrators are forced to redirect suspicious traffic to IP sinkholes, and then direct them from harm’s way.
Just like anti-virus software, patch management solutions usually require a management agent to be installed on the target system. Not surprisingly, 100% coverage is very rare.
Despite encouraging reports of waning threat activity, Wanna Cry continues to pose significant risks. Blazent provides a SaaS solution that enables its customers to take advantage of five or more data sources to build an accurate inventory of their IT assets, such as end-user systems and servers.

People are the heart and mind of your business. Processes form the backbone of your operations. Data is the lifeblood that feeds everything you do. For your business to operate at peak performance and deliver the results you seek, people, processes and data must be healthy individually, as well as work in harmony. Technology has always been important to bringing people, process and data together; however, technology’s importance is evolving. As it does, the relationships among people, processes and technology are also changing
People are the source of the ideas and the engine of critical thinking that enables you to turn customer needs and market forces into competitive (and profitable) opportunities for your business. The human brain is uniquely wired to interpret a large volume of information from the environment, analyze it and make decisions about how to respond.
Business and manufacturing processes provide the structure of your company’s operations – aligning the activities and efforts of your people into efficient and predictable workflows. Processes are critical to enable the effective allocation of the organization’s resources and ensure consistent and repeatable outcomes in both products and business functions.
Operational data enables the people and process elements of your company to work together, providing both real-time and historical indications of what activities are taking place and how well they are performing. The ability of companies to fine tune their organization effectively for optimal business performance will be largely dependent on the quality and trustworthiness of the data assets they have at their disposal. Business processes have become more data-centric, and technology adoption has expanded the possibilities for new and diverse instrumentation. Bringing all of the operational, environmental and strategic data sources together to enable decision making has become critical to business success.

The term “dynamic enterprise” was introduced during 2008, as an enterprise architecture concept. Rather than striving for stability, predictability and maturity, dynamic enterprises began focusing on continuous and transformational growth – embracing change as the only constant. This shift began with the proliferation of social media and user-generated (Web 2.0) content, which started to replace the curated information previously available.
As the data consumption trends evolved within the business environment, technologists (including Tim Berners-Lee, the inventor of the World Wide Web) were working behind the scenes on standards for a Semantic Web (Web 3.0), where computers could consume and analyze all of the content and information available
Making the data readable by computers was only part of the challenge. Most companies still lacked the technology capabilities and know-how to take advantage of the information at their disposal. Advancements in machine learning and cloud infrastructure during the past 3 years have finally unlocked the potential of big data to the masses. A few large cloud service providers have invested in computing infrastructure and developed the capabilities to ingest and process vast quantities of data. They have analyzed, correlated and made it available to users in the form of cloud services that require neither the technical expertise nor the capital investment that were former barriers to adoption.
As more enterprises and individuals leverage machine learning to draw insights from data, those insights become part of the “learned knowledge” of the system itself, and help the computer understand context and consumption behavior patterns that further improved its capability to bridge the human-information divide.

Improvements in IT data quality and analysis tools have enabled IT management to spend less time looking into the past and more time enabling the dynamic enterprise of the future. This allows them to anticipate business events more accurately, forecast costs and capacity, and identify operational risks before they appear. Empowered by technology-driven insights and technology-enabled prediction ability, IT leaders have secured a long-sought seat at the table with their business counterparts during the strategic planning process. IT management becoming more predictive is good. Right? Perhaps, but there are some risks to consider.

Technology-enabled prediction is only as good as the underlying data, and does a poor job of addressing unknown variables. Human intuition and analysis skills have traditionally been used to fill gaps in available data, interpret meaning and project future events. The predictive abilities of most IT leaders are heavily dependent on the quality of information and technology-enabled processing power at their disposal. Modern machine learning systems have made tremendous strides in analyzing large volumes of data to identify trends and patterns based on past and current observations. Their capability to do so is limited, however, by the quality and dependability of data inputs. “Garbage in-garbage out” has been the rule for many years.

Learning how to harness the power of technology and information and applying it to create valuable predictive insights for an organizations is definitely good; IT leaders should be commended for bringing new capabilities to the decision-making table. As we all know, however, no information is perfect, and technology has its limitations. Becoming entirely reliant on technology for prediction and losing the ability to apply a human filter is a risky situation for businesses. As with many business decisions, it is important to balance the potential benefits with the acceptable risk profile for your organization

Sexy may not be the first word that comes to mind when you think about your CMDB and the operational data of your company… but (seriously) maybe it should be! After all, your CMDB has a number of attractive qualities and (with some care and feeding) could be the ideal partner for a lasting long-term relationship. There are lots of potential reasons this can work, but let’s focus on the top three:
Substance: Your CMDB is not shallow and fickle, it is strong and deep, with a history as long as your company’s. The CMDB is built on a core of your master data and pulls together all of the facets of operational data your company creates every day. It contains the complex web of connective tissue that can help you understand how your company works. Those insights then become part of the CMDB itself – enabling the strength of your data to be balanced by the wisdom that comes from analytics and self-awareness.
Long-term potential: You may lust after the latest new tool or trend, but your CMDB will stand by your company’s side through thick and thin, long into the future. It will grow and evolve with you, always be honest about what’s going on, and work with you to provide insights to get your company through troubled times. As your company changes with new markets, products, customers, and competitors or becomes a part of something bigger through acquisition or partnership, your CMDB is there to help you navigate the changes and achieve success.
Air of mystery: You may never fully understand all of the secrets that your CMDB holds about your company. As you unlock one insight, the potential for others seems to appear magically. What would you expect from something that brings together all parts of your company data and the complex interrelationships in one place for you to explore?
Deep substance, long-term potential and an air of mystery. Maybe your CMDB is sexier than you think

Throughout history, business has always struggled with the challenge of data accuracy and integrity. Executives constantly ask their IT leaders how they can improve the quality and integrity of data in order to obtain the insights needed to guide their company effectively. While it sounds reasonable, it may well be the wrong question. Rather than focusing on the quality of raw data, a better approach is to focus on the quality of insights available and the speed/cost to obtain them by asking, “How can we better leverage the data we already have to cost effectively obtain the insights we need?”
Advances in machine learning, data science and correlation analysis during the past decade have enabled a broader range of capabilities to analyze data from disparate operational processes and information systems. This has been accomplished without developing some of the structured relationships and incurring data-model-integration costs associated with traditional data warehousing and reporting approaches
Through assessment of the trends and relationships between different data elements, modern data analysis systems are able to “discover” a variety of insights that may not have been available during the past. Examples include undocumented dependencies within operational processes, sources of data inaccuracy and the evolution of operational processes during time. Instead of focusing on what is “known” about operational data, modern methods focus on understanding what is “unknown” about operational data.
Is data integrity the key to operational insights or is it the elephant in the room? That depends on how organizations want to view the situation. Data Integrity at both the informational and operational level is a core requirement of any modern business, and has been an area of focus for Blazent since the early days of Big Data.

Each day, with every customer transaction, employee task and business process, companies generate vast amounts of operational data that provides leaders and managers with insight into what is working well and what requires attention. Operational data is particularly important to those responsible for stewarding the information and technology assets of their organization.
In this context, operational data is particularly important to IT, which is why it is so critical to understand the three different types of operational data on which IT leaders rely.
Business operational data is all about the business processes and user experiences, which IT enables with the technology and services it provides. The reason organizations invest in technology is to improve the productivity and effectiveness of business operations. Process and user-related data evaluated over time provides a contextual picture into how effectively the technology is achieving that goal.
IT operational data is concerned with the content of “what” technology components are operating and being used. IT operational data is important as a part of the IT planning process to understand capacity utilization and determine where scalability constraints exist, as well as to understand the cost of services provided to users and to assess security and risk considerations of the business-technology ecosystem. Within IT service management processes, operational data is critical to ensure performance and availability Service Levels Agreements (SLAs) are honored, and to drive technology cost reduction through infrastructure optimization.
Operational data provides IT with the critical picture it needs to understand and optimize the role it plays in the context of the company.

During the next 5 years, machine learning is poised to play a pivotal and transformational role in how IT Infrastructure is managed. Two key scenarios are possible: transforming infrastructure from a set of under-utilized capital assets to a highly efficient set of operational resources through dynamic provisioning based on consumption; and the identification of configurations, dependencies and the cause/effect of usage patterns through correlation analysis.
In the world of IT infrastructure, it’s all about efficient use of resources. With on-premise infrastructure (compute, storage and network) utilization rates for most organizations in the low single digits, the cloud has sold the promise of a breakthrough. For those organizations moving to Infrastructure as a Service (IaaS), utilization in the middle to high teens is possible, and for those moving to Platform as a Service (PaaS), utilization in the mid-twenties is within reach.
Dynamic provisioning driven by demand is essentially the same operational concept as power grids and municipal water systems – capacity allocation driven by where resources are consumed, rather than where they are produced.
The second part of the breakthrough relates to right-sizing infrastructure. Whether this is network capacity or compute Virtual Machine size – machine learning will enable analysis of the patterns of behavior by users and correlate them to the consumption of infrastructure resources.
During the near term, these benefits will be much more tactical. Automated discovery combined with behavioral correlation analysis will virtually eliminate the need for manual inventory and mapping of components and configuration items in the IT ecosystem to reveal how the ecosystem is operating.
Today, IT has the opportunity to automate the mapping of components in their infrastructure to provide a more accurate and actionable picture.

How well prepared is your organization for growth? What are the challenges to making progress?
IT systems and the data they contain for the organization is often seen as a foundational capability, an underpinning function, or simply a static resource separate from the organization’s core value chain. Framing data as a part of a value chain can enable you to see the downstream impact of upstream IT data improvements in the activities that consume them.
When the integrity and quality of IT data improves, leaders have more confidence in the decisions they make. They have the ability to evaluate opportunities and problems faster and more easily, without the need to question and independently validate the information they are continuously receiving. This increase in confidence can lead to the pursuit of more ambitious and more broadly scoped business opportunities, as well as the ability to preemptively mitigate organizational risks.

Improving the quality of the data available from IT enables data professionals to see process-performance variances easier and faster, and to correlate previously independent data sets that can drive new operational insights.

Data integration improvements across IT systems improve the efficiency of employees involved in executing transactional processes by reducing the need for redundant data entry tasks to keep operational data in sync as transactions flow through business processes. By removing manual tasks, managers and leaders have greater transparency into operational performance with a lower risk of intentional data manipulation and/or human error.
By providing an automated solution that delivers the highest data quality by using information gained from multiple sources to create refined data records.

Having data you can rely on is foundational to good decision making. Data Integrity is an important requirement which can be defined in many ways. The Technopedia definition of Data Integrity focuses on three key attributes of completeness, accuracy, and consistency.
In this video, we review these attributes in the context of IT Service and Operations Management . So, to begin:
Completeness: A data record such as a description of an IT asset needs to be complete in order to satisfy the needs of all its consumers. For example, IT Operations cares about whether the asset is active, as well as its location, while Finance wants to manage attribution of software licenses. Gaps in the attribute data can impair an organization’s ability to manage the asset.
Accuracy: Having wrong or misleading data helps no one. The cause of inaccuracy can be due to manual input errors, or mis-handled conflicting data between sources from poor IT discovery tools that miss or double-count an asset.
Consistency: This is one of the harder data integrity issues to resolve. If you only have a single source of data, it is likely to be consistent (although potentially consistently wrong). However, in order to verify the data, it has to be validated against multiple sources. Deciding which source is the most accurate is complicated, and setting up automated precedence rules can be challenging without the right tool.
Achieving and maintaining data integrity can be done using various error-checking methods such as normalization and validation procedures. Blazent’s Data Integrity platform was designed to make data management scalable through an automated process for exception handling. To learn more about the importance of good data integrity in an IT Service Management context, you can read Blazent’s white paper on Data Powered IT Service Management, available on the resources page on our website at www.blazent.com