Follow Enterprise Information Technology on Pinterest

Posts Tagged ‘analytics’

Tomorrow I will be giving a webinar on creating business cases for Big Data. One of the reasons for the webinar was that there is very little information available on creating a Big Data business cases. Most of what is available boils down to a “trust me, Big Data will be of value.” Most information available on the internet basically states:

More information, loaded into a central Hadoop repository, will enable better analytics, thus making our company more profitable.

Although logically, this statement seems true and most analytical companies have accepted the above statement, it illustrates the 3 most common mistakes we see in creating a business case for Big Data.

The first mistake, is not directly linking the business case to the corporate strategy. The corporate strategy is the overall approach the company is taking to create shareholder value. By linking the business case to the objectives in the corporate strategy, one will be able to illustrate the strategic nature of Big Data and how the initiative will support the overall company goals. Read the rest of this post »

In the Hadoop space we have a number of terms for the Hadoop File System used for data management. Data Lake is probably the most popular. I have heard it called a Data Refinery as well as some other not so mentionable names. The one that has stuck with me has been is the Data Reservoir. Mainly because this most accurate water analogy to what actually happens in a Hadoop implementation that is used for data storage and integration.

Consider, that data is first landed in the Hadoop file system. This is the un-processed data just like water running into a reservoir from different sources. The data in this form in only fit for limited use, like analytics by trained power users. The data is then processed just like water is processed. Process water you end up with water that is consumable. Go one step further and distill it, and you have water that is suitable for medical applications. Data is the same way in a Big Data environment. Process it enough and one ends up with conformed dimensions and fact tables. Process it even more, and you have data that is suitable for basing bonuses or even publishing to government regulators. Read the rest of this post »

The data warehouse has been a part of the EIM vernacular for nearly 20 years. The vision of the single source of the truth and a single repository for reporting and analysis are two objectives that have resulted in a never-ending journey. The data warehouse never has had enough data and the quality required for a single version of the truth demands significant investment that only rare business cases could support. Further, the role of the analytical database has generally been difficult to achieve. Ad-hoc analysis on large sets of complex data has generally been a significant challenge for the traditional data warehouse. Historically, to address this, companies have implemented appliances, analytical data marts, or a varying set of database features and compromises (think bit mapped indexing, a variety of hardware and software caching techniques, indexed stored data to name a few). All with significant investment and usually adding significant overhead. Read the rest of this post »

While there are several BI technologies and more coming into the foray every day, SSRS has remained a key player in this area for quite some time now. One of the biggest advantages of SSRS reporting is that it involves the participation of the end user and that is very intuitive to use.

Let’s go back few years when excel was the go to tool for dash boarding. Every time a director or VP wanted a report, he would go to his developers to extract information from the database to help him make dashboards for his meetings. The end user had to rely on the developers to extract information and had to spend several minutes if not hours to make a dashboard. This all works ok when the meeting is scheduled for a specific day of the week or month. We all know this is a myth and most meetings happen impromptu. In such cases, there is not enough time to extract data and to extrapolate that information into graphs.

Here is why SSRS came in as a key player. With a strong foundation of Microsoft, SSRS brought in some of the best features and much needed features:

Easy connection to databases

User friendly interface allowing users to design reports and make changes on the fly.

Report generation on a button click.

Subscription based delivery to deliver reports on a specific day and time of the month.

While these features may not look ground breaking in the first look, these features actually bring in a lot of value. These features save a lot of time and that time in business directly translates into revenue. The developers can design dashboards once and deploy them to a server. The VP or director can press a button to get these reports on his machine. Furthermore, the reports can be exported in several formats. What I really like about the reports though is the look and feel. Microsoft retained the aesthetics of MS excel reports and by that I mean that you can have a pie chart in excel and in SSRS look exactly same. This is a great feature especially for the audience since it most people do not like to see the look of the reports change over time. Another great feature is that SSRS has fantastic security options and one can implement a role based reporting.

In summary, SSRS is a power packed tool and you should reap benefits of the great features that come with it.

For information on Microsoft’s future BI roadmap and self-service BI options check out this post over on our Microsoft blog

Years of work went into building the elusive single version of truth. Despite all the attempts from IT and business, Excel reporting and Access databases were impossible to eliminate. Excel is the number one BI tool in the industry and for the following good reasons : accessibility to the tool, speed and familiarity. Almost all the BI tools export data to Excel for those reasons. Business will produce the insight they need as soon as the data is available, manual or otherwise. It is time to come to terms with the fact change is imminent and there is no such thing as Perfect Data but only what is good enough to business. As the saying goes:

‘Perfect is the enemy of Good!’

So waiting for all the business rules and perfect data to produce the report or analytics, is too late for the business. Speed is of essence, when the data is available, business wants it; stale data is as good as not having it.

In the changing paradigm of Data Management, agile ideas and tools are in play. Waiting for Months, weeks or even a day to analyze the data from Data warehouse is a problem. Data Discovery through Agile BI tools which doubles as ETL, offers significant reduction in data availability. Data Virtualization provides access to data in real-time for quicker insights along with metadata. In-Memory data appliances produce analytics in fraction of the time compared to traditional Data warehouse/ BI.

We are moving from the Gourmet sit-in dining to fast food concept for Data access and analytical insights. Though both have its place, usage benefits and short comings. They complement each other in terms of use and the value they bring to the Business. In the following series let’s look at these new set of tools and how they help Agile Data Management throughout the life cycle.

Big Data is on everyone’s mind these days. Creating an analytical environment involving Big Data technologies is exciting and complex. New technology, new ways of looking at the data which is otherwise remained dark or not available. The exciting part of implementing the Big Data solution is to make it a production ready solution.

Once the enterprise comes to rely on the solution, dealing with typical production issues is a must. Expanding the data lakes and creating multiple applications accessing, changing and deploying new statistical learning solutions can hit the overall platform performance. In the end-user experience and trust will become an issue if the environment is not managed properly. Models which used to run in minutes may turn into hours and days based on the data changes and algorithm changes deployed. Having the right DevOps process framework is important to the success of Big Data solutions.

In many organizations the Data Scientist reports to the business and not to IT. Knowing the business and technological requirements and setting up the DevOps process is key to make the solutions production ready.

Development to Production – Deployment Performance (Application changes)

Service SLA Performance (incidents, outages)

Security robustness / compliance

One of the top key issue is Big Data security. How secured is the data and who has the access and the oversight of the data? Putting together a governance framework to manage the data is vital for the overall health and compliance of the Big Data solutions. Big Data is just getting the traction and much of best practices for Big Data DevOps scenarios yet to mature.

The speed in which we receive information from multiple devices and the ever-changing customer interactions providing new ways of customer experience, creates DATA! Any company that knows how to harness the data and produce actionable information is going to make a big difference to their bottom line. So Why Virtualization? The simple answer is Business Agility.

As we build the new information infrastructure and the tools for the modern Enterprise Information Management, one has to adapt and change. In the last 15 years, the Enterprise Data Warehouse has matured to a point with proper ETL framework and Dimension models.

With the new ‘Internet of Things’ (IoT) a lot more data is created and consumed from external sources. Cloud applications create data which may not be readily available for analysis. Not having the data for analysis will greatly change the critical insights outcome.

Major Benefits of Virtualization

Additional considerations

Address performance impact of Virtualization on the underlying Application and the overall refresh delays appropriately

It is not a replacement for Data Integration (ETL) but it is a quicker way to get data access in a controlled way

May not include all the Business rules, which implies Data Quality issues, may still be an issue

In conclusion, having the Virtualization tool in the Enterprise Data Management portfolio of products will add more agility in Data Management. However, use Virtualization appropriately to solve the right kind problem and not as a replacement to traditional ETL.

Cloud BI comes in different forms and shapes, ranging from just visualization to full-blown EDW combined with visualization and Predictive Analytics. The truth of the matter is every niche product vendor offers some unique feature which other product suite does not offer. In most case you almost always need more than one suite of BI to meet all the needs of the Enterprise.

De-centralization definitely helps the business in achieving agility and respond to the market challenges quickly. At the same token that is how companies may end up with silos of information across the enterprise.

Let us look at some scenarios where a cloud BI solution is very attractive to Departmental use.

Time to Market

Getting the business case built and approved for big CapEx projects is a time-consuming proposition. Wait times for HW/SW and IT involvement means lot longer delays in scheduling the project. Not to mention the push back to use the existing reports or wait for the next release which is allegedly around the corner forever.

Deployment Delays

Business users have immediate need for analysis and decision-making. Typical turnaround for IT to get new sources of data takes anywhere between 90 days to 180 days. This is absolutely the killer for the business which wants the data now for analysis. Spreadsheets are still the top BI tool just for this reason. With Cloud BI (not just the tool) Business users get not only the visualization and other product features but also the data which is not otherwise available. Customer analytics with social media analysis are available as a third-party BI solution. In the case of value-added analytics there is business reason to go for these solutions.

Tool Capabilities

Power users need ways to slice and dice the data, need integration of other non traditional sources (Excel, departmental cloud applications) to produce a combined analysis. Many BI tools comes with light weight integration (mostly push integration) to make this a reality without too much of IT bottleneck.

So if we can add new capability, without much delay and within departmental budget where is the rub?

The issue is not looking at the Enterprise Information in a holistic way. Though speed is critical, it is equally important to engage Governance and IT to secure the information and share appropriately to integrate into the Enterprise Data Asset.

As we move into the future of Cloud based solutions, we will be able to solve many of the bottlenecks, but we will also have to deal with security, compliance and risk mitigation management of leaving the data in the cloud. Forging a strategy to meet various BI demands of the enterprise with proper Governance will yield the optimum use of resources and /solution mix.

Perficient is exhibiting and presenting this week at KScope14 in Seattle, WA. On Monday, June 23 I presented my retail-focused solution offering built upon the success of Perficient’s Retail Pathways, but using the Oracle suite of products. In order to focus the discussion to fit within a one hour window I chose restaurant operations to represent the solution.

Here is the abstract for my presentation.

Multi-unit, multi-concept restaurant companies face challenging reporting requirements. How should they compare promotion, holiday, and labor performance data across concepts? How should they maximize fraud detection capabilities? How should they arm restaurant operators with the data they need to react to changes affecting day-to-day operations as well as over-time goals? An industry-leading data model, integrated metadata, and prebuilt reports and dashboards deliver the answers to these questions and more. Deliver relevant, actionable mobile analytics for the restaurant industry with an integrated solution of Oracle Business Intelligence and Oracle Endeca Information Discovery.

We have tentatively chosen to brand this offering as Crave – Designed by Perficient. Powered by Oracle. This way we can differentiate this new Oracle-based offering from the current Retail Pathways offering.

Looking at these numbers it is easy to see why more and more technology vendors want to provide solutions to ‘Big Data’ problems.

In my previous blog, I mentioned how we’ll soon get to a place where it will be more expensive for a company not to store data than to store data – some pundits claim that we’ve already reached this pivotal point.

Either way, it would be greatly beneficial to come to terms with at least some of those technologies that have made a substantial investment in the Big Data space.

One such technology is SAP HANA – a Big Data enabler. I am sure that some of you have heard this name before… but what is SAP HANA exactly?

The acronym H.AN.A. in ‘SAP HANA’, stands for High-performance ANalytical Appliance. If I went beyond the name/acronym and described SAP HANA in one sentence, I would say that SAP HANA is a database on steroids, perfectly capable of handling Big Data in-memory, and one of the few in-memory computing technologies that can be used as an enabler of Big Data Solutions.

“SAP HANA is a flexible, data-source-agnostic toolset (meaning it does not care where the data comes from) that allows you to hold and analyze massive volumes of data in real time, without the need to aggregate or create highly complex physical data models. The SAP HANA in-memory database solution is a combination of hardware and software that optimizes row-based, column-based, and object-based database technologies to exploit parallel processing capabilities. We want to say the key part again: SAP HANA is a database. The overall solution requires special hardware and includes software and applications – but at its heart, SAP HANA is a database”.

Or as I put it, SAP HANA is a database on steroids… but with no side-effects, of course. Most importantly though, SAP HANA is a ‘Big Data Enabler’, capable of:

Safeguarding the integrity of the data by reducing, or eliminating data migrations, transformations, and extracts across a variety of environments

Ensuring overall governance of key system points, measures and metrics

All with very large amounts of data, in-memory and in real time… could this be a good fit for your company? Or, if you are already using SAP HANA, I’d love to hear from you and see how you have implemented this great technology and what benefits you’ve seen working with it.

My next blog post will focus on SAP HANA’s harmonious, or almost harmonious, co-existence with Hadoop…