Practical Data Cleaning

The basics of data cleaning are remarkably simple, yet few take the time to get organized from the start.

If you want to get the most out of your data, you're going to need to treat it with respect, and by getting prepared and following a few simple rules your data cleaning processes can be simple, fast and effective.

The Practical Data Cleaning webinar is a thorough introduction to the basics of data cleaning and takes you through:

Is it worth it for companies to spend millions of dollars a year on software that can't keep up with constantly evolving open source software? What are the advantages and disadvantages to keeping enterprise licenses and how secure is open source software really?

Join Data Society CEO, Merav Yuravlivker, as she goes over the software trends in the data science space and where big companies are headed in 2017 and beyond.

About the speaker: Merav Yuravlivker is the Co-founder and Chief Executive Officer of Data Society. She has over 10 years of experience in instructional design, training, and teaching. Merav has helped bring new insights to businesses and move their organizations forward through implementing data analytics strategies and training. Merav manages all product development and instructional design for Data Society and heads all consulting projects related to the education sector. She is passionate about increasing data science knowledge from the executive level to the analyst level.

The best services have one thing in common: a superb customer experience. Banking services are no exception to this rule, and indeed the quest for an effortless, well informed, and personalized customer experience is one of the main goals of today's innovation in digital banking services.

According to what Maslow has described in his "pyramid of needs", customers are seeking a more intimate and meaningful experience where banking services can actively assist the customer in performing and managing their financial life. Predictive APIs have a fundamental role in all this, as they enable a new set of customer journeys such as automatic categorization of transactions, detecting and alerting recurrent payments, pre-approving credit requests or provide better tools to fight fraud without limiting legitimate customer transactions.

In this talk, I will focus on how to provide better banking services by using predictive APIs. I will describe the path on how to get there and the challenges of implementing predictive APIs in a strictly audited and regulated domain such as banking. Finally, I will briefly introduce a number of data science techniques to implement those customer journeys and describe how big/fast data engineering can be used to realize predictive data pipelines.

Hear first hand from one of the nation’s leading healthcare providers, Intermountain Healthcare, on what is actually being accomplished with big data and machine learning (cognitive computing, artificial intelligence, deep learning, etc.) by leading healthcare providers.

Intermountain has evaluated between 300 and 400 big data and analytic solutions and actively collaborates with the other leading healthcare providers in the United States to implement the solutions that are delivering improved healthcare outcomes and cost reductions.

Splunk & Dell EMC will share insights into the challenges & opportunities customers are seeing in the market – with the ‘needs to’; reduce costs and improve efficiency within IT (operational analytics), improve Compliance (security analytics) & implement Shadow IT due to the business not receiving the right service from IT. CIO Priority is keeping the lights on and so on…

Today, data is everywhere. As more data streams into cloud-based systems, the combination of data and computing resources gives us today the unprecedented opportunity to perform very sophisticated data analysis and to explore advanced machine learning methods such as deep learning.

Clouds pack very large amount of computing and storage resources, which can be dynamically allocated to create powerful analytical environments. By accessing those analytics clusters of machines, data analysts and data scientists can quickly evaluate more hypotheses and scenarios in parallel and cost-effectively.

The number of analytical tools which is supported on various clouds is increasing by the day. The list of analytical tools spans from traditional rdms databases as provided by vendors to analytics open sources projects such as Hadoop Hive, Spark, H2O. Next to provisioning tools and solutions on the cloud, managed services for Data Science, Big Data and Analytics are becoming a popular offering of many clouds.

Analytics in the cloud provides whole new ways for data analysts, data scientists and business developer to interact with each other, share data and experiments and develop relevant insight towards improved business processes and results. In this talk, I will describe a number of data analytics solutions for the cloud and how they can be added to your current cloud and on-premise landscape.

The classic unimodal data warehouse architecture has expired because it is restricted to primarily supporting structured data but not the newer data types such as social, streaming, and IoT data. New BI architecture, such as “logical data warehouse”, is required to augment the traditional and rigid unimodal data warehouse systems with a new bimodal data warehouse architecture to support requirements that are experimental, flexible, explorative, and self-service oriented.

Learn from the Logical Data Warehousing expert, Rick van der Lans, about how you can implement an agile data strategy using a bimodal Logical Data Warehouse architecture.
In this webinar, you will learn:

Organizations, already awash in customer data, know geospatial capabilities can put a new “lens”on existing reports. Data from smartphones, GPS devices and social media has organizations anxious to factor in customer location, origin or destination, with time or day.

Join IBM Product Marketing Manager David Clement and IBM Senior Product Manager Rick Blackwell and explore the new, world-class mapping and geospatial capabilities for IBM Cognos Analytics and Watson Analytics. Discover how you can add geographic dimension to visualizing critical business information in reports and dashboards in Cognos Analytics.

David Burden - CEO, Daden Limited, an immersive learning and visualisation company

This webinar will look at the challenges currently facing VR across a variety of "serious business" use cases from education and training to data visualisation and what the technology needs to do in order to get beyond the "wow" and move into being a productive, useful and truly in-demand technology.

-The value of Big Data and which skills are required to deliver that value
-How to get started with Big Data projects
-What to do if progress is limited
-Business opportunities around customer insight, supply chain analytics, and more

Rapid data growth from a wide range of new data sources is significantly outpacing organizations’ abilities to manage data with existing systems. Today’s data architectures and IT budgets are straining under the pressure. In response, the center of gravity in the data architecture is shifting from structured transactional systems to cloud based modern data architectures and applications; with Hadoop at it's core.

Join this live and on-demand video panel as they discuss how the landscape is changing and offer insights into how organizations are successfully navigating this shift to capture new business opportunities while driving cost out.

The duo will discuss a successful case study on data-driven decision making.

They will tackle:
-How to implement data solutions quickly and efficiently in the cloud
-What are the challenges of data-driven decision making?
-How to discover data pain-points across an organisation and solve these accurately
-The importance of real-time analytics in generating actionable insights

-Moving beyond dashboards and applying the “5 Whys” technique to data
-Best practice tips for exploring and manipulating data
-The need to think about “data exploration” as a task in itself, but as part of a person’s goal to make an impact on their business

- As the founder of Trifacta, tell us a bit about your company and just what is data wrangling?
- How does it differ from ETL?
- You have just announced a new server edition of Trifacta, can you tell us more this?
- Can you give us some examples of how your customers are leveraging Big Data?
- What makes a big data project successful?
- What advice would you give to companies starting out with a big data project?
- What are the biggest hurdles to overcome?
- What use cases are the most prevalent at the moment and will that change over time?

1) What are some of the challenges data professionals face when developing their own cloud applications?
2) How important is it to provide end-users with dealing with real-time insights?
3) Why is your database choice critical for transforming customer experience?
4) How have customer expectations changed in the past 5 years?

Charlie will discuss:
-Do search engines and Big Data systems share any history?
-How can search engines be used to make sense of Big Data?
-What are the options available for those wanting to add full-text search to their Big Data stack?
-Why is open source search a better choice than a closed, commercial alternative?

Dave will discuss:
-Riak, the world's most resilient NoSQL database and what makes Riak unique in the category
-How Riak handles and resolves scalability and availability challenges when dealing with Big Data in this new connected world
-How his team at Basho is helping to solve challenges of too much data being created by the IoT
-The definition of "Data Gravity"
-How and why Agglomeration is a game changer for businesses
-How to get started on Riak and is it available open-source?

Sean will give an overview of how you can get the most out of your data -- from cloud-based analytics to data visualization, Sean will break down the challenges you face in your quest to become a data-driven professional.

Companies have embraced the concept of the data lake or data hub to serve their data storage and data-driven application needs. However, gaps remain in the maturity and capability of the Hadoop stack, leaving organisations struggling with how to reap the benefits of these data lakes and how to create analytic applications that deliver value to end users.

For data lakes to succeed, organisations need to learn and understand the differences between these big data scenarios:
1. Data discovery and exploratory analysis
2. Analytic applications and operationalisation of analytics across the enterprise

Richard will examine these two scenarios, where and when each one is appropriate, and how to mature from one to the other..