Data Visualization in Data Science

Having data is not enough. Adding context to data is essential to understand the data, find patterns and engage audiences. Data visualization is a key element of data science, the interdisciplinary field which deals with finding insights from data. In this webinar, we explore the roles of data visualization at different stages of the data science process, and why it is essential. We also look at how data is encoded visually with shape, size, color and other variables and also the basic principles of visual encoding can be applied to build better visualizations. We cover narratives, types of bias and maps. Finally we look at how various tools – both open source and off-the-shelf software that’s used in data science to build effective data visualizations.

IT is a key player in the digital and cognitive transformation of business processes delivering solutions for improved business value with analytics. This session will step by step explain the journey to secure production while adopting new analytics technologies leveraging mainframe core business assets

Mixed reality is the result of blending the physical world with the digital world. Though it is relatively new technology and its adoption is still in initial stages. Mixed Reality devices and applications are projected to be the next technological era after smart phones.

The webinar will give a brief on Mixed Reality Potential Usecases those provide an immersive experience but also revenues streams to the creators.

Data Scientists are rare and highly valued individuals, and for good reason: making sense of data, and using the machine learning libraries requires an unusual blend of advanced skills. Why is it then that Data Scientists spend the majority of their time getting data ready for models, and a fraction actually doing the high value work?

In this talk we introduce the concept of Data Fabric, a new way to provide a self-service model for data, where data scientists can easily discover, curate, share, and accelerate data analysis using Python, R, and visualization tools, no matter where the data is managed, no matter the structure, and no matter the size.

We will talk through the role of Apache Arrow, the in-memory columnar data standard that is accelerating analytics for GPU-based processing, as well as the role of Pandas and Arrow in providing unprecedented speed in accessing datasets from Python.

Researchers generate huge amounts of valuable unstructured data and articles from research every day. The potential for this information is huge: cancer and pharmaceutical breakthroughs, advances in technology and cultural research that can improve the world we live in.

This webinar discusses how text mining and Machine Learning can be used to make connections across this broad range of files and help drive innovation and research. We discuss using Kubernetes microservices to analyse the data and then applying Machine Learning and graph databases to simplify the reuse of the data.

Internet of Things (IoT) envisions that everything in the physical world is connected seamlessly and is securely integrated through Internet. New products are innovated under the umbrella of IOT and opening up different opportunities. This webinar will discuss the future potential of IOT and the trend in which it is moving in adoption and standardisation.

Today most companies collect more data than ever and as we all know: data is the new oil. However gaining insights and turning them into action is easier said than done. In my experience this is a challenge for many companies, including innovative FinTechs.

In order to create a data driven business and organisational culture it is important to integrate data collection and an appreciation for data driven truth from the starting of a venture. This webinar is a brief overview of the hurdles and challenges BI faces in growing FinTech companies and how they can be overcome. Furthermore this webinar will briefly mention new BI trends and tools and how they could impact businesses.

Big Data has increased the demand for big data management solutions that operate at scale and meet business requirements. Big Data organizations realize quickly that scaling from small, pilot projects to large-scale production clusters involves a steep learning curve. Despite tremendous progress, critically important areas including multi-tenancy, performance optimization, and workflow monitoring remain areas where the operations team still needs management help.

Intended for enterprises who already have a data lake or are setting up their first data lake, this presentation will discuss how to implement data lakes with operations tools that automatically optimize clusters with solutions for monitoring, performance tuning, and troubleshooting in production environments.

Sean is the co-founder and CTO of Pepperdata. Previously, Sean was the founding GM of Microsoft’s Silicon Valley Search Technology Center, where he led the integration of Facebook and Twitter content into Bing search. Prior to Microsoft, Sean managed the Yahoo Search Technology team, the first production user of Hadoop. Sean joined Yahoo through the acquisition of Inktomi, and holds a B.S. in Engineering and Applied Science from Caltech.

Public cloud deployments have become irresistible in terms of flexibility, low barriers to entry, security, and developer friendliness. But the sheer inertia of traditional data lakes make them difficult to transition to cloud. In this talk we'll look at examples of how leading companies have made the transition using open source technologies and hybrid strategies.

Instead of following a "lift and shift" strategy for moving data lake workloads to the cloud, there are new considerations unique to cloud that should be considered alongside traditional approaches related to compute (eg, GPU, FPGA), storage (object store vs. file store), integrations, and security.

Viewers will take away techniques they can immediately apply to their own projects.

The concept of Data lakes evolved to address challenges and opportunities in managing big data.

Organizations are investing massive amounts of time and money to upgrade existing data infrastructures and build data lakes whether on-premises or in the cloud.

This talk will discuss architectures and design options to implement data lakes with open source tools. Also covered are challenges of upgrade & migration from existing data warehouses, metadata management, supporting self-service and managing production deployments.

As an Enterprise customer, you are potentially using IBM Z in a hybrid cloud implementation. Let's understand how to benefit from cloud access to mainframe data without moving it outside z; thereby improving security, reducing integration challenges and answering your GDPR auditor's needs.

The Web is the most powerful communication medium and the largest public data repository that humankind has created. Its content ranges from great reference sources such as Wikipedia to ugly fake news. Indeed, social (digital) media is just an amplifying mirror of ourselves. Hence, the main challenge of search engines and other websites that rely on web data is to assess the quality of such data. However, as all people have their own biases, web content, as well as our web interactions, are tainted with many biases.

Data bias includes redundancy and spam, while interaction bias includes activity and presentation bias. In addition, sometimes algorithms add bias, particularly in the context of search and recommendation systems. As bias generates bias, we stress the importance of de-biasing data as well as using the context and other techniques such as explore & exploit, to break the filter bubble.

The main goal of this talk is to make people aware of the different biases that affect all of us on the Web. Awareness is the first step to be able to fight and reduce the vicious cycle of bias.

Ricardo Baeza-Yates areas of expertise are web search and data mining, information retrieval, data science, and algorithms. He is CTO of NTENT, a semantic search technology company based in California, USA since 2016. Before, he was VP of Research at Yahoo Labs, based first in Barcelona, Spain, and later in Sunnyvale, California, from January 2006 to February 2016. He also is part time Professor at DTIC of the Universitat Pompeu Fabra, in Barcelona, Spain, as well as at DCC of Universidad de Chile in Santiago.

Iver van de Zand will talk and demo on the latest SAP innovations for analytics in the cloud. Keywords are live connectivity and the closed loop of combined business intelligence, planning and predictive analytics all in one environment. Fully ready and prepared for big data.

Perhaps one of the buzziest of buzzwords for the past few years has been "Blockchain". But for those who are well-versed in the technology and the goings on of the industry, it is definitely more than just hype.

Join this session where a panel of experienced Blockchain specialists will discuss:

-Viability of Use Cases - Just because a centralized database can be more efficient doesn't mean it's the most viable

-Preventing Blockchain from Being Just a Trend - if blockchain isn't viable as the solution, it shouldn't be used

-Public vs Private Blockchain Use

-State of the Industry - Understanding that a lot of the platforms and frameworks are in their early stage of development and that support may not be readily available

In this webinar, we will learn about image recognition with deep learning. After a brief overview of what deep learning is, and why it matters, we will learn how to classify dogs from cats. That is, how to train a model to recognize dog images from cat images.

We use Keras, an easy to use python deep learning library that sits on top of Tensorflow, and “fine-tuning”, a very important skill for any deep learning practitioner, to train a model to classify the images.

Once we trained our model to classify dogs from cats images with high accuracy, we dig into the details of the trained model and look at its building blocks, i.e., Convolutional Neural Networks (CNN), Fully Connected Block and activation functions to develop an understanding of how the deep learning model works.

David Siegel, Blockchain, decentralization and business agility expert

Still confused about this whole Blockchain thing? Interested in investing in digital currencies, but not sure where to start? Want to get a better idea of the threats and opportunities?

David Siegel is a Blockchain, decentralization and business agility expert who has been a high-level management & strategy consultant to companies like Sony, Hewlett Packard, Amazon, NASA, Intel, and many start-ups. David has been praised for being able to explain Blockchain in the most simple and interesting way.

What you will learn:
-What is Bitcoin?
-What is the blockchain?
-What is Ethereum? What is Ether?
-What is a distributed application?
-What is a smart contract?
-What is a triple ledger?
-What about identity and security?
-What business models are at risk?
-What are the opportunities?
-What should we do?

Today the payments industry faces a rebirth by necessity. Financial institutions process massive volumes of customer and payments transaction data, much of it unstructured and untapped.

Cognitive Systems have the ability to understand, reason and learn. In Financial Services applying cognitive capabilities to real world payments issues like safer and faster payments is yielding significant results. Furthermore Risk and Compliance and segment of one engagement are areas where ROI is tremendous when leveraging advanced analytics and artificial intelligence in cohesion.

Learn from real world use cases of how financial institutions globally have gained significant competitive advantage by becoming a truly Cognitive Bank.

HDFS on Kubernetes: Lessons Learned is a webinar presentation intended for software engineers, developers, and technical leads who develop Spark applications and are interested in running Spark on Kubernetes. Pepperdata has been exploring Kubernetes as potential Big Data platform with several other companies as part of a joint open source project.

Accurate software asset management (SAM) is critical to a company’s IT operations in order to optimize usage and cost, and ensure that license compliance is achieved to avoid true-ups and penalties. Keeping a company’s software application library up-to-date for SAM has been a continuous struggle for asset managers given the time it takes for manual entry of new titles, the changing information in software titles, and the sheer number of titles across an enterprise organization. Every software title not recognized in their library causes potential license compliance breaches (leading to true-ups and potential fines) and reduces an organization’s ability to optimize their software license usage for cost optimization.

Micro Focus (HPE Software) has added the industry leading IDOL unstructured search engine to add machine learning to Universal Discovery, drastically improving in the process of software license recognition and teaching for quicker software license recognition and a consistently up-to-date SAI over time. Join us on September 13th for an introduction to this exciting new capability and to understand the business value it can bring to your SAM program.

• Introduction to machine learning and the IDOL engine
• Machine learning in Universal Discovery
• Current SAM with Machine Learning Success StoriesLive
• Q&A about this new technology

Learn about analyzing and visualizing big data in Tableau. Hear tips & tricks, see examples and ask your burning questions during this live webinar.

We will give you a live demo on how to connect Tableau to Exasol's dataviz environment where you can explore large public datasets.

We'll show you how to tackle these data sets when starting your analysis and take you through some example visualizations to inspire your own work.

We'll be working with data in Exasol for an upcoming Makeover Monday challenge later in September, so use this opportunity to ask us any questions you have ahead of time so you can make the most out of playing with big data.

Data is the foundation of any organization and therefore, it is paramount that it is managed and maintained as a valuable resource.

Subscribe to this channel to learn best practices and emerging trends in a variety of topics including data governance, analysis, quality management, warehousing, business intelligence, ERP, CRM, big data and more.