Spark Summit 2017: June 6-7 (San Francisco, CA)

Spark Summit 2017

“Apache® Spark™ is a powerful open source processing engine built around speed, ease of use, and sophisticated analytics. It was started at UC Berkeley in 2009 and is now developed at the vendor-independent Apache Software Foundation. Since its release, Spark has seen rapid adoption by enterprises across a wide range of industries. Internet powerhouses such as Yahoo, eBay and Netflix have deployed Spark at massive scale, processing multiple petabytes of data on clusters of over 8,000 nodes. Apache Spark has also become the largest open source community in big data, with over 1000 contributors from 250+ organizations.”

Event Details:

Guests Included

Shafaq Abdullah
Director of Data Infrastructure, The Honest Company

How Insnap acquisition made-over Jessica Alba’s company with data analytics
One day, regular line-of-business people might be able to handle high-level data monetization themselves with a do it yourself tool. Today is not that day for The Honest Co. The natural home and personal care brand, co-founded by actress Jessica Alba, acquired and integrated a big data software as a service startup called Insnap Inc. in 2016 to become more data-driven. Read the full blog post with highlights from his interview at SiliconANGLE.com.

John Cavanaugh
Master Architect, HP

How Spark helped build a community inside a distributed company
It’s hard to appreciate just how large organizations can be until someone has to get all those departments and splintered businesses talking to each other. They need a common platform to share data. Hewlett Packard Enterprise Co., a company currently in a downsizing transition of its own, found that common platform in the open-source data management platform, Apache Spark. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Matt Fryer
VP, Chief Data Scientist Officer, Hotels.com

Hotels.com embraces data-driven culture
Whether you are looking for a luxury hotel on a secluded beach or a place to stay in midtown Manhattan that won’t break the bank, the process of sorting through options and booking online can seem pretty straightforward. But behind most travel websites today is a complex data engine that pays close attention to what you may have booked before, and it strives to filter every possible detail in order to make recommendations that you’ll like. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Ali Ghodsi
CEO and Co-Founder, Databricks

Will serverless functions beat DevOps in race to democratize analytics?
Enterprise data scientists and developers fed up with data that just sits there and doesn’t make money might take heart in a prediction from Ali Ghodsi (pictured), chief executive officer and co-founder of Databricks Inc. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Michael Greene
VP, Software & Service Group, Intel

Silicon-software project gives ‘deep learning’ new meaning
Artificial intelligence workloads are set to increase 12-fold by 2020, according to Michael Greene (pictured), vice president, Software and Service Group, and general manager of system technologies and optimization at Intel Corp. Scalable hardware-software integrations better get cracking to accommodate them, Greene said today during Spark Summit in San Francisco, California. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Mark Grover
Software Developer & Author, Cloudera

Cloudera aims to change the way data is engineered
Developing an accurate data science model is a challenging process on its own. Scaling the model from a development environment to a production cluster presents another set of operational challenges that Cloudera Inc. aims to address with two new product offerings: Data Science Workbench and Altus. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Matthew Hunt
Technologist, Bloomberg

Could Apache Spark become a universal computation engine?
Spark Summit keynotes are known for their surprises, and this year the stand-out changes were in data streaming, with sub-millisecond times predicted for some workloads. With multiple avenues open for potential success, the community is watching as Spark matures to fulfill the promise of what it could be: But does that promise include becoming a database? Read the full blog post with highlights from his interview at SiliconANGLE.com.

Wesley Kerr
Data Scientist, Riot Games

A game of data science: the analytics architecture behind Riot Games
With the help of modern analytics, Riot Games Inc. developed a highly successful computer game called League of Legends, in which players form teams of champions and compete with other players around the world. Wesley Kerr, senior data scientist at Riot Games, explained how his organization is leveraging data science to improve player experience and weed out unsavory behavior. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Rob Lantz
Director of Predictive Analytics, Novetta

Automating entity recognition, extraction and resolution
Identifying and extracting relevant entities from masses of stored data is a complex and tedious task. Advanced analytics company Novetta Solutions LLC is using the functionality of Databricks Inc., a cloud-based data management service, to speed up and even automate the process. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Ash Munshi
CEO, Pepperdata

Can big data DevOps see what abstraction is hiding?
DevOps that have worked well in other areas of information technology can’t hack it in big data, according to Ash Munshi, chief executive officer of Pepperdata Inc. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Nathan Murith
Senior Software Development Manager, Autodesk

Data science, machine learning fuels new wave of Autodesk technology
The real world is a tricky place. Taking a design from concept to reality can be a complex task, especially when that design is a building. The construction industry relies on both precise planning and constant awareness on site to minimize the problems that happen when ideas meet the real world. Management tools that help in this regard are very valuable. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Clarke Patterson
Senior Director of Product Marketing, Confluent

Confluent makes Kafka easier in the cloud, grants new data powers to small business
Where data processing is concerned, speed is life. Companies can ship off some workloads to batch processing, but they need to process the critical stuff in real-time while a customer or system interaction is happening. Not all companies have the resources to make this happen, however. That’s where big data company Confluent Inc. and distributed data streaming platform Apache Kafka step in. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Jags Ramnarayan
CTO, SnappyData

How SnappyData is enriching Spark as a hybrid database
Enriching Apache Spark so it’s not just a platform but also a store is just part of the innovation occurring at SnappyData Inc., according to Jags Ramnarayan, founder and chief technical officer of SnappyData. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Eric Siegel
Founder, Predictive Analytics World

Author goes beyond the bells and whistles of big data buzz
As big data and machine learning tools become essential for modern business, the primary goal is no longer predicting the likelihood of making a sale. It’s now a matter of increasing the likelihood that a customer will buy. This key distinction forms the essential thesis behind former Columbia University Professor Eric Siegel’s book, “Predictive Analytics: The Power to Predict Who Will Click, Lie, Buy or Die.” Read the full blog post with highlights from his interview at SiliconANGLE.com.

Octavian Tanase
SVP Data ONTAP Software and Systems Group, NetApp

Dynamic data: Companies push for real-time insight tools
Good data is business gold. With it a company can predict trends, secure new customers and cut costs. Processing data to find the good stuff can take a while, though, and requires storage. Then there’s the added data pouring in from Internet of Things devices, where data comes in from the edge of the network with all the cost and latency that implies. There’s a growing market in solutions to these problems, according to Octavian Tanase, vice president of the Data ONTAP Software and Systems Group at NetApp Inc. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Detecting malicious insiders through behavioral analytics
It is no longer enough for companies to be sure they are safe from external attack. A comprehensive and up-to-date risk management strategy has to include monitoring employees who have the access to compromise systems and steal data. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Jennifer Wu
Director, Cloud Management, Cloudera

Cloudera aims to change the way data is engineered
Developing an accurate data science model is a challenging process on its own. Scaling the model from a development environment to a production cluster presents another set of operational challenges that Cloudera Inc. aims to address with two new product offerings: Data Science Workbench and Altus. Read the full blog post with highlights from her interview at SiliconANGLE.com.

Reynold Xin
Chief Architect & Co-Founder, Databricks

Spark doubles down on streaming, data warehousing and deep learning
The Apache Spark community has been wrestling with a wide range of big data challenges in information technology, and Databricks Inc. (which was founded by Spark’s creators), is taking steps to address the enterprise need for machine learning and speedier data processing. Read the full blog post with highlights from his interview at SiliconANGLE.com.

Matei Zaharia
Chief Technologist & Co-Founder, Databricks

What new apps are chugging on Spark 2.2’s real-time engine?
Apache Spark 2.2 has achieved event-by-event data streaming by trimming some fat from its execution process. So what new applications will the leaner, meaner engine drive online? Read the full blog post with highlights from his interview at SiliconANGLE.com.