In my previous blog, “Talend & Apache Spark: A Technical Primer” I walked you through how Talend Spark jobs equate to Spark Submit. In this blog post, I want to continue evaluating Talend Spark configurations with Apache Spark Submit. First, we are going to look at how you can map the options in the Apache Spark Configuration tab in the Talend Spark Job, to what you can pass as a...

A few years ago, Starbucks’ director of analytics and business intelligence, Joe LaCugna, said the Seattle coffee giant once struggled to make sense of the data pouring in from its loyalty card holders, which at the time was over 13 million and comprise 36 percent of all Starbucks’ transactions. The same was true of the coffee conglomerate’s social media data—they have mountains of it, but still can’t quite figure out what to do with it, according to Mr. LaCug...

When it comes to solutions for the big data sector, there is a clear split between the legacy and next- generation approaches to software development. Legacy vendors in this space generally have their own large internal development organizations, dedicated to building proprietary, bespoke software. It’s an approach that has worked well over the years. However, the big data market has always moved at lightning speed, and it’s had a strong element of open source fro...

When I graduated from college in the late 1990s, it was just in time to enjoy the Y2K crisis. If you remember those fun times, then you are old enough to enjoy this blog. I graduated with a Management Information Systems (MIS) degree, which is a cross between Computer Science (CS) and Business Management, and although I was stronger in CS than Business Management, I survived. There was a class spanning both disciplines that I partially excelled in called Database Theory, wh...

In my previous blog, “Talend & Apache Spark: A Technical Primer” I walked you through how Talend Spark jobs equate to Spark Submit. In this blog post, I want to continue evaluating Talend Spark configurations with Apache Spark Submit. First, we are going to look at how you can map the options in the Apache Spark Configuration tab in the Talend Spark Job, to what you can pass as a...

More and more companies around the globe are realizing that big data and deeper analytics can help improve their revenue and profitability. As such, they are building data lakes using new big data technologies and tools, so they can answer questions such as: How do we increase production while maintaining costs? How do we improve customer intimacy and share of wallet? What new business opportunities should we pursue? Big data is playing a major role in digital transformation pr...

Introduction – The beauty of being truly native The purpose of this post is to share my latest experience with Talend in the field, which is also the first time I have gotten to see the capacity Talend has to perform SQL queries inside any Talend Big Data Batch jobs using the Spark framework. In doing so, I want to teach you how to apply SQL Analytics and Windowing functions to process data i...

Companies are becoming increasingly aware that they have a goldmine in their hands - their data. For all businesses, the situation is clear – their future depends on how quickly and efficiently they can turn data into accurate insights. It is no longer a matter of simply collecting and managing data either; rather, it’s about quickly optimizing around and extracting value from the astronomical volume of data available. Speed is key...

In a recent post, I brought up a very serious issue that every data-driven business is now grappling with: how to calibrate the “virtue” of data. In other words, when is it ethical to use all the vast volumes of data we now have access to, and when is it not? Should companies use predictive analytics to gauge whether a customer may be pregnant? Should insurance companies use data to pr...

About the Author: Dimitri Volkmann, Digital Twin Thought Leader, GE Digital This article assumes you have a certain familiarity with the Digital Twin concepts, if you don't, start with this shorter article. The Digital Twin &am...