Paris "Apero Big Data" Report

This post reports on the first “Apéro Big Data” (that can be translated by Big Data Aperitif), held in Google’s Paris offices on July 2, 2013. Participants, during the evening event, discovered how certain technologies provide a simple, quick and inexpensive answer to big data analysis.
Companies such as Ysance, Talend, Google and QlikView were partners of the event.
The topics discussed during the evening helped to learn more about the challenges of big data for the next generation BI, Talend’s vision on big data, Google’s offers as well as the use of Google BigQuery and the creation of a big data QlikView application.

The first presentation by Romain Chaumais of Ysance explains the challenges of big data for the next generation BI.

First, Romain presents the different forms of business intelligence to be found in a company. They include exploratory, enterprise and departmental BI.
The exploratory BI focuses on very large data sources, very little or not structured, not collected and processed over time. Then, enterprise BI focuses on stable big data sources, controlled and industrialized. Departmental BI comprises non-voluminous, corporate, local, scalable and non-DSI data sources.

Big data technology revolutionizes traditional BI architectures and promises of big data that were discussed include: the availability for business users, integration with existing BI applications, adapting to real time uses, elasticity and scalability, inexpensive platforms, unstructured data and high-capacity storage. Especially, big data must not go in a business silo and be accessible to all departments in a company.

The second presentation by Bahaaldine Azarmi of Talend highlights his company’s big data products.

Bahaaldine explains Talend’s big data vision: this new type of data represents a significant paradigm shift in terms of corporate technology. It can capture trillions of bytes of information and the desire to collect and analyze this data is increasingly important.

Then the speaker performs an introduction to Talend Platform for Big Data including collection and treatment capabilities as well as data quality.
Bahaaldine creates a job collecting data from Google Cloud Storage, for a conversion in a Hadoop platform; the resulting data is loaded in Google BigQuery.

Once large amounts of data coming from the Web are extracted, they are enriched to make them usable and meaningful. And it is only after a connection from Talend ESB to Google Cloud Storage that the Hive jobs are graphically created!

Christophe Baroux of Google is the third speaker and he presents Google Cloud and Google BigQuery.

The speaker reviews two Google tools, including Google Compute Engine: it is an "Infrastructure as a Service" (IaaS) product offering a RESTful API to manage resources such as disks or images.
We were also able to learn more about Google BigQuery, allowing an interactive analysis of large amounts of data, for a Google Storage use. This is an IaaS that can be used in addition to MapReduce.

In terms of numbers, we have learned that Google sorts a terabyte of data with 1000 cores in 56 seconds with MapR. And at an equal performance for this capacity, the Google Cloud, compared to a standard server farm, costs $ 582 for Google Cloud versus $ 584,000 for the server farm option.
We were also able to understand that Google BigQuery provides the ability to query 291 million of lines in 1.6 seconds.

The fourth presentation of QlikTech’s Loïc Formont outlines how a QlikView Big Data application is built.

The company presented its solutions for big data, such as QlikView Business Discovery, and mentioned that these tools are simple to use, are usable in all areas, allow decision making as well as collaboration and mobility.
With Business Discovery, each user can analyze its data and make discoveries. All users can access the data and participate, to make more efficient decisions.

Then we were able to follow the creation of a QlikView big data application, based on the data available in the Hadoop cluster and BigQuery.
The example focused on queries to purchase a plane ticket and was based on the QlikView association search. The demonstration focused on concepts such as traffic, clicks, visits, sessions and KPIs.

This big data evening allowed attendees to discover several technologies, including those of Google, Talend and QlikView. During the presentations and networking discussions, participants were able to better understand how certain technologies provide a low-cost, rapid and simple answer to big data analysis. Moreover, the event brought together individuals interested in understanding how to effectively set up a big data platform but also discuss major issues and talk about different projects.

Patrick is a marketing and sales manager with over 10 years experience in medium and large-size companies. This has given the opportunity for Patrick to work in different roles in the United Kingdom and France. He also founded a website creation company during his studies.