HPCC Systems Open Source Big Data PlatformHPCC Systems is an open source Big Data analytics solution for businesses of all sizes, allowing them to improve critical time to results and decisions. Subscribe to our channel to keep informed of the latest HPCC Systems events.Webcasts for data science and Big Data analytics professionalshttps://www.brighttalk.com/channel/15091The Download: Tech Talks by the HPCC Systems Community, Episode 24Thu, 23 May 2019 15:00:00 +0000Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation, and further build and link the relationships within our HPCC Systems community.
Featured speakers include:
Itauma Itauma,PhD, Keiser University, HPCC Systems Community Innovator - Cervical Cancer Risk Factors: Exploratory Analysis using HPCC Systems
Cervical cancer is a leading cause of cancer-related death among women with about half a million new cases worldwide in 2018. 90% of cervical cancer deaths occur in low resource settings. This mortality could be reduced through effective prevention, screening and treatment programs. I will explain how an exploratory analysis of a cervical cancer database was performed using HPCC Systems with data visualizations and how the findings could be beneficial.
Itauma Itauma has a PhD in Instructional Design and Technology from Keiser University and is a student in the Harvard Business Analytics Program.
Lili Xu, Software Engineer III, LexisNexis Risk Solutions - Automatically cluster your data with the HPCC Systems massively scalable K-Means machine learning bundle
Imagine you are sitting in front of thousands of articles and trying to organize them into different folders. How would you accomplish it and how long would you expect to finish it? If you have some sort of data but have no clue how to efficiently cluster them, then Lili's talk will provide insight on a great place to start.
Lili Xu is in the final stages of completing her PhD in Computer Science at Clemson University. Now an employee, Lili has completed three internships with HPCC Systems on machine learning.
Richard Taylor, Chief Trainer, HPCC Systems, LexisNexis® Risk Solutions - ECL Tips and Tricks: DICTIONARY does it!
Richard Taylor has worked with the HPCC Systems technology platform and the ECL programming language for over 15 years.https://www.brighttalk.com/webcast/15091/354543HPCC Systemshttps://www.brighttalk.com/webcast/15091/354543"Machine Learning""hpcc systems""Data Analytics"healthcareThe Download: Tech Talks by the HPCC Systems Community, Episode 23Thu, 25 Apr 2019 15:00:00 +0000Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation, and further build and link the relationships within our HPCC Systems community.
Featured speakers include:
Jeremy Meier and David Noh, both Undergraduate Students at Clemson University - An Investigation into Time Series Analysis
Over the past several months, our team has worked closely with a dataset having roughly 16,000 total observations, recording both the date and balance in financial data. Focusing on individual accounts with a size of around 400 observations, our first goal was to compare statistical metrics and techniques used commonly in time series analysis on the given data sets. We dove deep into two major industry standard methods for understanding and predicting on a dataset. Using insights learned from these observations, we hope to better predict future balances in the dataset, as well as find any anomalies or misbehavior in the data in order to provide business value.
Roger Dev, Sr Architect, LexisNexis Risk Solutions - TextVectors - Machine Learning for Textual Data
Text Vectorization allows for the mathematical treatment of textual information. Words, phrases, sentences, and paragraphs can be organized as points in high-dimensional space such that closeness in space implies closeness of meaning. HPCC Systems' new TextVectors module supports vectorization for words, phrases, or sentences in a parallelized, high-performance, and user-friendly package.
Allan Wrobel, Consulting Software Engineer, LexisNexis Risk Solutions - ECL Tips and Tricks: Leveraging the power of HPCC Systems. Using AGGREGATE.
The ECL built-in function AGGREGATE has been seen by many in the community as ‘complex’ and as such has been underused. However in using AGGREGATE you can be sure you’re playing to the strengths of HPCC Systems.https://www.brighttalk.com/webcast/15091/354541HPCC Systemshttps://www.brighttalk.com/webcast/15091/354541"hpcc systems"ecl"Machine Learning""Data Analytics""Big Data"HPCC Systems Community Focus: 5 Questions with Jo PrichardWed, 17 Apr 2019 16:00:00 +0000In this session, we are highlighting some of the rock stars of the HPCC Systems Community. Today's session is 5 Questions with Jo Prichard.https://www.brighttalk.com/webcast/15091/356133Jo Prichard, Flavio Villanustrehttps://www.brighttalk.com/webcast/15091/356133"Big Data""Big Data Analytics""HPCC Systems""Data Science"The Download: Tech Talks by the HPCC Systems Community, Episode 22Thu, 21 Mar 2019 15:00:00 +0000Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation, and further build and link the relationships within our HPCC Systems community.
Featured speakers include:
Vincent Freeh, Professor NC State University, HPCC Systems as a Service (Haas)
There are numerous reasons to use an IaaS for HPCC Systems instead of dedicated hardware, especially if the workload does not execute 24/7. We developed a CloudFormation Template and an AMI for HPCC Systems and a reference architecture for HPCC Systems in AWS. Significant effort was expended to determine the best set of resources for HPCC Systems clusters. Furthermore, we created a program to create and manage HPCC Systems clusters in AWS from the command line. This talk will present the tools we created and also explain the reference architecture and many of the configuration options.
David de Hilster, Consulting Software Engineer, LexisNexis Risk Solutions, New ECL IDE Features in 7.0
The ECL IDE is an integrated development environment for ECL programmers to create, edit, and execute ECL code within the HPCC Systems platform. The latest 7.0 version includes new features and enhancements such as a more comprehensive autocomplete, tooltips and F12 capabilities. In this talk, David will discuss how users can leverage these features and more.
Bob Foreman, Senior Software Engineer, HPCC Systems, LexisNexis Risk Solutions - ECL Tip: A Tiny Trove of TABLE Tidbits
This month’s ECL Tip of the Month will focus on the ECL TABLE Function. Common (and some not so common) use cases will be discussed. Code example demonstrated will also be available for download.https://www.brighttalk.com/webcast/15091/350933HPCC Systemshttps://www.brighttalk.com/webcast/15091/350933"hpcc systems"ecl"Cloud Architecture""Big Data""Visual Analytics""Data Analytics""Open Source""Cloud Computing"HPCC Systems Community Focus: 5 Questions with Anupam SenguptaFri, 15 Mar 2019 13:00:00 +0000In this session, we are highlighting some of the rock stars of the HPCC Systems Community. Today's session is 5 Questions with Anupam Sengupta.
Anupam is a co-founder and the CTO of GuardHat.https://www.brighttalk.com/webcast/15091/350497Anupam Sengupta, Flavio Villanustrehttps://www.brighttalk.com/webcast/15091/350497"Big Data""Big Data Analytics""HPCC Systems"GuardHat"Industrial IoT"The Download: Tech Talks by the HPCC Systems Community, Episode 21Thu, 21 Feb 2019 16:00:00 +0000Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation, and further build and link the relationships within our HPCC Systems community.
Featured speakers include:
Adwait Joshi, CEO DataSeers - HPCC Systems - An IoT use case for Payments
Traditionally we all have used Thor for data processing and ROXIE indexes for data pulls. Think about using ROXIE for a data ingest and Thor directly pulling data into the back end repository. This talk will explain about how DataSeers has designed a realtime transaction monitoring system using HPCC Systems, Kafka, ElasticSearch and MySQL pushing the envelope for a typical use case. Learn the roadblocks we encountered, how we worked around them, and how we hardened the system to be truly disaster resistant with all open source technologies.
Yanrui Ma, Software Architect, LexisNexis Risk Solutions - Dynamic ESDL Has Become More Dynamic In 7.0
In this talk, Yanrui will talk about some of the major changes with Dynamic ESDL in 7.0, with a focus on the mechanisms and enhancements that have made it even more dynamic. He’ll give a demo of creating a DESDL service with the improved “esdl” command line to show you how easy and quick it can be. He’ll also go over DESDL related ECL Watch changes in 7.0, and some of the upcoming DESDL features.
Bob Foreman, Senior Software Engineer, HPCC Systems, LexisNexis Risk Solutions - ECL Tip: All About the ECL SET
This month’s ECL Tip spotlights the ECL SET definition, value type, and other supported functions that use it. Several code examples and best practices will be demonstrated.https://www.brighttalk.com/webcast/15091/348066HPCC Systemshttps://www.brighttalk.com/webcast/15091/348066"hpcc systems""big data""Data Analytics""Machine Learning""Open Source"HPCC Systems Community Focus: 5 Questions with Richard ChapmanThu, 14 Feb 2019 13:30:00 +0000In this session, we are highlighting some of the rock stars of the HPCC Systems Community. Today's session is 5 Questions with Richard Chapman.
Richard has been with LexisNexis Risk Solutions for more than 25 years. He is the VP of Research and Development and the leader of the HPCC Systems development team. Richard wrote the code to create the HPCC Systems query cluster, also known as ROXIE which stands for Richard’s Online XML Inquiry Engine. He was one of the original designers of ECL which was created as a data centric programming language for easily expressing problems involving large quantities of data.https://www.brighttalk.com/webcast/15091/349753Richard Chapman, Flavio Villanustrehttps://www.brighttalk.com/webcast/15091/349753"Big Data""Big Data Analytics""hpcc systems"The Download: Tech Talks by the HPCC Systems Community, Episode 20Thu, 24 Jan 2019 16:00:00 +0000Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation, and further build and link the relationships within our HPCC Systems community.
Featured speakers and topics include:
•Rob Mansfield, Senior Data Scientist, Proagrica - Dapper - A bundle to make your ECL neater
Have you ever written a long project for a simple column rename and thought, this should be easier? What about nicely named output statements? Yeah they bother me too. Oh, and DEDUP(SORT(DISTINCT()))? There is a better way! Learn how dapper can help!
•Bob Foreman, Senior Software Engineer, HPCC Systems, LexisNexis Risk Solutions - ECL Tip: The Seven Faces (Forms) of Dr. LOOP (Function)
The LOOP function has always been a powerful, yet tough ECL function to understand and use. Bob will review and examine the upcoming major changes to this documentation and showcase new examples.
•Lorraine Chapman, Consulting Business Analyst, LexisNexis Risk Solutions - Update on Academic Collaboration
Lorraine will share an update on recent collaboration, upcoming academic events and the 2019 HPCC Systems Internship Program.https://www.brighttalk.com/webcast/15091/345758HPCC Systemshttps://www.brighttalk.com/webcast/15091/345758"HPCC Systems"ECL"Big Data""Data Analytics"HPCC Systems Community Focus: 5 Questions with Lili XuThu, 24 Jan 2019 14:30:00 +0000In this session, we are highlighting some of the rock stars of the HPCC Systems Community. Today's session is 5 Questions with Lil Xu.
Lili is in the final stages of completing her PhD in Computer Science. She has worked in the DICE lab directed by Dr. Apon in the school of computing at Clemson University.
Lili has completed three internships with the HPCC Systems team, working on machine learning applications. Her research area is machine learning, natural language processing and high performance computing. We are pleased that Lili has joined the team as a LexisNexis employee.https://www.brighttalk.com/webcast/15091/347850Lili Xu, Flavio Villanustrehttps://www.brighttalk.com/webcast/15091/347850"Big Data""Big Data Analytics""Computer Science""Clemson University"HPCC Systems Community Focus: 5 Questions with Amy AponTue, 18 Dec 2018 18:30:00 +0000In this session, we are highlighting some of the rock stars of the HPCC Systems Community. Today's session is 5 Questions with Amy Apon, Ph.D.
Dr. Apon maintains an active research program at Clemson. Areas of research interest include cloud computing, performance modeling and analysis of parallel and distributed system, data-intensive computing, emerging parallel architectures, and impact of high performance computing to research competitiveness. Her research is currently supported by the National Science Foundation, the Department of Education, BMW, HPCC Systems, LexisNexis, Elsevier Scopus, RELX Group, and Amazon.https://www.brighttalk.com/webcast/15091/344871Flavio Villanustre and Amy Aponhttps://www.brighttalk.com/webcast/15091/344871"Big Data""Big Data Analytics""computer science""clemson university"Focus on FinTech [Season 2 Ep. 7]: Data & Innovation in 2019Tue, 11 Dec 2018 16:00:00 +0000In the season finale of Focus on FinTech we look at the topics on the minds of FinTech professionals at Money 20/20.
From how FinTech's and traditional Finance can work together, to the importance of data and innovation, discover the to top trends in FinTech from industry experts like Peerstreet and Dataseers.https://www.brighttalk.com/webcast/15091/342931Eric Hazard, CEO, Vested Ventureshttps://www.brighttalk.com/webcast/15091/342931Fintech"Finance Directors""focus on fintech"Innovation"Investment Management"The Download: Tech Talks by the HPCC Systems Community, Episode 19Thu, 15 Nov 2018 15:00:00 +0000Speakers and topics for this episode include:
Jayashree Ukkinagatti, Rashtreeya Vidyalaya College of Engineering, India
Set up Automatic Builds for the continuous integration of ECL queries stored in GIT using Jenkins
Software developers work in an isolated team. If they need to integrate their changes with different code base, waiting for days to integrate their code may create many merge conflicts , may get hard to fix the bugs or may lead to duplicate efforts. In this presentation, Jayashree will speak about the setting up of automatic builds to integrate ECL queries stored in Git using the Jenkins deployment pipeline techniques, when the pull request is made on additions or changes to ECL queries stored in Git.
Nicole Navarro, New College of Florida
Measuring the geo-social distribution of Opioid Prescriptions
Drug overdose was the leading cause of accidental death in the US in 2015, and the number of drug overdoses involving opioids in 2016 was 42,249 – an increase of 18% per year since 2014. In this talk, Nicole will explain how she utilized the open source HPCC Systems capabilities around knowledge engineering to create data features and interactive visualizations. These were designed to allow research into Drug Socialization across social groups and geographical regions with a focus on opioid prescription rates.https://www.brighttalk.com/webcast/15091/340791HPCC Systemshttps://www.brighttalk.com/webcast/15091/340791"hpcc systems""Big Data""Data Analytics"HPCC Systems Commuity Focus: 5 Questions with David DasherMon, 1 Oct 2018 13:30:00 +0000In this session, we are highlighting some of the rock stars of the HPCC Systems Community. Today's session is 5 Questions with David Dasher.
David Dasher is the Chief Technology Officer and Founder of CPL Online, the leading provider of e-Learning and digital services to the UK’s hospitality sector, that since 2018 has been part of CGA Group.
With over 25 years’ experience within the IT sector, he has worked extensively in the UK’s corporate sector developing database, marketing, and management solutions. Under David’s leadership, CPL Online has established itself as a market leader and enjoyed several years of strong year on year growth.https://www.brighttalk.com/webcast/15091/335851Flavio Villanustre and David Dasherhttps://www.brighttalk.com/webcast/15091/335851"HPCC Systems""CPL Online""Big Data Analytics"The Download: Tech Talks by the HPCC Systems Community, Episode 17Thu, 13 Sep 2018 14:00:00 +0000Speakers and topics for this episode include:
Farah Al Shanik, Clemson University - Equivalence Terms for Text Search Bundle
Text Search Bundle (TSB) is an open source project for searching on XML text documents & contains many subtasks, one being equivalence terms. We can consider equivalence terms as strong synonyms for TSB. Several term equivalences: initialism, abbreviation, synonyms & similarity based on context. We used HPCC Systems to develop a Text search tool via Moby thesaurus to return a set of synonyms, word2vec algorithm to return similar words, then built a dataset for state names & its abbreviation to return the set of related documents while improving the initialism for TSB to find strings with or without the punctuation.
Soukaina Filali, Georgia State University - Fraud Detection on Transactional Data using a Time Series Mining Approach
The project consists of detecting fraudulent pre-paid cards from non-fraudulent ones using mined patterns on their respective historical bank transactions data. There are numerous types of card programs, each of which comes with different fraud risk levels. Every fraud category has representative patterns that a human manually monitors on a daily basis. The goal here is to combine the domain expert engineered features with time series shapelets mining techniques to provide an automated fraud detection solution, which can potentially help in early fraud detection.
Lili Xu, Clemson University & Gus Reyna, LexisNexis - Using HPCC Systems ML to Map Thousands of Public Records Data Descriptions to Standard Codes
There is a challenge of incorporating public records data into business processes given disparate descriptions across states for similar events, and finding standards giving a consistent meaning for use. This session tells the story of how HPCC Systems ML addressed the problem of mapping thousands of disparate public record data descriptions to a corresponding set of standard codes.https://www.brighttalk.com/webcast/15091/333186HPCC Systemshttps://www.brighttalk.com/webcast/15091/333186ecl"hpcc systems""Machine Learning"HPCC Systems Commuity Focus: 5 Questions with Itauma ItaumaThu, 23 Aug 2018 15:00:00 +0000In this session, we are highlighting some of the rock stars of the HPCC Systems Community. Today's session is 5 Questions with Itauma Itauma.
Itauma Itauma is a doctoral candidate at Keiser University and a computer science instructor at Wayne State University. His interests lie in learning analytics and utilizing HPCC Systems for educational research. He has an undergraduate degree in Electrical Engineering from the University of Ilorin and two Masters Degrees, a Master of Science in Computer Engineering from Istanbul Technical University, majoring in human-robot interaction and a Master of Science in Computer Science from Wayne State University where his thesis was based on leveraging HPCC Systems for Big Data analytics.https://www.brighttalk.com/webcast/15091/333918Itauma Itaumahttps://www.brighttalk.com/webcast/15091/333918"HPCC Systems""Big Data"The Download: Tech Talks by the HPCC Systems Community, Episode 16Thu, 2 Aug 2018 15:00:00 +0000This episode will feature our 2018 HPCC Systems summer interns:
Shah Muhammad Hamdi, PhD student, CS at Georgia State University - Dimensionality Reduction and Feature Selection in ECL-ML
Hamdi will discuss the parallel implementation of Principal Component Analysis (PCA) using the Parallel Block Basic Linear Algebra Subsystem (PBblas) library and ECL implementations of feature selection algorithms for the HPCC Systems platform.
Robert Kennedy, PhD student in Computer Science at Florida Atlantic University - Parallel Distributed Deep Learning on HPCC Systems
Robert will cover what he implemented during his summer internship. Combining HPCC Systems and Google’s TensorFlow, Robert created a parallel stochastic gradient descent algorithm to provide a basis for future deep neural network research and to enhance HPCC System’s distributed neural network training capabilities.
Aramis Tanelus, programmer and senior at American Heritage High School where he is the lead programmer for the Advanced Robotics Team - Developing HPCC Systems Data Ingestion APIs for Common Robotic Sensors.
Aramis’s project will make it easy for anyone in robotics around the world to ingest data from common robotic sensors into an HPCC Systems platform for use in data analysis. Aramis will be speaking about his work on the autonomous agricultural robot and implementing new packages for the Robotics Operating System to interface with HPCC Systems for big data analysis.
Saminda Wijeratne, Masters student, Computational Science and Engineering at Georgia Institute of Technology, Atlanta - MPI Proof of Concept
The built-in "Message Passing" library in HPCC Systems is designed to handle these communications among dissimilar components and perform non-trivial communication patterns among them. Saminda will explore how this library currently operates and how we can introduce a different implementation such as an existing popular library called MPI.https://www.brighttalk.com/webcast/15091/330079HPCC Systemshttps://www.brighttalk.com/webcast/15091/330079"Machine Learning""Big Data""HPCC Systems"ECLThe Download: Tech Talks by the HPCC Systems Community, Episode 15Thu, 28 Jun 2018 15:00:00 +0000Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community. This episode will feature three speakers on the following topics:
Jingqing Zhang, Imperial College of London
Deep Sequence Learning and Text Classification
Bob Foreman, LexisNexis Risk Solutions
ECL Summer Code Camp Review
On May 16th, five HPCC Systems Ambassadors along with Flavio Villanustre met with eight iRISE2 members for a two-hour ECL Code Camp. The event was a great success, and I thought I’d share with the community what we did and some of the ECL ideas that came out of it. Tips from Data Ingestion to ECL to Data Evaluation will be included in this segment.https://www.brighttalk.com/webcast/15091/323691HPCC Systemshttps://www.brighttalk.com/webcast/15091/323691"HPCC Systems"ECL"Big Data""Machine Learning""text classification""Data Science"Boost Mobile & Digital Banking Engagement while Reducing Churn With Big DataTue, 22 May 2018 14:30:00 +0000Join Anirudh Shah, Founder & CEO, 3LOQ Labs, and Flavio Villanustre, VP Technology, HPCC Systems, to learn how 3LOQ is solving the problem of customer churn with open source big data and machine learning technology. 3LOQ addresses this challenge by deploying proprietary machine learning algorithms to analyze billions of data points and map out dynamic feature recommendations to reinforce repeated usage of a product. The end result? Reduced churn with high customer engagement for businesses.
3LOQ recently partnered with a leading Indian banking institution to increase adoption of their digital channels. The project yielded impressive results for the client, including a:
· 45% reduction in customer churn
· 145% increase in digital banking transactions
· 75% increase in users who made four or more transactions per month
In this webcast, Flavio will give an overview of one of the key tech tools that contributes to 3LOQ's success, the completely free, open source HPCC Systems big data platform. Anirudh will share how 3LOQ Labs leverages this platform to:
• Analyze four terabytes of data combined with built-in analytics libraries to create personalized recommendations
• Utilize efficient coding in an implicitly parallel platform that allows prototypes to be developed and iterated quickly
• Enable horizontal scaling on commodity hardware, with the flexibility to deploy both on premises and in the cloudhttps://www.brighttalk.com/webcast/15091/317803Anirudh Shah, Founder & CEO, 3LOQ Labs and Flavio Villanustre, VP Technology, HPCC Systemshttps://www.brighttalk.com/webcast/15091/317803"Open Source""Big Data""Data Analytics""Mobile Banking""Digital Banking""Retail Banking""decision science""bank cards"Analytics"Machine Learning"The Download: Tech Talks by the HPCC Systems Community, Episode 14Thu, 17 May 2018 15:00:00 +0000Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community. This episode will feature three speakers on the following topics:
Tai Donovan, Robotics Director, American Heritage School - High School Autonomous Agricultural Project
A group of 5-6 students are working on an autonomous agricultural project with the goal of providing time sensitive data to the owner-operator/farmer/grower of a production farm. Tai will discuss their challenges and how he is using HPCC Systems.
Lorraine Chapman, Consulting Business Analyst, LexisNexis Risk Solutions - Meet Our Summer Interns
By the end of 2018, ten students will have completed projects as part of the HPCC Systems intern program. Find out about these students, including where and what they are studying, the projects they will be working on and the intern experience we provide to help them feel part of the team. Lorraine will also speak about how you can get involved with the program by being a mentor, or contributing a project idea for a new feature or enhancement to the HPCC Systems platform and/or Machine Learning Library.
Richard Taylor, Chief Trainer, HPCC Systems, LexisNexis Risk Solutions – Current/Longest Event Sequence by Month
Richard will discuss processing event dates to discover for each event within a given time frame: the current number of sequential months the event occurred, and the longest contiguous month-by-month sequence. This topic is based on questions from one of our Statistical Modelers (new to ECL) regarding how to approach the problem in a non-procedural manner. The example code will make use of the GROUP and HAVING functions.https://www.brighttalk.com/webcast/15091/318027HPCC Systemshttps://www.brighttalk.com/webcast/15091/318027"hpcc systems""Machine Learning""autonomous vehicles"The Download: Tech Talks by the HPCC Systems Community, Episode 13Thu, 19 Apr 2018 15:00:00 +0000Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community.
Episode 13 includes Tech Talks featuring speakers from our community on topics covering the Future of Automotive Telemetry: Assessing Autonomous Vehicle Risk Implications using Simulated Data, Developing A Custom, Pluggable HPCC Systems Security Manager and Understanding the ECL Watch Graphs. View the full details at hpccsystems.comhttps://www.brighttalk.com/webcast/15091/311103HPCC Systemshttps://www.brighttalk.com/webcast/15091/311103"HPCC Systems""Big Data""graph processing""Data Analytics"autonomousThe Download: Tech Talks by the HPCC Systems Community, Episode 12Thu, 15 Mar 2018 15:00:00 +0000Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community.
Episode 12 includes Tech Talks featuring speakers from our community on topics covering exploratory data analysis, geospatial solutions and ECL Tips leveraging the HPCC Systems platform.
1) Itauma Itauma, PhD Candidate, Keiser University - Conducting exploratory data analysis in educational research using HPCC Systems®
2) Ignacio Calvo, LexisNexis Risk Solutions - Big Data and Geospatial with HPCC Systems®
3) Bob Foreman, Senior Software Engineer, HPCC Systems, LexisNexis Risk Solutions - ECL Tip of the Monthhttps://www.brighttalk.com/webcast/15091/306651HPCC Systemshttps://www.brighttalk.com/webcast/15091/306651"hpcc systems"geospatial"Big Data"ecl"Machine Learning"No Ordinary Hard Hat: Improving Health & Safety with Open Source Big DataFri, 2 Mar 2018 15:00:00 +0000Over 4,000 U.S. workers die on the job every year. While new wearable technologies are aggressively entering consumer applications, industrial safety equipment has not seen a fundamental innovation in the last decade.
Join us to learn how Guardhat CTO Anupam Sengupta and Guardhat use open source big data technology to address this issue with its “smart hard hat ecosystem”, an industrial wearable that uses IoT and wireless communications systems to protect and empower industrial workers.
In this webcast, Flavio will give an overview of the completely free, open source HPCC Systems big data platform.
Anupam will share how Guardhat leveraged this platform to:
• Allow real-time complex event processing of vast amounts of streaming data.
• Enable horizontal scaling on commodity hardware, with the flexibility to deploy both on premises and in the cloud.
• Support big data analytics including the ability to analyze, identify, and predict trends.
• Enable rapid green-field developmenthttps://www.brighttalk.com/webcast/15091/303415Anupam Sengupta, CTO Guardhat & Flavio Villanustre, VP Technology, HPCC Systemshttps://www.brighttalk.com/webcast/15091/303415"Open Source""Big Data""Industrial IoT""Health IT""Health & Safety""Industrial Manufacturing"The Download: Tech Talks by the HPCC Systems Community, Episode 11Thu, 15 Feb 2018 16:00:00 +0000Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community.
Episode 11 includes Tech Talks featuring speakers from our community on topics covering Big Data solutions, Spark Integration and other ECL Tips leveraging the HPCC Systems platform.
1) Raj Chandrasekaran, CTO & Co-Founder, ClearFunnel - Scaling Data Science capabilities: Leveraging a homogeneous Big Data ecosystem
2) James McMullan, Software Engineer III, LexisNexis Risk Solutions - HDFS Connector Preview
3) Bob Foreman, Senior Software Engineer, LexisNexis Risk Solutions - Building a RELATIONal Dataset - A Valentine’s Day Special!https://www.brighttalk.com/webcast/15091/302435HPCC Systemshttps://www.brighttalk.com/webcast/15091/302435"hpcc systems""Cloud Computing"AWS"Big Data"SPARKThe Download: Tech Talks by the HPCC Systems Community, Episode 10Thu, 18 Jan 2018 16:00:00 +0000Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community.
Episode 10 will kick off our first Tech Talk in 2018 and includes 15 minute Tech Talks featuring speakers from the community:
1) Chris Gropp, PhD candidate, Clemson University - Asking the Right Questions with Machine Learning
The HPCC Systems Machine Learning Library contains a number of powerful tools, but it is important to use them properly. Chris will discuss how to ask the right questions by taking a step backwards from the methods themselves and examining the requirements defined by the applications.
2) Rodrigo Pastrana, Software Architect, LexisNexis Risk Solutions - Creating Front-facing Web Services to Deliver your HPCC Systems Query Data
The HPCC Systems platform provides everything you need to easily create production grade web services to deliver your query data. Rodrigo will discuss the tools and frameworks provided by the HPCC Systems platform and walk through the end-to-end creation of a sample web service.
3) Richard Taylor, Chief Trainer, HPCC Systems, LexisNexis Risk Solutions – ECL Tips and Cool Tricks
Join Richard for the latest tips and tricks with using ECL! In this session, he will talk about the PARSE function and interesting techniques used in data parsing.https://www.brighttalk.com/webcast/15091/295587HPCC Systemshttps://www.brighttalk.com/webcast/15091/295587"HPCC Systems""Big Data""machine learning"Baselines & Benchmarks –Making Open Source Big Data Analytics EasyThu, 7 Dec 2017 17:00:00 +0000Bringing heterogeneous data into a homogenous data warehouse environment is one of the most daunting aspects of any big data implementation.
Even though Apache Spark and HPCC Systems Thor can be thought of as complementary, there is interest in comparing their performance with data analytics-related benchmarks, specifically transformation, cleaning, normalization, and aggregation. Join us to hear how HPCC Systems Thor's performance compares to Apache Spark utilizing standard benchmarking methodologies.
Learn how these benchmarks and HPCC Systems can help you establish new baselines that:
•Improve the speed and accuracy of the transformation, cleaning, normalization, and aggregation processes
•Enable efficient use of developer resources and development budgets
•Facilitate the use of standard hardware, operating systems, and protocolshttps://www.brighttalk.com/webcast/15091/290773Arjuna Chala, Sr. Director of Special Projects for the HPCC Systemshttps://www.brighttalk.com/webcast/15091/290773ETL"big data"SPARK"HPCC Systems"Benchmarking"Open Source"How Open Sourced Big Data is Helping to Fuel World SustainabilityWed, 6 Dec 2017 15:00:00 +0000As farmers grapple with how they will feed an increasing global population, the need to harness data and analytics has become more critical. Changing diets, demand for healthier food options, and decreasing water availability are just some of the challenges that face agriculture today. Combine that with global market volatility and rising input costs such as water, chemicals, seeds, and more and it is harder than ever for farmers to be profitable and sustainable.
As Proagrica searched for ways to help the agriculture industry use data-driven decision making for crops and livestock production, they decided to adopt HPCC Systems® as their big data partner. HPCC Systems not only delivered a scalable, resilient and secure platform, but it also met their projected future expansion needs.
Join Jeff Bradshaw, Group CTO for Adaptris within Proagrica and Flavio Villanustre, Vice President of Technology for HPCC Systems, as they discuss how they implemented HPCC Systems and the Adaptris enterprise service bus to incorporate massive, diverse data sets from within complex, secure environments. Jeff and Flavio will share insights on best implementation practices to deliver clean/quality data, provide data security, and deliver the real-time analytics their customers demanded.
In this webcast, Proagrica will share their mission and their experiences with:
· The open source, end to end big data processing platform, HPCC Systems
· Rapid development with a growing set of real time data sources
· Seamlessly integrating with existing infrastructure to simplify deployment and management capabilities
· Real-time data ingestion and flexible stream processing from massively diverse data sets
· Enabling real-time analytics while providing for data securityhttps://www.brighttalk.com/webcast/15091/287559Jeff Bradshaw, Group CTO, Adaptris within Proagrica & Flavio Villanustre, Vice President of Technology, HPCC Systemshttps://www.brighttalk.com/webcast/15091/287559"Big Data"AgricultureAnalytics"Data Analytics""Open Source""Data Science"The Download: Tech Talks by the HPCC Systems Community, Episode 9Thu, 16 Nov 2017 16:00:00 +0000Join us November 16 for another episode of The Download: HPCC Systems Community Tech Talks!
This series of workshops is specifically designed for the community by the community with the goal to share knowledge, spark innovation, and further build and link the relationships within our HPCC Systems community.
Featured speakers and topics include:
Robert Pelley, Architect for UK and Ireland, LexisNexis Risk Solutions - Integrating REDIS with HPCC Systems in high volume UK infrastructure.
REDIS (REmote DIctionary Server), an in-memory caching technology will be integrated into the LexisNexis application stack to provide an additional cache at the entry point of the application infrastructure. This will serve as a cache of Product Responses whereas existing caches will continue to serve Vendor Responses. The aim of the REDIS front-end Product Response cache is to improve system throughput and response times.
Bob Foreman, Senior Software Engineer, LexisNexis Risk Solutions – ECL Tips: The Bright Green Data Generation Machine (DataGen)
Bob will be talking about the ECL Code Generator, DataGen, best practices and tips for its use in generating random data and will walk through a short demo.https://www.brighttalk.com/webcast/15091/285841Flavio Villanustrehttps://www.brighttalk.com/webcast/15091/285841"hpcc systems""Big Data"The Download: Tech Talks by the HPCC Systems Community, Episode 7Thu, 14 Sep 2017 15:00:00 +0000This series of workshops is specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community.
Featured speakers and topics include:
Xiaoming Wang (Ming), Consultant Software Engineer, LexisNexis Risk Solutions - Initial HPCC Systems integration with Jupyter Notebook
The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more. In this talk, Ming will give an overview on plans for implementing a typescript based ECL kernel utilizing HPCC Systems JavaScript libraries to submit ECL code and return Workunit results rendered in Jupyter Notebook cells.
Bob Foreman, Senior Software Engineer, LexisNexis Risk Solutions – ECL Tips: PROCESS and AGGREGATE transform functions
Have you ever wanted to expand the power of your ECL ITERATE and ROLLUP statements? Bob Foreman discusses the next level PROCESS and AGGREGATE transform functions, and illustrates practical examples that were shown in our HPCC Systems forums.https://www.brighttalk.com/webcast/15091/274097Flavio Villanustrehttps://www.brighttalk.com/webcast/15091/274097HPCCSystemsBigDataJupyterThe Download: Tech Talks by the HPCC Systems Community, Episode 6Tue, 1 Aug 2017 15:00:00 +0000Join us as we continue this series of workshops specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community.
Episode 6 will include 15 minute Tech Talks from our HPCC Systems summer interns. Come hear the cool projects they have been working on!
This session will include Lorraine Chapman, Consulting Business Analyst, LexisNexis Risk Solutions, who manages our HPCC Systems internship program, along with our 2017 HPCC Systems summer interns. Come hear the cool projects they have been working on!
Lily Xu - PhD student of Computer Science at Clemson University, USA
Extending the YinYang K-Means machine learning algorithm in ECL
Vivek Nair - PhD student of Computer Science North Carolina State University, USA
Working to allow Spark to use HPCC Systems as a datastore and provide ECL programmers with the ability to access Spark algorithms.
George Mathew - PhD student of Computer Science at North Carolina State University, USA
Implementing the Gradient Trees machine learning algorithm in ECL to build predictive models.https://www.brighttalk.com/webcast/15091/270335Flavio Villanustrehttps://www.brighttalk.com/webcast/15091/270335HPCCSystemsBigDataMachineLearningThe Download: Tech Talks by the HPCC Systems Community, Episode 5Thu, 25 May 2017 15:00:00 +0000This series of workshops is specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community.
Our featured speakers and topics include:
1. Jeff Bradshaw, CTO, Adaptris - Interlok Deep Dive
In this talk, Jeff will explain how Interlok is used within the HPCC Systems platform, specifically the Thor component, and developing entity models for delivering data insights.
2. Jon Burger, Sr Architect, LexisNexis Risk Solutions - Hive360, Cloud Ported HPCC Systems Platform
HPCC Systems is excited to announce the creation of the Hive360 & Swarm360 stacks. Hive360 and its companion Swarm360 are a set of AWS cloud formation scripts designed to easily create a scalable, self-configuring, self-healing, on-demand HPCC platform within an existing AWS VPC. This talk will introduce you to Hive360 and its components, give a brief demonstration of the process and answer any questions you have about this technology.
3. Rodrigo Pastrana, Software Architect, LexisNexis Risk Solutions - SQL on HPCC Systems
HPCC Systems provides a SQL interface into its data files and published ROXIE queries called WsSQL. As its name implies, this functionality is provided as a web service, which allows interactive and/or programmatic SQL based access to HPCC Systems.
4. Bob Foreman, Senior Software Engineer, HPCC Systems, LexisNexis Risk Solutions - ECL Tip of the Month
This session will showcase an “ECL Tip of the Month”, presented by one of our ECL instructors, Bob Foreman. The tip will usually be something interesting that was posted on our HPCC Systems Support Forums, or a cool teaching example found in one of our many ECL classes.https://www.brighttalk.com/webcast/15091/258047Flavio Villanustrehttps://www.brighttalk.com/webcast/15091/258047HPCCSystemsBigDataMachineLearningThe Download: Tech Talks by the HPCC Systems Community, Episode 4Thu, 20 Apr 2017 15:00:00 +0000This series of workshops is specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community.
Our featured speakers and topics include:
Gordon Smith, Lead Architect, LexisNexis Risk Solutions - “Visualizer” – the ECL Bundle
John Holt, Lead Architect, LexisNexis Risk Solutions – An Update of the Machine Learning Bundles
David de Hilster, Consulting Software Engineer, LexisNexis Risk Solutions - The ECL IDE Goes Multi-Language – Computer Languages that Is!
Jessica Lorti, Director Marketing, LexisNexis Risk Solutions - HPCC Systems Marketing & Web Site Updatehttps://www.brighttalk.com/webcast/15091/255225Flavio Villanustrehttps://www.brighttalk.com/webcast/15091/255225HPCCSystemsbigdataThe Download: Tech Talks by the HPCC Systems Community, Episode 3Thu, 30 Mar 2017 15:00:00 +0000This series of workshops is specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community.
1. Joselito (Joey) Chua , PhD, Manager Software Engineer, Optimal Decisions Group - Prescriptive Analytics - a Software Engineering Perspective
This talk presents an overview of prescriptive techniques involving simulation and optimisation, the engineering challenges in building prescriptive tools, and HPCC solutions for those challenges.
2. Jill Luber, Senior Architect, LexisNexis Risk Solutions - Migrating an ECL code repository into Git, Part II
This session will take a quick look at a migration plan that moved ECL production code, production processes and developers out of MySQL/SVN and into a Git code management culture. This includes migrating both ROXIE and Thor processes to use Git branches across multiple HPCC Systems environments, all while continuing production data builds and releases.
3. Michael Gardner, Software Engineer II, LexisNexis Risk Solutions - HPCC Systems Platform: Java APIs and tools
This presentation will be in regards to the Java API and tools released by the HPCC Systems Platform team. These projects include wsclient, rdf2hpcc, clienttools, and jdbc. These open source projects, which can be found in the hpcc-systems github repositories, are designed to allow downstream developers a consistent means by which to interface with the HPCC Systems Platform, and to facilitate the workflow of common tasks a downstream developer might be concerned with.
4. Bob Foreman, Senior Software Engineer, HPCC Systems, LexisNexis Risk Solutions - In Search of the Lost Tutorial – the best ECL lesson you have never seen.
In this presentation, Bob will explore David Bayliss’ ECL Bible Tutorial, with particular focus on the GRAPH function and building the inverted index for the ROXIE search.https://www.brighttalk.com/webcast/15091/249491HPCC Systemshttps://www.brighttalk.com/webcast/15091/249491hpccsystems;bigdataanalyticsmachinelearningThe Download: Tech Talks by the HPCC Systems Community, Episode 2Thu, 16 Feb 2017 16:00:00 +0000The purpose of the workshop will be to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community.
1. Fujio Turner, Solutions Architect, Couchbase - Mobile/IoT & HPCC Systems
​Fujio will discuss the challenges around IoT and address the following questions:
As there are more mobile and embedded devices all generating more data, what does that mean now and for the future?
What has to change in an organization's infrastructure to keep up?
And how can I best take advantage this new stream of information?
2. Jacob Pellock, Sr Director Software Engineering, LexisNexis - Operationalizing jobs on Thor utilizing Python, Git and HPCC Systems client tools - Part I
So you’ve setup your HPCC Systems cluster and you’ve written your ECL code. Now you want to take the ECL you’ve written into production. Jacob will explain what technologies we’ve leveraged in bringing our LexisNexis data warehouse into production.
3. Roger Dev, Sr Architect, LexisNexis - Basic Linear Algebra Subsystem (BLAS) and Parallel Block BLAS (PBBlas) libraries for HPCC Systems.
Manipulation of matrix data via Linear Algebra operations lies at the heart of many data-mining and machine-learning techniques. New modules for HPCC provide highly scalable and performant implementations of these operations. BLAS provides an industry-standardized set of highly-optimized linear algebra operations. PBBlas extends these operations to mega-scale, splitting the operations into parallelizable units that can be balanced across an HPCC cluster. This talk provides an introduction to BLAS, describes the techniques and features of PBBlas, and provides an overview of the PBBlas interface.
4. Richard Taylor, Chief Trainer, HPCC Systems - HPCC Systems Training: Updates and Deep Dives on Cool Code
Richard will be presenting an update on what’s going on with ECL/HPCC/SALT/KEL training courses. He will also be selecting some interesting code snippets.https://www.brighttalk.com/webcast/15091/244033HPCC Systems Communityhttps://www.brighttalk.com/webcast/15091/244033machinelearning;IoTPBblas;graphanalyticsThe Download: Tech Talks by the HPCC Systems Community, Episode 1Thu, 12 Jan 2017 16:00:00 +0000Introducing The Download: Tech Talks by the HPCC Systems community!
We had a great lineup of speakers for our first Tech Talk on January 12:
Flavio Villanustre, VP Technology, LexisNexis
Anirudh Shah, Co-Founder, 3Loq
Allan Wrobel, Sr Software Engineer, LexisNexis
Lorraine Chapman, Consulting Business Analyst, HPCC Systems
Detailed Agenda - Watch Recording
Flavio Villanustre, VP Technology, LexisNexis
Welcome and HPCC Systems Update
Anirudh Shah, Co-Founder, 3Loq
How we use HPCC Systems to process more than 500 monthly marketing campaigns at the largest private bank in India across the banks entire portfolio.
Our experience with HPCC Systems in production
Automation and data sanity frameworks
Allan Wrobel, Senior Engineer, LexisNexis
Making full use of Superfiles to make order of magnitude improvements to build times on THOR. (plus fringe benefits)
Thor is well known for making short the processing of billions of records, and this promotes the tendency to use brute force in its deployment. Watch how the UK managed to implement efficiency over brute force to reduce the processing time for a daily build of a billion record ingest file from 12 hours, to 2 hours, and enabled further speed increases in other processes.
Lorraine Chapman, Consulting Business Analyst, HPCC Systems
In 2015, HPCC Systems was an accepted organization for Google Summer of Code (GSoC) taking on 2 students involved in this program. However, we had the bandwidth to support more students and so the HPCC Systems summer internship program was born. Four students joined the program in 2015 and four more in 2016. We will apply for GSoC and run our intern program again in 2017. Hear how the programs work, how projects are identified and find out about student successes on these programs.https://www.brighttalk.com/webcast/15091/240577Flavio Villanustre, Anirudh Shah, Allan Wrobel, Lorraine Chapmanhttps://www.brighttalk.com/webcast/15091/240577dataanalyticsmarketingGSoCdataprocessing