Big Data & Analytics

LENOVO BIG DATA & ANALYTICS SOLUTIONS

Insights with Faster Time-to-Value

The analysis of data drives decisions in every business. To gain better business insights, you need to manage the volume, variety, and velocity of data, while applying analytics. With Lenovo-engineered big data validated designs on Lenovo servers, you can harness the power of Apache™ Hadoop® and Apache™ Spark® with Cloudera®, Hortonworks®, and MapR®. Furthermore, Lenovo servers provide highly reliable and flexible foundations for your business analytics solutions so you can unlock the value of your data and deliver insights faster.

Outstanding scalability so you can grow as your workloads grow

Industry-leading transaction processing so you can make better, faster business decisions

High-throughput capacity that enables you to respond more quickly

Optimized systems and validated designs for faster time to value

Lenovo Big Data Validated Design for Cloudera Enterprise

With the Lenovo Big Data Validated Design for Cloudera Enterprise, Lenovo delivers a certified solution for both Apache Hadoop and Apache Spark environments. These tested configurations perform complex analytics on structured and unstructured data. These offerings provide the planning, design considerations, and best practices for deploying Cloudera Enterprise with a variety of Lenovo products including the high performance capabilities of the Lenovo ThinkSystem SR650 and SR630 servers. The Lenovo Big Data Validated Design for Cloudera Enterprise may also be deployed with Lenovo x3550 M5 and x3650 M5 servers (System x servers) to support both Apache Hadoop and Apache Spark architectures.

Additionally these validated designs address the deployment of big data platforms with Cloudera Enterprise onto a VMware vSphere virtualized server environment. Supported by the Lenovo ThinkSystem and System x servers, this aspect of the Lenovo Big Data Validated Design for Cloudera Enterprise provides administrators with the ability to manage big data solutions in a manner similar to other VMware vSphere virtualized workloads.

Lenovo Big Data Validated Design for Cloudera Streaming Analytics

For businesses that need to analyze streaming data in real-time, Lenovo offers a solution that addresses streaming analytics with a unique big data reference architecture. This Cloudera Enterprise architecture for Apache Spark streaming and machine learning utilizes the Lenovo ThinkSystem SR650 for data nodes and the ThinkSystem SR630 for management nodes. The data nodes in this unique offering utilize Intel 8168 platinum processors for supporting a suite of applications including an Apache Kafka data stream processing platform, Hbase database and Elasticsearch analytics engine. Storage for Kafka data stream processing is supported with 4-TB Intel P4500 Series NVMe Solid State Drives which are employed for their high IOPS throughput and bandwidth support. Elasticsearch search data is stored on low-latency Intel® Optane™ SSD DC P4800X Series. This solution is designed and tested to scale from half-rack to multi-rack configurations. This solution is ideal for those who require a high performance, low latency platform for addressing real-time streaming analytics applications.

Lenovo Big Data Validated Design for Hortonworks Data Platform

This Lenovo Big Data Validated Design Reference Architecture for Hortonworks Data Platform provides a thoroughly tested and integrated solution that combines the benefits of leading-edge technologies with mature, enterprise-ready features. Starting with a preconfigured hardware platform that is Hortonworks certified, this solution helps your team to be up and running analytics quickly.

The Lenovo ThinkSystem product portfolio serves as the high performance foundation for this solution with the Lenovo ThinkSystem SR630 rack server, SR650 rack server and Lenovo ThinkSystem network switches with speeds up to 100Gbps. This solution can also be deployed with the Lenovo System x product portfolio using the Lenovo x3650 M5 and x3550 M5 servers.

Lenovo Big Data Validated Design for MapR Converged Data Platform

The Lenovo Big Data Validated Design for MapR Converged Data Platform is engineered to run the latest MapR Edition of the Apache™ Hadoop® and Apache™ Spark® distributions for a simple, reliable and high performance big data solution. On top of the platform services, MapR packages a broad set of Hadoop and Spark open source ecosystem projects that enable big data applications. The goal is to provide an open platform that lets administrators choose the right tool for the job. Built on Lenovo rack servers and Lenovo RackSwitch network switches, this architecture simplifies Hadoop and Spark implementation by combining the extensive API support and ease of use of MapR with the performance and reliability of Lenovo servers and switches.

The Lenovo product portfolio serves as the high performance foundation for this solution with the Lenovo SR650 rack server and Lenovo ThinkSystem network switches with speeds up to 100Gbps. This solution can also be deployed with the Lenovo System x product portfolio using the Lenovo x3650 M5 server.

Lenovo Converged Analytics Platform for IBM

As businesses look to embrace data-driven decision making processes to gain a competitive advantage or secure operational efficiencies, Lenovo, IBM and Intel have created a converged analytics platform to make that transition as smooth as possible. The Lenovo converged analytics platform for IBM provides enterprise-level support for a company’s analytical needs, whether it is open source Apache Spark, data warehouse or machine learning application development. Data scientists can define and develop suitable machine learning models using the IBM Data Science Experience (DSX) Local offering. In addition, IBM DSX Local may be integrated with and draw upon a data lake featuring Hadoop and Spark databases along with IBM Big SQL and/or IBM Db2 Warehouse applications, enabling business to tap into a wealth of existing corporate information. The Lenovo converged analytics platform provides the ability to implement one, two or all three of these applications based on a company’s unique requirements. An added bonus is the ability of running this platform as part of an IBM Cloud Private deployment – an on-premise private cloud designed for enterprises to develop, test and run cloud-native applications.

Lenovo Big Data Validated Design for IBM SQL Analytics

SQL Analytics on Big Data clusters is now an essential workload for all enterprises. This workload is gaining adoption both for its features and its performance. Spark SQL enables organizations to realize the benefits of real-time analytics and gain faster insights in today’s competitive business environment.

This solution supports the deployment of either Apache Spark SQL or IBM Big SQL. Consequently, this solution provides significant performance and cost advantages for analyzing massive datasets with its in-memory processing engine and support for running SQL analytics. With this solution either Apache Spark SQL or IBM Big SQL may be deployed on a distributed cluster of Lenovo ThinkSystem SR630 Management nodes and ThinkSystem SR650 Data nodes. In combination with Intel NVMe drives and Lenovo NE10032 100Gbps high speed data networking switches the following key capabilities of this solution can be realized:

Lenovo Big Data Validated Design for IBM Db2 Warehouse

This solution is designed and optimized to deliver predictable performance for an IBM Db2 Warehouse infrastructure and accelerate its deployment. This solution is based on the IBM Db2 Warehouse analytics software, which is a software-defined data warehouse for private clouds and virtual private clouds that support Docker container technology. The solution is deployed on a cluster of Lenovo ThinkSystem SR630 management nodes and Lenovo ThinkSystem SR650 data nodes optimized for data warehouse workloads. Networking is supported by Lenovo ThinkSystem NE10032 100Gbps high speed data networking switches.

This solution is designed to support the IBM Data Science Experience (DSX) Local software. This is a new IBM Data Science platform offering an out-of-the-box on premises enterprise solution for data scientists and data engineers, offering a suite of data science tools. The solution is deployed on a cluster of Lenovo ThinkSystem SR650 data and management nodes optimized for machine learning and deep learning workloads. Networking is supported by the Lenovo ThinkSystem G8272 10Gbps data networking switches.

Lenovo Solution for SAP Data Hub – Unified, Scalable Analytics

The Lenovo Solution for SAP Data Hub is a critical starting point to enable a successful Digital Transformation. The solution enables unified analytics of multi-layer, multi-vendor, multi-location environments (on-premises, cloud, hybrid cloud). In addition, it provides executive-level oversight of devices and data.

New data sources from use cases such as IoT, video imaging, mobile devices and real-time monitoring are driving the exponential growth of data. Many enterprises have challenges storing all this data as well as gaining business insights and competitive advantage from the data. At the same time, corporate data landscapes are growing increasingly complex making it hard and costly to capture the maximum value from the available data.

The Lenovo Solution for SAP Data Hub helps address these challenges. With it, you can perform data integration, orchestrate the movement of data, and provide governance capabilities for data across a complex and diverse data landscape. You can also use Big Data processing to create uniquely powerful data pipelines. Existing data and processes can be managed, shared, and distributed across the enterprise with seamless, unified, and enterprise-ready monitoring and landscape management capabilities.

Lenovo Validated Design for Splunk Enterprise

The Lenovo Big Data Validated Design for Splunk is designed to provide an analytic platform for machine data. Applications, sensors, systems, web servers and other technology infrastructure generate data every second of the day. Splunk Enterprise software takes this machine data and provides a unified way to organize and extract real-time insights. With Splunk's ability to bring in insights from your relational databases, you can get value from the full spectrum of your data. The Lenovo product portfolio serves as the high performance foundation for this solution with the Lenovo SR650 rack server and Lenovo ThinkSystem network switches with speeds up to 100Gbps.