The Blog

Blog

Want to increase data scientist productivity? Use IBM PowerAI

Social

Data scientists surface new and valuable insights from a wide variety of relational, semi-structured and unstructured data sources. This ‘magic’ is accomplished by leveraging the combination of modern accelerated IT infrastructure along with powerful machine learning (ML) and deep learning (DL) algorithms. Data scientists often have advanced academic degrees along with deep skills, ability, and experience spanning multiple programming languages and other supporting tools. Unfortunately, data scientists can struggle to optimize their productivity and reduce their time spent on low value tasks.

Based on our team’s work on hundreds of engagements, we have observed three major inhibitors that reduce data scientists’ productivity and cause unnecessary loss of time and money:

Difficulty in accessing or using data scientist tools, AI algorithm frameworks and modern accelerated hardware resources delays initial AI project start-up and the ability for a data scientist to deliver early results from new proof of concepts.

Lack of access to high-quality training data inhibits the data scientists’ ability to deliver the required accuracy levels from the chosen ML/DL algorithms.

Hardware and software infrastructure that is too slow or expensive particularly for training and fine-tuning brand-new ML or DL models with associated big data is a big hindrance.

IBM PowerAI was designed from the ground up with the next generation of data scientists in mind. Our goal is to create an enterprise software distribution of the open source machine learning / deep learning frameworks and then add value and support around this core.

Simplicity: IBM PowerAI includes the most popular deep learning frameworks, including all required dependencies and files, precompiled and ready to deploy. PowerAI Enterprise software and the accelerated Power servers it runs on are fully supported by IBM technical support. Our pre-packaged integrations, IBM support, and performance benefits can save data scientists valuable time and significantly increase their productivity, especially during the critical start-up phase of a new project.

Unique capabilities: IBM PowerAI has a library called SnapML that GPU-accelerates common machine learning algorithms like logistic regression, linear regression, and SVMs. With PowerAI’s distributed deep learning, data scientists can now scale a single TensorFlow job across 100s of GPUs in 10s of servers, with 95% scalability. Large model support facilitates the use of system memory with little to no performance impact, yielding significantly larger and more accurate deep learning models.

PowerAI is an open platform: This means that data scientists, ISVs, business partners, systems integrators and other individual developers and clients can build on the software and extend it out in new and unique ways that make it more effective for clients.