Distributors

If there is one subset of machine learning that spurs the most excitement, that seems most like the intelligence in artificial intelligence, it’s deep learning. Deep learning frameworks—aka deep neural networks—power complex pattern-recognition systems that provide everything from automated language translation to image identification.

Deep learning holds enormous promise for analyzing unstructured data. There are just three problems: It’s hard to do, it requires large amounts of data, and it uses lots of processing power. Naturally, great minds are at work to overcome these challenges.

What’s now brewing in this space isn’t just a clash of supremacy between competing deep learning frameworks, such as Google’s TensorFlow versus projects like Baidu’s Paddle. Rivalry between multiple software frameworks is a given in most any part of IT.

The newest part of the story is about hardware versus software. Will the next big advances in deep learning come by way of dedicated hardware designed for training models and serving predictions? Or will better, smarter, and more efficient algorithms put that power into many more hands without the need for a hardware assist? Finally, will deep learning become accessible to the rest of us, or will we always need computer science PhDs to put this technology to work?

Leave it to Microsoft to assume the role of rival. Its push back against Google on the deep learning front comes in the form of the Cognitive Toolkit, or CNTK for short. The 2.0 revision of CNTK challenges TensorFlow on multiple fronts. CNTK now provides a Java API, allowing more direct integration with the likes of the Spark processing framework, and supports code written for the popular neural network library Keras, which is essentially a front end for TensorFlow. Thus Keras users may transition gracefully away from Google’s solution and towards Microsoft’s.

But Microsoft’s most direct and meaningful challenge to TensorFlow was making CNTK faster and more accurate, and providing Python APIs that expose both low-level and high-level functionality. Microsoft even went so far as to draw up a list of reasons to switch from TensorFlow to CNTK, with those benefits at the top.

Speed and accuracy aren’t just bragging points. If Microsoft’s system is faster than TensorFlow by default, it means people have more options than just to throw more hardware at the problem—e.g., hardware acceleration of TensorFlow, via Google’s custom (and proprietary) TPU processors. It also means third-party projects that interface with both TensorFlow and CNTK, such as Spark, will gain a boost. TensorFlow and Spark already work together, courtesy of Yahoo, but if CNTK and Spark offer more payoff for less work, CNTK becomes an appealing option in all of those places that Spark has already conquered.

Graphcore and Wave Computing: The hardware’s the thing

One of the downsides to Google’s TPUs is that they’re only available in the Google cloud. For those already invested in GCP, that might not be an issue—but for everyone else, and there’s a lot of “everyone else,” it’s a potential blocker. Dedicated silicon for deep learning, such as general purpose GPUs from Nvidia, are available with fewer strings attached.

Several companies have recently unveiled specialized silicon that outperforms GPUs for deep learning applications. Startup Graphcore has a deep learning processor, a specialized piece of silicon designed to process the graph data used in neural networks. The challenge, according to the company, is to create hardware optimized to run networks that recur or feed into each other and into other networks.

One of the ways Graphcore has sped things up is by keeping the model for the network as close to the silicon as possible, and avoiding round trips to external memory. Avoiding data movement whenever possible is a common approach to speeding up machine learning, but Graphcore is taking that approach to another level.

Wave Computing is another startup offering special-purpose hardware for deep learning. Like Graphcore, the company believes GPUs can be pushed only so far for such applications before their inherent limitations reveal themselves. Wave Computing’s plan is to build “dataflow appliances,” rackmount systems using custom silicon that can deliver 2.9 petaops of compute (note that’s “petaops” for fixed-point operations, not “petaflops” for floating-point operations). Such speeds are orders of magnitude beyond the 92 teraops provided by Google’s TPU.

Claims like that will need independent benchmarks to bear them out, and it isn’t yet clear if the price-per-petaop will be competitive with other solutions. But Wave is ensuring that price aside, prospective users will be well supported. TensorFlow support is to be the first framework supported by the product, with CNTK, Amazon’s MXNet and others to follow thereafter.

Brodmann17: Less model, more speed

Whereas Graphcore and Wave Computing are out to one-up TPUs with better hardware, other third parties are out to demonstrate how better frameworks and better algorithms can deliver more powerful machine learning. Some are addressing environments that lack ready access to gobs of processing power, such as smartphones.

The company’s approach, according to CEO and co-founder Adi Pinhas, is to take existing, standard neural network modules, and use them to create a much smaller model. Pinhas said the smaller models amount to “less than 10% of the data for the training, compared to other popular deep learning architectures,” but with around the same amount of time needed for the training. The end result is a slight trade-off of accuracy for speed—faster prediction time, but also lower power consumption and less memory needed.

Don’t expect to see any of this delivered as an open source offering, at least not at first. Brodmann17’s business model is to provide an API for cloud solutions and an SDK for local computing. That said, Pinhas did say “We hope to widen our offering in the future,” so commercial-only offerings may well just be the initial step.

Deep Learning Pipelines, as the project is called, approaches the integration of deep learning and Spark from the perspective of Spark’s own ML Pipelines. Spark workflows can call into libraries like TensorFlow and Keras (and, presumably, CNTK as well now). Models for those frameworks can be trained at scale in the same way Spark does other things at scale, and by way of Spark’s own metaphors for handling both data and deep learning models.

Many data wranglers are already familiar with Spark and working with it. To put deep learning in their hands, Databricks is allowing them to start where they already are, rather than having to figure out TensorFlow on its own.

Deep learning for all?

A common thread through many of these announcements and initiatives is how they are meant to, as Databricks put it in its own press release, “democratize artificial intelligence and data science.” Microsoft’s own line about CNTK 2.0 is that it is “part of Microsoft’s broader initiative to make AI technology accessible to everyone, everywhere.”

The inherent complexity of deep learning isn’t the only hurdle to be overcome. The entire workflow for deep learning remains an ad-hoc creation. There is a vacuum to be filled, and the commercial outfits behind all of the platforms, frameworks, and clouds are vying to fill it with something that resembles an end-to-end solution.

The next vital step won’t just be about finding the one true deep learning framework. From the look of it, there is room for plenty of them. It will be about finding a single consistent workflow that many deep learning frameworks can be a part of—wherever they may run, and whoever may be behind them.

ARN Distributor Directory

ARN Vendor Directory

Slideshows

Opening ice breaker sessions set the scene for EDGE 2017

​EDGE 2017 kicked off with an opening ice breaker session, providing a structured environment for channel executives to form and foster new relationships and business opportunities. Photos by Maria Stefina.​

ARN returns to Melbourne for second running of After Hours

Partners, vendors and distributors came together for the second running of After Hours in Melbourne, designed to further unite the Australian channel through a series of invite-only social events in Victoria. Photos by Raymond Korn.​

A bumper crowd of partners turned out in force for Synnex Alliance 2017 in Melbourne, uncovering the key channel strategies required to deliver on the potential of digital transformation in Australia. An evening of keynote speakers, panel discussions and technology exhibitions assessed the opportunities and challenges of digital at Melbourne Olympic Park, with Sydney next up on August 16. Photos by Raymond Korn.

Copyright 2017 IDG Communications. ABN 14 001 592 650. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of IDG Communications is prohibited.