This past year has been heralded as the Year of Artificial Intelligence, in which so-called “deep learning” technologies based on neural networks have revolutionized everything from image recognition to voice transcription to machine translation. At the same time, another revolution of sorts has also been at work: an acceleration of the trend towards open sourcing software and hardware designs, while data becomes the lifeblood that increasingly drives the online world.

Last month Google open sourced its TensorFlow machine learning platform, purpose-built for deep learning applications. Not to be outdone, last Thursday Facebook open sourced the hardware designs and specifications for its proprietary neural network computing platform. Yet, what makes both of these announcements so remarkable is that they represent not throw-away public relations stunts, but rather the opening of the very production tools the two companies use to power all of their own services. Put another way, TensorFlow is not a “light” shareware preview - it is the fully developed software that Google itself uses internally for all of its own research. Joining the fray, startup OpenAI, co-chaired by Elon Musk and Sam Altman and backed by more than one billion dollars in commitments, intends to develop deep learning platforms and make them freely available to the research community.

Yet, the lifeblood that powers these algorithms and the software packages and hardware platforms that run them is data. Neural networks, like the rest of their machine learning brethren, require vast archives of training data with sufficient diversity in their composition to allow the algorithms to properly discern and distinguish the contours of the categorizations they are attempting to learn.

Unlike the rest of the open source ecosystem, in which having the source code to an application allows you to build upon and extend it, the world of machine learning is driven by data. In fact, IBM’s acquisition in October of the Weather Channel was intended to give IBM’s Watson machine learning algorithms access to high-velocity high-volume global weather information, rather than specific software packages or algorithms.

This means that merely having the same software as Google running on the same hardware as Facebook, or even whole ecosystems like OpenAI, are of limited use without the data to train them. At the same time, opening these designs offers considerable benefits to Google and Facebook. For Google, if its software becomes the standard tool taught in computer science courses, it gains legions of potential future employees fully trained on its software, while research teams extend and enhance its software free of charge. For Facebook, the more companies that purchase machines with its design, the lower the cost becomes for Facebook as the market grows.

This represents a cataclysmic shift in which hardware schematics and software source code that formerly were the most fiercely protected secrets a company possessed are now openly shared. In the cloud world of today hardware is increasingly commoditized and software is simply a click away in GitHub.

The relative infancy of the deep learning world means that a great deal of the leading expertise is shared with the academic world, where openness and publication are key tenants. Both Google and Facebook actively encourage their engineers to publish their latest innovations in the academic literature and cultivate strong relationships with the academic world.

In contrast, Apple faces much more of an uphill battle in its quest to improve its use of deep learning algorithms to improve services like its voice recognition and automated assistant. Unlike its peers, Apple has focused on devices over data, meaning it lacks the vast in-house data archives needed to robustly train neural networks. Its history of strict secrecy and non-disclosure agreements is also making it difficult for the company to hire some of the top engineers and collaborate more closely with the academic community.

In the AI world of the future, software and hardware will become commodity building blocks, with the real power lying in the vast archives of data powering the online world. In this world, data is as valuable as gold.