How DataSphere Uses Machine Learning to Automate Data Management

DataSphere uses machine learning to solve resource
management problems, fixing numerous other IT problems along the way. To meet service level agreements
(SLAs), enterprise IT pros need to allocate finite resources in storage
infrastructure: capacity, performance load, and anything else anything
consumable such as IOPS/unit time, or how much data can be ingested over a
lifetime before the media it is written to wears out. These resources have to
be handed out to achieve the most value for the business. At the same time as
IT is expected to keep business data humming, it is also under pressure to
contain costs. Historically, IT has allocated storage resources manually, often
reacting to events after they have impacted business.

DataSphere begins with an enterprise’s existing
infrastructure and uses reinforcement learning to continually optimize the
placement and movement of data across all storage, including flash, shared, and
cloud storage, to meet SLAs. Its algorithm simulates a market economy, where
storage containers are “landlords” with floor space to lease and the data
objects (individual files) are “tenants” who are looking to lease real estate
that suits their tastes/needs.

Following is a simplified summary of how DataSphere’s
machine learning engine generally works:

Admins
assign Service Level Objectives (SLOs) to data that define the performance,
price, and protection that the data requires. Admins can do this using predefined
workflows or powerful
objective expressions.

DataSphere discovers the
capabilities of the storage resources, and maps the active performance of the
entire topology from storage to clients. This holistic view of the topology enables
DataSphere to map the net-net performance of the IT infrastructure to learn how
to better allocate resources to clients that need it. It also eliminates the
need to manually map the performance of the network, since DataSphere measures
the net impact of all layers on performance.

Using machine learning, DataSphere
maps data objects to the IT-defined objectives. When data falls out of
alignment with objectives, it makes adjustments, feeding the results of changes
to the infrastructure back into the engine’s economic simulation. As it learns
what data needs, DataSphere can recommend changes to objectives to better meet
business’s requirements, as well as notify admins when performance or capacity
are needed.

When
contention occurs, or when DataSphere predicts it might occur, it can perform
“financial arbitrage” to move data according in ways that deliver the highest
value to business. For example, performance for a mission critical application
should be protected over data being used by a back-end process. DataSphere
determines this value, the “income” the data tenant has to spend, based on two
factors: the price paid for a given service level and the penalties for when
service levels are not met.

Over time, DataSphere learns
what data’s “tastes” are and will be able to predict the needs of all data in
the system. Just as a robot using machine learning to navigate runs into fewer
and fewer objects over time, DataSphere will need to perform arbitrage less and
less often.

From the day DataSphere is installed (in just a few hours), it
enables enterprises to automate the location and placement of data without
impact to applications. As DataSphere learns about the topology, from storage
to clients, and learns about what data needs, it optimizes how data can be
managed to meet business goals, getting better and better over time.

With DataSphere, IT can deliver more consistent application
response times and greater application uptime, and contribute to higher profits
through increased utilization and the savings that come with reducing
overprovisioning. Want to learn more about how DataSphere can automate
intelligent data management? Connect with us at deepdive@primarydata.com to schedule
a meeting or demo.