By submitting my Email address I confirm that I have read and accepted the Terms of Use and Declaration of Consent.

By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.

You also agree that your personal information may be transferred and processed in the United States, and that you have read and agree to the Terms of Use and the Privacy Policy.

systems. But you can fight NAS sprawl with a number of technologies.

At the end of 2008, Framingham, Mass.-based research firm IDC reported that for the first time ever, more data was stored on network-attached storage (NAS) systems or filers than on storage-area network (SAN) storage. In addition, IDC's more recent forecasts predict an acceleration of this trend. It's not only the number of files growing, but their size as well.

All of this translates into more installed NAS systems. Adding more NAS systems is an understandable reaction to file growth as network-attached storage systems are typically self-contained and preconfigured for rapid installation, and are easy to implement, operate, manage and use. But most traditional NAS systems are also silos, so they contribute to NAS sprawl. The consequences of NAS sprawl can be summed up by the often-repeated adage, "I loved my first NAS filer, I really liked my second, but by my tenth I was pulling my hair out."

Five ways NAS sprawl causes problems

NAS sprawl generally creates five major IT challenges (these are the biggies; there are others as well). All of them are complicated by the limited number of tasks a data storage administrator can complete in a given time, and they're all pretty difficult.

1. System management. Even though NAS management is far simpler than SAN storage management, it still requires some care, feeding and time.

2. Managing client and application access to data. Each NAS system must be mounted on every server and workstation that requires access. Mounts are application disruptive so they require scheduling downtime for the server applications. With more NAS systems you have more mounts, and that adds up to more scheduled downtime.

3. File location. Policies for file placement must be set based on performance, accessibility, age, access frequency, storage cost, availability, data protection and so forth. Policy setting is the easy part, but actually moving the files to the appropriate NAS system is a time-consuming manual data migration process. And it's an ongoing one. When the migration is done, the originating application must be re-pointed at the correct NAS system; this isn't such a big deal with a couple of NAS systems, but it's compounded as NAS systems are added.

4. NAS load balancing. Load balancing is required to get better utilization or to meet applications' performance requirements. Because load balancing is also a manual process to set up and manage, it becomes a major time sink even if you have identically configured NAS boxes.

5. Protecting, replicating and/or backing up files. Different NAS systems have different methods for snapshots, continuous data protection (CDP), mirroring and replication. Some are well integrated with common backup vendors, such as Windows Volume Shadow Copy Service (VSS), VMware or Citrix Systems Inc.'s XenServer, but others aren't. So there are more tasks requiring more time, training and experience. Even with identical NAS systems, they still require separate touch points for each data protection setup, operation and management.

These challenges get more difficult, take more time and make it more likely that errors will occur as NAS sprawl grows.

Technologies that can help with NAS sprawl

The industry took note and recognized this sober situation. The result is the current availability of four technologies designed to solve some or all of these challenges, albeit in completely different ways. They include: operating system built-ins such as Microsoft's Distributed File System (DFS) for CIFS as well as Linux/Unix automounters for NFS; file virtualization systems; clustered NAS systems; and private cloud and grid storage. A brief analysis of each of these technologies illustrates what they do and don't do to meet the aforementioned challenges.

1. Operating system built-ins

Microsoft Distributed File System (DFS) is part of Microsoft's Windows 2003 and 2008 server operating systems; DFS was developed for the small- and medium-sized business (SMB) Windows-only (CIFS) market. DFS Namespaces enables multiple file servers' shared folders to be grouped into one or more logical namespaces. Users see the namespace as a single shared folder and are automatically connected to shared folders in the same available Active Directory domain services site. This sidesteps the need for LAN or WAN routing. DFS Replication can automatically synchronize folders between local file servers or remote NAS systems on a wide-area network.

Linux/Unix automounters are intended for NFS users. Automounters mount and unmount directories from other systems on the network as they're needed. They get their mounting instructions from centralized maps, which can be flat files, NIS maps or sections of an LDAP directory. Automounters are far easier to use than managing multiple static NFS mounts. Automounter advantages are readily apparent when there's a service failure. If a remote file server becomes unavailable, an automounter will simply time out and unmount the directory without alarming users. With static NFS, mounts will hang until the file server is back up and running again.

Pros

Easy integration with Linux and Unix environments

No additional licensing costs

Low upfront total costs

Eliminates server hangs when the mounted service fails

Works in conjunction with most NFS NAS systems

Solves many of the user and mount issues with multiple NFS NAS systems

Cons

Requires significant Linux or Unix expertise

Not easy to set up

Only works with NFS

Lacks built-in replication capabilities

Doesn't work with Windows (CIFS)-based NAS systems

No file-level granularity

Doesn't solve many of the management problems with multiple NAS systems

2. File virtualization systems

File virtualization systems separate the physical location of a file from the representation of that file. File virtualization systems essentially eliminate the requirement for a user or application to know exactly where their files are stored as they see only a single global namespace (GNS.) Depending on how it's implemented, file virtualization allows transparent file access, load balancing, data storage tiering, file migration, and even snapshots and replication for multiple homogeneous or heterogeneous NAS systems.

File virtualization implementations can usually leverage Microsoft's DFS and/or Linux/Unix automounters by acting as a management layer. This allows them to automatically update the DFS Namespace to include NAS filers and file servers, while also providing common management for multiple dissimilar NAS systems. F5 Network Inc.'s ARX file virtualization appliance also provides available disk space monitoring, while others (Avere Systems Inc.'s FXT Series and EMC Corp.'s Celerra NS with FAST) provide storage tiering.

No additional software is required to leverage DFS Namespace and Linux/Unix automounters. If the file virtualization technology fails, the file maps for Windows and mounts for Linux/Unix remain intact, allowing users and applications access to their files. Not all the file virtualization systems work with DFS or automounters, and some that do don't necessarily require them.

There are two types of file virtualization products: shared path and split path.

Shared-path file virtualization systems share the control and data path, which means that all connections to the NAS and all data to/from the NAS flow through the virtualization system. Shared-path file virtualization systems are full proxies that touch every file and every packet in the path before it's written or read.

Pros

Allows files to be migrated in real-time even when in use; the file virtualization system updates the global namespace with the new physical location of the file

Definable policies using file metadata such as file type, creation date or when last accessed

Cons

Added latency to pass through file virtualization system can be a bottleneck affecting response times and IOPS

Single point of failure; a dead-box failure cuts off all access to the NAS and/or file systems

Scalability is limited by the throughput of the shared-path file virtualization system

Split-path file virtualization systems separate the control and data paths, so the NAS connections and all data to/from the NAS don't pass through the file virtualization system. Split-path file virtualization is typically deployed as an x86 appliance connected to the LAN switch. They manage the namespace to direct files to the appropriate NAS or file system without intercepting any packets.

Pros

Nondisruptive implementation for applications/users

Highly scalable

File virtualization system failure won't cut off access to data

Protects current investment in NAS and file systems

Relatively easy file migration

If it uses Microsoft DFS for the namespace, DFS will always have the most recent namespace configuration allowing users and applications to access their files

Heterogeneous NAS support

Easy to operate

Cons

Usually requires agents on application servers and workstations for transparent file migration; agents must be managed and maintained

Tends to be Windows (CIFS) focused with limited NFS support

Shared-path and split-path systems are typically mutually exclusive. But EMC's Rainfinity is primarily a split-path system except when moving files when it's configured as shared path. That eliminates the need for split-path agents for file migrations and the shared-path scalability, performance and single-point--of-failure issues.

Clustered NAS systems use a distributed file system running concurrently on multiple NAS nodes. Data and metadata can be striped across both the cluster and underpinning block (direct-attached storage [DAS] or SAN) storage subsystems. Clustering also provides access to all files from any of the clustered nodes regardless of the physical location of the file. The number and location of the nodes are transparent to the users and applications accessing them.

Although clustering appears similar to file virtualization, the key difference is that all system nodes must be from the same vendor and often configured similarly. Some exceptions to this include BlueArc Corp.'s Titan and Mercury series, and NetApp's Ontap GX.

Clustered NAS systems typically provide transparent replication and fault tolerance, so that if one or more nodes fail, the system continues functioning without any data loss. Clustered NAS systems are distinguished by their large file systems that can scale to hundreds of terabytes (or more) of addressable capacity.

Linearly scale to many nodes and high capacities, with millions to billions of managed file objects; aggregate throughput and IOPS independent of one another

Easy to grow

Pay-as-you-go architecture

Built-in fault tolerance

Centralized management

Easy data protection

Simple file access

Cons

Rip-and-replace solution; can't reuse current NAS systems

No support for heterogeneous NAS systems

No ability to migrate files from current NAS systems to the clustered

Higher hardware and license costs, but may be offset by significantly lower management costs

Clustered NAS does a very good job of resolving most network-attached storage sprawl challenges. It eliminates or at least mitigates the multisystem management issue depending on the scale of the environment. User and application access is simplified with load balancing built in, and data protection and replication is also part of the architecture. Clustered NAS does fall a little short on storage tiering; it does make it easier, but doesn't automate the process (with the exception of EMC's Celerra NS-960 with FAST using Rainfinity).

4. Private cloud or grid storage systems

Private cloud or grid storage systems are somewhat similar to clustered NAS systems, but grid storage provides peer-to-peer clustering that enables it to provide single-image files over geographically dispersed, long-distance and cross-domain operations.

Geographic location "awareness" adds another dimension to NAS sprawl management by centralizing control, management and access for distributed environments. Based on access performance and/or data protection policies, files are replicated and moved to the geographic location that best meets the policy. Whether you have only a few remote or branch offices or hundreds, grid or private cloud storage can make a lot of sense.

There are currently two commercially available private cloud storage systems: Bycast Inc.'s StorageGRID and EMC Atmos. The Bycast StorageGRID runs on x86 nodes that sit in front of standard DAS or SAN storage, so it can use already installed block storage. EMC Atmos also runs on x86 nodes but can only use its own JBOD storage. Bycast's product is a bit more mature with hundreds of installations and OEM deals with HP and IBM.

Pros

Same pros as clustered NAS

Same or lower cost than clustered NAS

Management of geographically dispersed locations

Distributed geographically aware access with centralized management, protection and replication of all files

Geographically aware, policy-based file replication and movement

DAS and SAN investment protection or use of very low-cost storage

Cons

Limited number of vendors with mature technology

No automated storage tiering at this time

Startup costs can be more than other technologies (but long-term costs will likely be less)

File storage growth is bordering on the out of control, with many companies struggling to get a handle on their network-attached storage systems. This NAS sprawl creates serious management problems that can tax overworked IT staffs and jeopardize users' access to corporate data. But the four different technologies described above are available today and can resolve many of the issues and challenges created by NAS sprawl.

Take a pragmatic approach, and implement the least amount of new technology that best meets current and forecasted requirements. That will help minimize risk, lessen the strain on CapEx and OpEx budgets, and can make a world of difference with NAS management.

0 comments

Register

Login

Forgot your password?

Your password has been sent to:

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy