Archive for January, 2014

The first big challenge to overcome with any new NoSQL database deployment is figuring out how to deploy the cluster in an environment that lets you scale as needed within a single data center and even across multiple data centers. To save cash, many customers make the mistake of trialing the product on cheap hardware with limited RAM across clusters that are inadequate for the application.

We think there’s a better way to run your evaluation. At GoGrid, we’ve made it possible to deploy a 5-node Riak cluster on beefy, high-performance machines with the click of a button. Check out the specs we’re providing as an orchestrated deployment using our 1-Button Deploy™ technology:

Once the first cluster is deployed, you can point-and-click to add more nodes as you need them. Geek out for a moment on what you can do with this technology: You can run a user/session store for your application, use it to target and serve advertising, perform MapReduce operations, or any number of other things with just a few clicks of the mouse. And you can do it all in 4 easy steps.

Step 1: Login to GoGrid

To get started, login to your GoGrid account at https://my.gogrid.com to access the management console. If you don’t yet have an account, go ahead and create one: visit www.gogrid.com and click the Get Started button in the upper right-hand corner of the screen.

The concept of Big Data escapes many, but those who take the time to understand and use it benefit significantly. It’s predicted that Big Data will become huge in 2014, but companies that make business intelligence solutions a priority early in January will undoubtedly gain an edge over their competition. From data-based decisions to improving customer satisfaction, here are 5 ways Big Data can improve your business in 2014.

How Big Data will change your business in 2014

1. Access trends through social media
Big Data and the analytics that decipher it can easily zero in on a particular trend in social media and capitalize on it in ways that are advantageous to your business. Quobole uses the example of a location-based trend. That means a business can detect when a store is getting a notable amount of buzz on social media through unstructured data. Using that information, the business’s marketing team can then send out a blast on social media encouraging users to visit the store and give them an incentive to make a purchase. That’s just one example of capitalizing on social media trends with Big Data.

2. Rely less on gut feelings
In the past, many business decisions have been made based solely on instinct. Increasingly, however, companies are relying on data collected through market research or online trends. Big Data gives companies even more information to go on, enabling them to make business decisions that are more factually based and considerably less risky. If you have actual data, you can weigh real-life pros and cons before plunging into the deep end.

3. Stay ahead of the competition
Because Big Data is expected to go mainstream in the next year, businesses can get a head start on the competition by familiarizing themselves with business intelligence solutions. The sooner businesses start to use trends found by unstructured data, the larger head start they’ll have on the competition. It’s strategies like this that can turn underdog companies into market leaders and keep these top businesses at the forefront of their industry.

4. Make customers more satisfied
One of the biggest ways Big Data is changing businesses is by improving customer service. According to The Wall Street Journal, Netflix began using unstructured data for this purpose in 2008. After an outage, the company used the data to spot problem areas and improve the technology. It also used the data to inform the future viewing suggestions they offer customers. Big Data lets Netflix know where the most traffic is on their website and helps on-site engineers plan for better network capacity. Now, Netflix is a top company for on-demand Internet streaming.

Maintaining data security in the healthcare sector is hard. Although all businesses worry about securing confidential data, it doesn’t compare to the burden of companies managing personal health information that must comply with the Healthcare Insurance Portability and Accountability Act (HIPAA) and other relevant regulations. Unfortunately, the sensitive nature of these assets makes them even more desirable to cybercriminals. The result: Patient health information is being targeted more frequently and more aggressively than ever before. Fortunately, the evolving IT landscape has provided a way to address these threats: proactive security monitoring to identify and mitigate potential risks and encryption to protect the data itself.

Outside attacks are only one aspect of the problem, however: Negligent insiders are also putting their organizations at risk. Studies have shown that roughly 94% of healthcare firms have experienced at least 1 data breach within the past 2 years. Because these incidents cost the industry upwards of $7 billion per year, administrators must proactively seek strategies that cut down the chances of unwanted security problems.

Financial repercussions of a data breach

Due to the regulations governing personal health information, the reputation damage and bottom-line costs of a data breach are often exacerbated by compliance fines. What is more troubling is that these costs are only increasing in frequency and severity. Experts believe that the financial repercussions of data breaches have increased by $400,000 between 2010 and 2012, with more than half of companies losing $500,000 or more in 2012. With the price tag expected to rise 10 percent year-over-year through 2016, businesses must plan ahead to reduce these challenges.

To illustrate the effect of data breaches on healthcare organizations and the magnitude of the response required, we’ve put together the following infographic, “Keep Your Patient Health Info Secure in the Cloud.” Part of our series of 60-second guides, the graphic will show you in only a minute why the cloud is powering new ways to secure some of the most personal information available: details about our health.

GoGrid just launched Raw Disk Cloud Servers, the perfect choice for your Hadoop data node. These purpose-built Cloud Servers run on a redundant 10-Gbps network fabric on the latest Intel Ivy Bridge processors. What sets these servers apart, however, is the massive amount of raw storage in JBOD (Just a Bunch of Disks) configuration. You can deploy up to 45 x 4 TB SAS disks on 1 Cloud Server.

These servers are designed to serve as Hadoop data nodes, which are typically deployed in a JBOD configuration. This setup maximizes available storage space on the server and also aids in performance. There are roughly 2 cores allocated per spindle, giving these servers additional MapReduce processing power. In addition, these disks aren’t a virtual allocation from a larger device. Each volume is actually a dedicated, physical 4 TB hard drive, so you get the full drive per volume with no initial write penalty.

Hadoop in the cloud

Most Hadoop distributions call for a name node supporting several data nodes. GoGrid offers a variety of SSD Cloud Servers that would be perfect for the Hadoop name node. Because they are also on the same 10-Gbps high-performance fabric as the Raw Disk Cloud Servers, SSD servers provide low latency private connectivity to your data nodes. I recommend using at least the X-Large SSD Cloud Server (16 GB RAM), although you may need a larger server, depending on the size of your Hadoop cluster. Because Hadoop stores metadata in memory, you’ll want more RAM if you have a lot of files to process. You can use any size Raw Disk Cloud Server, but you’ll want to deploy at least 3. Also, each Raw Disk Cloud Server has a different allocation of raw disks, which are illustrated in the table below. The Cloud Server in the illustration is the smallest size that has multiple disks per Cloud Server. Hadoop defaults to a replication factor of three, so to protect your data from failure, you’ll want to have at least 3 data nodes to distribute data. Although Hadoop attempts to replica data to different racks, there’s no guarantee that your Cloud Servers will be on different racks.

Note that the example below is for illustrative purposes only and is not representative of a typical Hadoop cluster; for example, most Cloudera and Hortonworks sizing guides start at 8 nodes. These configurations can differ greatly depending on if you intend to use the cluster for development, production, or production with HBase added. This includes the RAM and disk sizes (less of both for development, most likely more for HBase). Plus, if you’re thinking of using these nodes for production, you should consider adding a second name node.

I watched an interview this morning where Snapchat’s CEO was discussing the recent exposure of its users’ phone numbers and names and something he said stood out for me: “Tech businesses are susceptible to hacking attacks. You have to work really, really, really hard with law enforcement, security experts, and various external and internal groups to make sure that you’re addressing security concerns.”

I have to agree with him: It takes a lot of effort to keep up with the latest security threats and vulnerabilities, to continuously assess existing security safeguards, to open channels of communications with security peers in other organizations, and to work with local and federal law enforcement to solve common security problems. Even companies that spend millions on security like Target are clearly challenged every day to identify and remove vulnerabilities to protect their customers’ data.

The rapid growth of cloud services and cloud service providers has only added new areas of concern for organizations hoping to leverage the benefits of the cloud. Organizations must perform their due diligence in identifying the right cloud service provider for their needs—preferably one that’s had time to develop security best practices based on firsthand experience and hard-won expertise. Securing a company’s production environment requires a cloud partner that is mature and has dedicated resources to provide robust security services and products.

Consider the recent DigitalOcean security revelation that its customers can view data from a VM previously used by another customer. According to one reporter, a DigitalOcean customer “noted that DigitalOcean was not by default scrubbing user’s data from its hard drives after a virtual machine instance was deleted.” Why not? DigitalOcean confided that the deletes were taking too long to complete and resulted in potential performance degradation of its services.

I recognize that challenge because GoGrid addressed this same issue years ago. All our deleted VMs go through an automated secure scrubbing process that ensures a previous customer’s data isn’t inadvertently shared with a new customer—and we do so without impacting our production environment. Was that easy to accomplish? No, it wasn’t. In fact, it took a lot of engineering work and resources to develop the right way to secure our customers’ data without impacting performance. Taking technical shortcuts when it comes to security often results in unexpected consequences that can affect an organization’s overall security—and ultimately, its reputation.