The author is a Forbes contributor. The opinions expressed are those of the writer.

Loading ...

Loading ...

This story appears in the {{article.article.magazine.pretty_date}} issue of {{article.article.magazine.pubName}}. Subscribe

Guest post written by Franz Aman

Franz Aman is chief marketing officer and senior VP of Business Strategy at SGI.

Franz Aman

A recent survey conducted by Capgemini and the Economist Intelligence Unit raises questions on how “data-driven” today’s enterprises truly are. The study shows that companies and organizations are struggling with the enormous volumes of data - often poor quality data – that they collect and create, and many are wrestling to free data from organizational silos. However, almost 55 per cent of the survey’s global respondents say that the need for Big Data management is not recognized at senior levels of their organizations. Corporations and C-level executives need to understand that merely collecting and storing Big Data doesn’t add value to the bottom line of a company.

In contrast, the U.S. government is moving rapidly to exploit the full potential of Big Data. In March the Obama Administration launched a $200 million Big Data initiative. The objective of this initiative, which is one of the largest public technology investments in recent history, is to analyze Big Data and achieve advances in several sectors such as security, healthcare, education, the environment and the sciences. Already, the government’s Big Data initiative is helping the Army perform real-time analysis of huge volumes of intelligence information crucial to saving the lives of soldiers in war zones. For example, the Army deployed a Hadoop cluster in Afghanistan. Known as a network connection point, the cluster is housed in a 20-foot shipping container and dropped into a remote area. This technology is giving battlefield commanders the ability to do real-time analysis of intelligence reports filed from around the world.

For the last six years, a major government postal carrier has also been leveraging Big Data technologies for fraud detection. The organization scans over four trillion pieces of mail every year checking for duplicates and fraudulent postage. Indeed, the U.S. Federal Government’s use of Big Data may provide some valuable lessons for enterprises that are struggling to extract value from Big Data resources.

Hiring a skilled workforce

Government organizations are hiring “quants,” people destined to becoming Big Data specialists. They are scientists who have unique math, science and IT backgrounds. Government agencies are investing in their futures, providing on-the-job training to develop expert skills like “open source analysis” of Web sites and social media networks to detect the recruiting strategies and information pipelines of extremist organizations. From the Army’s hiring of Big Data specialists to NASA’s appointment of a principal scientist for data mining, the federal government is looking strongly at talent acquisition and growth.

Extracting value from Big Data requires trained experts who understand semantics, statistics, algorithms and analytics. Currently these resources are hard to find. According to a recent McKinsey study, the U.S. is now facing a shortage of talent with the expertise to understand and make decisions around Big Data.

Ironically, today the private sector is competing with the government to acquire Big Data talent. So it is time for businesses to target their talent training investments at Big Data. One tried and true approach is to partner with universities and colleges to ensure that the next generation of workers is well versed in Big Data technologies. Until then, Big Data expertise will come at a premium price.

Big Data in the Data Center

By having the largest Big Data deployments in a data center to date, government agencies have discovered the benefits of system density and management capabilities. The government understands how to move and manage very large scales of data, while deriving information advantage that can literally mean the difference between life and death.

So when Big Data grows too big for standard commercial computer systems, what is the best way to go? Do you scale-up to a high performance computing system or scale-out by just adding nodes to a distributed computing network?

For enterprises, the best strategic and economically viable option in the long run will be to take the “scale up” approach. While this decision is certainly tempered by the needs of the company, the scale up approach will help enterprises accurately prepare for their current Big Data needs by buying limited hardware and add more racks in a cost-effective manner as data needs increase.

A complete Big Data solution should empower enterprises right from ingest phase to analysis and archiving, and should grow with the company as data demands dictate.

For example during the ingest phase, the Big Data solution should be capable of gulping massive volumes of data at high velocity. This can include everything from videos, to RFID and sensor signals, to massive volumes of social media content like Facebook and Twitter (there are about 400+ million Tweets per day). This can be done either in parallel into a cluster or into a scale-up in memory system that is capable of ingesting 4TB of data - roughly the equivalent of the Library of Congress - in 3 to 4 seconds.

Once the data has been ingested, the next most critical aspect will be data storage. To gain maximum value, data needs to be potentially pre-processed in memory, and then stashed away for further analysis including trending and comparing over time. For this, fast disks and often petabytes of it with the right software that makes it possible to find data down the road, is needed. Zero watt disk archives provide the benefit of being on-line and keeping the data accessible at all times while sleeping when not in use, thereby keeping power billslow.

When it comes to analysis, enterprises are often faced with two classic problems. They either have to find the “needle in the haystack” when a Hadoop scale-out system will do the trick; or they have to understand complex relationships requiring graphing algorithms, when a scale-up system might be the answer.

Often all we want is a single bit answer – yes or no. Sometimes we ask for better understanding and visualization becomes the only way to grasp complex patterns easily. More frequently than ever before, we want to take immediate action and transact on Big Data, and no SQL databases are quickly becoming the hammer to crack this nut.

As companies are beginning to make bigger investments, and the availability of advanced technologies and a skilled workforce are ramping up, the era of Big Data and Big Data analysis is just getting started. In the next five years, we will see businesses and industries transform just like they did when they stumbled into the digital era. Learning from the government, imagine the effects Big Data technologies will have on the health-care industry when it comes to re-admissions and survival rates; or the impact of instantaneous fraud detection on consumer sentiment in retail. Whether it’s the public or private sector, the possibilities are endless and only limited by our creativity.