How Governments Can (and Should) Use Hadoop

Some of the core functions of government—both state and federal—are keeping citizens safe, representing constituents, and ensuring a healthy, efficient infrastructure. As shrinking budgets threaten to jeopardize the resources available to the public sector, it is increasingly hard for governments to meet their ever-changing responsibilities.

Some of the core functions of government—both state and federal—are keeping citizens safe, representing constituents, and ensuring a healthy, efficient infrastructure. As shrinking budgets threaten to jeopardize the resources available to the public sector, it is increasingly hard for governments to meet their ever-changing responsibilities.

That’s where Apache Hadoop comes in. This innovative software fills in the gaps, and enables government, intelligence, and defense agencies and contractors to handle various tasks while minimizing costs.

Case Studies

From pinpointing terrorist threats to monitoring road conditions, Hadoop is a powerful and comprehensive storage and processing platform that helps the government meet its mandates. Below is just a sampling of big data Hadoop use cases in government.

Gauging Public Opinion

Powered by cutting-edge big data technology, Hadoop scans and analyzes social media posts, tweets, and instant messages. This allows analysts to get a handle on public sentiment about bills, motions, and other government actions. While more direct methods may be effective for smaller groups of people, a broader metric is needed to ensure an accurate read of the citizenry. Hadoop can easily and quickly collect vast amounts of data at a low expense.

Protecting Critical Networks from Internal and External Threats

While a large amount of the data stored on large government servers is considered “exhaust,” some is highly sensitive, and may pertain to issues of national security. To maintain data integrity, it is imperative that any vulnerabilities or threats are identified and blocked. The difficulty lies in identifying threats amid benign data and correspondence. Hadoop’s processing power lets government agencies more easily zero in on glitches, intrusions, malware, and other threats.

Hadoop’s approach was described in a paper from Lockheed Martin: “Intelligence-Driven Computer Network Defense Informed by Analysis of Adversary Campaigns and Intrusion Kill Chains.”

Network defenses that are tailored to their adversaries can create an intelligence “feedback loop,” enabling defenders to establish information superiority and decrease the adversary’s odds of success with each subsequent intrusion attempt.

It’s also essential to prepare for internal threats. Hadoop provides “information superiority” to detect and prevent malicious users from hacking or crashing government networks.

Preventing Fraud & Waste

Using Apache Hadoop to uncover fraudulent benefits claims, one federal agency reduced ETL time from nine hours to just 60 minutes. Thanks to this increased efficiency, they created new data models focused on fraud, waste, and abuse. Data processing was accelerated by a factor of three, and searches were expanded to include additional legacy systems.

Efficiency is key to preventing the misuse of government programs. This is why Hadoop speeds up data processing, helping agencies strengthen their fraud and waste prevention efforts.

Identifying Terrorist Threats on Social Media

Terrorist networks often evade detection by communicating through social media sites. By analyzing the vast amounts of data generated by social networks, Hadoop has the capability to not only identify terrorist threats, but also to pinpoint accomplices. Leveraging advanced scanning and filtering algorithms, the platform can provide agencies with actionable evidence against users with malicious intent, while protecting innocent users.

Decreasing Budget Pressures by Offloading Expensive Workloads

Hadoop can assist government agencies by reducing expenses from data storage, and by providing nonstop access to information.

Largely due to scrutiny and budgetary pressure from the federal government, many agencies are seeking digital storage at a lower cost. Hadoop allows them to offload certain data sets at a better value. The system is structured to allow agencies to easily access data without compromising their day-to-day operations and without increasing expenses.

Crowdsource Reporting for Repairs to Roads and Public Infrastructure

Citizens often accuse their local governments of placing too low of a priority on pothole repair and other infrastructure problems. In most cases, the city simply isn’t aware of the problem. With up-to-date reports, repairs are made faster and more efficiently.

Ground sensors, imaging, and photos captured by citizens are all effective ways to gauge the condition of roads and infrastructure. Quickly responding to problems is critical to ensuring a city’s safety, aesthetics, and function. Hadoop can store all of that data for prioritization and rapid response.

Responding to “Open Records” & Freedom of Information Requests

While open record acts permit citizens to request data in a timely manner, the requested data is often scattered across several platforms. IT teams may have trouble locating and compiling all of it in a timely manner. Hadoop can improve efficiency and accountability by storing multiple data sets, retaining them for years, and combining them to meet specific data requests.