Let’s keep things in perspective. Big data might be a wickedly complex discipline, but it isn’t something entirely new. We’ve been dealing with data for thousands of years, structured and unstructured, and even large data sets are nothing new. Some elements of the big data revolution might be new, but there are best practices that exist.

“Big data is still data, and there are 30-plus years of best practices established for managing enterprise-style data that we shouldn’t throw away,” notes Jeff Pollock, vice president of product management for Oracle.

Still, putting big data to service for your company might feel like an overwhelming task. Relax. Break big data down into smaller components and follow these six best practices, and managing big data might be easier than you think.

1. Don’t Silo Your Big Data Effort

The first best practice is not to isolate big data as its own special program within your company. You might need data scientists and a team for managing big data, but treat big data as just part of your existing data management ecosystem.

Putting big data in a silo is one of the biggest mistakes that Pollack sees among enterprise customers, he says.

“Businesses that deliberately or accidentally treat new big data initiatives as if big data somehow exists in a vacuum and is not a part of the overall data management ecosystem are creating havoc for themselves,” says Pollack. “In every area from security to governance, integration, process management or analytics – big data should be understood as part of the data ecosystem and not as a separate and special program.”

2. Incomplete Data Works, Too

Insights don’t require all the trees in the forest. While complete data sets fully analyzed are never a bad thing, businesses can make sense of incomplete, messy or large data sets by leveraging derived or summarized data.

Creating derived data sets of sample or summarized data is a crucial best practice, according to distinguished Gartner analyst and big data expert, Douglas Laney.

“Many types of analytics don’t require complete datasets,” he notes.

3. Govern from the Start

While messy data can be useful for a business, big data management is nonetheless much easier when there is selection and structuring at the edge whenever possible. Good big data management includes governance upstream and right from the start of an initiative.

“Govern from the start. Rather than postponing data governance to a future phase, include it from the very start,” says Pollock at Oracle. “The consequences and costs of trying to clean up a data swamp after it’s been left to fester are much greater than the effort it takes to include proper governance from the beginning.”

Laney at Gartner also stresses this point.

“Failing to address up-stream data quality issues that originate in business processes” is a huge mistake, says Laney. “This is a result of organizations failing to recognize and treat information as an actual corporate asset.”

4. Assume Nothing

Hadoop and other open source tools are becoming the rigor for big data. While these tools are quite useful, don’t treat them the same way that you use mature enterprise products. Many of today’s big data tools are still evolving quickly and therefore are rough around the edges. As a result, be rigorous in planning and testing your tools before relying on them.

“There is a myriad of examples where project issues crop up in situations that would be unimaginable with mature software,” says Pollock. “Some first- and second-generation open source big data tools are still rudimentary and problems come up in unexpected places. Be prepared to have big data issues in areas that would never be a problem elsewhere.”

5. Use Virtual Integration Layers

Big data comes from many sources and arrives in many forms. Businesses can save themselves the pain of integration by using virtualization technology to bring together disparate big data sources with virtual integration layers.

Creating virtual integration layers that reduce the need to physically pre-integrate data, but present multiple sources in a single view to applications or individuals is an important best practice, according to Laney.

6. Create a Single Source of Truth

Think about building a single master “source of truth” for your big data, such as a data lake, so various teams within your business can define the data in the way that makes the most sense for their needs.

“Be sure to allow for multiple versions of the truth,” stresses Laney. “For example, ‘Customer’ doesn’t mean exactly the same thing to finance as it does to sales or marketing or shipping.”

By using a master data solution, you also enable easier product, customer and other linkages across multiple data sets.

Managing big data can be complex, but these six best practices can make it easier.