Subscribe to this blog

Subscribe

Search This Blog

JSON and XML for Big data engineers

JavaScript Object Notation (JSON) was invented by Douglas Crockford as a subset of JavaScript syntax to be a lightweight data format that is easily readable and writable by both humans and machines. In general, JSON is considered terse when compared to other interchange formats. After you become familiar with JSON, you will find it fairly easy to read complex JSON data structures. Even though JSON is based on a subset of the JavaScript programming language, it is considered language independent.

The flexibility of XML has made it increasingly prevalent in programming environments. Unlike the Unix® world, where configuration files are usually text files with either tab-delimited name/value pairs or colon-separated fields, configuration files in the open source world are often XML documents. Most well-known application servers also use XML-based configuration files. The Ant utility relies on XML-based files for defining tasks.

A tremendous amount of data in the business world and scientific community does not use the JSON or XML format. To give you some perspective, roughly 80% to 90% of all software programs were written in either COBOL or Fortran™ in the early 1990s (and NASA scientists were still using Fortran in 2004). Therefore, data integration and migration can be a complex problem. The movement toward XML as a standard for data representation is intended to simplify the problem of exchanging data between systems. You probably already know that XML is ubiquitous in the Java world, yet you might be asking yourself one question: What's all the fuss about XML? In broad terms, XML is to data what relational theory is to databases; both provide a standardized mechanism for representing data.

A nontrivial database schema consists of a set of tables in which there is some type of parent/child (or master/detail) relationship in which data can be viewed hierarchically. An XML document also represents data in a parent/child relationship. One important difference is that database schemas can model many-to-many relationships such as the many-to-many relationships that exists between a student's entity and a class's entity. XML documents are strictly one-to-many, with a single root node. People sometimes make the analogy that XML is to data what Java is to code; both are portable, which means you avoid the problems that are inherent in proprietary systems.

Post a Comment

Popular posts from this blog

Total four products. Read the details below.Tableau desktop-(Business analytics anyone can use) - Tableau Desktop is based on breakthrough technology from Stanford University that lets you drag & drop to analyze data. You can connect to data in a few clicks, then visualize and create interactive dashboards with a few more.

We’ve done years of research to build a system that supports people’s natural ability to think visually. Shift fluidly between views, following your natural train of thought. You’re not stuck in wizards or bogged down writing scripts. You just create beautiful, rich data visualizations. It's so easy to use that any Excel user can learn it. Get more results for less effort. And it’s 10 –100x faster than existing solutions.

Tableau server
Tableau Server is a business intelligence application that provides browser-based analytics anyone can use. It’s a rapid-fire alternative to th…

The Credit Card (Shopping): The purpose o this card is to buy any item withing the limit prescribed by banks to cardholder. These cards can have both Magnetic stripe and Chip cards.
Now a days all banks are issuing credit cards with CHIP and PIN. After entering the PIN by cardholder, then transaction starts for further processing.

The debit (ATM, Cash) card is a relatively new method of payment. It is different from a credit card because the debit cardholder pays with the money available in their bank account, which is debited immediately in real time. A debit card seems to be more dangerous compared to a credit card because the debit card is directly linked to the bank checking account and usually allows ATM cash withdrawals.

On the other hand, it is more protected by the required two-factor authentication (PIN number plus card itself). The real dangerous element of many branded debit cards is that they can be processed as credit cards, without entering the PIN.The Gift card is simi…

Why Sqoop you need while working on Hadoop-The Sqoop and its primary reason is to import data from structural data sources such as Oracle/DB2 into HDFS(also called Hadoop file system).
To our readers, I have collected a good video from Edureka which helps you to understand the functionality of Sqoop.

The comparison between Sqoop and Flume

The Sqoop the word came from SQL+Hadoop
Sqoop word came from SQL+HADOOP=SQOOP. And Sqoop is a data transfer tool. The main use of Sqoop is to import and export the large amount of data from RDBMS to HDFS and vice versa. List of basic Sqoop commandsCodegen- It helps to generate code to interact with database records.Create-hive-table- It helps to Import a table definition into a hiveEval- It helps to evaluateSQL statement and display the resultsExport-It helps to export an HDFS directory into a database tableHelp- It helps to list the available commandsImport- It helps to import a table from a database to HDFSImport-all-tables- It helps to import tables …