Archives: big data

The knowledge graph was introduced by Google in 2012. It is a way for information that you are searching for to be presented more easily as opposed to looking a SERPs, or search engine results. The knowledge graph for businesses (which we at Casanova refer to the “business showcase”) is huge for representing your business interests online. If your business doesn’t appear in the knowledge graph, then you are seriously losing out. How do you get into the knowledge graph? The three major search engines, Google, Bing, and yahoo, each have their own knowledge graph (But they have different names) and methods for entering it. Google my business is the way to enter the knowledge graph for Google. Google my business allows you to enter business information and to get a google+ page for your business which does wonders for your SEO. Bing Places is the way to enter the knowledge graph for Bing. These tools are paramount to your businesses’ online presence and should not be overlooked by any business.

The knowledge graph is aimed at semantic search with three main goals: answer, converse, and anticipate. The Knowledge graph seeks to become a robot that humans can talk to get information and to learn. Where do you stand with understanding the knowledge graph and how it relates to your business?

Overview

While the term “big data” is often considered a buzzword, the intention behind the term and the concept itself is not quite so new. Simply put, “big data” is the act of gathering a large volume of information and storing it for later processing, specifically data sets so large that traditional methods for organizing and processing the data prove inadequate. One might argue that the creation of libraries were one of the first practices of what we now call “big data,” as that quantity of data in written form far exceeded what anyone until then would have encountered, and it forced the first librarians to develop new schema to organize it. While in the distant past, this information might have been measured in the number of scrolls or books stored, we now measure this in terabytes and petabytes–even exabytes occasionally. Suffice it to say, when we speak of “big data,” we speak of data sets so large that they prove problematic for the conventional means of managing data.

Big Data: A Constantly Moving Target

Because of the nature of its definition, when to use the term is not something that can be pinned down. There is no stable minimum size above which one can say that a data set officially qualifies as “big data.” It depends on the current levels of processing power, the nature of the data itself, and the what the end product is supposed to be. What also complicates matters is the exponential growth of data over time.

As technology advances, the rate at which data is produced increases, and as this production rate increases, advanced technology can harness this data to advance technology at an even faster rate. It is a self-accelerating cycle.

We’re seeing this culminate (currently anyway) in the internet of things. Data is being produced by and consumed by an ever growing list of devices via the internet, and this data is often captured by companies to form data sets of unprecedented size. We’re seeing two major forces working to expand this data.

The adoption of internet-connected embedded systems spreads awareness of their existence, even to “non-geeks,” further increasing their adoption. Second, this adoption drives down cost, which also, in turn, leads to further adoption. This leads to an ever-accelerating increase in the amount data produced and transmitted.

The other side of this is that as technology adapts to big data, data sets that were previously considered nearly intractably large because trivial to manage, let alone store, after some time.

We can see this even on a consumer scale. Only ten years ago, the idea that a consumer could even store 500 GB on a personal computer would have seemed laughable. Now it’s considered a little on the smaller side for a home PC. The search feature on consumer operating systems have also incorporated indexing to better cope with the markedly increased amount of data consumers can now store on their PC, among other adoptions and innovations.

So What is This Used For?

Glad you asked. Most commonly they are used in three major applications

Descriptive statistics

Predictive analytics

Machine learning

Descriptive statistics is a discipline that takes a data set and uses quantitative analysis to describe features about the data set itself. Very commonly encountered forms of descriptive statistics are the attributes of the performance of athletes. Things like average points scored per game and batting average are both examples of a descriptive statistic.

Predictive analytics, on the other hand, is a discipline that uses data to construct models that are themselves used to predict the future. Generally these predictions are informed by the trends identified by descriptive statistics.

Machine learning is field of computer science that has its roots within artificial intelligence, particularly pattern recognition. The goal of machine learning is to enable computer programs to “learn” without being explicitly programmed to do so, generally by recognizing patterns in data and using these patterns to predict results. This field can itself be broken down into three major fields:

Supervised learning: The computer program is given a training data set and what the desired outputs should be by the “teacher.” The goal of the machine learning algorithm becomes finding a general rule that maps the inputs to the desired outputs.

Unsupervised learning: As the name implies, there is no explicit “teacher.” The algorithm first must uncover structure within the data. Sometimes that is, in fact, the purpose of using this method: teasing out hidden structures within ostensibly unstructured data.

Reinforcement learning: This is sort of the hot-cold game for computers. There is no “teacher,” rather the program must maximize some value called the “reward.”

Big Data and the Future

Going forward, big data will likely only increase in prominence. With applications ranging from targeted advertising to cancer research, the utility and power it holds are too attractive to ignore. But as anything with the potential to do so much good, its potential for great evil is equally large. Already we’re seeing big data used by the Chinese government to collect detailed amounts of data on the behaviors of its citizenry, enabling them to conduct Orwellian surveillance. However, it’s important to remember that big data itself is neither good nor bad. It simply is. Like any other nascent discipline, we must decide as a society where the ethical boundaries lie.

SEO, in the world of 2016, is ever more of a pressing issue. Digital real estate is very important, especially to a business. The measure of a website’s success is typically based off ROI. One of the best investments that a business can make is in it’s digital real estate. SEO not only improves search rankings, but it also improves usability of a site, along with functionality and design. An SEO’s insistence on search and page rankings allow for a more concise site that it is readable by the search engines, and thus, more readable by the human engines. To create a website specifically for business and without utilizing SEO can be damaging to the business.