Most human cognitive abilities rely on or interact with what we call knowledge. How do people navigate through the world? How do they solve problems, how do they comprehend their surroundings and on which basis do people make decisions and draw inferences? For all these questions, knowledge, the mental representation of the world is part of the answer.

What is knowledge? According to Merriam-Websters online dictionary, knowledge is “the range of one’s information and understanding” and “the circumstance or condition of apprehending truth or fact through reasoning”. Thus, knowledge is a structured collection of information, that can be acquired through learning, perception or reasoning.

This chapter deals with the structures both in human brains and in computational models that represent knowledge about the world. First, the idea of concepts and categories as a model for storing and sorting information is introduced, then the concept of semantic networks and, closely related to these ideas, an attempt to explain the way humans store and handle information is made. Apart from the biological aspect, we are also going to talk about knowledge representation in artificial systems which can be helpful tools to store and access knowledge and to draw quick inferences.

After looking at how knowledge is stored and made available in the human brain and in artificial systems, we will take a closer look at the human brain with regard to hemispheric specialisation. This topic is not only connected to knowledge representation, since the two hemispheres differ in which type of knowledge is stored in each of them, but also to many other chapters of this book. Where, for example, is memory located, and which parts of the brain are relevant for emotions and motivation? In this chapter we focus on the general differences between the right and the left hemisphere. We consider the question whether they differ in what and how they process information and give an overview about experiments that contributed to the scientific progress in this field.

For many cognitive functions, concepts are essential. Concepts are mental representations,
including memory, reasoning and using/understanding language. One function of concepts is
the categorisation of knowledge which has been studied intensely. In the course of this chapter, we will focus on this function of concepts.

Imagine you wake up every single morning and start wondering about all the things you have
never seen before. Think about how you would feel if an unknown car parked in front of your
house. You have seen thousands of cars but since you have never seen this specific car in this
particular position, you would not be able to provide yourself with any explanation. Since
we are able to find an explanation, the questions we need to ask ourselves are: How are we
able to abstract from prior knowledge and why do we not start all over again if we are
confronted with a slightly new situation? The answer is easy: We categorise knowledge.
Categorisation is the process by which things are placed into groups called categories.

Categories are so called “pointers of knowledge”. You can imagine a category as a box, in
which similar objects are grouped and which is labeled with common properties and other
general information about the category. Our brain does not only memorise specific examples
of members of a category, but also stores general information that all members have in
common and which therefore defines the category. Coming back to the car-example, this
means that our brain does not only store how your car, your neighbors’ and your friends’ car
look like, but it also provides us with the general information that most cars have four wheels,
need to be fueled and so on. Because categorisation immediately allows us to get a general
picture of a scene by allowing us to recognise new objects as members of a category, it saves
us much time and energy that we otherwise would have to spend in investigating new objects.
It helps us to focus on the important details in our environment, and enables us to draw the
correct inferences. To make this obvious, imagine yourself standing at the side of a road,
wanting to traverse it. A car approaches from the left. Now, the only thing you need to know
about this car is the general information provided by the category, that it will run you over if
you don't wait until it has passed. You don't need to care about the car's color, number of
doors and so on. If you were not able to immediately assign the car to the category "car", and
infer the necessity to step back, you would get hit because you would still be busy with examining
the details of that specific and unknown car. Therefore categorisation has proved itself as
being very helpful for surviving during evolution and allows us to quickly and efficiently
navigate through our environment.

Take a look at the following picture! You will see four different kinds of cars. They differ in
shape, color and other features, nonetheless you are probably sure that they are all cars.

What makes us so convinced about the identity of these objects? Maybe we can try to find a
definition which describes all these cars. Have all of them four wheels? No, There are some
which have only three. Do all cars drive with petrol? No, That’s not true for all cars either.
Apparently we will fail to come up with a definition. The reason for this failure is that we
have to generalise to make a definition. That would work perhaps for geometrical objects, but
obviously not for natural things. They do not share completely identical features in one
category for that it is problematic to find an appropriate definition. There are however
similarities between members of one category, so what about this familiarity? The famous
philosopher and linguist Ludwig Wittgenstein asked himself this question and claimed to have
found a solution. He developed the idea of family resemblance. That means that members of a
category resemble each other in several ways. For example cars differ in shape, color and
many other properties but every car resembles somehow other cars. The following two
approaches determines categories by similarity.

The prototype approach was proposed by Rosch in 1973. A prototype is an average case of all
members in a particular category, but it is not an actual, really existent member of the
category. Even extreme various features of members within one category can be explained by
this approach. Different degrees of prototypicality represent differences among category-
members. Members which resemble the prototype very strongly are high-prototypical.
Members which differ in a lot of ways from the prototype are therefore low-prototypical.
There seem to be connections to the idea of family resemblance and indeed some experiments
showed that high prototypicality and high family resemblance are strongly connected.
The typicality effect describes the fact that high-prototypical members are faster recognised
as a member of a category. For example participants had to decide whether statements like “A
penguin is a bird.” or “A sparrow is bird.” are true. Their decisions were much faster
concerning the “sparrow” as a high-prototypical member of the category “bird” than for an
atypical member as “penguin”. Participants also tend to prefer prototypical members of a
category when asked to list objects of a category. Concerning the birds-example, they rather
list “sparrow” than “penguin”, which is a quite intuitive result. In addition high-prototypical
objects are strongly affected by priming.

The typicality effect can also be explained by a third approach which is concerned with
exemplars. Similar to a prototype, an exemplar is a very typical member of the category. The
difference between exemplars and prototypes is that exemplars are actually existent members
of a category that a person has encountered in the past. Nevertheless, it involves also the
similarity of an object to a standard object. Only that the standard here involves many
examples and not the average, each one called an exemplar.

Again we can show the typicality effect: Objects that are similar to many examples we have
encountered are classified faster to objects which are similar to few examples. You have seen
a sparrow more often in your life than a penguin, so you should recognise the sparrow faster.

For both prototype and exemplar approach there are experiments whose results support either
one approach. Some people claim that the exemplar approach has less problems with variable
categories and with atypical cases within categories. E.g. the category “games” is quite
difficult to realise with the prototype approach. How do you want to find an average case for
all games, like football, golf, chess. The reason for that could be that “real” category-
members are used and all information of the individual exemplars, which can be useful when
encountering other members later, are stored. Another point where the approaches can be
compared is how well they work for differently sized categories. The exemplar approach seems
to work better for smaller categories and prototypes do better for larger categories.

Some researchers concluded that people may use both approaches: When we initially learn
something about a category we average seen exemplars into a prototype. It would be very bad
in early learning, if we already take into account what exceptions a category has. In getting to know some of these exemplars more in detail the information becomes strengthened.

“We know generally what cats are (the prototype), but we know specifically our own cat the
best (an exemplar).” (Minda & Smith, 2001)

Now that we know about the different approaches of how we go about forming categories, let
us look at the structure of a category and the relationship between categories. The basic idea is that larger categories can be split up into more specific and smaller ones.

Rosch stated that by this process three levels of categorization are created:

It is interesting that the decrease of information from basic to superordinate is really high but
that the increase of information from basic down to subordinate is rather low. Scientists
wanted to find out if among these levels one is preferred over the others. They asked
participants to name presented objects as quickly as possible. The result was that the subjects
tended to use the basic-level name, which includes the optimal amount of stored information.
Therefore a picture of a retriever would be named “dog” rather than “animal” or “retriever”. It is important to note that the levels are different for each person depending on factors such
as expertise and culture.

One factor which influences our categorization is knowledge itself. Experts pay more
attention to specific features of objects in their area than non-experts would do. For example
after presenting some pictures of birds experts of birds tend to say the subordinate name
(blackbird, sparrow) while non-experts just say "bird". The basic level in the area of interest
of an expert is lower than the basic level of a layperson. Therefore knowledge and experience
of people affect categorization.

Another factor is culture. Imagine a people living for instance in close contact with their
natural environment, and have therefore a greater knowledge about plants etc. than, for example,
students in Germany. If you ask the latter what they see in nature, they use the basic level
‘tree’ and if you do the same task for the people closer to nature they will tend to answer in terms of
lower level concepts such as ‘oak tree’.

There is evidence that some areas in the brain are selective for different categories, but it is
not very probable that there is a corresponding brain area for each category. Results of
neurophysiological research point to a kind of double dissociation for living and non-living things. Evidence has been found in fMRI studies that they are indeed
represented in different brain areas. It is important to denote that nevertheless there is much
overlap between the activation of different brain areas by categories. Moreover when going
one step closer into the physical area there is a connection to mental categories, too. There
seem to exist neurons which respond better to objects of a particular category, namely so
called “category-specific neurons”. These neurons fire not only as a response to one object but
to many objects within one category. This leads to the idea that probably many neurons fire if
a person recognises a particular object and that maybe these combined patterns of the firing
neurons represent the object.

The "Semantic Network approach" proposes that concepts of the mind are arranged in
networks, in other words, in a functional storage-system for the `meanings' of words. Of
course, the concept of a semantic net is very flexible. In a graphical illustration of such a
semantic net, concepts of our mental dictionary are represented by nodes, which in this way
represent a piece of knowledge about our world.

The properties of a concept could be placed, or "stored", next to a node representing that
concept. Links between the nodes indicate the relationship between the objects. The links
can not only show that there is a relationship, they can also indicate the kind of relation by their
length, for example.

Every concept in the net is in a dynamical correlation with other concepts, which may have
protoypically similar characteristics or functions.

Semantic Network according to Collins and Quillian with nodes, links, concept names and properties.

One of the first scientists who thought about structural models of human memory that could
be run on a computer was Ross Quillian (1967). Together with Allan Collins, he developed
the Semantic Network with related categories and with a hierarchical organisation.

In the picture on the right hand side, Collins and Quillians network with added properties at
each node is shown. As already mentioned, the skeleton-nodes are interconnected by links. At
the nodes, concept names are added. Like in paragraph "Hierarchical Organisation of
Categories", general concepts are on the top and more particular ones at the bottom. By
looking at the concept "car", one gets the information that a car has 4 wheels, has an engine,
has windows, and furthermore moves around, needs fuel, is manmade.

These pieces of information must be stored somewhere. It would take too much space, if every
detail must be stored at every level. So the information of a car is stored at the basis level and
further information about specific cars, e.g. BMW, is stored at the lower level, where you do not
need the fact that the BMW also has four wheels, if you already know that it is a car. This way
of storing shared properties at a higher-level node is called Cognitive Economy.

In order not to produce redundancies, Collins and Quillian thought of this as an information
inheritance principle. Information, that is shared by several concepts, is stored in the highest
parent node, containing the information. So all son-nodes, that are below the information
bearer , also can access the information about the properties. However, there are exceptions. Sometimes a special car has not four wheels, but three. This specific
property is stored in the son-node.

The logic structure of the network is convincing, since it can show that the time of retrieving
a concept and the distances in the network correlate. The correlation is proven by the
sentence-verification technique. In experiments probands had to answer statements about
concepts with "yes" or "no". It took actually longer to say "yes", if the concept bearing nodes
were further apart.

The phenomenon that adjacent concepts are activated is called Spreading activation. These
concepts are far more easily accessed by memory, they are "primed". This was studied and backed by David Meyer and Roger Schaneveldt (1971) with a lexical-decision task. Probands had to decide if word pairs were words or non-words. They were
faster at finding real word pairs if the concepts of the two words were close-by in the
intended network.

While having the ability to explain many questions, the model has some flaws.

The Typicality Effect is one of them. It is known that "reaction times for more typical
members of a category are faster than for less typical members". (MITECS) This contradicts
the assumptions of Collins' and Quillian's Model, that the distance in the net is
responsible for reaction time. It was experimentally determined that some properties are stored at specific nodes, therefore the cognitive economy stands in question. Furthermore, there are examples of faster concept retrieval although the distances in the
network are longer.

These points led to another version of the Semantic Network approach: Collins and Loftus Model.

Collins and Loftus (1975) tried to abandon these problems by using shorter or longer links
depending on the relatedness and interconnections between formerly not directly linked
concepts. Also the former hierarchic structure was substituted by a more individual structure
of a person. Only to name a few of the extensions. As shown in the picture on the right, the
new model represents interpersonal differences, such as acquired during a humans lifespan.
They manifest themselves in the layout and the various lengths of the links of the same
concepts.

An example: The concept "vehicle" is connected to car, truck or bus by short links, and to fire
engine or ambulance with longer links.

After these enhancements, the model is so omnipotent that some researchers scarced it
for being too flexible. In their opinion, the model is no longer a scientific theory, because it is
not disprovable. Furthermore, we do not know how long these links are in us. How should
they be measurable and could they actually?

Every concept in a semantic net is in a dynamical correlation with other concepts which can have prototypically similar characteristics or functions. The neural networks in the brain are organised similarly. Furthermore, it is useful to include the features of ”spreading activation”
and ”parallel distributed activity” in a concept of such a semantic net to
explain the complexity of the very sophisticated environment.

The connectionists did this by modeling their networks after neural networks
in the nervous system. Every node of the diagram represents a neuron-like
processing unit. These units can be divided into three subgroups: Input units, which become activated by a stimulation of the environment, hidden units, which receive signals from an input-unit and pass them to an output unit and output units, which show a pattern of activation that represents the initial stimulus. Excitatory and inhibitory connections between units just like synapses in the brain allow ’input’ to be analyzed and evaluated. For computing the outcome
of such systems, it is useful to attach a certain ’weight’ to the input of the
connectionists system, that mimics the strength of a stimulus of the human
nervous system.

It needs to be emphasized that connectionist networks are not models of how the
nervous system works. The approach of connectionist networks is a hypothetical approach to represent categories
in network patterns. Another name for the connectionist approach is Parallel
Distributed Processing approach, for short PDP, since processing takes place
in parallel lines and the output is distributed across many units.

First a stimulus is presented to the input units.
Then the links pass on the signal to the hidden units, that distribute the
signal to the output units via further links.
In the first trial, the output units shows a wrong pattern. After many repetitions, the pattern finally is
correct. This is achieved by back propagation. The error signals are send back
to the hidden units and the signals are reprocessed. During these repetitive
trials, the ”weights” of the signal are gradually calibrated on behalf of the
error signals in order to get a right output pattern at last.
After having achieved a correct pattern for one stimulus, the system is ready
to learn a new concept.

The PDP approach is important for knowledge representation studies. It is far from perfect, but
on the move to get there. The process of learning enables the system to make generalizations, because similar concepts create
similar patterns. After knowing one car, the system can recognize similar
patterns as other cars, or may even predict how other cars look like. Furthermore, the system is protected against total wreckage. A damage to single
units will not cause the system’s total breakdown, but will delete only some
patterns, which use those units. This is called graceful degradation and is
often found in patients with brain lesions. These two arguments lead to the
third. The PDP is organized similarly to the human brain. And some effective computer programs have been developed on this basis, that were able
to predict the consequences of human brain damage.

On the other hand, the connectionist approach is not without problems. Formerly
learned concepts can be superposed by new concepts. In addition, PDP can
not explain more complex processes than learning concepts. Neither can it
explain the phenomenon of rapid learning, which does not require extensive
learning. It is assumed that rapid learning takes place in the hippocampus,
and that conceptual and gradual learning is located in the cortex.

In conclusion, the PDP approach can explain some features of knowledge representation
very well but fails for some complex processes.

There are different theories on how living beings, especially humans encode
information to knowledge. We may think of diverse mental representations of
the same object. When reading the written word "car", we call this a discrete
symbol. It matches with all imaginable cars and is therefore not bound to
a special vehicle. It is an abstract, or amodal, representation. This is different if
instead we see a picture of a car. It might be a red sports car. Now we speak of a
non-discrete symbol, an imaginable picture that appears in front of our inner
eye and that fits only to certain cars of sufficiently similar appearance.

The Propositional Approach is one possible way to model mental representations in the human brain. It works with discrete symbols which are strongly
connected among each other. The usage of discrete symbols necessitates clear
definitions of each symbol, as well as information about the syntactic rules
and the context dependencies in which the symbols may be used. The symbol
"car" is only comprehensible for people who do understand English and have
seen a car before and therefore know what a car is about. The Propositional
Approach is an explicit way to explain mental representation.

Definitions of propositions differ in the different fields of research and are
still under discussion. One possibility is the following:
”Traditionally in philosophy a distinction is made between sentences and the
ideas underlying those sentences, called propositions. A single proposition may be expressed by an almost unlimited number of sentences. Propositions
are not atomic, however; they may be broken down into atomic concepts
called ”Concepts”.

In addition, mental propositions deal with the storage, retrieval and interconnection of information as knowledge in the human brain. There is a big discussion, if the brain really works with propositions or if
the brain processes its information to and from knowledge in another way or
perhaps in more than one way.

One possible alternative to the Propositional Approach, is the Imagery Approach. Since here the representation of knowledge is understood as the
storage of images as we see them, it is also called analogical or perceptual approach. In contrast to the Propositional Approach it works with non-discrete
symbols and is modality specific. It is an implicit approach to mental representation. The picture of the sports car includes implicitly seats of any
kind. If additionally mentioned that they are off-white, the image changes
to a more specific one. How two non-discrete symbols are combined is not
as predetermined as it is for discrete symbols. The picture of the off-white
seats may exist without the red car around, as well as the red car did before
without the off-white seats.
The Imagery and the Propositional Approaches are also discussed in chapter 8.

Computational knowledge representation is concerned with how knowledge can be represented symbolically and how it can be manipulated in automated ways. Almost all of the theories mentioned above evolved in symbiosis with computer science. On the one hand, computer science uses the human brain as an inspiration for computational systems, on the other hand, artificial models are used to further our understanding of the biological basis of knowledge representation.

Knowledge representation is connected to many other fields related to information processing, e.g. logic, linguistics, reasoning, and the philosophical aspects of these fields. In particular, it is one of the crucial topics of Artificial Intelligence, as it deals with information encoding, storing and usage for computational models of cognition.

There are three main points that need to be addressed with regard to computational knowledge representation: The process, the formalisms and the applications of knowledge engineering.

The process of developing computational knowledge-based systems is called knowledge engineering. This process involves assessing the problem, developing a structure for the knowledge base and implementing actual knowledge into the knowledge base. The main task for knowledge engineers is to identify an appropriate conceptual vocabulary.

There are different kinds of knowledge, for instance rules of games, attributes of objects and temporal relations, and each type is expressed best by its own specific vocabulary. Related conceptual vocabularies that are able to describe objects and their relationships are called ontologies. These conceptual vocabularies are highly formal and each is able to express meaning in specific fields of knowledge. They are used for queries and assertions to knowledge bases and make sharing knowledge possible. In order to represent different kinds of knowledge in one framework, Jerry Hobbs (1985) proposed the principle of ontological promiscuity. Thereby several ontologies are mixed together to cover a range of different knowledge types.

A query to a system that represents knowledge about a world made of everyday items and that can perform actions in this world may look like this: “Take the cube from the table!”. This query could be processed as follows: First, since we live in a temporal world, the action needs to be a processed in a way that can be broken down into successive steps. Secondly, we make general statements about the rules for our system, for example that gravitational forces have a certain effect. Finally, we try out the chain of tasks that have to be done to take the cube from the table. 1) Reach out for the cube with the hand, 2) grab it, 3) raise the hand with the cube, etc. Logical Reasoning is the perfect tool for this task, because a logical system can also recognise if the task is possible at all.

There is a problem with the procedure described above. It is called the frame problem. The system in the example deals with changing states. The actions that take place change the environment. That is, the cube changes its place. Yet, the system does not make any propositions about the table so far. We need to make sure, that after picking up the cube from the table, the table does not change its state. It should not disappear or break down. This could happen, since the table is no longer needed. The systems tells that the cube is in the hand and omits any information about the table. In order to tackle the Frame Problem there have to be stated some special axioms or similar things. The Frame Problem has not been solved completely. There are different approaches to a resolution. Some add object spatial and temporal boundaries to the system/world (Hayes 1985). Others try more direct modeling. They do transformations on state descriptions. For example: Before the transformation the cube is on the table, after transformation , the table still exists, but independent from the cube.

The type of knowledge representation formalism determines how information is stored. Most knowledge representation applications are developed for a specific purpose, for example a digital map for robot navigation or a graph like account of events for visualizing stories.

Each knowledge representation formalisms needs a strict syntax, semantics and inference procedure in order to be clear and computable. Most formalisms have the following attributes to be able to express information more clearly: The Semantic Network Approach, hierarchies of concepts (e.g. vehicle -> car -> truck) and property inheritance (e.g. red cars have four wheels since cars have four wheels). There are attributes that provide the possibility to add new information to the system without creating any inconsistencies, and the possibility to create a "closed-world" assumption. For example if the information that we have gravitation on earth is omitted, the closed-world assumption must be false for our earth/world.

A problem for knowledge representation formalisms is that expressive power and deductive reasoning are mutually exclusive. If a formalism has a big expressive power, it is able to describe a wide range of (different) information, but is not able to do brilliant inferring from (given) data. Propositional logic is restricted to Horn clauses. A Horn clause is a disjunction of literals with at most one positive literal. It has a very good decision procedure(inferring), but can not express generalisations. An example is given in the logical programming language Prolog. If a formalism has a big deductive complexity, it is able to do brilliant inferring, i.e. make conclusions, but has a poor range of what it can describe. An example is second-order logic. So, the formalism has to be tailored to the application of the KR system. This is reached by compromises between expressiveness and deductive complexity. In order to get a greater deductive power, expressiveness is sacrificed and vice versa.

With the growth of the field of knowledge bases, many different standards have been developed. They all have different syntactic restrictions. To allow intertranslation, different "interchange" formalisms have been created. One example is the Knowledge Interchange Format which is basically first-order set theory plus LISP (Genesereth et al. 1992).

Computational knowledge representation is mostly not used as a model of cognition but to make pools of information accessible, i.e. as an extension of database technology. In these cases general rules and models are not needed. With growing storage media, one is capable of creating simple knowledge bases stating all specific facts. The information is stored in the form of sentential knowledge, that is knowledge saved in form of sentences comparable to propositions and program code. Knowledge is seen as a reservoir of useful information rather than as supporting a model of cognitive activity. More recently, increased available memory size has made it feasible to use "compute-intensive" representations that simply list all the particular facts rather than stating general rules. These allow the use of statistical techniques such as Markov simulation, but seem to abandon any claim to psychological plausibility.

Artificial intelligence or intelligence added to a system that can be arranged in a scientific context or Artificial Intelligence (English: Artificial Intelligence or simply abbreviated AI) is defined as the intelligence of a scientific entity. This system is generally considered a computer. Intelligence is created and incorporated into a machine (computer) in order to be able to do work as human beings can. Several types of fields that use artificial intelligence include expert systems, computer games (games), fuzzy logic, artificial neural networks and robotics.
Many things seem difficult for human intelligence, but for Informatics it is relatively unproblematic. For example: transforming equations, solving integral equations, making chess games or Backgammon. On the other hand, things that for humans seem to demand a little intelligence, until now are still difficult to realize in Informatics. For example: Object / Face Introduction, playing soccer.

Although AI has a strong connotation of science fiction, AI forms a very important branch of computer science, dealing with behavior, learning and intelligent adaptation in a machine. Research in AI involves making machines to automate tasks that require intelligent behavior. Examples include control, planning and scheduling, the ability to answer customer diagnoses and questions, as well as handwriting recognition, voice and face. Such things have become separate disciplines, which focus on providing solutions to real life problems. The AI ​​system is now often used in the fields of economics, medicine, engineering and the military, as has been built in several home computer and video game software applications.
This 'artificial intelligence' not only wants to understand what an intelligence system is, but also constructs it.
There is no satisfactory definition for 'intelligence':
1. intelligence: the ability to acquire knowledge and use it
2. or intelligence is what is measured by a 'Intelligence Test'

Broadly speaking, AI is divided into two notions namely Conventional AI and Computational Intelligence (CI, Computational Intelligence). Conventional AI mostly involves methods now classified as machine learning, which are characterized by formalism and statistical analysis. Also known as symbolic AI, logical AI, pure AI and GOFAI, Good Old Fashioned Artificial Intelligence. The methods include:
1. Expert system: apply the capability of consideration to reach conclusions. An expert system can process a large amount of information that is known and provides conclusions based on these information.
2. Case based considerations
3. Bayesian Network
4. Behavior-based AI: a modular method for manually establishing AI systems
Computational intelligence involves iterative development or learning (eg tuning parameters as in connectionist systems. This learning is based on empirical data and is associated with non-symbolic AI, irregular AI and soft calculations. The main methods include:
1. Neural Network: a system with very strong pattern recognition capabilities
2. Fuzzy systems: techniques for consideration under uncertainty, have been used extensively in modern industry and consumer product control systems.
3. Evolutionary computing: applying biologically inspired concepts such as population, mutation and "survival of the fittest" to produce better problem solving.
These methods are mainly divided into evolutionary algorithms (eg genetic algorithms) and group intelligence (eg ant algorithms)
With a hybrid intelligent system, experiments were made to combine these two groups. Expert inference rules can be generated through neural networks or production rules from statistical learning as in ACT-R. A promising new approach states that strengthening intelligence tries to achieve artificial intelligence in the process of evolutionary development as a side effect of strengthening human intelligence through technology.

History of artificial intelligence
In the early 17th century, René Descartes argued that an animal's body was nothing but complicated machines. Blaise Pascal invented the first mechanical digital calculating machine in 1642. At 19, Charles Babbage and Ada Lovelace worked on programmable mechanical calculators.
Bertrand Russell and Alfred North Whitehead published Principia Mathematica, which overhauled formal logic. Warren McCulloch and Walter Pitts published "Logical Calculus of Ideas that Remain in Activities" in 1943 which laid the foundation for neural networks.
The 1950s were a period of active effort in AI. The first AI program to work was written in 1951 to run the Ferranti Mark I engine at the University of Manchester (UK): a script play program written by Christopher Strachey and a chess game program written by Dietrich Prinz. John McCarthy made the term "artificial intelligence" at the first conference provided for this issue, in 1956. He also discovered the Lisp programming language. Alan Turingmemper introduced "Turing test" as a way to operationalize intelligent behavior tests. Joseph Weizenbaum built ELIZA, a chatterbot that applies Rogerian psychotherapy.
During the 1960s and 1970s, Joel Moses demonstrated the power of symbolic considerations to integrate problems in the Macsyma program, a knowledge-based program that was first successful in the field of mathematics. Marvin Minsky and Seymour Papert published Perceptrons, which demonstrated simple neural network boundaries and Alain Colmerauer developed the computer language Prologue. Ted Shortliffe demonstrates the power of a rule-based system for representation of knowledge and inference in diagnosis and medical therapy which is sometimes referred to as the first expert system. Hans Moravec developed the first computer controlled vehicle to deal with the tangled, starred road independently.
In the 1980s, neural networks were used extensively with the reverse propagation algorithm, first explained by Paul John Werbos in 1974. In 1982, physicists such as Hopfield used statistical techniques to analyze storage properties and network optimization nerve. Psychologists, David Rumelhart and Geoff Hinton, continue their research on neural network models in memory. In 1985 at least four research groups rediscovered the Back-Propagation learning algorithm. This algorithm is successfully implemented in computer science and psychology. The 1990s marked large gains in various fields of AI and demonstrations of various applications. More specifically Deep Blue, a chess computer game, defeated Garry Kasparov in a well-known match 6 game in 1997. DARPA stated that the costs saved through applying the AI ​​method for scheduling units in the first Gulf War had replaced all investment in AI research since 1950 to the US government.
The great challenge of DARPA, which began in 2004 and continues to this day, is a race for a $ 2 million prize where vehicles are driven by themselves without communication with humans, using GPS, computers and sophisticated sensors, across several hundred miles of challenging desert areas.

After having dealt with how knowledge is stored in the brain, we now turn to the question of whether the
brain is specialised and, if it is specialised, which functions are located where and which knowledge
is present in which hemisphere. These questions can be subsumed under the topic “hemispheric specialisation” or “lateralisation of processing” which looks at the differences in processing between the two hemispheres of the human
brain.

Differences between the hemispheres can be traced back to as long as 3.5 million years ago. Evidence
for this are fossils of australopithecines (which is an ancient ancestor of homo sapiens).
Because differences have been present for so long and survived the selective pressure they must be
useful in some way for our cognitive processes.

Although at first glance the two hemispheres look identical, they differ in in various
ways.

Concerning the anatomy, some areas are larger and the tissue contains more dendritic spines in one
hemisphere than in the other. An example of this is what used to be called “Broca’s area” in the left
hemisphere. This area which is –among other things- important for speech production shows greater
branching in the left hemisphere than in the respective right hemisphere area. Because of the left
hemisphere’s importance for language, with which we will deal later, one can conclude that anatomical
differences have consequences for lateralisation in function.

Neurochemistry is another domain the hemispheres differ in: The left hemisphere is dominated by the
neurotransmitter dopamine, whereas the right hemisphere shows higher concentrations of
norepinephrine. Theories suggest that modules specialised on cognitive processes are distributed over
the brain according to the neurotransmitter needed. Thus, a cognitive function relying on dopamine
would be located in the left hemisphere.

The two hemispheres are interconnected via the corpus callosum, the major cortical connection. With its
250 million nerve fibres it is like an Autobahn for neural data connecting the two hemispheres. There
are in fact smaller connections between the hemispheres but these are little paths in comparison. All
detailed higher order information must pass through the corpus callosum when being transferred from
one hemisphere to the other. The transfer time, which can be measured with ERP, lies between 5 to 20
ms.

Hemispheric specialisation has been of interest since the days of Paul Broca and Karl Wernicke, who
discovered the importance of the left hemisphere for speech in the 1860s. Broca examined a number of
patients who could not produce speech but whose understanding of language was not severed, whereas
Wernicke examined patients who suffered the opposite symptoms (i.e. who could produce speech but
did not understand anything). Both Broca and Wernicke found that their patients’ brains had damage to
distinct areas of the left hemisphere.

Because in these days language was seen as the cognitive process superior to all other processes, the
left hemisphere was believed to be superior to the right which was expressed in the “cerebral
dominance theory” developed by J.H. Jackson. The right hemisphere was seen as a “spare tire [...]
having few functions of its own” (Banich, S.94).
This view was not challenged until the 1930s. In this decade and the following, research dramatically
changed this picture. Of special importance for showing the role of the right hemisphere was Sperry,
who conducted several experiments in 1974 for which he won the Nobel Prize in Medicine and
Physiology in 1981.

Sperry’s experiments took place with people who suffered a condition called “split brain syndrome”
because they underwent a commissurotomy. In a commissurotomy the corpus callosum is
sectioned so that communication between the hemispheres becomes severed in these patients. With his
pioneering experiments, Sperry wanted to find out whether the left hemisphere really plays such an
important role in speech processing as suggested by Broca and Wernicke.

Sperry used different experimental designs in his studies, but the basic assumption behind all
experiments of this type was that perceptual information received at one side of the body is processed
in the contra-lateral hemisphere of the brain. In one of the experiments the subjects had to recognise
objects by touching it with merely one hand, while being blindfolded. He then asked the patients to
name the object they felt and found that people could not name it when touching it with the left hand
(which is linked to the right hemisphere). The question that arose was whether this inability was due to
a possible function of the right hemisphere as “spare tire” or due to something else. Sperry now
changed the design of his experiment so that patients now had to show that they recognised the
objects by using it the right way. For example, if they recognised a pencil they would use it to write.
With this changed design, no difference in performance between both hands were found.

In a different experiment conducted by Sperry et al. the patients were shown the word sky to one
visual field and scraper to the other. They now had to draw the whole word they had seen with one
hand. The patients were not able to synthesise this to skyscraper, instead they draw a scraper
overlapped by some cloud. Thus it was concluded that each hemisphere took control of the hand to draw
what it had seen.

There have been other experiments conducted to gain more knowledge about hemispheric specialisation. They were
conducted with epileptic individuals who were about to receive surgery where parts of one of their
hemispheres was going to be removed. Before the surgery started it was important to find out which
hemisphere is responsible for speech in this individual. This was done using the Wada-technique',
where barbiturate is injected into one of the arteries supplying the brain with blood. Shortly after the
injection, the contra-lateral side of the body is paralysed. If the person is now still able to speak, the
doped hemisphere of the brain is not responsible for speech production in this individual. With the
results of this technique it could be estimated that 95\% of all adult right-handers use their left
hemisphere for speech.

Research with people who suffer brain lesions or even have a commissurotomy has some major draw
backs: The reason why they had to undergo such surgery is usually epileptic seizures. Because of this,
it is possible that their brains are not typical or have received damage to other areas during the
surgery. Also, these studies have been performed with very limited numbers of subjects, so the
statistical reliability might not be high.

In addition to experiments with brain-severed patients, studies with neurologically intact individuals
have been conducted to measure perceptual asymmetries. These are usually performed with one of
three methods: Namely the “divided visual field technique”, “dichaptic presentation” and “dichotic
presentation”. Each of them again has as basic assumption the fact that perceptual information received
at one side of the body is processed in the contra-lateral hemisphere.

Highly simplified picture of the visual pathway.

The divided visual field technique is based on the fact that the visual field can be divided into the right
(RVF) and left visual field (LVF). Each visual field is processed independently from the other in the
contra-lateral hemisphere. The divided visual field technique includes two different experimental
designs: The experimenter can present one picture in just one of the visual fields and then let the
subject respond to this stimulus. The other possibility involves showing two different pictures in each
visual field.

A problem that can occur using the visual field technique is that the stimulus must be presented for less
than 200 ms because this is how long the eyes can look at one point without shifting of the visual field.

In the dichaptic presentation technique the subject is presented two objects at the same time in each
hand. (c.f. Sperry’s experiments)

The dichotic presentation technique enables researchers to study the processing of auditory
information. Here, different information is presented simultaneously to each ear.
Experiments with these techniques found that a sensory stimulus is processed 20 to 100 ms faster
when it is initially directed to the specialised hemisphere for that task and the response is 10% more
accurate.

Explanations for this include three hypotheses, namely the direct access theory, the callosal relay
model and the activating-orienting model.
The direct access theory assumes that information is processed in that hemisphere to which it is initially
directed. This may result in less accurate responses, if the initial hemisphere is the unspecialised
hemisphere.
The Callosal relay model states that information if initially directed to the wrong hemisphere is
transferred to the specialised hemisphere over the corpus callosum. This transfer is time-consuming
and is the reason for loss of information during transfer.
The activating-orienting model assumes that a given input activates the specialised hemisphere. This
activation then places additional attention on the contra-lateral side of the activated hemisphere,
“making perceptual information on that side even more salient”. (Banich)

All the experiments mentioned above have some basic findings in common: The left hemisphere is superior at verbal tasks
such as the processing of speech, speech production and recognition of letters whereas the right
hemisphere excels at non-verbal tasks such as face recognition or tasks that involve spatial skills such
as line orientation, or distinguishing different pitches of sound. This is evidence against the cerebral
dominance theory which appointed the right hemisphere to be a spare tire! In fact both hemispheres
are distinct and outclass at different tasks, and neither one can be omitted without this having high
impact on cognitive performance.

Although the hemispheres are so distinct and are experts at their assigned functions, they also have
limited abilities in performing the tasks for which the other hemisphere is specialised. In the picture
above is an overview which hemisphere gives raise to what ability.

Experiment on local and global processing with patients with left- or right-hemisphere damage

There are two sets of approaches to the question of hemispheric specialisation. One set of
theories is about the topic by asking the question “What tasks is each hemisphere specialised for?”.
Theories that belong to this set, assign the different levels of ability to process sensory information to
the different levels of abilities for higher cognitive skills. One theory that belongs to this set is the
“spatial frequency hypothesis”. This hypothesis states that the left hemisphere is important for fine
detail analysis and high spatial frequency in visual images whereas the right hemisphere is important
for low spatial frequency.
We have pursued this approach above.

The other approach does not focus on what type of information is processed by each hemisphere but
rather on how each hemisphere processes information. This set of theories assumes that the left
hemisphere processes information in an analytic, detail- and function-focused way and that it places
more importance on temporal relations between information, whereas the right hemisphere is believed
to go about the processing of information in a holistic way, focusing on spatial relations and on
appearance rather than on function.

The picture above shows an exemplary response to different target stimuli in an experiment on global
and local processing with patients who suffer right- or left-hemisphere damage. Patients with damage
to the right hemisphere often suffer a lack of attention to the global form, but recognise details with no
problem. For patients with left-hemisphere-damage this is true the other way around.
This experiment supports the assumption that the hemispheres differ in the way they process information.

Why is the transfer between the hemispheres needed at all if the hemispheres are so distinct concerning functioning, anatomy,
chemistry and the transfer results in degrading of quality of information and takes time? The reason is
that the hemispheres, although so different, do interact. This interaction has important advantages
because as studies by Banich and Belger have shown it may “enhance the overall processing capacity
under high demand conditions” (Banich). (Under low demand conditions the transfer does not make as
much sense because the cost of transferring the information to the other hemisphere are higher than
the advantages of parallel processing.)

The two hemispheres can interact over the corpus callosum in different ways. This is measured by first
computing performance of each hemisphere individually and then measuring the overall performance of
the whole brain. In some tasks one hemisphere may dominate the other in the overall performance, so
the overall performance is as good or bad as the performance of one of the single hemispheres. What’s
surprising is that the dominating hemisphere may very well be the one that is less specialised, so here
is another example of a situation where parallel processing is less effective than processing in just one
half of the brain.

Another way of how the hemispheres interact is that overall processing is an average of performance of
the two individual hemispheres.

The third, most surprising way the hemispheres can interact is that when performing a task together
the hemispheres behave totally different than when performing the same task individually. This can be
compared to social behavior of people: Individuals behave different in groups than they would when
being by themselves.

After having looked at hemispheric specialisation from a general point of view, we now want to focus on
differences between individuals concerning hemispheric specialisation. Aspects that may have an impact on lateralisation might be age, gender or handed-ness.

Age could be one factor which decides in how far each hemisphere is used at specific tasks.
Researchers have suggested that lateralisation develops with age until puberty. Thus infants should not
have functionally-lateralised brains. Here are four pieces of evidence that speak against this hypothesis:

Infants already show the same brain anatomy as adults. This means the brain of a new born is already
lateralised. Following the hypothesis that anatomy is linked to function this means that lateralisation is
not developed at a later period in life.

Differences in perceptual asymmetries that means superior performance at processing verbal vs. non-
verbal material in the different hemispheres cannot be observed in children aged 5 to 13, i.e. children
aged 5 process the material the same way 13 year olds do.

Experiments with 1-week-old infants showed that they responded with increased interest to verbal
material when this was presented to the right ear than when presented to the left ear and increased
interest to non-verbal material when presented to the left ear. The infants’ interest was hereby
measured by the frequency of soother sucking.

Although children who underwent hemispherectomy (the surgical removal of one hemisphere) do
develop the cognitive skills of the missing hemisphere (in contrast to adults or adolescents who can
only partly compensate for missing brain parts), they do not develop these skills to the same extent as
a child with hemispherectomy of the other hemisphere. For example: A child whose right hemisphere
has been removed will develop spatial skills but not to the extent that a child whose left hemisphere
has been removed, and thus still possesses the right hemisphere.

Handedness is another factor that might influence brain lateralisation. There is statistical evidence
that left-handers have a different brain organisation than right-handers. 10% of the population is left-handed. Whereas 95% of the right-handed people process verbal material in a superior manner in the
left-hemisphere, there is no such a high figure for verbal superiority of one hemisphere in left-handers:
70% of the left-handers process verbal material in the left-hemisphere, 15% process verbal material in
the right hemisphere (the functions of the hemispheres are simply switched around), and the
remaining 15% are not lateralised, meaning that they process language in both hemispheres.
Thus as a group, left-handers seem to be less lateralised. However a single left-handed-individual can
be just as lateralised as the average right-hander.

Gender is also an aspect that is believed to have impact on the hemispheric specialisation. In
animal studies, it was found that hormones create brain differences between the genders that are
related to reproductional functions. In humans it is hard to determine to which extent it is really
hormones that cause differences and to which extent it is culture and schooling that are responsible.

One brain area for which a difference between the genders was observed is the corpus callosum.
Although one study found that the c.c. is larger in women than in men these results could not be
replicated. Instead it was found that the posterior part of the c.c. is more bulbous in women than in
men. This might however be related to the fact that the average woman has a smaller brain than the
average man and thus the bulbousness of the posterior section of the c.c. might be related to brain size
and not to gender.

In experiments that measure performance in various tasks between the genders the cultural aspect is
of great importance because men and women might use different problem solving strategies due to
schooling.

Although the two hemispheres look like each other’s mirror images at first glance, this impression is
misleading. Taking a closer look, the hemispheres not only differ in their conformation and chemistry, but
most importantly in their function. Although both hemispheres can perform all basic cognitive tasks,
there exists a specialisation for specific cognitive demands. In most people, the left hemisphere is an
expert at verbal tasks, whereas the right hemisphere has superior abilities in non-verbal tasks.
Despite the functional distinctness the hemispheres communicate with each other via the corpus
callosum.

This fact has been utilised by Sperry’s experiments with split-brain-patients. These are outstanding
among other experiments measuring perceptual asymmetries because they were the first experiments
to refute the hemispheric dominance theory and received recognition through the Nobel Prize for
Medicine and Physiology.

Individual factors such as age, gender or handed-ness have no or very little impact on hemispheric
functioning.