The Forum for Artificial Intelligence meets every
other week (or so) to discuss scientific, philosophical, and cultural
issues in artificial intelligence. Both technical research topics and
broader inter-disciplinary aspects of AI are covered, and all
are welcome to attend!

Friday, May 5th, 3:30pm

Coffee at 3:15pm

ACE 6.304

Towards Efficient Boolean Circuit Satisfiability Checking

Head of the Laboratory for Theoretical Computer Science
Helsinki University of Technology

Boolean circuits offer a natural, structured, and compact representation of
Boolean functions for many application domains such as computer aided verification.
We study satisfiability checking methods for Boolean circuits. As a starting point
we take the successful Davis-Putnam-Logemann-Loveland (DPLL) procedure for satisfiability
checking of propositional formulas in conjunctive normal form and study its generalization
to Boolean circuits. We employ a tableau formulation where DPLL propagation rules correspond
to tableau deduction rules and splitting corresponds to a tableau cut rule. It turns out that
Boolean circuits enable interesting deduction (simplification) rules not typically available
in DPLL where the idea is to exploit the structure of the circuit. We also study the relative
efficiency of different variations of the cut (splitting) rule obtained by restricting the use
of cut in several natural ways. A number of exponential separation results are obtained showing
that the more restricted variations cannot polynomially simulate the less restricted ones. The
results also apply to DPLL for formulas in conjunctive normal form obtained from Boolean circuits
by using Tseitin's translation. Thus DPLL with the considered cut restrictions, such as allowing
splitting only on the variables corresponding to the input gates, cannot polynomially simulate
DPLL with unrestricted splitting.

(This is joint work with Tommi Junttila and Matti Jarvisalo.)

About the speaker:

Ilkka Niemela is professor and head of the Laboratory for Theoretical
Computer Science at Helsinki University of Technology since year 2000. He
received his doctoral degree in computer science in 1993 from Helsinki
University of Technology and has worked in 1993 as an International Fellow
at SRI International, in 1995-1996 as a research scientist and acting
professor in the Department of Computer Science of the University of
Koblenz-Landau, Germany and in 1998-2000 as a senior research fellow of
the Academy of Finland.

Dr. Niemela's current research interests include automated reasoning,
knowledge representation, computational complexity, computer aided
verification, automated testing and product configuration. At Helsinki
University of Technology he leads the computational logic group which has
developed a number of the state-of-the-art software tools for automated
reasoning, such as the Smodels system for answer set programming and BCSat
for Boolean circuit satisfiability checking, leading to applications in
areas like automated planning, product configuration, and bounded model
checking. Dr. Niemela is an author of more than 100 papers, has been a
member of the program committee for over 40 international conferences and
has given several invited talks and tutorials.

Dr. Niemela is a member of the Executive Committee of the Association for
Logic Programming (ALP), Editorial Board Member of Theory and Practice of
Logic Programming and Journal of Artificial Intelligence Research as well
as a Steering Committee Member of the International Workshops on
Nonmonotonic Reasoning and of the International Conferences on Logic
Programming and Nonmonotonic Reasoning.

Friday, April 14th, 11:00am

Coffee at 10:45am

ACES 2.302 (Avaya Auditorium)

Learning by Reading: An Experiment in Text Analysis

USC Information Sciences Institute
University of Southern California

A few years ago, three research groups participated in an audacious experiment called Project Halo: (manually) converting
the information contained in one chapter of a high school chemistry textbook into knowledge representation statements, and
then having the knowledge representation system take the high school AP exam. Surprisingly, all three systems passed,
albeit at a relatively low level of performance. Could one do the same, automatically? If not fully, how far can one go?
Since October, several projects have taken up this challenge, or aspects of it. Our Learning by Reading project at ISI,
drawing part-time participation of experts in NLP and KR&R, addresses the problem from the perspective of NLP. After
suitable analysis and preparation, we parse the Chemistry textbook and then convert the results into very shallow
pre-logic predications, which are asserted to a knowledge base. The evaluation, still in progress, has two aspects. In
the first, we apply questions to the system at various levels (text-only, knowledge level without inference, the latter
with inference), and compare performance. In the second, we (in conjunction with other groups) compare the systems
bottom-up automatically derived representations to the top-down ones created by hand by those groups. Although this
project (and related) projects are merely pilot studies, they nonetheless are likely to generate some interesting
conclusions regarding the gap between what automated systems can deliver and what human knowledge engineers deem
necessary, in the fascinating endeavor of learning by reading.

About the speaker:

Eduard Hovy leads the Natural Language Research Group at the Information Sciences Institute of the University of Southern
California. He is also Deputy Director of the Intelligent Systems Division, as well as a research associate professor of
the Computer Science Department of USC and Advisory Professor of the Beijing University of Posts and Telecommunications.
He completed a Ph.D. in Computer Science (Artificial Intelligence) at Yale University in 1987. His research focuses on
information extraction, automated text summarization, question answering, the semi-automated construction of large lexicons
and ontologies, machine translation, and digital government. Dr Hovy regularly serves in an advisory capacity to funders
of NLP research in the US and EU. He is the author or co-editor of five books and over 170 technical articles. In 2001
Dr. Hovy served as President of the Association for Computational Linguistics (ACL) and in 2001-03 as President of the
International Association of Machine Translation (IAMT). Dr. Hovy regularly co-teaches a course in the Masters Degree
Program in Computational Linguistics at the University of Southern California, as well as occasional short courses on MT
and other topics at universities and conferences. He has served on the Ph.D. and M.S. committees for students from USC,
Carnegie Mellon University, the Universities of Toronto, Karlsruhe, Pennsylvania, Stockholm, Waterloo, Nijmegen, Pretoria,
and Ho Chi Minh City.

Friday, April 7th, 11am

Coffee at 10:45am

ACES 6.304

Set-based logic programming

Department of Computer Sciences
University of Kentucky

Logic Programming (and one of its practical versions Answer Set
Programming - ASP), since its inception has been trying to follow two
directions at once. One is treating programs as sets of formulas of some
logic formalism. The other is an approach related to the treatment of
programs as descriptions of inductively defined sets of objects.

Recently, with the spectacular results of researchers in the first area
showing that there is a logic that correctly formalizes fundamental
aspects of answer set programming, one could get the impression that the
dichotomy of approaches presented above has been settled, with logic
(in this case maximal intermediate logic N) providing the ultimate
explanation of the subject.

This presentation discusses the other approach to ASP and its
consequences. We show that by assigning a "sense" to atoms, one can use
the ASP paradigm in the areas different from the classical logic
programming. In this perspective it turns out that the Gelfond-Lifschitz
stable models involve two, not one, distinct concepts. We show how
various classical mathematical notions can be formalized in the
resulting formalism.

The reported research is joint with H.A. Blair of Syracuse University
and J.B. Remmel of UCSD.

About the speaker:

V.W. Marek received his Ph.D. in 1968, and D.Sc. in 1972 from the
University of Warsaw, Poland. Originally interested in Set Theory and
Recursion Theory, we changed his interests in the middle of 1980ies to
the issues of Knowledge Representation and Nonmonotonic Reasoning.

For a number of years he taught at Warsaw University where, as a
successor of Mostowski he was head of Foundations of Mathematics Group.
In 1983 he moved to the University of Kentucky, Computer Science. He
spent longer periods of time at Cornell University at the Mathematical
Sciences Institute, where for several years he was an associate
researcher, and later at the University of California, San Diego.

He is an author of four books (most recently on Satisfiability), and an
author of over 150 journal and refereed conference papers. He was
a member of program committees of numerous conferences and started the
conference series "Logic Programming and Nonmonotonic Reasoning".

Friday, March 31st, 3pm

Coffee at 2:45pm

ACES 2.302

Data-Driven Discourse Parsing

Department of Linguistics
University of Texas at Austin

Computing the structure of discourse is both representationally and
computationally challenging. It is largely agreed that discourses
consists of segments that are related to one another through
rhetorical relations and the goals and intentions of the speaker(s).
While some theories postulate a context-free tree representation of
discourse structure, there are strong arguments that quite general
acyclic graphs are representationally necessary for adequately
capturing the rhetorical connections of discourse segments within a
text or dialog. This leads to an explosion of alternative potential
analyses that is difficult to reign in even with very sophisticated
machine learning models. Another challenge is that there are many
sources of information --e.g., sentence moods, discourse cue phrases,
goals and intentions, and domain-specific information-- that go into
the determination of segmentation and rhetorical relationships. This
information can be difficult to utilize effectively, especially in the
face of data sparsity.

In this talk, I will discuss data and a statistical parser for
analyzing appointment scheduling dialogs. The parser, which is based
on the sentence parsing models of Collins, builds discourse structures
of Segmented Discourse Representation Theory. I will highlight some of
the adequacies and inadequacies of this approach for this task, and
then present a new approach based on recent developments in
sentence-level dependency parsing. Though this approach brings with it
new representational challenges, it promises to greatly improve both
the process of annotation and accuracy in the automatic recovery of
discourse structures.

About the speaker:

Jason Baldridge is an assistant professor in the Department of
Linguistics at the University of Texas at Austin. He completed his
dissertation on categorial grammars at the University of Edinburgh in
2002, advised by Mark Steedman. From 2002 to 2005, he held a
post-doctoral position at Edinburgh working with Alex Lascarides and
Miles Osborne. His current work includes research on probabilistic
parsing for Portuguese, discriminative parse ranking models,
probabilistic discourse parsing for Segmented Discourse Representation
Theory, active learning, and formal syntax using categorial grammars
and other constraint-based formalisms. With Nicholas Asher, he
recently began a NSF-funded project to investigate the integration of
discourse structure and coreference resolution using machine
learning. He has been active for many years in the creation and
promotion of open source software for natural language processing.

Friday Mar. 10, 11:00am

Coffee at 10:45am

ACES 2.302 Avaya Auditorium

Learning in Artificial Sensorimotor Systems

Department of Electrical and Systems Engineering
University of Pennsylvania

Many algorithms in machine learning involve changing the underlying
dimensionality of the data set. Unsupervised learning techniques such
as principal components analysis typically involve dimensionality
reduction, whereas supervised learning techniques such as support
vector machines can be understood as mapping the data to a higher
dimensional space. Equivalent problems emerge when considering
information processing in sensorimotor systems. Sensory processing
requires mapping high-dimensional sensory inputs onto a smaller number
of perceptually-relevant features, whereas motor learning involves
driving a large number of actuator parameters with a smaller number of
control variables. I will describe some of our recently developed
learning algorithms that utilize changes in dimensionality, and
demonstrate their application on some prototypical robotic systems.

About the speaker:

Daniel D. Lee is currently an Associate Professor of Electrical and
Systems Engineering at the University of Pennsylvania, with a
secondary appointment in the Department of Bioengineering. He
received his B.A. in Physics from Harvard University in 1990, and his
Ph.D. in Condensed Matter Physics from the Massachusetts Institute of
Technology in 1995. He was a researcher at Bell Laboratories, Lucent
Technologies, from 1995-2001 in the Theoretical Physics and Biological
Computation departments. His research focuses on understanding the
general principles that biological systems use to process and organize
information, and on applying that knowledge to build better artificial
sensorimotor systems. He resides in New Jersey with his wife Lisa,
four-year old son Jordan, and two-year old daughter Jessica.

Friday, February 24th, 11am

Coffee at 10:45am

ACES 2.302 Avaya Auditorium

The Key to Speed: How Supermultiplicative Speedups Enable the
Optimization of Very Large, Hard Problems

Department of Industrial and Enterprise Systems Engineering
University of Illinois at Urbana-Champaign

Genetic algorithms (GAs)—search procedures based on the mechanics of
natural selection and genetics—have been used across the spectrum of
human endeavor, especially in problems that defy solution by
traditional methods of search, optimization, and machine learning.
The folklore of GAs suggests that they can often be effective
performers, but they are rarely considered to be speed demons, and
some authors complain that they are notoriously slow. Recent work has
established a design theory and methodology for competent GAs—genetic
algorithms that solve large classes of hard problems, quickly,
reliably and accurately—and those studies have been augmented by a
growing literature GA efficiency enhancement that explores a variety
of methods for speeding GA solutions. The news of competence and
efficiency study has been good, suggesting that nearly decomposable
problems can be solved in subquadratic times and that efficiency
enhancement modes such as parallelism, time continuation,
hybridization, and evaluation relaxation can be combined to yield
multiplicative speedups over competent GAs without efficiency
enhancement. These results are welcome in and of themselves, but a
number of recent studies have broken through the multiplicative
barrier to achieve supermultiplicative speedups to hard problems
through the tight integration of model building and efficiency
enhancement. These results are just now being understood and fully
explored, but they offer the promise of optimizing very large, hard
problems day in and day out. This talk explores the foundations,
principles, and promise of supermultiplicative speedup of competent
procedures. Examples will be given of the integration of distribution
estimation with evaluation relaxation and time continuation, and a
final discussion will suggest how these results are leading to the
near-term possibility of using these techniques to effectively
optimize problems with 100s of millions or even billions of variables.

About the speaker:

David E. Goldberg (BSE, 1975, MSE, 1976, PhD, 1983, Civil Engineering,
University of Michigan, Ann Arbor) is the Jerry S. Dobrovolny
Distinguished Professor of Entrepreneurial Engineering at the
University of Illinois at Urbana-Champaign (UIUC) and director of the
Illinois Genetic Algorithms Laboratory (IlliGAL,
http://www-illigal.ge.uiuc.edu/). Between 1976 to 1980 he held a
number of positions at Stoner Associates of Carlisle, PA, including
Project Engineer and Marketing Manager. Following his doctoral
studies he joined the Engineering Mechanics faculty at the University
of Alabama, Tuscaloosa, in 1984, and he moved to the University of
Illinois in 1990. Professor Goldberg was a 1985 recipient of a
U.S. National Science Foundation Presidential Young Investigator
Award, and in 1995 he was named an Associate of the Center for
Advanced Study at UIUC. He was founding chairman of the International
Society for Genetic and Evolutionary Computation
(http://www.isgec.org/), and his book Genetic Algorithms in Search,
Optimization and Machine Learning (Addison-Wesley, 1989) is one of the
most widely cited texts in computer science. His research focuses on
the design, analysis, and application of genetic algorithms-computer
procedures based on the mechanics of natural genetics and
selection-and other innovating machines. His recent book, The Design
of Innovation: Lessons from and for Competent Genetic Algorithms
(http://www-doi.ge.uiuc.edu/), discusses (1) how to design scalable
genetic algorithms and (2) how such procedures are similar to certain
processes of human innovation.

Wednesday, February 22nd, 4:00pm

Coffee at 3:45pm

Room ACES 2.302 (Avaya Auditorium)

Effective Short-Term Opponent Exploitation in Simplified Poker

Department of Computer Sciences
University of Alberta

Poker is a game filled with interesting AI challenges. Uncertainty in poker stems from
two key sources, the shuffled deck and an adversary whose strategy is unknown. One approach
is to find pessimistic game theoretic solutions (i.e. minimax), but human players have
idiosyncratic weaknesses that can be effectively exploited if a model of their strategy
can be learned by observing their play. However, games against humans last for at most a
few hundred hands, so learning must be fast to be effective. We explore the effectiveness
of two approaches to opponent modelling in the context of Kuhn poker, a small game for
which game theoretic solutions are known. Parameter estimation and expert algorithms are
both studied. Experiments demonstrate that, even in this small game, convergence to maximally
exploitive solutions in a small number of hands is impractical, but that good (i.e. better
than Nash or breakeven) performance can be achieved in a short period of time. Finally, we
also show that even amongst a set of strategies with equal game theoretic value, in particular
the set of Nash equilibrium strategies, some are preferable because they speed learning of the
opponent's strategy by exploring it more effectively.

About the speaker:

Professor Robert Holte is a well-known member of the international machine
learning research community, former editor-in-chief of the leading
international journal in this field (Machine Learning), and current
director of the Alberta Ingenuity Centre for Machine Learning. His main
scientific contributions are his seminal works on the problem of small
disjuncts and the performance of very simple classification rules. His
current machine learning research investigates cost-sensitive learning and
learning in game-playing (for example: opponent modelling in poker, and
the use of learning for gameplay analysis of commercial computer
games). In addition to machine learning he undertakes research in
single-agent search (pathfinding): in particular, the use of automatic
abstraction techniques to speed up search. He has over 55 scientific
papers to his credit, covering both pure and applied research, and has
served on the steering committee or program committee of numerous major
international AI conferences.

Friday, February 17th, 3:00pm

Coffee at 2:45pm

ACES 2.402

Intrinsic Motivation and Computational Reinforcement Learning

Department of Computer Science
University of Massachusetts at Amherst

Motivation is a key factor in human learning. We learn best when we
are highly motivated to learn. Psychologists distinguish between
extrinsically-motivated behavior, which is behavior undertaken to
achieve some externally supplied reward, such as a prize, a high
grade, or a high-paying job, and intrinsically-motivated behavior,
which is behavior done for its own sake. Is there an analogous
distinction for machine learning systems? Can we say of a machine
learning system that it is motivated to learn, and if so, can it be
meaningful to distinguish between extrinsic and intrinsic motivation?
Further, is intrinsic motivation something that we—as machine
learning researchers—should care about? In this talk, I argue that
the answer to each to each of these questions is “yes.” After
presenting a brief overview of the history of ideas related to
intrinsic motivation in machine learning, I describe some of our
recent computational experiments that explore these ideas within the
framework of computational reinforcement learning (RL). It is a
common perception that computational RL only deals with extrinsic
reward because an RL agent is typically seen as receiving reward
signals only from its external environment. To the contrary,
however, I argue that the computational RL framework is particularly
well suited for incorporating principles of intrinsic motivation, and
I present our view that extending learning in this direction is
important for creating competent adaptive agents.

About the speaker:

Andrew Barto is Professor of Computer Science, University of
Massachusetts, Amherst. He received his B.S. with distinction in
mathematics from the University of Michigan in 1970, and his Ph.D. in
Computer Science in 1975, also from the University of Michigan. He
joined the Computer Science Department of the University of
Massachusetts Amherst in 1977 as a Postdoctoral Research Associate,
became an Associate Professor in 1982, and has been a Full Professor
since 1991. He is Co-Director of the Autonomous Learning Laboratory
and a core faculty member of the Neuroscience and Behavior Program of
the University of Massachusetts. His research centers on learning in
natural and artificial systems, and he has studied machine learning
algorithms since 1977, contributing to the development of the
computational theory and practice of reinforcement learning. His
current research centers on models of motor learning and reinforcement
learning methods for real-time planning and control, with specific
interest in autonomous mental development through intrinsically
motivated reinforcement learning.

Friday, February 10th, 11:00am

Coffee at 10:45am

ACES 6.304

Link Mining

Computer Science Department/UMIACS
University of Maryland at College Park

A key challenge for data mining is tackling the problem of mining richly structured datasets, where the objects are linked in some
way. Links among the objects may demonstrate certain patterns, which can be helpful for many data mining tasks and are usually hard to
capture with traditional statistical models. Recently there has been a surge of interest in this area, fueled largely by interest in
web and hypertext mining, but also by interest in mining social networks, security and law enforcement data, bibliographic citations
and epidemiological records.

Link mining includes both descriptive and predictive modeling of link data. Classification and clustering in linked relational domains
require new data mining models and algorithms. Furthermore, with the introduction of links, new predictive tasks come to light.
Examples include predicting the numbers of links, predicting the type of link between two objects, inferring the existence of a link,
inferring the identity of an object, finding co-references, and discovering subgraph patterns.

In this talk, I will give an overview of this newly emerging research area. I will describe novel aspects of the modeling, learning
and inference tasks. Then, as time permits, I will describe some of my group's recent work on link-based classification and entity
resolution in linked data.

About the speaker:

Prof. Lise Getoor is an assistant professor in the Computer Science
Department at the University of Maryland, College Park. She received her
PhD from Stanford University in 2001. Her current work includes research
on link mining, statistical relational learning and representing
uncertainty in structured and semi-structured data. Her work in these
areas has been supported by NSF, NGA, KDD, ARL and DARPA. In July 2004,
she co-organized the third in a series of successful workshops on
statistical relational learning, http://www.cs.umd/srl2004. She has
published numerous articles in machine learning, data mining, database and
AI forums. She is a member of AAAI Executive council, is on the editorial
board of the Machine Learning Journal and JAIR and has served on numerous
program committees including AAAI, ICML, IJCAI, KDD, SIGMOD, UAI, VLDB,
and WWW.

Friday, January 27th, 11:00am

Coffee at 10:45am

ACES 2.302 Avaya Auditorium

Rethinking State, Action, and Reward in Reinforcement Learning

Computer Science & Engineering
University of Michigan, Ann Arbor

Over the last decade and more, there has been rapid theoretical and
empirical progress in reinforcement learning (RL) using the well-
established formalisms of Markov decision processes (MDPs) and
partially observable MDPs or POMDPs. At the core of these formalisms
are particular formulations of the elemental notions of state,
action, and reward that have served the field of RL so well. In this
talk, I will describe recent progress in rethinking these basic
elements to take the field beyond (PO)MDPs. In particular, I will
briefly describe older work on flexible notions of actions called
options, briefly describe some recent work on intrinsic rather than
extrinsic rewards, and then spend the bulk of my time on recent work
on predictive representations of state. I will conclude by arguing
that taken together these advances point the way for RL to address
the many challenges of building an artificial intelligence.

About the speaker:

Satinder Singh is an Associate Professor of Electrical Engineering
and Computer Science at the University of Michigan, Ann Arbor. His
main research interest is in the old-fashioned goal of Artificial
Intelligence,
that of building autonomous agents that can learn to be broadly
competent
in complex, dynamic, and uncertain environments. The field of
reinforcement
learning (RL) has focused on this goal, and accordingly his deepest
contributions are in RL. More recently he has also been contributing to
computational game theory and mechanism design.

Thursday, January 26th, 11:00am

Coffee at 10:45am

ACES 2.402

Detecting Online Credit Card Fraud: A Data Driven Approach

Director of Data Mining
Apple Computers

A consistent problem plaguing online merchants today is the
growth and evolution of online credit card fraud. Thieves
harvest credit card numbers from a myriad of sources, place
online orders from all over the world, and ship to countless
drop locations. Current estimates place the problem at 1 to
1.5 billion dollars annually of which the online merchant holds
complete liability.

In this talk, I will describe the problem of eCommerce fraud
and outline various detection measures that online merchants
employ. Like many merchants, Apple Computer leverages
techniques from data mining, machine learning, and statistics
to efficiently discover fraud patterns and adapt to new
trends. Some topics that I will discuss include evaluating
fraud patterns in historic data, discovering efficient pattern
matching rules, building fraud predictive models, inferencing
through order linkages, and anomaly detection.

About the speaker:

David Moriarty is the Director of Data Mining at Apple
Computer, where he leads a group of scientists developing
analytic solutions to large-scale business problems.
Specifically, Dr. Moriarty leverages data patterns to optimize
strategic decisions in various business areas, including fraud
detection, product quality, logistics, and sales. Dr. Moriarty
received a M.S. and Ph.D. in computer science from the
University of Texas at Austin specializing in artificial
intelligence and machine learning. He regularly serves on
journal and conference review committees and is a founding
member of Merchant Risk Council. Before Apple Computer, David
designed intelligent algorithms at the Naval Research
Laboratory, Daimler-Chrysler Research Center, USC Information
Sciences Institute, and Intelligent Technologies Corporation.

Friday, January 20th, 11:00am

Coffee at 10:45am

ACES 2.302 Avaya Auditorium

Support Vector Machines for Structured Outputs

Department of Computer Science
Cornell University

Over the last decade, much of the research on discriminative learning
has focused on problems like classification and regression, where the
prediction is a single univariate variable. But what if we need to
predict complex objects like trees, orderings, or alignments? Such
problems arise, for example, when a natural language parser needs to
predict the correct parse tree for a given sentence, when one needs to
optimize a multivariate performance measure like the F1-score, or
when predicting the alignment between two proteins.

This talk discusses a support vector approach to predicting complex
objects. It generalizes the idea of margins to complex prediction
problems and a large range of loss functions. While the resulting
training problems have exponential size, there is a simple algorithm
that allows training in polynomial time. The algorithm is implemented
in the SVM-Struct software and empirical results will be given for
several examples.

About the speaker:

Thorsten Joachims is an Assistant Professor in the Department of Computer Science at Cornell University. In 2001, he finished his
dissertation with the title "The Maximum-Margin Approach to Learning Text Classifiers: Methods, Theory, and Algorithms", advised by Prof.
Katharina Morik at the University of Dortmund. From there he also received his Diplom in Computer Science in 1997 with a thesis on
WebWatcher, a browsing assistant for the Web. His research interests center on a synthesis of theory and system building in the field of
machine learning, with a focus on Support Vector Machines and machine learning with text. He authored the SVM-Light algorithm and
software for support vector learning. From 1994 to 1996 he was a visiting scientist at Carnegie Mellon University with Prof. Tom
Mitchell.

Friday, December 16th, 11:00am

Coffee at 10:45am

TAY 3.128

Learning Hierarchical Task Networks
from Problem Solving

Institute for the Study of Learning and Expertise and Stanford University

In this talk, I present a novel approach to representing, utilizing, and
learning hierarchical structures. The new formalism - teleoreactive logic
programs - involves a special form of hierarchical task network that
indexes methods by the goals they achieve. These structures can be used
for reactive but goal-directed execution, and they can be interleaved
with problem solving over primitive operators to address tasks for which
there are no stored methods. Successful problem solving leads to the
incremental creation of new methods that handle analogous tasks directly
in the future. The learning module determines the structure of the
hierarchy, the heads or indices of component methods, and the conditions
on these methods. I report experiments on three domains that demonstrate
rapid learning of both disjunctive and recursive structures that transfer
well to more complex tasks. In closing, I discuss related research on
learning from problem solving and propose directions for future research.

This talk describes work done jointly with Dongkyu Choi and Seth Rogers.

About the speaker:

Dr. Pat Langley serves as Director of the Institute for the Study of
Learning and Expertise, Consulting Professor of Symbolic Systems at
Stanford University, and Head of the Computational Learning Laboratory
at Stanford's Center for the Study of Language and Information. He
has contributed to the fields of artificial intelligence and cognitive
science for over 25 years, having published 200 papers and five books
on these topics, including the text Elements of Machine Learning.
Professor Langley is considered a co-founder of the field of machine
learning, where he championed both experimental studies of learning
algorithms and their application to real-world problems before either
were popular and before the phrase `data mining' became widespread.
Dr. Langley is a AAAI Fellow, he was founding Executive Editor of the
journal Machine Learning, and he was Program Chair for the Seventeenth
International Conference on Machine Learning. His research has dealt
with learning in planning, reasoning, language, vision, robotics,
and scientific knowledge discovery, and he has contributed novel
methods to a variety of paradigms, including logical, probabilistic,
and case-based learning. His current research focuses on methods for
constructing explanatory process models in scientific domains and on
integrated architectures for intelligent physical agents.

Friday, December 9th, 11:00am

Coffee at 10:45am

ACES 2.402

A Connectionist Model of Sentence Comprehension in Visual Worlds

Recent "visual worlds" studies, wherein researchers study language in
situated settings by monitoring eye-movements in a visual scene during
sentence processing, have revealed much about the interaction of scene
information with other information sources such as linguistic, semantic,
and world knowledge, as well as the time course of their influence on
comprehension. These studies underscore the highly adaptive nature of
the human sentence processor to use all available information in order
to more rapidly interpret and disambiguate a sentence, and even anticipate
upcoming arguments. Furthermore, some of these experiments have begun to
provide insight into how the acquisition of language affects the influence
of different information sources.

In this talk, I will describe the modelling of five experiments that
trade off scene context with a variety of linguistic factors using a Simple
Recurrent Network that has been modified to integrate a scene representation
with the standard incremental input of a sentence. The results show that
the model captures the qualitative behavior observed during the experiments,
while retaining the ability to develop the correct interpretation in the
absence of visual input. Moreover, the network correctly models the
empirical observation of the relative importance of visual context over
learned stereotypical associations and supports a developmental account of
how this preference is acquired.

About the speaker:

Marshall R. Mayberry, III is currently a postdoctoral researcher at
the Department of Computational Linguistics and Phonetics in
Saarbruecken, Germany. He received his Ph.D. in Computer Science in
2003 from the University of Texas at Austin, where he also received
a MS in Computer Science in 1998, as well as a BS in both Computer
Science and Mathematics in 1993.

His research interests have revolved around the modelling of human
language sentence processing with neural networks. The focus of these
models has been on performance issues such as incrementality,
anticipation, adaptation, and robustness. His dissertation work
demonstrated that a neural network model could be scaled up to parse
sentences from the medium-sized LinGO Redwoods Treebank into semantic
graph structures. His current research has concentrated on the
modelling of how a variety of linguistic and non-linguistic factors
interact during sentence processing.

Coffee at 10:45am

TAY 3.128

CSE Department
University of Texas at Arlington

Computer games provide fertile ground for the study of how humans play games, interact with objects in virtual
environments, and transfer knowledge across scenarios. The genre of first-person shooter (FPS) games comprises
16.3% percent of the electronic entertainment market and 19% of online gaming. They provide immersive, engaging,
and highly interactive worlds that allow players to engage in behaviors similar to those in the real world. Our
work has involved observing human players complete assigned tasks in these FPS games, deconstructing the immense
environments of this genre into sets of interactive feature points to track human interaction, and evaluation of
this captured data to assist in understanding the notions of human performance and human-consistency. We have
produced our own agents based in part on an architecture for intelligence we designed named the D'Artagnan
Cognitive Architecture (DCA). Understanding human and agent performance through a set of performance and clustering
metrics, we have shown that a basic implementation of DCA was able to complete 29% of a reference level set of
which 73% were within human performance levels and 15.4% were within our definition of human-consistency. We have
further explored the use of neural network control of artificial characters in FPS games and spatial reasoning in
these complex environments. Our current work involves developing a set of scenarios for understanding the transfer
of knowledge between source and target environments and developing artificial agents that can exhibit the same
types of knowledge transfer in our Urban Combat Testbed.

About the speaker:

Dr. Youngblood is currently a Faculty Research Associate, a PI for the DARPA funded and ISLE-led Transfer Learning
effort, and Chief Scientist for the NSF ITR funded MavHome Project at the University of Texas at Arlington. He is a
former US Navy Submariner, professional software engineer, and earned his Honors BS in Computer Science and Engineering
in 1999, MS in CSE in 2002, and Ph.D. in CSE in 2005. His research interests are in Entertainment Computing,
Intelligent Systems, Pervasive Computing, and Autonomous Systems.

Monday, November 21st, 1:00pm

Coffee at 12:45pm

Avaya Auditorium, ACES 2.302

Global Inference in Learning for Natural Language Processing

Department of Computer Science and the Beckman Institute
University of Illinois at Urbana-Champaign

Natural language decisions often involve assigning values to sets
of variables where complex and expressive dependencies can
influence, or even dictate, what assignments are possible.
Dependencies may range from simple statistical correlations to
those that are constrained by deeper structural, relational and
semantic properties of the text.

I will describe research on a framework that combines learning and
inference for this problem, of inferring structured and
constrained output. The inference process of assigning globally
optimal values to mutually dependent variables is formalized as an
optimization problem and is solved as an integer linear
programming (ILP) problem. Several key issues will be discussed,
including the incorporation of both statistical and declarative
constraints and training paradigms.

The work will be described in the context of the Semantic Role
Labeling tasks, inferring a shallow semantic analysis of sentences
at the level of "who did what to whom, how, when and why".

About the speaker:

Dan Roth is an Associate Professor in the Department of Computer
Science at the University of Illinois at Urbana-Champaign and the
Beckman Institute of Advanced Science and Technology (UIUC). He is
a fellow of the Institute of Advanced Studies at the University of
Illinois and a Willett Faculty Scholar of the College of
Engineering. Prof. Roth got his B.A Summa cum laude in Mathematics
from the Technion, Israel, in 1982 and his Ph.D in Computer
Science from Harvard University in 1995.

Professor Roth's research spans both theoretical work in machine
learning and intelligent reasoning and work on applying learning
and inference to intelligent human-computer interaction --
focusing on learning and inference for natural languages
understanding related tasks and intelligent access to free form
textual information. Prof. Roth has published over 80 papers in
machine learning, natural language processing, knowledge
representation and reasoning and has developed a learning system
that has been used widely in this field. His paper ``Learning in
Natural Language'' received the best paper award in the
International Joint Conference on Artificial Intelligence (IJCAI)
1999. Among other awards are the NSF CAREER Award (1999), the
Xerox Award for Faculty Research (2001,2005), and the University
of Illinois Award for Research with Undergraduates (2002).

Prof. Roth has presented several invited talks in international
conferences including key note addresses in the Conference on
Natural Language Learning (CoNLL-2000), Empirical Methods in
Natural Language Processing (EMNLP-2002) and the European
Conference on Machine Learning (ECML-2002). He was an editor of
the Journal of Computational Linguistics, and is currently an
action editor for the Machine Learning Journal and on the
editorial board of Computational Intelligence and the Journal of
Artificial Intelligence Research; he has served on the committees
of all major conferences in Machine Learning, Learning Theory,
Computational Linguistics and Artificial Intelligence. He was the
program co-chair of CoNLL'02 and the program co-chair of ACL'03,
the main international meeting of the Association for
Computational Linguistics and the natural language processing
community. He is currently the president (elected) of SIGNLL, the
Association of Computational Linguistics, Special Interest Group
on Natural Language Learning.

Friday, November 11th, 11:00am

Coffee at 10:45am

ACES 2.302 Avaya Auditorium

Top-down and bottom-up influences in human language comprehension

Department of Brain and Cognitive Sciences
MIT

In this talk, I will summarize recent language processing data from my lab
that explore the influence of a variety of factors on human language
comprehension, including syntactic structure, working memory, lexical
frequency, and discourse context. Experimental evidence for a number of
hypotheses will be summarized including:

(1) Local connections between syntactic/semantic dependents are easier to
process than more distant connections. This factor helps to explain why
sentences like (a) are so much easier to understand than sentences like (b),
despite having the same words and meaning:
a. The reporter who the senator who John met at the party attacked was
criticized by the editor.
b. At the party, John met the senator who attacked the reported who was
criticized by the editor.

(2) The human sentence processor is sensitive to syntactic expectations that
are relatively certain to occur, such as a verb following a sequence like
"The claim that the baseball player would hold out for more money ...". The
greater the number of open expectations, the greater the local processing
load.

(3) The human sentence processor is sensitive to the frequencies of
different senses of words. For example, the word "that" is 78%
complementizer; 11% determiner; 11% pronoun in large corpora of written
text. The high bias of complementizers affects people's processing of this
word, even in environments where complementizers are virtually impossible,
e.g., "I visited that hotel last week."

Data from multiple languages will be summarized, including English, Japanese
and Chinese.

About the speaker:

Ted Gibson was educated at Queens University and Cambridge University, and
received his Ph.D. in Computational Linguistics from Carnegie Mellon
University in 1991. He is currently a Professor of Cognitive Science in
the Department of Brain and Cognitive Sciences, with a joint position in
the Linguistics Department at MIT, where he has been a faculty member
since 1993.

His research interests include all factors that make putting words,
phrases and sentences together easy or difficult to process, primarily in
comprehension, but also in production. Four major research avenues that
he has been pursuing in recent years are: (1) word order and sentence
complexity / working memory and sentence complexity; (2) syntactic
representational issues (e.g., are sentences context-free?); (3) discourse
coherence representation issues (e.g., are discourse structures
context-free?); and (4) the relationship between intonational boundary
information and syntactic structure. Two primary kinds of methods are used
in order to investigate these issues: (1) behavioral methods like reading
and listening paradigms in order to gather reaction time and response
accuracy data; and (2) corpus analyses.

Friday, November 11th, 3:00pm

Coffee at 2:45pm

TAY 3.128

Semantics for Autonomy and Interdependence in Agents

Dr. Carl E. Hewitt, Associate Professor (Emeritus)

Department of Electrical Engineering and Computer Science
MIT

In this talk I use Participatory Semantics for autonomy
and interdependence in agents. It is based on Actor semantics
(which in turn is based on physics) where Actors are the
universal primitives of concurrent digital computation.
In response to a message that it receives, an Actor can
make local decisions, create more Actors, send more messages,
and determine how to respond to the next message received.
A serializer is an Actor that is continually open to the
arrival of messages. A distinguishing characteristic of
the Actor model is that every message sent to a serializer
must arrive although this can take an unbounded amount of
time. (However, the Actor model can be augmented with metrics.)
Participatory Semantics of commitments provides means for
studying issues of autonomy and interdependence in agents.
Interdependence can be manifested in conversation and negotiation
with others; autonomy in reasoning.

About the speaker:

Carl E. Hewitt is an Associate Professor (Emeritus) in the
Electrical Engineering and Computer Science department at
the Massachusetts Institute of Technology (MIT).

Carl is known for his design of Planner that was the first
Artificial Intelligence programming language based on procedural
plans that were invoked using pattern-directed invocation
from assertions and goals.

Thursday, October 27th, 3:45pm

Coffee at 4:00pm

Taylor 2.106

A Quantitative Theory of Neural Computation

Division of Engineering and Applied Sciences
Harvard University

A central open question of neuroscience is to identify the data structures
and algorithms that are used in neural systems to support successive acts
of basic tasks such as memorization and association. We describe a theory
of neural computation based on three physical parameters: the number n of
neurons, the number d of synaptic connections per neuron, and the inverse
synaptic strength k expressed as the number of presynaptic action
potentials needed to cause a postsynaptic action potential. Our fourth
parameter r expresses the number of neurons that represent a real world
item. We describe a computational mechanism for realizing hierarchical
memorization and other cognitive tasks that implies a relationship among
these four parameters. For the locust olfactory system estimates for all
four parameters are available and we show that these numbers are in
agreement with the theorys predictions. In human medial temporal lobe
neurons that represent invariant concepts have been identified and we
offer a quantitative mechanistic explanation of these otherwise
paradoxical findings. More generally, we identify two useful regimes for
neural computation, one with r and k large where each neuron may represent
many items, and another in which r is small, k is 1 and every neuron
represents at most one item.

About the speaker:

Leslie Valiant was educated at King's College, Cambridge; Imperial
College, London; and at Warwick University where he received his Ph.D. in
computer science in 1974. He is currently T. Jefferson Coolidge Professor
of Computer Science and Applied Mathematics in the Division of Engineering
and Applied Sciences at Harvard, where he has taught since 1982. Before
coming to Harvard he had taught at Carnegie-Mellon University, Leeds
University, and the University of Edinburgh.

His work has ranged over several areas of theoretical computer science,
particularly complexity theory, computational learning, and parallel
computation. He also works in computational neuroscience, where his
interests include understanding memory and learning.

He received the Nevanlinna Prize at the International Congress of
Mathematicians in 1986 and the Knuth Award in 1997. He is a Fellow of the
Royal Society (London) and a member of the National Academy of Sciences
(USA).

Thursday, September 29th, 3:30pm

Coffee at 3:15am

Taylor 3.128

Probabilistic Policy Reuse

Computer Science DepartmentCarnegie Mellon University

We define Policy Reuse as a Reinforcement Learning technique guided by
past policies offering the challenge of balancing among three choices:
the exploitation of the ongoing learned policy, the exploration of new
random actions, and the exploitation of past policies. Policy Reuse is
based on two main cornerstones. On the one hand, an exploration
strategy able to bias the exploration of the domain with a predefined
past policy; on the other hand, a similarity metric that allows the
estimation of the similarity of past policies with respect to a new
one. Policy Reuse contributes three main capabilities to Reinforcement
Learning in the life-long term. Firstly, it provides Reinforcement
Learning algorithms with a mechanism to bias an exploration process by
reusing a set of past policies that we call Policy Library; second,
Policy Reuse provides an incremental method to build such a library of
policies; and last, our method to build the Policy Library has a novel
side-effect in terms of learning the structure of the domain, i.e.,
the basis or the "eigen-policies" of the domain. We demonstrate
theoretically that, if some conditions are satisfied, reusing such a
set of "eigen-policies" allows us to bound the minimal expected
gain received while learning a new policy. We also provide empirical results
demonstrating that Policy Reuse improves the learning performance over
different strategies that learn from scratch.

About the speaker:

Fernando Fernandez is a postdoctoral fellow at the Computer Science
Department of CMU. He received his Ph.D. degree in Computer Science
from University Carlos III of Madrid (UC3M) in 2003. He received his
B.Sc. in 1999 from UC3M, also in Computer Science. In the fall of
2000, Fernando was a visiting student at the Center for Engineering
Science Advanced Research at Oak Ridge National Laboratory
(Tennessee). From 2001, he became assistant and associate professor at
UC3M. He is the recipient of a pre-doctoral FPU fellowship award from
Spanish Ministry of Education (MEC), a Doctoral Prize from UC3M, and a
MEC-Fulbright postdoctoral Fellowship.

Fernando is interested in intelligent systems that operate in
continuous and stochastic domains. In his thesis, entitled
"Reinforcement Learning in Continuous State Spaces", he studied
different discretization methods of the state space in Reinforcement
Learning problems, specifically Nearest Prototype approaches. Since he
arrived to CMU, he has focused his research on the transfer of
policies between different Reinforcement Learning tasks, and in how to
bias the exploration of new learning processes with previously learned
policies. Applications of his research include robot soccer, adaptive
educational systems, and tourism support tools.