Department of Knowledge Engineering

 

SIKS-DKE Colloquia


2012


Monday, 10 December, 2012, 16:00-17:00
Dr. Ziv Hellman, Tel Aviv University, Israel.
Title: Bayesian Games over Continuum States
Abstract:  We present an example of a Bayesian game with a continuum of states that possesses no Bayesian approximate equilibria, resolving an open question since Simon (2003) showed the existence of a game with no Bayesian exact equilibria. To achieve this we show a close relationship between strategic considerations in overlapping generations games and certain Bayesian games. The result naturally leads to the question of necessary and sufficient conditions for the existence of Bayesian equilibria in games with a continuum of states, a question that will also be resolved in the talk, time permitting.


Wednesday, November 28, 2012, 11:00-12:00
Dr. Nathan Sturtevan, University of Denver
Title: Parallelizing External Memory Search 
Abstract:  In this talk we look in detail at several forms of external memory search and how they have been run in parallel. In addition to comparing metrics provided in research papers, we also describe our new approach which minimizes writes in external memory search. We describe the parallel performance of the approach, and provide the details of performing several breadth-first searches of the single-agent Chinese Checkers game, which has 1.88x10^12 states.


Thursday, November 22, 2012, 16:00-17:00
Dr. Marc Lanctot, Department of Knowledge Engineering, Maastricht University.
Title: Monte-Carlo Sampling & Regret Minimization for Equilibrium Computation and Decision-Making in Large Extensive-Form Games
Abstract:  In this talk I will give an overview of my Ph.D. research work at the University of Alberta with Michael Bowling. Generally, we investigate the problem of decision-making in large two-player zero-sum games using Monte Carlo sampling and regret minimization methods. We demonstrate four contributions. The first is Monte Carlo Counterfactual Regret Minimization (MCCFR): a generic family of sample-based algorithms that compute near-optimal equilibrium strategies. Secondly, we develop a theory for applying counterfactual regret minimization to a generic subset of imperfect recall games as well as a lossy abstraction mechanism for reducing the size of very large games. Thirdly, we describe Monte Carlo *-Minimax Search (MCMS): an adversarial search algorithm based on *-Minimax that uses sparse sampling. Finally, we present variance reduction techniques that can be used in these settings, with a focused application to Monte Carlo Tree Search (MCTS).
We thoroughly evaluate our algorithms in practice using several different domains and sampling strategies. I will conclude the talk by outlining potential future research projects.


Wednesday, November 14, 2012, 11:00-12:00
Abdallah Saffidin
, LAMSADE, Université Paris-Dauphine, France
Title: Alpha-beta pruning for games with simultaneous moves (joint work with Hilmar Finnsson and Michael Buro)
Abstract:  Alpha-Beta pruning is one of the most powerful and fundamental MiniMax search improvements. It was designed for sequential two-player zero-sum perfect information games. In this paper we introduce an Alpha-Beta-like sound pruning method for the more general class of “stacked matrix games” that allow for simultaneous moves by both players. This is accomplished by maintaining upper and lower bounds for achievable payoffs in states with simultaneous actions and dominated action pruning based on the feasibility of certain linear programs. Empirical data shows considerable savings in terms of expanded nodes and computation time compared to naive depth-first move computation without pruning.


September 28, 2012, 13:30-14:30.
Dr.
Damien Ernst, University of Liège
Title: Meta-learning for exploration-exploitation strategies in reinforcement
Abstract:  In this talk, I will present a meta-learning approach for learning exploration-exploitation (E/E) strategies in reinforcement learning. This approach supposes that prior information is available on the E/E problem to be played and works as follows (i) it models prior knowledge in the form of a probability distribution over the target class of E/E problems; (ii) it chooses a large hypothesis space of candidate E/E strategies; and (iii), it solves an optimization problem to find a candidate E/E strategy of maximal average performance over a sample of problems drawn from the prior distribution.
I will illustrate this meta-learning approach on different types of E/E problems and for two different hypothesis spaces: one where E/E strategies are numerically parameterized and another where E/E strategies are represented as small symbolic formulas.
I will discuss results that show that this meta-learning approach achieves systematically better results than state-of-the-art E/E strategies published in the literature.


July 20, 2012, 11:00-12:00
Prof. Dr.
Reuven Rubinstein, Faculty of Industrial Engineering and Technion, Israel Institute of Technology, Haifa, Israel
Title: The Splitting Method
Abstract:  We present a generic randomized algorithm, called the splitting or cloning algorithm, for combinatorial optimization, counting and sampling uniformly on complex sets, such as the set defined by the constraints of an integer program. Similar to the original classic randomized algorithms our one uses a sequential sampling plan to decompose a "difficult" problem into a sequence of easy ones. It presents, in fact, a combination of MCMC (Markov Chain Monte Carlo), like the Gibbs sampler, with a specially designed cloning mechanism. The latter runs in parallel multiple Markov chains by making sure that all of them run in steady-state at each iteration.


July 11, 2012, 16:00-17:00
Tom J. de Jong
, Institute Biology, Leiden University

Title: Games Plants Play
Abstract:  Evolution implies change in allele frequencies. The evolutionary game is played between different alleles that are coupled to different phenotypes and therefore pass on different numbers of copies to the next generation. It can be predicted that an allele that increases seed production will win the game. We could also say that this winning allele has highest fitness. This seems obvious. In many situations it is, however, not easy to predict which allele/phenotype combination will prevail and how this depends on the environment. Biologists have therefore developed several models to predict evolutionary change under different scenarios. I will give examples of 1) straightforward optimization, 2) Evolutionarily Stable Strategy (ESS) models, 3) intragenomic conflict and 4) intergenomic conflict. The examples illustrate that the outcome of evolution is not always what is best for the species or even for alleles in the genome of the parents. 


Friday, July 6, 2012, 13:00-14:00
Peter Reutemann, Waikato ML group, Hamilton, New Zealand
Title: Real-world workflow applications

Abstract:  Building applications with predictive models on real-world data can be a complex task, due to noisy and/or multidimensional data. Using a workflow approach allows the professional/researcher to keep an overview of all the required pre-processing steps and models in use.
But not all workflow engines are alike. Most engines nowadays follow the "canvas" approach, with the user placing operators on a large canvas and then connecting them manually, determining how the data flows. Despite being rather intuitive, this approach gets tedious once one develops flows beyond a few tens of operators. ADAMS, short for Advanced Data mining And Machine learning System, uses a layout technique with implicit data flow instead. This frees the user from having to connect operators, resulting in rapid workflow development and a very compact layout, which easily scales up to several thousand operators. In my talk I will present the ADAMS system, what sets it apart from other workflow engines, where its strengths and weaknesses are and also present some of its applications currently in operation.


Friday, July 6, 2012, 14:30-15:30
Sander M. Bohte, NeuroInformatics Lab, Life Sciences Group, CWI Amsterdam
Title:  Efficient Coding in Spiking Neural Networks
Abstract:  Where biological neurons communicate via fast electrical pulses, it is still unclear how information is encoded in these pulses. We propose a simple model of dynamical neural coding with spiking neurons, which is consistent with 1) biologically realistic spiking neuron models 2) key experimental data and 3) maximizes information transmission. Since neural coding is central to neural information processing, the framework allows for efficient and continuous analog computation with networks of spiking neurons. Possible applications of such networks are continuous-time computations like dynamical control, for instance in cognitive robotics.


Monday, June 25, 2012, 15:30-16:30
Nikos Vlassis
, Luxembourg Centre for Systems Biomedicine, University of Luxembourg.

Title:  Stochastic POMDP controllers: Their computational properties and potential applications in biology.
Abstract:  Partially observable Markov decision processes (POMDPs) are stochastic models for decision making under uncertainty. One way to tackle the complexity of POMDPs is via finite-state controllers, which are probabilistic mappings from observation sequences to actions. In this talk I will give a short introduction to POMDPs and stochastic controllers, then I will present some recent results on the computational (in)tractability of this class of models (recently published work), and finally I will outline some potential applications in molecular biology (ongoing work).

Nikos Vlassis received a diploma and a PhD from the National Technical University of Athens, Greece. From 2000 until 2006 he was faculty at the Universiteit van Amsterdam, and from 2007 until 2010 he was faculty at the Technical University of Crete. Currently he is PI at the Luxembourg Center for Systems Biomedicine. His research interests are in Machine Learning and Bioinformatics.


June 14, 2012, 16:00-17:00
Michael Johanson
, Department of Computing Science, University of Alberta, Edmonton, Canada
Title: Achieving Professional-Level Strength in Computer Poker
Abstract:  Realistic decision-making scenarios often involve multiple agents and incomplete information. The game of poker is a canonical example of such scenarios, where human professionals wager and bluff in casinos for high stakes.  As viewed by computing scientists, the game is a challenging domain with imperfect information, stochastic events, and an emphasis on identifying and exploiting an opponent's errors in order to maximize one's expected winnings.  Even the smallest popular variant played by human professionals has 10^18 game states and 10^14 decision points in the game tree, and simply writing a complete policy for this domain would require nearly two petabytes of disk.  In 2008 the University of Alberta produced a program called Polaris that defeated top human professionals for the first time in a meaningful poker match, and since then progress has continued to drive towards superhuman and optimal play. Creating Polaris required us to address a stack of challenges including state space abstraction, efficient algorithms for finding robust strategies, modelling and adapting to an opponent given limited observations, and accurately evaluating the resulting agent. In this talk, I will give an overview of these challenges, describe the core general algorithm behind our programs, and describe its competitions against top human professional players.


May 30, 2012, 15:30-16:30
Matthew E. Taylor
, Lafayette College,  Department of Computer Science 

Title: Real-Life Learning Agents
Abstract:  Agents, defined as programs or robots that interact with their environment, are becoming increasingly common. However, the current generation of agents are often unable to robustly interact with each other, or with humans, severely limiting the number of tasks that they can accomplish. Furthermore, these agents typically are unable to adapt to their environment, a critical skill when programmers do not
have full knowledge of agents' future environments or when the agents' environment may change over time. This talk will discuss recent work in combining autonomous learning of sequential decision making tasks with transfer learning, a general approach to sharing knowledge
between agents with different capabilities, resulting in significant improvements to learning speeds and abilities.

Matthew E. Taylor graduated magna cum laude with a double major in computer science and physics from Amherst College in 2001. After working for two years as a software developer, he began his Ph.D. with a MCD fellowship from the College of Natural Sciences. He received his doctorate from the Department of Computer Sciences at the University of Texas at Austin in the summer of 2008, supervised by Peter Stone.
Matt recently completed a two year postdoctoral research position at the University of Southern California with Milind Tambe and is now an assistant professor at Lafayette College in the computer science department. His current research interests include intelligent agents,
multi-agent systems, reinforcement learning, and transfer learning.


May 9, 2012, 16:00-17:00
Ruben van der Zwaan, Maastricht University
Title: Intentions matter
Abstract:  Everyone uses a search engine to find information on the internet. Despite rapid advances in relating keywords to relevant pages, ordering the results remains difficult. People have different intentions when searching, even if they use the exact same keyword! The keyword itself can be ambiguous, people expect different results. For example, when searching for ``Maastricht'' both the municipality and the tourist information website are highly relevant. However, as a tourist the municipality information is less interesting but to a local it is relevant. Another difference is the quantity of results: for research one needs many results, but for navigational purposes a single result suffices.
To see why ordering results merely based on their significance to the majority can lead to bad orderings, consider the following example. What is the first thought about ``lion''? Among other things, Lion can refer to the following: the name of the recent operating system for Apple computers; the species; a shop for beds; it is an acronym for Leiden Institute of Physics. If we only consider the intent the majority has when searching for ``lion'', probably results for the operating system and the species dominate the first few pages of search results.
Our goal is to find, for a given keyword, an ordering of relevant pages that minimizes the average effort of ALL users.
This combinatorial problem can be elegantly stated and is related to other well-known problems. Unfortunately it is very unlikely that there exists an efficient algorithm that finds the optimal ordering since the problem is NP-hard. I will present an efficient and intuitive method that finds in polynomial time an ordering that is at most a constant factor worse than the optimal ordering.


April 18, 2012, 15:00-16:00
Michael Siek, Delft University of Technology and Unesco-Ihe
Title: Chaotic Behavior: From Self-Similarity To Prediction

Abstract:  Chaos theory has become an important field of study in mathematics for understanding and modeling complex dynamical systems, with many areas of applications, including: engineering, biology, economics, geophysics and psychology. Methods in chaos theory have been significantly enhanced with considerable emergent research since Edward Lorenz made a discovery in 1963 during his experiments with a simplified atmospheric model. He discovered the sensitivity to initial conditions that lead to chaos theory or butterfly effect.
Behavior in a complex dynamical system appears irregular or unpredictable but is actually deterministic. Nowadays, a large number of dynamical systems are discovered to behave in chaotic manners, whereas previously these systems were believed to act randomly. Much of what one calls coincidence or chance is actually chaos, for example rolling a pair of dice. The dynamical systems characterized by deterministic chaos are predictable. If one can control chaos, with just a little bit of mathematics, everything can be controllable.
Several case studies, like storm surge dynamics in the North Sea have been explored to reveal the chaotic model predictability. The model exploits the dynamical neighbors (self-similarity) found in the higher-dimension of the reconstructed phase space as the main control for prediction. The self-similarity here is identified as similar storm surge patterns or developments in the past. A number of additional features can be incorporated into a chaotic model to enhance its predictability, i.e. the use of a data assimilation technique or ensemble chaotic models. The modeling results show that the predictive chaotic model outperforms the other machine learning techniques, like neural network models. This demonstrates that the modeling technique based on chaos theory can serve as an efficient tool for accurate and reliable short and mid-term predictions.


March 27, 2012, 13:30-14:30
Eduardo F. Morales, Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), Mexico
Title: Teaching a Robot How to Perform New Tasks Though Demonstrations and Feedback
Abstract:  Service robots are rapidly expanding and soon will become part of our everyday life. To personalize service robots to the user's needs, robots will have to learn new tasks according to the preferences of the users who will have to teach them in natural and accessible ways.
In this talk, we will show how to teach a robot a new task using simple demonstrations and voice feedback. We consider three demonstrations modes: (i) controlling a robot with a joystick, (ii) instructing the robot with voice, and (iii) executing the task while the robot is watching with a Kinect sensor. Demonstrations provide an initial guidance and focus the search space and are followed by a reinforcement learning process to improve over the original sequences performed by the user. The low-level sensor information of the user's demonstration is transformed into a more abstracted representation to allow the system to learn general control policies applicable to different instances of the task. During the learning process the user can provide on-line feedback in the form of actions or qualifiers over the performance of the robot, which are translated into additional rewards to provide additional guidance. The effectiveness of the proposed approach is shown on simulation and real robots for simple navigation tasks and for a pick-and-place task. 


March 7, 2012, 15:30
Johan Kwisthout, Radboud University, Faculty of Science, Nijmegen
Parameterized Complexity Analysis - A New and Powerful Tool in the Cognitive Scientist's Toolbox
Abstract:  Many computational models of cognition - like models of intention recognition, action planning, or problem solving - are based on Bayesian abduction: determining the most probable explanation for what is observed. These models often perform well in experimental settings, yet face the theoretical problem of intractability: Bayesian abduction is in general NP-hard, meaning that such models cannot serve as computationally and psychologically plausible explanations of the tasks and processes which they model. Proponents of such Bayesian models often assume approximation, fast-and-frugal heuristics, or similar means to speed up computation.
Alas, approximate Bayesian inference happens to be NP-hard as well. A better approach would be to identify the sources of complexity of the model using a parameterized complexity analysis. In this talk, based on recent work with Iris van Rooij, Todd Wareham, and Mark Blokpoel, I will show how parameterized complexity analyses can aid the Cognitive Scientist, both in identifying constraints that - when met - render these models tractable, and in generating empirically testable hypotheses that predict under which circumstances human cognitive performance will deteriorate. In short: I will advocate that parameterized complexity analyses are as powerful for a Cognitive Scientist as water pump pliers are for a plumber.