Department of Data Science and Knowledge Engineering

SIKS-DKE Colloquia 2018

Title: Model and signal processing approaches for Atrial Fibrillation signals

Speaker: Prof. Luca Mainardi, Politecnico di Milano

When and Where:

Date: Friday, May 18, 2018

Time: 14:00-15:00

Room: 0.009

Location: Maastricht University, Department of Data Science and Knowledge Engineering, Bouillonstraat 8-10

Atrial Fibrillation (AF) is the most common sustained disorder of cardiac rhythm and is estimated to affect 1.5%-2% of the general population with a prevalence that increases with age, and reaches nearly 10% in octogenarians. Considering the aging of population, the disease is reaching a pandemic proportion which stimulates the development of methodology and technologies for the investigation of AF events in large population both for screening purposes and for monitoring the efficiency of treatment/therapy. The talk will describe modern signal processing and modelling techniques applied to a variety of biosignals acquired in AF patients using both traditional devices (ECG and blood pressure measurements) and novel technologies (wristband devices or contactless measurements). A general overview will be given on the contributions of those methods in solving the challenging problems of AF patient screening and management.

Title: Modern Game AI Algorithms Solve Real-World Problems: From Chemical Retrosynthesis to Peruvian Bug Control Campaigns

Speaker: Dr. Mike Preus, Universität Münster

When and Where:

Date: Wednesday, February 28, 2018

Time: 16:00-17:00

Room: 2.015

Location: Maastricht University, Department of Data Science and Knowledge Engineering, Bouillonstraat 8-10

Monte Carlo Tree Search and Deep Neural Networks in combination have pushed the limits of what AI can do in areas where humans have been perceived as dominant over machines, as for the game Go. While this line of research continues towards even more complex game problems, there are many other application areas that could benefit a lot from these new techniques. Chemical retrosynthesis you know the product, but not how to get there) is one of these, and we show that this problem can very effectively be tackled with MCTS/DNN. But it does not have to end here. We generalize the approach and then provide another testbed that is currently investigated: directing inspectors in bug control campaigns in Arequipa, a city in Peru.

Title: Semi-Supervised Learning and Applications

Speaker: Dr. Siamak Mehrkanoon, KU Leuven

When and Where:

Date: Tuesday, February 20, 2018

Time: 15:30-16:30

Room: 0.015

Location: Maastricht University, Department of Data Science and Knowledge Engineering, Bouillonstraat 8-10

In many applications ranging from machine learning to data mining, obtaining the labeled samples is costly and time consuming. On the other hand with the recent development of information technologies one can easily encounter a huge amount of unlabeled data coming from the web, smartphones, satellites etc. In these situations, one may consider to design an algorithm that can learn from both labeled and unlabeled data. Starting from the Kernel Spectral Clustering (KSC)core formulation, which is an unsupervised algorithm, extensions towards integration of the available side information and devising a semi-supervised algorithm are a scope of the first part of the talk. In particular, the multi-class semi-supervised learning model(MSS-KSC) will be introduced that can address both semi-supervised classification and clustering. The labeled data points are incorporated into the KSC formulation at the primal level via adding a regularization term. This converts the solution of KSC from an eigenvalue problem to a system of linear equations in the dual. The algorithm realizes a low dimensional embedding for discovering micro clusters. Though the portion of the labeled instances is small, one can easily encounter a Huge amount of the unlabelled data points. Therefore, in order to make the model scalable to large scale data two approaches are proposed, Fixed-size and reduced kernel MSS-KSC (FS-MSS-KSC and RD-MSS-KSC). The former relies on the Nyström method for approximating the feature map and solves the problem in the primal whereas the latter uses a reduced kernel technique and solves the problem in the dual. Both approaches possess the out-of-sample extension property to unseen data points. In today’s applications, evolving data streams are ubiquitous. Due to the complex underlying dynamics and non-stationary behavior of real-life data, the demand for adaptive learning mechanisms is increasing. An incremental multi-class semi- supervised kernel spectral clustering (I-MSS-KSC) algorithm is proposed for an on-line clustering/classification of time-evolving data. It uses the available side information to continuously adapt the initial MSS-KSC model and learn the underlying complex dynamics of the data stream. The performance of the proposed method is demonstrated on synthetic data sets and real-life videos. Furthermore, for the video segmentation tasks, Kalman filtering is used to provide the labels for the objects in motion and thereby regularizing the solution of I-MSS-KSC. Manual labeling of sufficient training data for diverse application domains is a costly, laborious task and often prohibitive. Therefore, designing models that can leverage rich labeled data in one domain and be applicable to a different but related domain is highly desirable. In particular, domain adaptation or transfer learning algorithms seek to generalize a model trained in a source domain (training data) to a new target domain (test data). The most common underlying assumption of many machine learning models is that both training and test data exhibit the same distribution or the same feature domains. However, in many real life problems, there is a distributional, feature space and/or dimension mismatch between the two domains or the statistical properties of the data evolve in time. Here a brief overview of the Regularized Semi-Paired Kernel Canonical Correlation Analysis (RSP-KCCA) formulation for learning a latent space for the domain adaptation problem will be provided. The optimization problem is formulated in the primal dual LS-SVM setting where side information can be readily incorporated through regularization terms. The proposed model learns a joint representation of the data set across different domains by solving a generalized eigenvalue problem or linear system of equations in the dual. The approach is naturally equipped with out-of- sample extension property which plays an important role for model selection.

SIKS-DKE Colloquia 2017

Title: GVGAI as a Tool for Game Design

Speaker: Dr. Jialin Liu, Queen Mary University of London

When and Where:

Date: Friday, October 27, 2017

Time: 11:00-12:00

Room: 0.015

Location: Maastricht University, Department of Data Science and Knowledge Engineering, Bouillonstraat 8-10

The General Video Game AI (GVGAI, http://www.gvgai.net/ ) framework and competition have attracted many practitioners, researchers and students during the last couple of years. This benchmark proposes the challenge of creating agents that are able to play any game it’s given, even if it’s not known in advance, in the absence of any domain knowledge. It has been widely used as research testbed and education material by many universities. Different tracks have been designed for diverse research purpose, including single-player and two-play planning tracks and single-player learning track, in which no forward model is given. After briefly introducing the framework and its different tracks, I will show a new feature recently added to the framework: the possibility of setting the parameters of a game described in Video Game Definition Language (VGDL) and explore the different possibilities the game rules offer. I will describe how to make a VGDL game parametrizable, and how to define a game space for it. Finally, I will talk about how to tune game parameters to make it either more balanced or more challenging using some new variants of Evolution Algorithms.

SIKS-DKE Colloquia 2016

Title: Computational Neuroimaging of human audition

Speaker: Prof. Elia Formisano (Maastricht University, Cognitive Neuroscience)

When and Where:

Date: Wednesday, November 2, 2016

Time: 16:00-17:00

Room: 0.015

Location: Maastricht University, Department of Data Science and Knowledge Engineering, Bouillonstraat 8-10

How does the human brain encode and analyze sensory information? So far, functional neuroimaging research could only describe the response patterns evoked in the human brain by sensory stimuli or cognitive tasks, mostly at the level of regional activations. A comprehensive account of the neural basis of human perception and cognition, however, requires deriving multi-scale, deterministic models that enable specific predictions on how the brain represents (new) sensory and cognitive events. Here, I will present recent progresses in this direction, as obtained by the combination of ultra-high field (7 tesla or more) functional magnetic resonance imaging (fMRI) and computational modeling. In particular, I will focus on results that provide insights on how real-life sounds and scenes are encoded in the human brain. Using innovative experimental designs, high-resolution (~ 1mm) MR image acquisitions and machine learning algorithms, we embed computational models of sound representations in the analysis of measured fMRI response patterns and derive models that can predict the brain responses to new sounds and contexts.

Title: Cooperative Game Theory Tools to Design Coalitional Control Networks

Speaker: Francisco Javier Muros Ponce, University of Seville

When and Where:

Date: Thursday, April 14, 2016

Time: 16:00-17:00

Room: 0.015

Location: Maastricht University, Department of Data Science and Knowledge Engineering, Bouillonstraat 8-10

In coalitional control the connections among the different parts of a control network evolve dynamically to achieve a trade-off between communication burden and control performance. In particular, the communication links with low contribution to the overall system are disconnected. Likewise, the control law is adapted to these changes. Given that the control objective can be described as a cooperative game, it is possible to apply classical cooperative game theory tools to gain insight into the distributed control problem. In particular, a method to perform the partitioning of non-centralized dynamical linear systems based on the relevance of the possible interconnections is analyzed. Moreover, the possibility of including constraints regarding the game theoretical tools to consider some links (and agents) as more critical or dispensable inside the communication network is also considered.

Title: Natural Topology

Speaker: Dr. Frank Waaldijk, Nijmegen

When and Where:

Date: Tuesday March 22, 2016

Time: 13:15-14:00

Room: 0.015

Location: Maastricht University, Department of Data Science and Knowledge Engineering, Bouillonstraat 8-10

Natural Topology (NToP) describes how to build the real numbers (R) in a very simple way. We start from the topology of closed rational intervals, without using any metric concepts. As a result, we obtain R in a way that corresponds closely to how data are obtained in the natural sciences. This correspondence makes NToP a candidate to allow a smooth transition from constructive (theoretical) math to applied math and exact computation. (On the theoretical side, NToP covers `all separable T_1 spaces', which are the natural habitat for constructive math. )

SIKS-DKE Seminar on Recent Data Analysis and Modeling Methods

Wednesday 17 February 2016, 16:00-17:50

Department of Data Science and Knowledge Engineering, Maastricht University

Bouillonstraat 8-10, room 0.015

16:00-16:10 Opening and brief introduction, Ralf Peeters

16:10-16:55 Recurrence plots for the analysis of complex systems

Norbert Marwan, Potsdam Institute of Climate Change Research, Germany

Recurrence is a fundamental property of dynamical systems, which can be exploited to characterise the system's behaviour in phase space. A powerful tool for their visualisation and analysis is the recurrence plot. Methods basing on recurrence plots have been proven to be very successful especially in analysing short, noisy and nonstationary data. Recurrence Plots (RPs) have found applications in such diverse fields as life sciences, astrophysics, earth sciences, meteorology, biochemistry, and finance, where they are used to provide measures of dynamical properties, complexity or dynamical transitions. Theoretical results show how closely RPs are linked to dynamical invariants like entropies and dimensions. Moreover, they are successful tools for coupling and synchronisation analysis or advanced surrogate tests.

16:55-17:05 break

17:05-17:50 2EPT: Financial modelling using random variables with rational characteristic function

Bernard Hanzon, Department of Mathematics, University College Cork, Ireland

Exponential-polynomial-trigonometric (EPT) functions form a general class of functions that has a long history (dating back at least to Euler and d'Alembert in the 1740's!), is well-studied and has a lot of nice properties. Here we report on the possibilities it supplies for usage as probability density functions of non-Gaussian random variables in financial modelling.

Topics in the talk include: Representation of such functions (using so-called state space realization techniques from linear systems theory); Positivity/non-negativity issues; Extension to probability measures on the real line (2EPT): this will lead to the class of all probability measures on the real line with rational characteristic function; characterization of the class of infinitely divisible 2EPT probability density functions and corresponding Levy processes and application in Finance and Financial option pricing.

If time permits some remarks will be made about the link with related (discrete probability) GPT density functions and their estimation (where maximum likelihood estimation corresponds directly to Kullback-Leibler divergence minimization) and the potential usage for EPT and 2EPT estimation. A large part of this research is based on joint work with Conor Sexton(Barclays London) and Finbarr Holland (UCC).

Title: Strong Truthfulness in Peer Prediction

Speaker: Dr. David Parkes (Harvard University)

Joint GSBE-DKE seminar.

When and Where:

Date: Wednesday, January 11, 2016

Time: 16.00-17.15

Room: A1.22

Location: Maastricht University, Graduate School of Business and Economics, Tongersestraat 12

We study the problem of information elicitation without verification (peer prediction). Agents may invest effort to receive correlated, noisy signals about an environment ("task"), and it is these signals that we want to elicit. In our model, we follow Dasgupta and Ghosh (2013) and allow agents to report signals on overlapping sets of independent tasks. We characterize conditions for which the prior-free DG mechanism generalizes from binary to multiple signals, while retaining "strong truthfulness," so that truthful reporting yields the maximum payoff across all equilibria (tied only with reporting permutations). Our analysis also yields a greatly simplified proof of their result for binary signals. We then introduce a simple generalization that, with knowledge of the signal distribution, is able to align incentives in general environments.

In an analysis of peer-evaluation data from a massive open online course (MOOC) platform, we investigate how well student peer grading fits our models, and evaluate how the proposed scoring mechanisms would perform in practice. We find some surprises in the distributions, but conclude that our modified mechanisms would do well.

(Joint with Rafael Frongillo and Victor Shnayder)

Title: Probability done Computably

Speaker: Dr. Pieter Collins (Maastricht University, Department of Data Science and Knowledge Engineering)

When and Where:

Date: Wednesday, January 6, 2016

Time: 16.00-17.00

Room: 0.009

Location: Maastricht University, Department of Knowledge Engineering, Bouillonstraat 8-10

The mathematical foundations of probability are traditionally based on observable events being elements of a sigma-algebra of sets. Unfortunately, while we can then *define* probabilities of events, we cannot *compute* probabilities of general events. In this talk I will describe work on an alternative approach to probability based on *valuations* on *open* sets, and show that this allows us to recover a complete theory of probability and random variables, including conditioning and stochastic integrals, and in such a way that all operations can be implemented (maybe not very efficiently...) on a digital computer.

This talk should be of interest to anybody working on anything random.

2015

Title: Switched symplectic graphs and their 2-ranks

Speaker: Dr. Aida Abiad (Maastricht University, KE department)

When and Where:

Date: Wednesday, November 4th, 2015

Time: 13.30-14.30

Room: 0.015

Location: Maastricht University, Department of Knowledge Engineering, Bouillonstraat 8-10

In this work we apply Godsil-McKay switching to the symplectic graphs over F_2 with at least 63 vertices and prove that the 2-rank of (the adjacency matrix of) the graph increases after switching. This shows that the switched graph is a new strongly regular graph with the same parameters as the symplectic graph over F_2 but different 2-rank.

For the symplectic graph on 63 vertices we investigate repeated switching by computer and find many new strongly regular graphs with the same parameters but different 2-ranks.

Using these results and a recursive construction method for the symplectic graph from Hadamard matrices, we also obtain several graphs with the same parameters as the symplectic graphs over F_2, but different 2-ranks.

This is joint work with Willem Haemers.

Title: Decomposing a three-way dataset of TV-ratings when this is impossible

Speaker: Dr. Alwin Stegeman (Heijmans Institute for Psychological Research, University of Groningen)

When and Where:

Date: Wednesday, October 28th, 2015

Time: 16.00-17.00

Room: 0.015

Location: Maastricht University, Department of Knowledge Engineering, Bouillonstraat 8-10

This talk is about finding a best rank-R approximation to a three-way array, where a three-way array of size IxJxK can be seen as K matrices of size IxJ. For a matrix a best rank-R approximation always exists and can be obtained as the truncated singular value decomposition, which is done in principal component analysis. For a three-way array existence for R>1 is not guaranteed. When a best rank-R approximation does not exist, trying to compute it results in rank-1 components diverging to infinite magnitude while the rank-R approximation converges to an optimal boundary point of the rank-R set. This situation can be avoided by imposing orthogonality constraints or by including specific interaction terms in the rank-R approximation (Stegeman, 2012). The latter boils down to computing a three-way Jordan canonical form of the optimal boundary point (Stegeman, 2013). We demonstrate this procedure for a three-way dataset containing 15 TV shows that are scored on 16 rating scales by 30 persons.

Trying to compute a rank-3 approximation results in two diverging rank-1 terms, while fitting a model with one additional interaction term yields an interpretable solution (Stegeman, 2014).

Title: Planning in Large-Scale Multi-Agent Decision Problems

Speaker: Tom Pepels, M.Sc. (Thales Nederland)

When and Where:

Date: Tuesday, June 23th, 2015

Time: 16.00-17.00

Room: 0.015

Location: Maastricht University, Department of Knowledge Engineering, Bouillonstraat 8-10

Current research in planning at Thales Research and Technology in Delft focusses on two types of domains. First, Unmanned Vehicle (UxV) path-planning focusses on executing missions with a heterogeneous set of UxVs on global and local scales. Second, planning in maritime domains, with a large, partially observable state-space in which a set of agents must accomplish a goal.

Unmanned vehicles (UxV) are increasingly employed for surveillance, reconnaissance and intelligence missions in both civilian and military operations. Planning the trajectories of multiple heterogeneous UxVs is a challenging problem. Based on high-level descriptions of missions defined by human operators, available UxVs are to develop a joint plan. Planning such missions with specific constraints, while avoiding breakdowns, collisions and limiting fuel consumption, for multiple UxVs is an active area of research at our lab in Delft.

Current research focusses on planning on two levels. First the global level, in which the combined set of UxVs and mission objectives are used to generate a globally optimal plan. Secondly, local planning, in which individual objectives are solved, such as scanning an area with multiple UxVs using lawn-mower patterns.

Scenario-based reasoning is a method developed to assist decision-makers with exploring possible future scenarios. In the maritime smuggling domain this technique can be used to reason about possible intercept locations, in order to capture a smuggler’s vessel. Challenges in this domain are: 1) a large, open state-space, 2) sparse observations, 3) large uncertainty in the smuggler’s goals, and initial location. Given a set of agents, the goal is to define a plan which will result in the highest likelihood of capturing the smuggler.

Title: Monte Carlo Tree Search: A Reinforcement Learning Method

Speaker: Tom Vodopivec, M.Sc. (University of Ljubljana)

When and Where: (date and location are changed)

Date: Friday, June 19th, 2015

Time: 15.00-16.00

Room: 0.009

Location: Maastricht University, Department of Knowledge Engineering, Bouillonstraat 8-10

A few years ago, a revolution took place among learning agents for playing games. Researchers managed to successfully combine bandit algorithms with tree search – they devised the upper confidence bounds applied to trees (UCT) algorithm. The field of Monte Carlo tree search (MCTS) was born. Since then, a connection between MCTS methods and the field of reinforcement learning (RL) started to emerge; however, this was not immediately apparent, nor did it impact the larger gaming community. MCTS methods are still treated as (completely) novel by many applied researchers, which slows down the transition of knowledge between the two fields.

In this talk, we will take a closer look at the relationship between RL and MCTS methods and argue that MCTS is actually a RL method. We will justify this by giving an overview of the dynamics of both fields and by demonstrating an example On-policy Monte Carlo reinforcement learning method that behaves identically as the UCT algorithm. Furthermore, we will show how to improve the performance of MCTS methods by enhancing them with RL concepts, specifically with temporal difference learning and eligibility traces. We will provide experimental results on several classic games that show the superiority of such methods over traditional MCTS methods and discuss some caveats. Our overall purpose is to improve the general understanding of MCTS and to speed up the cross-fertilization with the field of reinforcement learning.

Title: Network Creation Games

Speaker: Dr. Matús Mihalák (DKE)

When and Where:

Date: Wednesday, March 25th, 2015

Time: 16.00-17.00

Room: 0.015

Location: Maastricht University, Department of Knowledge Engineering, Bouillonstraat 8-10

Network creation games model the affect of self-interested agents during the process of creating communication networks such as the Internet. In this strategic game, every player represents a vertex of a graph (network) and decides to create a set of adjacent edges (to other vertices/players). The decisions S of the players form a graph G(S)=(V,E). Each created edge costs a fixed amount A, which is a parameter of the game. Every player wants to pay as little as possible, and in the same time to have small distances to other vertices in the resulting graph. These two goals are not compatible, and a trade-off between the two objectives needs to be made. Since its introduced in the early period of algorithmic game theory, this game and its plentiful variants are omnipresent in the current research. Main research questions are the existence of pure Nash equilibria (i.e., of graphs in which no player/vertex regrets its decision about the created edges), their structural properties, and the quality of such price of anarchy/stability of such games. Interestingly, the Nash equilibria of the game induce an interesting class of graphs, and studying the properties of such graphs is an intriguing question in its self. Despite all this active research in the area, the main questions are still (partially) open. In this talk I will focus on the original network creation game, survey some of the main results, talk about the main open questions, and report on the latest progress of the efforts trying to answer them.

Title: Agent-based simulation supporting smart grid management

Speaker: Chiara Federica Sironi (UM DKE / University of Milan-Bicocca)

When and Where:

Date: Thursday, March 19th, 2015

Time: 16.00-17.00

Room: 2.015

Location: Maastricht University, Department of Knowledge Engineering, Bouillonstraat 8-10

The need for electricity has always encouraged a research for better ways to produce and manage it efficiently. Over the years, in order to reduce harmful carbon missions, the sources used to produce energy have started changing from non-renewable (like fossil fuels) to renewable (like solar, wind or hydroelectric energy). Moreover, new technologies that can exploit renewable energies, like wind farms or solar panels, have been developed. Power grids need to be modified to be able to face these changes in energy production, and also need to be adapted to the introduction of new actors, like electric vehicles, that consume a high amount of energy, and prosumers, that can both consume and produce energy. Research and studies are aiming at improving the current power grid, integrating in it distributed intelligence and automatic control systems that monitor and manage all the nodes of the network and the network itself. From this, the concept of smart grid has emerged. A smart grid requires the investigation and the careful evaluation of different approaches and management mechanisms in order to determine their feasibility, their impact on the existing power system, their scalability and their ability to regulate interaction between relevant entities in the system. This talk presents an agent-based simulator that models the smart grid and all the actors in it. The main purpose of this simulator is allowing the simulation of different scenarios and heterogeneous management mechanisms, so that these mechanisms can be evaluated and tested. To achieve this goal a modular architecture has been designed that makes the simulator easily extendible and adaptable to different situations.

Title: Multivariate Lifetime Data in Presence of Right Censoring Data and Cure Fraction

Speaker: Prof. Jorge Alberto Achcar, Medical School, University of São Paulo

When and Where:

Date: Monday, February 23th, 2015

Time: 11.00-13.00

Room: 0.015

Location: Maastricht University, Department of Knowledge Engineering, Bouillonstraat 8-10

Abstract: In this talk, we introduce some existing approaches to model multivariate lifetime. In special, we discuss the use of “frailties” or latent variables to capture the dependence of multivariate lifetime data and copula functions assuming the popular Weibull distribution to model bivariate lifetime data. We also explore, as a special case, the Block and Basu bivariate exponential distribution to model bivariate lifetime data assuming right censoring data, a situation very common in medical and engineering applications. We also present some approaches to model lifetime data in presence of cure fraction, a situation very common in medical studies where a proportion of the individuals are suscetible for a disease and other proportion of individuals are cured or not suscetible. We consider univariate mixture cure fraction model and a bivariate mixture cure fraction model. Inferences of the proposed models are presented under the classical and Bayesian approaches. A Bayesian analysis for the bivariate Block and Basu bivariate exponential distribution in presence of cure fraction is presented using standard available MCMC (Markov Chain Monte Carlo) methods to simulate samples of the joint posterior distribution of interest. Finally, we present some illustrations with medical data and some discussion on the used software to get the inferences of interest.

For an overview of the SIKS-DKE colloquia in previous years, see 2014, 2013, 2012, 2011, 2010, 2009.

Coordination: Jean Derks, Mark Winands