University of Twente


Praatjes in 1998 | 1999 | 2000 | 2001 | 2002 | 2003 |

''Parlepraatjes'' are the group's weekly colloquium talks held on mondays at 15:45 h. in room INF 2126.

27/10/03 Martijn van Otterlo A Talk about my Ongoing Research

Martijn van Otterlo

Reinforcement learning (RL) is the main learning paradigm for behavior learning. Reinforcement learning amounts to learning actions to achieve goals, and where actions are rewarded (or punished) during learning. By learning how to maximize the rewards the agent gets, it can learn how to solve tasks. For multi-agent systems, reinforcement learning is the main paradigm as well.

One of the main problems in RL is the 'generalization problem'. How can the agent transfer learned knowledge to 'similar' situations. For example, if the agent can learn how to play the game 'Go' on a 9x9 board, how can it use this knowledge for 19x19 boards? And first of all, how can it use a learned move on the bottom of the board for basically the same move on the top of the board? Usually generalization is done using neural networks, decision trees and other generalizers from machine learning.

In recent years, new forms of generalization are being researched. Hierarchical RL focuses on the structure of tasks. So, if an agent has learned a skill for moving from one room to another, it can reuse this for another learning task which involves - as a subtask - moving from one place to another. Another new form of generalization is the use of more powerful representation languages, such as first-order logic.

In this talk I will introduce this new research direction, sketch some of its problems and I will present one of my own endeavours in this field; a method called CARCASS that enables the specification of the learning problem in (a subset of) first-order logic, and the application of RL algorithms to learn abstract strategies.

Last modified at $Date: 2003/10/27 08:22:36 $ by Hendri Hondorp