A curious robot: An explorative-exploitive inference algorithm
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
A curious robot: An explorative-exploitive inference algorithm. / Pedersen, Kim Steenstrup; Johansen, Peter.
RoboMat 07: Coimbra, Portugal, 17-19 September, 2007. ed. / Helder Araujo; Maria Isabel Ribeiro. CIM (Centro Internacional de Matematica), 2007. p. 51-57.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - A curious robot: An explorative-exploitive inference algorithm
AU - Pedersen, Kim Steenstrup
AU - Johansen, Peter
PY - 2007
Y1 - 2007
N2 - We propose a sequential learning algorithm with a focus on robotcontrol. It is initialised by a teacher who directs the robotthrough a series of example solutions of a problem. Left alone, thecontrol chooses its next action by prediction based on a variableorder Markov chain model selected to minimise a MDL criterion basedon generalised code length La of the past robot-environmentinteraction. The user specifies the parameter a and as aresult, the robot can be directed towards exploratory behaviour ifconfidence in the teacher is low (a<0), and towardsgoal-seeking exploitive behaviour if confidence in the teacher ishigh (a>0). The novelty of the proposed method lies in theuse of generalised code length in the MDL model selection criterion.
AB - We propose a sequential learning algorithm with a focus on robotcontrol. It is initialised by a teacher who directs the robotthrough a series of example solutions of a problem. Left alone, thecontrol chooses its next action by prediction based on a variableorder Markov chain model selected to minimise a MDL criterion basedon generalised code length La of the past robot-environmentinteraction. The user specifies the parameter a and as aresult, the robot can be directed towards exploratory behaviour ifconfidence in the teacher is low (a<0), and towardsgoal-seeking exploitive behaviour if confidence in the teacher ishigh (a>0). The novelty of the proposed method lies in theuse of generalised code length in the MDL model selection criterion.
M3 - Article in proceedings
SN - 9789899501133
SP - 51
EP - 57
BT - RoboMat 07
A2 - Araujo, Helder
A2 - Ribeiro, Maria Isabel
PB - CIM (Centro Internacional de Matematica)
Y2 - 17 September 2007 through 19 September 2007
ER -
ID: 2030803