A curious robot: An explorative-exploitive inference algorithm

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Standard

A curious robot: An explorative-exploitive inference algorithm. / Pedersen, Kim Steenstrup; Johansen, Peter.

RoboMat 07: Coimbra, Portugal, 17-19 September, 2007. ed. / Helder Araujo; Maria Isabel Ribeiro. CIM (Centro Internacional de Matematica), 2007. p. 51-57.

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Harvard

Pedersen, KS & Johansen, P 2007, A curious robot: An explorative-exploitive inference algorithm. in H Araujo & MI Ribeiro (eds), RoboMat 07: Coimbra, Portugal, 17-19 September, 2007. CIM (Centro Internacional de Matematica), pp. 51-57, Workshop of Robotics and Mathematics (RoboMat 2007), Coimbra, Portugal, 17/09/2007. <http://labvis.isr.uc.pt/robomat/program.html>

APA

Pedersen, K. S., & Johansen, P. (2007). A curious robot: An explorative-exploitive inference algorithm. In H. Araujo, & M. I. Ribeiro (Eds.), RoboMat 07: Coimbra, Portugal, 17-19 September, 2007 (pp. 51-57). CIM (Centro Internacional de Matematica). http://labvis.isr.uc.pt/robomat/program.html

Vancouver

Pedersen KS, Johansen P. A curious robot: An explorative-exploitive inference algorithm. In Araujo H, Ribeiro MI, editors, RoboMat 07: Coimbra, Portugal, 17-19 September, 2007. CIM (Centro Internacional de Matematica). 2007. p. 51-57

Author

Pedersen, Kim Steenstrup ; Johansen, Peter. / A curious robot: An explorative-exploitive inference algorithm. RoboMat 07: Coimbra, Portugal, 17-19 September, 2007. editor / Helder Araujo ; Maria Isabel Ribeiro. CIM (Centro Internacional de Matematica), 2007. pp. 51-57

Bibtex

@inproceedings{b3adca60b54e11dcbee902004c4f4f50,

title = "A curious robot: An explorative-exploitive inference algorithm",

abstract = "We propose a sequential learning algorithm with a focus on robotcontrol. It is initialised by a teacher who directs the robotthrough a series of example solutions of a problem. Left alone, thecontrol chooses its next action by prediction based on a variableorder Markov chain model selected to minimise a MDL criterion basedon generalised code length La of the past robot-environmentinteraction. The user specifies the parameter a and as aresult, the robot can be directed towards exploratory behaviour ifconfidence in the teacher is low (a<0), and towardsgoal-seeking exploitive behaviour if confidence in the teacher ishigh (a>0). The novelty of the proposed method lies in theuse of generalised code length in the MDL model selection criterion.",

author = "Pedersen, {Kim Steenstrup} and Peter Johansen",

year = "2007",

language = "English",

isbn = "9789899501133",

pages = "51--57",

editor = "Helder Araujo and Ribeiro, {Maria Isabel}",

booktitle = "RoboMat 07",

publisher = "CIM (Centro Internacional de Matematica)",

note = "null ; Conference date: 17-09-2007 Through 19-09-2007",

}

RIS

TY - GEN

T1 - A curious robot: An explorative-exploitive inference algorithm

AU - Pedersen, Kim Steenstrup

AU - Johansen, Peter

PY - 2007

Y1 - 2007

N2 - We propose a sequential learning algorithm with a focus on robotcontrol. It is initialised by a teacher who directs the robotthrough a series of example solutions of a problem. Left alone, thecontrol chooses its next action by prediction based on a variableorder Markov chain model selected to minimise a MDL criterion basedon generalised code length La of the past robot-environmentinteraction. The user specifies the parameter a and as aresult, the robot can be directed towards exploratory behaviour ifconfidence in the teacher is low (a<0), and towardsgoal-seeking exploitive behaviour if confidence in the teacher ishigh (a>0). The novelty of the proposed method lies in theuse of generalised code length in the MDL model selection criterion.

AB - We propose a sequential learning algorithm with a focus on robotcontrol. It is initialised by a teacher who directs the robotthrough a series of example solutions of a problem. Left alone, thecontrol chooses its next action by prediction based on a variableorder Markov chain model selected to minimise a MDL criterion basedon generalised code length La of the past robot-environmentinteraction. The user specifies the parameter a and as aresult, the robot can be directed towards exploratory behaviour ifconfidence in the teacher is low (a<0), and towardsgoal-seeking exploitive behaviour if confidence in the teacher ishigh (a>0). The novelty of the proposed method lies in theuse of generalised code length in the MDL model selection criterion.

M3 - Article in proceedings

SN - 9789899501133

SP - 51

EP - 57

BT - RoboMat 07

A2 - Araujo, Helder

A2 - Ribeiro, Maria Isabel

PB - CIM (Centro Internacional de Matematica)

Y2 - 17 September 2007 through 19 September 2007

ER -

ID: 2030803

Department of Computer Science