Direct policy search: intrinsic vs. extrinsic perturbations

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Standard

Direct policy search : intrinsic vs. extrinsic perturbations. / Heidrich-Meisner, V.; Igel, Christian.

Workshop New Challenges in Neural Computation . red. / B. Hammer; T. Villmann. 2010. s. 33-39 (Machine Learning Reports, Bind 04/2010).

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Harvard

Heidrich-Meisner, V & Igel, C 2010, Direct policy search: intrinsic vs. extrinsic perturbations. i B Hammer & T Villmann (red), Workshop New Challenges in Neural Computation . Machine Learning Reports, bind 04/2010, s. 33-39, Workshop New Challenges in Neural Computatation 2010, Karlsruhe, Tyskland, 21/09/2010. <https://www.techfak.uni-bielefeld.de/~fschleif/mlr/mlr_04_2010.pdf>

APA

Heidrich-Meisner, V., & Igel, C. (2010). Direct policy search: intrinsic vs. extrinsic perturbations. I B. Hammer, & T. Villmann (red.), Workshop New Challenges in Neural Computation (s. 33-39). Machine Learning Reports Bind 04/2010 https://www.techfak.uni-bielefeld.de/~fschleif/mlr/mlr_04_2010.pdf

Vancouver

Heidrich-Meisner V, Igel C. Direct policy search: intrinsic vs. extrinsic perturbations. I Hammer B, Villmann T, red., Workshop New Challenges in Neural Computation . 2010. s. 33-39. (Machine Learning Reports, Bind 04/2010).

Author

Heidrich-Meisner, V. ; Igel, Christian. / Direct policy search : intrinsic vs. extrinsic perturbations. Workshop New Challenges in Neural Computation . red. / B. Hammer ; T. Villmann. 2010. s. 33-39 (Machine Learning Reports, Bind 04/2010).

Bibtex

@inproceedings{734fe197561945f4b251d06b45daa963,
title = "Direct policy search: intrinsic vs. extrinsic perturbations",
abstract = "Reinforcement learning (RL) is a biological inspired learning paradigm based on trial-and-error learning. A successful RL algorithm has to balance exploration of new behavioral strategies and exploitation of already obtained knowledge. In the initial learning phase exploration is the dominant process. Exploration is realized by stochastic perturbations, which can be applied at different levels. When considering direct policy search in the space of neural network policies, exploration can be applied on the synaptic level or on the level of neuronal activity. We propose neuroevolution strategies (NeuroESs) for direct policy search in RL. Learning using NeuroESs can be interpreted as modelling of extrinsic perturbations on the level of synaptic weights. In contrast, policy gradient methods (PGMs) can be regarded as intrinsic perturbation of neuronal activity. We compare these two approaches conceptually and experimentally.",
author = "V. Heidrich-Meisner and Christian Igel",
year = "2010",
language = "English",
series = "Machine Learning Reports",
pages = "33--39",
editor = "B. Hammer and T. Villmann",
booktitle = "Workshop New Challenges in Neural Computation",
note = "null ; Conference date: 21-09-2010 Through 21-09-2010",

}

RIS

TY - GEN

T1 - Direct policy search

AU - Heidrich-Meisner, V.

AU - Igel, Christian

PY - 2010

Y1 - 2010

N2 - Reinforcement learning (RL) is a biological inspired learning paradigm based on trial-and-error learning. A successful RL algorithm has to balance exploration of new behavioral strategies and exploitation of already obtained knowledge. In the initial learning phase exploration is the dominant process. Exploration is realized by stochastic perturbations, which can be applied at different levels. When considering direct policy search in the space of neural network policies, exploration can be applied on the synaptic level or on the level of neuronal activity. We propose neuroevolution strategies (NeuroESs) for direct policy search in RL. Learning using NeuroESs can be interpreted as modelling of extrinsic perturbations on the level of synaptic weights. In contrast, policy gradient methods (PGMs) can be regarded as intrinsic perturbation of neuronal activity. We compare these two approaches conceptually and experimentally.

AB - Reinforcement learning (RL) is a biological inspired learning paradigm based on trial-and-error learning. A successful RL algorithm has to balance exploration of new behavioral strategies and exploitation of already obtained knowledge. In the initial learning phase exploration is the dominant process. Exploration is realized by stochastic perturbations, which can be applied at different levels. When considering direct policy search in the space of neural network policies, exploration can be applied on the synaptic level or on the level of neuronal activity. We propose neuroevolution strategies (NeuroESs) for direct policy search in RL. Learning using NeuroESs can be interpreted as modelling of extrinsic perturbations on the level of synaptic weights. In contrast, policy gradient methods (PGMs) can be regarded as intrinsic perturbation of neuronal activity. We compare these two approaches conceptually and experimentally.

M3 - Article in proceedings

T3 - Machine Learning Reports

SP - 33

EP - 39

BT - Workshop New Challenges in Neural Computation

A2 - Hammer, B.

A2 - Villmann, T.

Y2 - 21 September 2010 through 21 September 2010

ER -

ID: 33863042