國立高雄大學圖資館 |

語系: 繁體中文

說明(常見問題)

圖資館首頁

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Learning Representations in Reinforc...

Rafati Heravi, Jacob.

Learning Representations in Reinforcement Learning.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Learning Representations in Reinforcement Learning.
作者:	Rafati Heravi, Jacob.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, 2019
面頁冊數:	155 p.
附註:	Source: Dissertations Abstracts International, Volume: 81-03, Section: B.
附註:	Advisor: Noelle, David C.
Contained By:	Dissertations Abstracts International81-03B.
標題:	Artificial intelligence.
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=13880098
ISBN:	9781085630283

Learning Representations in Reinforcement Learning.
Rafati Heravi, Jacob.

Learning Representations in Reinforcement Learning. - Ann Arbor : ProQuest Dissertations & Theses, 2019 - 155 p.

Source: Dissertations Abstracts International, Volume: 81-03, Section: B.

Thesis (Ph.D.)--University of California, Merced, 2019.

This item must not be sold to any third party vendors.

Reinforcement Learning (RL) algorithms allow artificial agents to improve their action selection policy to increase rewarding experiences in their environments. Temporal Difference (TD) learning algorithm, a model-free RL method, attempts to find an optimal policy through learning the values of agent's actions at any state by computing the expected future rewards without having access to a model of the environment. TD algorithms have been very successful on a broad range of control tasks, but learning can become intractably slow as the state space grows. This has motivated methods for using parameterized function approximation for the value function and developing methods for learning internal representations of the agent's state, to effectively reduce the size of state space and restructure state representations in order to support generalization. This dissertation investigates biologically inspired techniques for learning useful state representations in RL, as well as optimization methods for improving learning. There are three parts to this investigation. First, failures of deep RL algorithms to solve some relatively simple control problems are explored. Taking inspiration from the sparse codes produced by lateral inhibition in the brain, this dissertation offers a method for learning sparse state representations. Second, the challenges of RL in efficient exploration of environments with sparse delayed reward feedback, as well as the scalability issues in large-scale applications are addressed. The hierarchical structure of motor control in the brain prompts the consideration of approaches to learning action selection policies at multiple levels of temporal abstraction. That is learning to select subgoals separately from action selection policies that achieve those subgoals. This dissertation offers a novel model-free Hierarchical Reinforcement Learning framework, including approaches to automatic subgoal discovery based on unsupervised learning over memories of past experiences. Third, more complex optimization methods than those typically used in deep learning, and deep RL are explored, focusing on improving learning while avoiding the need to fine tune many hyperparameters. This dissertation offers limited-memory quasi-Newton optimization methods to efficiently solve highly nonlinear and nonconvex optimization problems for deep learning and deep RL applications. Together, these three contributions provide a foundation for scaling RL to more complex control problems through the learning of improved internal representations.

ISBN: 9781085630283Subjects--Topical Terms:

194058
Artificial intelligence.

Learning Representations in Reinforcement Learning.
LDR:03557nmm a2200313 4500 001 570768
005 20200514111955.5
008 200901s2019 ||||||||||||||||| ||eng d
020 $a 9781085630283
035 $a (MiAaPQ)AAI13880098
035 $a AAI13880098
040 $a MiAaPQ $c MiAaPQ
100 1 $a Rafati Heravi, Jacob. $3 857460
245 1 0 $a Learning Representations in Reinforcement Learning.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2019
300 $a 155 p.
500 $a Source: Dissertations Abstracts International, Volume: 81-03, Section: B.
500 $a Advisor: Noelle, David C.
502 $a Thesis (Ph.D.)--University of California, Merced, 2019.
506 $a This item must not be sold to any third party vendors.
520 $a Reinforcement Learning (RL) algorithms allow artificial agents to improve their action selection policy to increase rewarding experiences in their environments. Temporal Difference (TD) learning algorithm, a model-free RL method, attempts to find an optimal policy through learning the values of agent's actions at any state by computing the expected future rewards without having access to a model of the environment. TD algorithms have been very successful on a broad range of control tasks, but learning can become intractably slow as the state space grows. This has motivated methods for using parameterized function approximation for the value function and developing methods for learning internal representations of the agent's state, to effectively reduce the size of state space and restructure state representations in order to support generalization. This dissertation investigates biologically inspired techniques for learning useful state representations in RL, as well as optimization methods for improving learning. There are three parts to this investigation. First, failures of deep RL algorithms to solve some relatively simple control problems are explored. Taking inspiration from the sparse codes produced by lateral inhibition in the brain, this dissertation offers a method for learning sparse state representations. Second, the challenges of RL in efficient exploration of environments with sparse delayed reward feedback, as well as the scalability issues in large-scale applications are addressed. The hierarchical structure of motor control in the brain prompts the consideration of approaches to learning action selection policies at multiple levels of temporal abstraction. That is learning to select subgoals separately from action selection policies that achieve those subgoals. This dissertation offers a novel model-free Hierarchical Reinforcement Learning framework, including approaches to automatic subgoal discovery based on unsupervised learning over memories of past experiences. Third, more complex optimization methods than those typically used in deep learning, and deep RL are explored, focusing on improving learning while avoiding the need to fine tune many hyperparameters. This dissertation offers limited-memory quasi-Newton optimization methods to efficiently solve highly nonlinear and nonconvex optimization problems for deep learning and deep RL applications. Together, these three contributions provide a foundation for scaling RL to more complex control problems through the learning of improved internal representations.
590 $a School code: 1660.
650 4 $a Artificial intelligence. $3 194058
650 4 $a Computer science. $3 199325
650 4 $a Applied mathematics. $3 377601
690 $a 0800
690 $a 0984
690 $a 0364
710 2 $a University of California, Merced. $b Electrical Engineering and Computer Science. $3 857461
773 0 $t Dissertations Abstracts International $g 81-03B.
790 $a 1660
791 $a Ph.D.
792 $a 2019
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=13880098