Hierarchical DDPG for manipulator motion planning in dynamic environments

Um, Dugan; Nethala, Prasad; Shin, Hocheol

Hierarchical DDPG for manipulator motion planning in dynamic environments

dc.contributor.author	Um, Dugan
dc.contributor.author	Nethala, Prasad
dc.contributor.author	Shin, Hocheol
dc.creator.orcid	https://orcid.org/0000-0002-7489-2991	en_US
dc.creator.orcid	https://orcid.org/0000-0002-7489-2991
dc.date.accessioned	2022-09-07T21:35:29Z
dc.date.available	2022-09-07T21:35:29Z
dc.date.issued	2022-08-03
dc.description.abstract	In this paper, a hierarchical reinforcement learning (HRL) architecture, namely a “Hierar chical Deep Deterministic Policy Gradient (HDDPG)” has been proposed and studied. A HDDPG utilizes manager and worker formation similar to other HRL structures. However, unlike others, the HDDPG enables sharing an identical environment and state among workers and managers, while a unique reward system is required for each Deep Deterministic Policy Gradient (DDPG) agent. Therefore, the HDDPG allows easy structural expansion with probabilistic action selection of a worker by the manager. Due to its innate structural advantage, the HDDPG has a merit in building a general AI to deal with a complex time-horizon tasks with various conflicting sub-goals. The experimental results demonstrated its usefulness with a manipulator motion planning problem in a dynamic environment, where path planning and collision avoidance conflict each other. The proposed HDDPG is compared with an HAM and a single DDPG for performance evaluation. The result shows that the HDDPG demonstrated more than 40% of reward gain and more than two times the reward improvement rate. Another important feature of the proposed HDDPG is the biased manager training capability. By adding a preference factor to each worker, the manager can be trained to prefer a certain worker to achieve better success rate for a specific objective if needed.	en_US
dc.description.abstract	In this paper, a hierarchical reinforcement learning (HRL) architecture, namely a “Hierar chical Deep Deterministic Policy Gradient (HDDPG)” has been proposed and studied. A HDDPG utilizes manager and worker formation similar to other HRL structures. However, unlike others, the HDDPG enables sharing an identical environment and state among workers and managers, while a unique reward system is required for each Deep Deterministic Policy Gradient (DDPG) agent. Therefore, the HDDPG allows easy structural expansion with probabilistic action selection of a worker by the manager. Due to its innate structural advantage, the HDDPG has a merit in building a general AI to deal with a complex time-horizon tasks with various conflicting sub-goals. The experimental results demonstrated its usefulness with a manipulator motion planning problem in a dynamic environment, where path planning and collision avoidance conflict each other. The proposed HDDPG is compared with an HAM and a single DDPG for performance evaluation. The result shows that the HDDPG demonstrated more than 40% of reward gain and more than two times the reward improvement rate. Another important feature of the proposed HDDPG is the biased manager training capability. By adding a preference factor to each worker, the manager can be trained to prefer a certain worker to achieve better success rate for a specific objective if needed.
dc.identifier.citation	Um, D., Nethala, P., & Shin, H. (2022). Hierarchical DDPG for Manipulator Motion Planning in Dynamic Environments. AI, 3(3), 645–658. MDPI AG. Retrieved from http://dx.doi.org/10.3390/ai3030037	en_US
dc.identifier.citation	Um, D., Nethala, P., & Shin, H. (2022). Hierarchical DDPG for Manipulator Motion Planning in Dynamic Environments. AI, 3(3), 645–658. MDPI AG. Retrieved from http://dx.doi.org/10.3390/ai3030037
dc.identifier.doi	https://doi.org/10.3390/ai3030037
dc.identifier.uri	https://hdl.handle.net/1969.6/93950
dc.language.iso	en_US	en_US
dc.language.iso	en_US
dc.rights	Attribution 4.0 International	*
dc.rights	Attribution 4.0 International
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.subject	rl reinforcement learning	en_US
dc.subject	hrl hierarchical reinforcement learning	en_US
dc.subject	ddpg deep deterministic policy gradient	en_US
dc.subject	hddpg hierarchical deep deterministic policy gradient	en_US
dc.subject	ham hierarchical abstract machines	en_US
dc.subject	rl reinforcement learning
dc.subject	hrl hierarchical reinforcement learning
dc.subject	ddpg deep deterministic policy gradient
dc.subject	hddpg hierarchical deep deterministic policy gradient
dc.subject	ham hierarchical abstract machines
dc.title	Hierarchical DDPG for manipulator motion planning in dynamic environments	en_US
dc.title	Hierarchical DDPG for manipulator motion planning in dynamic environments
dc.type	Article	en_US
dc.type	Article

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Hierarchical DDPG for Manipulator Motion Planning in Dynamic Environments.pdf
Size:: 6.13 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.72 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

COE Faculty Works