Hierarchical DDPG for manipulator motion planning in dynamic environments
dc.contributor.author | Um, Dugan | |
dc.contributor.author | Nethala, Prasad | |
dc.contributor.author | Shin, Hocheol | |
dc.creator.orcid | https://orcid.org/0000-0002-7489-2991 | en_US |
dc.creator.orcid | https://orcid.org/0000-0002-7489-2991 | |
dc.date.accessioned | 2022-09-07T21:35:29Z | |
dc.date.available | 2022-09-07T21:35:29Z | |
dc.date.issued | 2022-08-03 | |
dc.description.abstract | In this paper, a hierarchical reinforcement learning (HRL) architecture, namely a “Hierar chical Deep Deterministic Policy Gradient (HDDPG)” has been proposed and studied. A HDDPG utilizes manager and worker formation similar to other HRL structures. However, unlike others, the HDDPG enables sharing an identical environment and state among workers and managers, while a unique reward system is required for each Deep Deterministic Policy Gradient (DDPG) agent. Therefore, the HDDPG allows easy structural expansion with probabilistic action selection of a worker by the manager. Due to its innate structural advantage, the HDDPG has a merit in building a general AI to deal with a complex time-horizon tasks with various conflicting sub-goals. The experimental results demonstrated its usefulness with a manipulator motion planning problem in a dynamic environment, where path planning and collision avoidance conflict each other. The proposed HDDPG is compared with an HAM and a single DDPG for performance evaluation. The result shows that the HDDPG demonstrated more than 40% of reward gain and more than two times the reward improvement rate. Another important feature of the proposed HDDPG is the biased manager training capability. By adding a preference factor to each worker, the manager can be trained to prefer a certain worker to achieve better success rate for a specific objective if needed. | en_US |
dc.description.abstract | In this paper, a hierarchical reinforcement learning (HRL) architecture, namely a “Hierar chical Deep Deterministic Policy Gradient (HDDPG)” has been proposed and studied. A HDDPG utilizes manager and worker formation similar to other HRL structures. However, unlike others, the HDDPG enables sharing an identical environment and state among workers and managers, while a unique reward system is required for each Deep Deterministic Policy Gradient (DDPG) agent. Therefore, the HDDPG allows easy structural expansion with probabilistic action selection of a worker by the manager. Due to its innate structural advantage, the HDDPG has a merit in building a general AI to deal with a complex time-horizon tasks with various conflicting sub-goals. The experimental results demonstrated its usefulness with a manipulator motion planning problem in a dynamic environment, where path planning and collision avoidance conflict each other. The proposed HDDPG is compared with an HAM and a single DDPG for performance evaluation. The result shows that the HDDPG demonstrated more than 40% of reward gain and more than two times the reward improvement rate. Another important feature of the proposed HDDPG is the biased manager training capability. By adding a preference factor to each worker, the manager can be trained to prefer a certain worker to achieve better success rate for a specific objective if needed. | |
dc.identifier.citation | Um, D., Nethala, P., & Shin, H. (2022). Hierarchical DDPG for Manipulator Motion Planning in Dynamic Environments. AI, 3(3), 645–658. MDPI AG. Retrieved from http://dx.doi.org/10.3390/ai3030037 | en_US |
dc.identifier.citation | Um, D., Nethala, P., & Shin, H. (2022). Hierarchical DDPG for Manipulator Motion Planning in Dynamic Environments. AI, 3(3), 645–658. MDPI AG. Retrieved from http://dx.doi.org/10.3390/ai3030037 | |
dc.identifier.doi | https://doi.org/10.3390/ai3030037 | |
dc.identifier.uri | https://hdl.handle.net/1969.6/93950 | |
dc.language.iso | en_US | en_US |
dc.language.iso | en_US | |
dc.rights | Attribution 4.0 International | * |
dc.rights | Attribution 4.0 International | |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | * |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | |
dc.subject | rl reinforcement learning | en_US |
dc.subject | hrl hierarchical reinforcement learning | en_US |
dc.subject | ddpg deep deterministic policy gradient | en_US |
dc.subject | hddpg hierarchical deep deterministic policy gradient | en_US |
dc.subject | ham hierarchical abstract machines | en_US |
dc.subject | rl reinforcement learning | |
dc.subject | hrl hierarchical reinforcement learning | |
dc.subject | ddpg deep deterministic policy gradient | |
dc.subject | hddpg hierarchical deep deterministic policy gradient | |
dc.subject | ham hierarchical abstract machines | |
dc.title | Hierarchical DDPG for manipulator motion planning in dynamic environments | en_US |
dc.title | Hierarchical DDPG for manipulator motion planning in dynamic environments | |
dc.type | Article | en_US |
dc.type | Article |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Hierarchical DDPG for Manipulator Motion Planning in Dynamic Environments.pdf
- Size:
- 6.13 MB
- Format:
- Adobe Portable Document Format
- Description:
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.72 KB
- Format:
- Item-specific license agreed upon to submission
- Description: