Autonomous harvesting via hierarchical reinforcement learning in dynamic environments

Nethala, Prasad

Autonomous harvesting via hierarchical reinforcement learning in dynamic environments

dc.contributor.advisor	Huang, Yuxia (Lucy)
dc.contributor.advisor	Dugan, Um
dc.contributor.author	Nethala, Prasad
dc.contributor.committeeMember	Starek, Michael
dc.creator.orcid	https://orcid.org/0000-0002-7489-2991
dc.date.accessioned	2023-08-07T20:01:52Z
dc.date.available	2023-08-07T20:01:52Z
dc.date.issued	2023-05
dc.description	A Thesis Submitted In Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE in Geospatial Surveying Engineering from Texas A&M University-Corpus Christi.
dc.description.abstract	Smart farming not only requires geospatial navigation but also uses various microprocessors and sensors to perform functions such as controlling temperature and irrigation systems. Advanced phenotyping modalities such as IoT and digital twin technologies revamped agriculture productivity to an extent hitherto unprecedented. However, matching crop cultivation and harvesting technology has yet to be further advanced to take advantage of data-driven crop production. Farming areas are often unstructured with dynamic objects such as human workers and farming machines. Therefore, a smart harvesting robot is in need of automatic navigation and obstacle avoidance. Due to conflicting objectives of goal-reaching and obstacle-avoidance, especially in a dynamic environment, harvesting is a challenging task for a robotic system. In this thesis, a novel Hierarchical Reinforcement Learning architecture is proposed, which is a robust multitask-capable AI model for an autonomous mobile manipulator to achieve both terrain coverage while assuring obstacle avoidance with dynamic objects. It is assumed that the manipulator is equipped with sensitive skin for omnidirectional sensitivity. The proposed Hierarchical Reinforcement Learning architecture is modeled with both Deep Deterministic Policy Gradient (DDPG) algorithm and Proximal Policy Optimization (PPO) algorithm. As a result, two different hierarchical architectures are developed as Hierarchical Deep Deterministic Policy Gradient (HDDPG) and Hierarchical Proximal Policy Optimization (HPPO) algorithms to autonomously manage two separate agents for both goal-reaching and obstacle-avoidance objectives. Transfer learning is adopted to assess if the trained models were overfit or underfit as well as for learning generalized policy. The algorithms were evaluated in a simulated environment by collecting fallen fruits in a crowded orchard farm environment with a variety of dynamic obstacles after being taught in a simple environment with fewer constraints. The metric used for this mission includes percent harvesting and the number of goal touch, the number of obstacle touch, navigation distance, and navigation time. HDDPG outperformed the remaining algorithms by 70% in terms of total average rewards and minimum pixel distance travel, whereas HPPO achieved the highest number of fruit collections, DDPG and PPO were unable to complete the test environment due to local minimum. Both Hierarchical architectures HDDPG and HPPO could successfully generalize to new situations beyond the training environments with robust performance.
dc.description.college	College of Engineering
dc.description.department	Geospatial Surveying Engineering
dc.format.extent	67 pages
dc.identifier.uri	https://hdl.handle.net/1969.6/96898
dc.language.iso	en_US	en_US
dc.rights	This material is made available for use in research, teaching, and private study, pursuant to U.S. Copyright law. The user assumes full responsibility for any use of the materials, including but not limited to, infringement of copyright and publication rights of reproduced materials. Any materials used should be fully credited with its source. All rights are reserved and retained regardless of current or future development or laws that may apply to fair use standards. Permission for publication of this material, in part or in full, must be secured with the author and/or publisher.
dc.rights	This material is made available for use in research, teaching, and private study, pursuant to U.S. Copyright law. The user assumes full responsibility for any use of the materials, including but not limited to, infringement of copyright and publication rights of reproduced materials. Any materials used should be fully credited with its source. All rights are reserved and retained regardless of current or future development or laws that may apply to fair use standards. Permission for publication of this material, in part or in full, must be secured with the author and/or publisher.	en_US
dc.subject	HDDPG	en_US
dc.subject	hierarchical reinforcement learning	en_US
dc.subject	HPPO	en_US
dc.subject	reinforcement learning	en_US
dc.title	Autonomous harvesting via hierarchical reinforcement learning in dynamic environments	en_US
dc.type	Text
dc.type	Text	en_US
dc.type.genre	Thesis
dc.type.genre	Thesis	en_US
thesis.degree.discipline	Geospatial Surveying Engineering	en_US
thesis.degree.grantor	Texas A & M University--Corpus Christi
thesis.degree.grantor	Texas A & M University--Corpus Christi	en_US
thesis.degree.level	Masters	en_US
thesis.degree.name	Master of Science	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Nethala_Prasad_Thesis.pdf
Size:: 1.04 MB
Format:: Adobe Portable Document Format

Download

Collections

Theses