Autonomous harvesting via hierarchical reinforcement learning in dynamic environments

dc.contributor.advisorHuang, Yuxia (Lucy)
dc.contributor.advisorDugan, Um
dc.contributor.authorNethala, Prasad
dc.contributor.committeeMemberStarek, Michael
dc.creator.orcidhttps://orcid.org/0000-0002-7489-2991
dc.date.accessioned2023-08-07T20:01:52Z
dc.date.available2023-08-07T20:01:52Z
dc.date.issued2023-05
dc.descriptionA Thesis Submitted In Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE in Geospatial Surveying Engineering from Texas A&M University-Corpus Christi.
dc.description.abstractSmart farming not only requires geospatial navigation but also uses various microprocessors and sensors to perform functions such as controlling temperature and irrigation systems. Advanced phenotyping modalities such as IoT and digital twin technologies revamped agriculture productivity to an extent hitherto unprecedented. However, matching crop cultivation and harvesting technology has yet to be further advanced to take advantage of data-driven crop production. Farming areas are often unstructured with dynamic objects such as human workers and farming machines. Therefore, a smart harvesting robot is in need of automatic navigation and obstacle avoidance. Due to conflicting objectives of goal-reaching and obstacle-avoidance, especially in a dynamic environment, harvesting is a challenging task for a robotic system. In this thesis, a novel Hierarchical Reinforcement Learning architecture is proposed, which is a robust multitask-capable AI model for an autonomous mobile manipulator to achieve both terrain coverage while assuring obstacle avoidance with dynamic objects. It is assumed that the manipulator is equipped with sensitive skin for omnidirectional sensitivity. The proposed Hierarchical Reinforcement Learning architecture is modeled with both Deep Deterministic Policy Gradient (DDPG) algorithm and Proximal Policy Optimization (PPO) algorithm. As a result, two different hierarchical architectures are developed as Hierarchical Deep Deterministic Policy Gradient (HDDPG) and Hierarchical Proximal Policy Optimization (HPPO) algorithms to autonomously manage two separate agents for both goal-reaching and obstacle-avoidance objectives. Transfer learning is adopted to assess if the trained models were overfit or underfit as well as for learning generalized policy. The algorithms were evaluated in a simulated environment by collecting fallen fruits in a crowded orchard farm environment with a variety of dynamic obstacles after being taught in a simple environment with fewer constraints. The metric used for this mission includes percent harvesting and the number of goal touch, the number of obstacle touch, navigation distance, and navigation time. HDDPG outperformed the remaining algorithms by 70% in terms of total average rewards and minimum pixel distance travel, whereas HPPO achieved the highest number of fruit collections, DDPG and PPO were unable to complete the test environment due to local minimum. Both Hierarchical architectures HDDPG and HPPO could successfully generalize to new situations beyond the training environments with robust performance.
dc.description.collegeCollege of Engineering
dc.description.departmentGeospatial Surveying Engineering
dc.format.extent67 pages
dc.identifier.urihttps://hdl.handle.net/1969.6/96898
dc.language.isoen_USen_US
dc.rightsThis material is made available for use in research, teaching, and private study, pursuant to U.S. Copyright law. The user assumes full responsibility for any use of the materials, including but not limited to, infringement of copyright and publication rights of reproduced materials. Any materials used should be fully credited with its source. All rights are reserved and retained regardless of current or future development or laws that may apply to fair use standards. Permission for publication of this material, in part or in full, must be secured with the author and/or publisher.
dc.rightsThis material is made available for use in research, teaching, and private study, pursuant to U.S. Copyright law. The user assumes full responsibility for any use of the materials, including but not limited to, infringement of copyright and publication rights of reproduced materials. Any materials used should be fully credited with its source. All rights are reserved and retained regardless of current or future development or laws that may apply to fair use standards. Permission for publication of this material, in part or in full, must be secured with the author and/or publisher.en_US
dc.subjectHDDPGen_US
dc.subjecthierarchical reinforcement learningen_US
dc.subjectHPPOen_US
dc.subjectreinforcement learningen_US
dc.titleAutonomous harvesting via hierarchical reinforcement learning in dynamic environmentsen_US
dc.typeText
dc.typeTexten_US
dc.type.genreThesis
dc.type.genreThesisen_US
thesis.degree.disciplineGeospatial Surveying Engineeringen_US
thesis.degree.grantorTexas A & M University--Corpus Christi
thesis.degree.grantorTexas A & M University--Corpus Christien_US
thesis.degree.levelMastersen_US
thesis.degree.nameMaster of Scienceen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Nethala_Prasad_Thesis.pdf
Size:
1.04 MB
Format:
Adobe Portable Document Format

Collections