Deep Reinforcement Learning as Guidance for Aerospace Robotics
Public Deposited- Resource Type
- Creator
- Contributors
- Steve Ulrich (Supervisor and co-author)
- Murat Bronz (Co-author)
- Abstract
The ability for a manipulator-equipped chaser spacecraft to autonomously capture a target spacecraft is an unsolved prerequisite for space debris removal and on-orbit servicing. This thesis investigates using deep reinforcement learning (DRL) to improve the capabilities of a manipulator-equipped chaser at this task. DRL allows for behaviour to be learned, rather than designed, according to a simple reward function. DRL uses trial-and-error to learn the behaviour, which is not feasible to perform on-board a spacecraft. Training must therefore be performed in simulation with the resulting behaviour transferred to the spacecraft. Transferring the learned-in-simulation behaviour to a real robot, however, is difficult due to dynamics differences between the simulator and the real world, i.e., the simulation-to-reality gap. This thesis develops, over the course of four increasingly-difficult applications, a solution to the simulation-to-reality gap by restricting DRL to exclusively learn the guidance portion of the guidance, navigation, and control system needed for autonomous spacecraft operations. The first application is spacecraft proximity operations (without capture), where a DRL-based guidance strategy issuing desired velocity signals is designed, trained, and evaluated in simulation and experiment. Next, the DRL-based guidance strategy is improved upon and applied to a quadrotor proximity operations scenario. Here, it is demonstrated in simulation and experiment that desired acceleration signals lead to better performance compared to desired velocity signals. These two proof-of-concept results show the proposed DRL-based guidance strategy is viable for bringing DRL to real aerospace vehicles. Next, the DRL-based guidance strategy is applied to a more difficult scenario: a multi-agent cooperative quadrotor runway inspection task, where fault-tolerant behaviour is successfully learned and demonstrated in both simulation and a real, outdoor, GPS-driven quadrotor facility. Finally, with the now-developed DRL-based guidance strategy, the author returns to the central motivator for this research: autonomous manipulator-based capture of a spinning spacecraft. The DRL-based guidance strategy learns this task in simulation and is successfully transferred to an experimental facility where similar results are obtained. Additionally, capture is successful in experiment despite large perturbations and initial conditions not seen during training. Improvements to the experimental facility were performed to enable this research.
- Subject
- Language
- Publisher
- Thesis Degree Level
- Thesis Degree Name
- Thesis Degree Discipline
- Identifier
- Rights Notes
Copyright © 2022 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.
- Date Created
- 2022
Relations
- In Collection:
Items
Thumbnail | Title | Date Uploaded | Visibility | Actions |
---|---|---|---|---|
hovell-deepreinforcementlearningasguidanceforaerospace.pdf | 2023-05-05 | Public | Download |