Skip to main content
Skip to article control options
No Access

Addressing Undesirable Emergent Behavior in Deep Reinforcement Learning UAS Ground Target Tracking

AIAA 2022-2544
Session: Other Topics in Aerospace XII
Published Online:

View Video Presentation:

Deep reinforcement learning algorithms have been used to produce agent policies for unmanned air systems using non-gimballed cameras that are tracking ground targets. Simplifying abstractions to the environment are often used which mandate a relatively small number of states and actions and sometimes produce undesirable emergent behavior. This paper investigates the use of a learning-based algorithm in conjunction with a flight controller to eliminate undesirable emergent behavior for the non-gimballed camera, fixed-wing unmanned air system ground target tracking problem. Approaches investigated consist of fidelity of dynamical model, reward structure shaping, low-level controller, and changing action space and duration. These approaches mitigate undesirable emergent behavior and result in a learning method that is stable during training, resilient to hyperparameter values, and produces a flight controller that is able keep the target in the image frame of the camera. Results presented in the paper show that a proper combination of these techniques can greatly reduce the likelihood of the agent performing undesirable emergent behavior, while still providing acceptable target tracking performance with minimal ringing and smooth learning.