## Publications Home » Research » Publications

#### A BibTex file that includes all references is found here. You can also follow our publication updates via Google Scholar.

 Type: AllJournal/magazine articlePaper in conference proceedingsTechnical reportBook chapterPh.D. thesisMaster’s thesisMisc

## 2022

Bridging the model-reality gap with Lipschitz network adaptation
S. Zhou, K. Pereida, W. Zhao, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 7, iss. 1, p. 642–649, 2022.

As robots venture into the real world, they are subject to unmodeled dynamics and disturbances. Traditional model-based control approaches have been proven successful in relatively static and known operating environments. However, when an accurate model of the robot is not available, model-based design can lead to suboptimal and even unsafe behaviour. In this work, we propose a method that bridges the model-reality gap and enables the application of model-based approaches even if dynamic uncertainties are present. In particular, we present a learning-based model reference adaptation approach that makes a robot system, with possibly uncertain dynamics, behave as a predefined reference model. In turn, the reference model can be used for model-based controller design. In contrast to typical model reference adaptation control approaches, we leverage the representative power of neural networks to capture highly nonlinear dynamics uncertainties and guarantee stability by encoding a certifying Lipschitz condition in the architectural design of a special type of neural network called the Lipschitz network. Our approach applies to a general class of nonlinear control-affine systems even when our prior knowledge about the true robot system is limited. We demonstrate our approach in flying inverted pendulum experiments, where an off-the-shelf quadrotor is challenged to balance an inverted pendulum while hovering or tracking circular trajectories.

@article{zhou-ral22,
author = {Siqi Zhou and Karime Pereida and Wenda Zhao and Angela P. Schoellig},
title = {Bridging the model-reality gap with {Lipschitz} network adaptation},
journal = {{IEEE Robotics and Automation Letters}},
year = {2022},
volume = {7},
number = {1},
pages = {642--649},
doi = {https://doi.org/10.1109/LRA.2021.3131698},
urlvideo = {http://tiny.cc/lipnet-pendulum},
abstract = {As robots venture into the real world, they are subject to unmodeled dynamics and disturbances. Traditional model-based control approaches have been proven successful in relatively static and known operating environments. However, when an accurate model of the robot is not available, model-based design can lead to suboptimal and even unsafe behaviour. In this work, we propose a method that bridges the model-reality gap and enables the application of model-based approaches even if dynamic uncertainties are present. In particular, we present a learning-based model reference adaptation approach that makes a robot system, with possibly uncertain dynamics, behave as a predefined reference model. In turn, the reference model can be used for model-based controller design. In contrast to typical model reference adaptation control approaches, we leverage the representative power of neural networks to capture highly nonlinear dynamics uncertainties and guarantee stability by encoding a certifying Lipschitz condition in the architectural design of a special type of neural network called the Lipschitz network. Our approach applies to a general class of nonlinear control-affine systems even when our prior knowledge about the true robot system is limited. We demonstrate our approach in flying inverted pendulum experiments, where an off-the-shelf quadrotor is challenged to balance an inverted pendulum while hovering or tracking circular trajectories.}
}

Tag-based visual-inertial localization of unmanned aerial vehicles in indoor construction environments using an on-manifold extended kalman filter
N. Kayhani, W. Zhao, B. McCabe, and A. P. Schoellig
Automation in construction, vol. 135, p. 104112, 2022.

Automated visual data collection using autonomous unmanned aerial vehicles (UAVs) can improve the accessibility and accuracy of the frequent data required for indoor construction inspections and tracking. However, robust localization, as a critical enabler for autonomy, is challenging in ever-changing, cluttered, GPS-denied indoor construction environments. Rapid alterations and repetitive low-texture areas on indoor construction sites jeopardize the reliability of typical vision-based solutions. This research proposes a tag-based visual-inertial localization method for off-the-shelf UAVs with only a camera and an inertial measurement unit (IMU). Given that tag locations are known in the BIM, the proposed method estimates the UAV’s global pose by fusing inertial data and tag measurements using an on-manifold extended Kalman filter (EKF). The root-mean-square error (RMSE) achieved in our experiments in laboratory and simulation, being as low as 2 − 5 cm, indicates the potential of deploying the proposed method for autonomous navigation of low-cost UAVs in indoor construction environments.

@article{kayhani-autocon22,
author = {Navid Kayhani and Wenda Zhao and Brenda McCabe and Angela P. Schoellig},
title = {Tag-based visual-inertial localization of unmanned aerial vehicles in indoor construction environments using an on-manifold extended Kalman filter},
journal = {Automation in Construction},
year = {2022},
volume = {135},
pages = {104112},
issn = {0926-5805},
doi = {https://doi.org/10.1016/j.autcon.2021.104112},
url = {https://www.sciencedirect.com/science/article/pii/S092658052100563X},
keywords = {Indoor localization, Unmanned aerial vehicle, Extended Kalman filter, SE(3), On-manifold state estimation, Autonomous navigation, Building information model, Construction robotics, AprilTag},
abstract = {Automated visual data collection using autonomous unmanned aerial vehicles (UAVs) can improve the accessibility and accuracy of the frequent data required for indoor construction inspections and tracking. However, robust localization, as a critical enabler for autonomy, is challenging in ever-changing, cluttered, GPS-denied indoor construction environments. Rapid alterations and repetitive low-texture areas on indoor construction sites jeopardize the reliability of typical vision-based solutions. This research proposes a tag-based visual-inertial localization method for off-the-shelf UAVs with only a camera and an inertial measurement unit (IMU). Given that tag locations are known in the BIM, the proposed method estimates the UAV's global pose by fusing inertial data and tag measurements using an on-manifold extended Kalman filter (EKF). The root-mean-square error (RMSE) achieved in our experiments in laboratory and simulation, being as low as 2 − 5 cm, indicates the potential of deploying the proposed method for autonomous navigation of low-cost UAVs in indoor construction environments.}
}

## 2021

Robust adaptive model predictive control for guaranteed fast and accurate stabilization in the presence of model errors
K. Pereida, L. Brunke, and A. P. Schoellig
International Journal of Robust and Nonlinear Control, vol. 31, iss. 18, p. 8750–8784, 2021.

@article{pereida-ijrnc21,
author = {Karime Pereida and Lukas Brunke and Angela P. Schoellig},
title = {Robust adaptive model predictive control for guaranteed fast and accurate stabilization in the presence of model errors},
journal = {{International Journal of Robust and Nonlinear Control}},
year = {2021},
volume = {31},
number = {18},
pages = {8750--8784},
doi = {https://doi.org/10.1002/rnc.5712},
}

A deep learning approach for rock fragmentation analysis
T. Bamford, K. Esmaeili, and A. P. Schoellig
International Journal of Rock Mechanics and Mining Sciences, vol. 145, p. 104839, 2021.

In mining operations, blast-induced rock fragmentation affects the productivity and efficiency of downstream operations including digging, hauling, crushing, and grinding. Continuous measurement of rock fragmentation is essential for optimizing blast design. Current methods of rock fragmentation analysis rely on either physical screening of blasted rock material or image analysis of the blasted muckpiles; both are time consuming. This study aims to present and evaluate the measurement of rock fragmentation using deep learning strategies. A deep neural network (DNN) architecture was used to predict characteristic sizes of rock fragments from a 2D image of a muckpile. The data set used for training the DNN model is composed of 61,853 labelled images of blasted rock fragments. An exclusive data set of 1,263 labelled images were used to test the DNN model. The percent error for coarse characteristic size prediction ranges within ±25% when evaluated using the test set. Model validation on orthomosaics for two muckpiles shows that the deep learning method achieves a good accuracy (lower mean percent error) compared to manual image labelling. Validation on screened piles shows that the DNN model prediction is similar to manual labelling accuracy when compared with sieving analysis.

@article{bamford-ijrmms21,
title = {A deep learning approach for rock fragmentation analysis},
author = {Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
journal = {{International Journal of Rock Mechanics and Mining Sciences}},
year = {2021},
volume = {145},
doi = {10.1016/j.ijrmms.2021.104839},
pages = {104839},
abstract = {In mining operations, blast-induced rock fragmentation affects the productivity and efficiency of downstream operations including digging, hauling, crushing, and grinding. Continuous measurement of rock fragmentation is essential for optimizing blast design. Current methods of rock fragmentation analysis rely on either physical screening of blasted rock material or image analysis of the blasted muckpiles; both are time consuming. This study aims to present and evaluate the measurement of rock fragmentation using deep learning strategies. A deep neural network (DNN) architecture was used to predict characteristic sizes of rock fragments from a 2D image of a muckpile. The data set used for training the DNN model is composed of 61,853 labelled images of blasted rock fragments. An exclusive data set of 1,263 labelled images were used to test the DNN model. The percent error for coarse characteristic size prediction ranges within ±25% when evaluated using the test set. Model validation on orthomosaics for two muckpiles shows that the deep learning method achieves a good accuracy (lower mean percent error) compared to manual image labelling. Validation on screened piles shows that the DNN model prediction is similar to manual labelling accuracy when compared with sieving analysis.},
}

Meta learning with paired forward and inverse models for efficient receding horizon control
C. D. McKinnon and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 6, iss. 2, p. 3240–3247, 2021.

This paper presents a model-learning method for Stochastic Model Predictive Control (SMPC) that is both accurate and computationally efficient. We assume that the control input affects the robot dynamics through an unknown (but invertable) nonlinear function. By learning this unknown function and its inverse, we can use the value of the function as a new control input (which we call the input feature) that is optimised by SMPC in place of the original control input. This removes the need to evaluate a function approximator for the unknown function during optimisation in SMPC (where it would be evaluated many times), reducing the computational cost. The learned inverse is evaluated only once at each sampling time to convert the optimal input feature from SMPC to a control input to apply to the system. We assume that the remaining unknown dynamics can be accurately represented as a model that is linear in a set of coefficients, which enables fast adaptation to new conditions. We demonstrate our approach in experiments on a large ground robot using a stereo camera for localisation.

@article{mckinnon-ral21,
title = {Meta Learning With Paired Forward and Inverse Models for Efficient Receding Horizon Control},
author = {Christopher D. McKinnon and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2021},
volume = {6},
number = {2},
pages = {3240--3247},
doi = {10.1109/LRA.2021.3063957},
abstract = {This paper presents a model-learning method for Stochastic Model Predictive Control (SMPC) that is both accurate and computationally efficient. We assume that the control input affects the robot dynamics through an unknown (but invertable) nonlinear function. By learning this unknown function and its inverse, we can use the value of the function as a new control input (which we call the input feature) that is optimised by SMPC in place of the original control input. This removes the need to evaluate a function approximator for the unknown function during optimisation in SMPC (where it would be evaluated many times), reducing the computational cost. The learned inverse is evaluated only once at each sampling time to convert the optimal input feature from SMPC to a control input to apply to the system. We assume that the remaining unknown dynamics can be accurately represented as a model that is linear in a set of coefficients, which enables fast adaptation to new conditions. We demonstrate our approach in experiments on a large ground robot using a stereo camera for localisation.}
}

Do we need to compensate for motion distortion and doppler effects in spinning radar navigation?
K. Burnett, A. P. Schoellig, and T. D. Barfoot
IEEE Robotics and Automation Letters, vol. 6, iss. 2, p. 771–778, 2021.

@article{burnett-ral21,
title = {Do We Need to Compensate for Motion Distortion and Doppler Effects in Spinning Radar Navigation?},
author = {Keenan Burnett and Angela P. Schoellig and Timothy D. Barfoot},
journal = {{IEEE Robotics and Automation Letters}},
year = {2021},
volume = {6},
number = {2},
pages = {771--778},
doi = {10.1109/LRA.2021.3052439},
}

Learning-based bias correction for time difference of arrival ultra-wideband localization of resource-constrained mobile robots
W. Zhao, J. Panerati, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 6, iss. 2, p. 3639–3646, 2021.

Accurate indoor localization is a crucial enabling technology for many robotics applications, from warehouse management to monitoring tasks. Ultra-wideband (UWB) time difference of arrival (TDOA)-based localization is a promising lightweight, low-cost solution that can scale to a large number of devices—making it especially suited for resource-constrained multi-robot applications. However, the localization accuracy of standard, commercially available UWB radios is often insufficient due to significant measurement bias and outliers. In this letter, we address these issues by proposing a robust UWB TDOA localization framework comprising of (i) learning-based bias correction and (ii) M-estimation-based robust filtering to handle outliers. The key properties of our approach are that (i) the learned biases generalize to different UWB anchor setups and (ii) the approach is computationally efficient enough to run on resource-constrained hardware. We demonstrate our approach on a Crazyflie nano-quadcopter. Experimental results show that the proposed localization framework, relying only on the onboard IMU and UWB, provides an average of 42.08\% localization error reduction (in three different anchor setups) compared to the baseline approach without bias compensation. We also show autonomous trajectory tracking on a quadcopter using our UWB TDOA localization approach.

@article{zhao-ral21,
title = {Learning-based Bias Correction for Time Difference of Arrival Ultra-wideband Localization of Resource-constrained Mobile Robots},
author = {Wenda Zhao and Jacopo Panerati and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2021},
volume = {6},
number = {2},
pages = {3639--3646},
doi = {10.1109/LRA.2021.3064199},
urlvideo = {https://youtu.be/J32mrDN5ws4},
abstract = {Accurate indoor localization is a crucial enabling technology for many robotics applications, from warehouse management to monitoring tasks. Ultra-wideband (UWB) time difference of arrival (TDOA)-based localization is a promising lightweight, low-cost solution that can scale to a large number of devices—making it especially suited for resource-constrained multi-robot applications. However, the localization accuracy of standard, commercially available UWB radios is often insufficient
due to significant measurement bias and outliers. In this letter, we address these issues by proposing a robust UWB TDOA localization framework comprising of (i) learning-based bias correction and (ii) M-estimation-based robust filtering to handle outliers. The key properties of our approach are that (i) the learned biases generalize to different UWB anchor setups and (ii) the approach is computationally efficient enough to run on resource-constrained hardware. We demonstrate our approach on a Crazyflie nano-quadcopter. Experimental results show that the proposed localization framework, relying only on the onboard IMU and UWB, provides an average of 42.08\% localization error reduction (in three different anchor setups) compared to the baseline approach without bias compensation. We also show autonomous trajectory tracking on a quadcopter using our UWB TDOA localization approach.}
}

Exploiting differential flatness for robust learning-based tracking control using Gaussian processes
M. Greeff and A. Schoellig
IEEE Control Systems Letters, vol. 5, iss. 4, p. 1121–1126, 2021.

Learning-based control has shown to outperform conventional model-based techniques in the presence of model uncertainties and systematic disturbances. However, most state-of-the-art learning-based nonlinear trajectory tracking controllers still lack any formal guarantees. In this letter, we exploit the property of differential flatness to design an online, robust learning-based controller to achieve both high tracking performance and probabilistically guarantee a uniform ultimate bound on the tracking error. A common control approach for differentially flat systems is to try to linearize the system by using a feedback (FB) linearization controller designed based on a nominal system model. Performance and safety are limited by the mismatch between the nominal model and the actual system. Our proposed approach uses a nonparametric Gaussian Process (GP) to both improve FB linearization and quantify, probabilistically, the uncertainty in our FB linearization. We use this probabilistic bound in a robust linear quadratic regulator (LQR) framework. Through simulation, we highlight that our proposed approach significantly outperforms alternative learning-based strategies that use differential flatness.

@article{greeff-lcss21,
title = {Exploiting Differential Flatness for Robust Learning-Based Tracking Control using {Gaussian} Processes},
author = {Melissa Greeff and Angela Schoellig},
journal = {{IEEE Control Systems Letters}},
year = {2021},
volume = {5},
number = {4},
pages = {1121--1126},
doi = {10.1109/LCSYS.2020.3009177},
urlvideo = {https://youtu.be/ZFzZkKjQ3qw},
abstract = {Learning-based control has shown to outperform conventional model-based techniques in the presence of model uncertainties and systematic disturbances. However, most state-of-the-art learning-based nonlinear trajectory tracking controllers still lack any formal guarantees. In this letter, we exploit the property of differential flatness to design an online, robust learning-based controller to achieve both high tracking performance and probabilistically guarantee a uniform ultimate bound on the tracking error. A common control approach for differentially flat systems is to try to linearize the system by using a feedback (FB) linearization controller designed based on a nominal system model. Performance and safety are limited by the mismatch between the nominal model and the actual system. Our proposed approach uses a nonparametric Gaussian Process (GP) to both improve FB linearization and quantify, probabilistically, the uncertainty in our FB linearization. We use this probabilistic bound in a robust linear quadratic regulator (LQR) framework. Through simulation, we highlight that our proposed approach significantly outperforms alternative learning-based strategies that use differential flatness.}
}

Online spatio-temporal calibration of tightly-coupled ultrawideband-aided inertial localization
A. Goudar and A. P. Schoellig
in Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2021. Accepted.

The combination of ultrawideband (UWB) radios and inertial measurement units (IMU) can provide accurate positioning in environments where the Global Positioning System (GPS) service is either unavailable or has unsatisfactory performance. The two sensors, IMU and UWB radio, are often not co-located on a moving system. The UWB radio is typically located at the extremities of the system to ensure reliable communication, whereas the IMUs are located closer to its center of gravity. Furthermore, without hardware or software synchronization, data from heterogeneous sensors can arrive at different time instants resulting in temporal offsets. If uncalibrated, these spatial and temporal offsets can degrade the positioning performance. In this paper, using observability and identifiability criteria, we derive the conditions required for successfully calibrating the spatial and the temporal offset parameters of a tightly-coupled UWB-IMU system. We also present an online method for jointly calibrating these offsets. The results show that our calibration approach results in improved positioning accuracy while simultaneously estimating (i) the spatial offset parameters to millimeter precision and (ii) the temporal offset parameter to millisecond precision.

@INPROCEEDINGS{goudar-iros21,
author = {Abhishek Goudar and Angela P. Schoellig},
title = {Online Spatio-temporal Calibration of Tightly-coupled Ultrawideband-aided Inertial Localization},
booktitle = {{Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS)}},
year = {2021},
note = {Accepted},
abstract = {The combination of ultrawideband (UWB) radios and inertial measurement units (IMU) can provide accurate positioning in environments where the Global Positioning System (GPS) service is either unavailable or has unsatisfactory performance. The two sensors, IMU and UWB radio, are often not co-located on a moving system. The UWB radio is typically located at the extremities of the system to ensure reliable communication, whereas the IMUs are located closer to its center of gravity. Furthermore, without hardware or software synchronization, data from heterogeneous sensors can arrive at different time instants resulting in temporal offsets. If uncalibrated, these spatial and temporal offsets can degrade the positioning performance. In this paper, using observability and identifiability criteria, we derive the conditions required for successfully calibrating the spatial and the temporal offset parameters of a tightly-coupled UWB-IMU system. We also present an online method for jointly calibrating these offsets. The results show that our calibration approach results in improved positioning accuracy while simultaneously estimating (i) the spatial offset parameters to millimeter precision and (ii) the temporal offset parameter to millisecond precision.},
}

Mobile manipulation in unknown environments with differential inverse kinematics control
A. Heins, M. Jakob, and A. P. Schoellig
in Proc. of the Conference on Robots and Vision (CRV), 2021, p. 64–71.

Mobile manipulators combine the large workspace of mobile robots with the interactive capabilities of manipulator arms, making them useful in a variety of domains including construction and assistive care. We propose a differential inverse kinematics whole-body control approach for position-controlled industrial mobile manipulators. Our controller is capable of task-space trajectory tracking, force regulation, obstacle and singularity avoidance, and pushing an object toward a goal location, with limited sensing and knowledge of the environment. We evaluate the proposed approach through extensive experiments on a 9 degree-of-freedom omnidirectional mobile manipulator. A video demonstrating many of the experiments can be found at http://tiny.cc/crv21-mm.

@INPROCEEDINGS{heins-crv21,
author = {Adam Heins and Michael Jakob and Angela P. Schoellig},
title = {Mobile Manipulation in Unknown Environments with Differential Inverse Kinematics Control},
booktitle = {{Proc. of the Conference on Robots and Vision (CRV)}},
year = {2021},
pages = {64--71},
urlvideo = {http://tiny.cc/crv21-mm},
abstract = {Mobile manipulators combine the large workspace of mobile robots with the interactive capabilities of manipulator arms, making them useful in a variety of domains including construction and assistive care. We propose a differential inverse kinematics whole-body control approach for position-controlled industrial mobile manipulators. Our controller is capable of task-space trajectory tracking, force regulation, obstacle and singularity avoidance, and pushing an object toward a goal location, with limited sensing and knowledge of the environment. We evaluate the proposed approach through extensive experiments on a 9 degree-of-freedom omnidirectional mobile manipulator. A video demonstrating many of the experiments can be found at http://tiny.cc/crv21-mm.},
}

Learning to fly—a Gym environment with PyBullet physics for reinforcement learning of multi-agent quadcopter control
J. Panerati, H. Zheng, S. Zhou, J. Xu, A. Prorok, and A. P. Schoellig
in Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2021. Accepted.

Robotic simulators are crucial for academic research and education as well as the development of safetycritical applications. Reinforcement learning environments—simple simulations coupled with a problem specification in the form of a reward function—are also important to standardize the development (and benchmarking) of learning algorithms. Yet, full-scale simulators typically lack portability and parallelizability. Vice versa, many reinforcement learning environments trade-off realism for high sample throughputs in toylike problems. While public data sets have greatly benefited deep learning and computer vision, we still lack the software tools to simultaneously develop—and fairly compare—control theory and reinforcement learning approaches. In this paper, we propose an open-source OpenAI Gym-like environment for multiple quadcopters based on the Bullet physics engine. Its multi-agent and vision-based reinforcement learning interfaces, as well as the support of realistic collisions and aerodynamic effects, make it, to the best of our knowledge, a first of its kind. We demonstrate its use through several examples, either for control (trajectory tracking with PID control, multi-robot flight with downwash, etc.) or reinforcement learning (single and multi-agent stabilization tasks), hoping to inspire future research that combines control theory and machine learning.

@INPROCEEDINGS{panerati-iros21,
author = {Jacopo Panerati and Hehui Zheng and SiQi Zhou and James Xu and Amanda Prorok and Angela P. Schoellig},
title = {Learning to Fly—a {Gym} Environment with {PyBullet} Physics for
Reinforcement Learning of Multi-agent Quadcopter Control},
booktitle = {{Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS)}},
year = {2021},
note = {Accepted},
abstract = {Robotic simulators are crucial for academic research and education as well as the development of safetycritical applications. Reinforcement learning environments—simple simulations coupled with a problem specification in the form of a reward function—are also important to standardize the development (and benchmarking) of learning algorithms. Yet, full-scale simulators typically lack portability and parallelizability. Vice versa, many reinforcement learning environments trade-off realism for high sample throughputs in toylike problems. While public data sets have greatly benefited deep learning and computer vision, we still lack the software tools to simultaneously develop—and fairly compare—control theory and reinforcement learning approaches. In this paper, we propose an open-source OpenAI Gym-like environment for multiple quadcopters based on the Bullet physics engine. Its multi-agent and vision-based reinforcement learning interfaces, as well as the support of realistic collisions and aerodynamic effects, make it, to the best of our knowledge, a first of its kind. We demonstrate its use through several examples, either for control (trajectory tracking with PID control, multi-robot flight with downwash, etc.) or reinforcement learning (single and multi-agent stabilization tasks), hoping to inspire future research that combines control theory and machine learning.},
}

## 2020

Haul road monitoring in open pit mines using unmanned aerial vehicles: a case study at Bald Mountain mine site
F. Medinac, T. Bamford, M. Hart, M. Kowalczyk, and K. Esmaeili
Mining, Metallurgy & Exploration, vol. 37, p. 1877–1883, 2020.

Improved haul road conditions can positively impact mine operations resulting in increased safety, productivity gains, increased tire life, and lower maintenance costs. For these reasons, a monitoring program is required to ensure the operational efficiency of the haul roads. Currently, at Bald Mountain mine, monthly site severity studies, ad hoc inspections by frontline supervisors, or operator feedback reporting is used to assess road conditions. These methods are subjective and provide low temporal resolution data. This case study presents novel unmanned aerial vehicle (UAV) technologies, applied on a critical section of haul road at Bald Mountain, to showcase the potential for monitoring haul roads. The results show that orthophotos and digital elevation models can be used to assess the road smoothness condition and to check the road design compliance. Moreover, the aerial mapping allows detection of surface water, rock spillage, and potholes on the road that can be quickly repaired/removed by the dedicated road maintenance team.

@article{medinac-mme20,
title = {Haul road monitoring in open pit mines using unmanned aerial vehicles: A case study at {Bald Mountain} mine site},
author = {Filip Medinac and Thomas Bamford and Matthew Hart and Michal Kowalczyk and Kamran Esmaeili},
journal = {{Mining, Metallurgy \& Exploration}},
year = {2020},
volume = {37},
pages = {1877--1883},
doi = {10.1007/s42461-020-00291-w},
abstract = {Improved haul road conditions can positively impact mine operations resulting in increased safety, productivity gains, increased tire life, and lower maintenance costs. For these reasons, a monitoring program is required to ensure the operational efficiency of the haul roads. Currently, at Bald Mountain mine, monthly site severity studies, ad hoc inspections by frontline supervisors, or operator feedback reporting is used to assess road conditions. These methods are subjective and provide low temporal resolution data. This case study presents novel unmanned aerial vehicle (UAV) technologies, applied on a critical section of haul road at Bald Mountain, to showcase the potential for monitoring haul roads. The results show that orthophotos and digital elevation models can be used to assess the road smoothness condition and to check the road design compliance. Moreover, the aerial mapping allows detection of surface water, rock spillage, and potholes on the road that can be quickly repaired/removed by the dedicated road maintenance team.},
}

Deep neural networks as add-on modules for enhancing robot performance in impromptu trajectory tracking
S. Zhou, M. K. Helwa, and A. P. Schoellig
The International Journal of Robotics Research, p. 1–22, 2020.

High-accuracy trajectory tracking is critical to many robotic applications, including search and rescue, advanced manufacturing, and industrial inspection, to name a few. Yet the unmodeled dynamics and parametric uncertainties of operating in such complex environments make it difficult to design controllers that are capable of accurately tracking arbitrary, feasible trajectories from the first attempt (i.e., impromptu trajectory tracking). This article proposes a platform-independent, learning-based ‘‘add-on’’ module to enhance the tracking performance of black-box control systems in impromptu tracking tasks. Our approach is to pre-cascade a deep neural network (DNN) to a stabilized baseline control system, in order to establish an identity mapping from the desired output to the actual output. Previous research involving quadrotors showed that, for 30 arbitrary hand-drawn trajectories, the DNN-enhancement control architecture reduces tracking errors by 43\% on average, as compared with the baseline controller. In this article, we provide a platform-independent formulation and practical design guidelines for the DNN-enhancement approach. In particular, we: (1) characterize the underlying function of the DNN module; (2) identify necessary conditions for the approach to be effective; (3) provide theoretical insights into the stability of the overall DNN-enhancement control architecture; (4) derive a condition that supports dataefficient training of the DNN module; and (5) compare the novel theory-driven DNN design with the prior trial-and-error design using detailed quadrotor experiments. We show that, as compared with the prior trial-and-error design, the novel theory-driven design allows us to reduce the input dimension of the DNN by two thirds while achieving similar tracking performance.

@article{zhou-ijrr20,
title = {Deep neural networks as add-on modules for enhancing robot performance in impromptu trajectory tracking},
author = {Siqi Zhou and Mohamed K. Helwa and Angela P. Schoellig},
journal = {{The International Journal of Robotics Research}},
year = {2020},
volume = {0},
number = {0},
pages = {1--22},
doi = {10.1177/0278364920953902},
urlvideo = {https://youtu.be/K-DrZGFvpN4},
abstract = {High-accuracy trajectory tracking is critical to many robotic applications, including search and rescue, advanced manufacturing, and industrial inspection, to name a few. Yet the unmodeled dynamics and parametric uncertainties of operating in such complex environments make it difficult to design controllers that are capable of accurately tracking arbitrary, feasible trajectories from the first attempt (i.e., impromptu trajectory tracking). This article proposes a platform-independent,
learning-based ‘‘add-on’’ module to enhance the tracking performance of black-box control systems in impromptu tracking tasks. Our approach is to pre-cascade a deep neural network (DNN) to a stabilized baseline control system, in order to establish an identity mapping from the desired output to the actual output. Previous research involving quadrotors showed that, for 30 arbitrary hand-drawn trajectories, the DNN-enhancement control architecture reduces tracking errors by 43\% on average, as compared with the baseline controller. In this article, we provide a platform-independent formulation and practical design guidelines for the DNN-enhancement approach. In particular, we: (1) characterize the underlying function of the DNN module; (2) identify necessary conditions for the approach to be effective; (3) provide theoretical insights into the stability of the overall DNN-enhancement control architecture; (4) derive a condition that supports dataefficient training of the DNN module; and (5) compare the novel theory-driven DNN design with the prior trial-and-error design using detailed quadrotor experiments. We show that, as compared with the prior trial-and-error design, the novel theory-driven design allows us to reduce the input dimension of the DNN by two thirds while achieving similar tracking performance.}
}

Continuous monitoring and improvement of the blasting process in open pit mines using unmanned aerial vehicle techniques
T. Bamford, F. Medinac, and K. Esmaeili
Remote Sensing, vol. 12, iss. 17, p. 2801, 2020.

The current techniques used for monitoring the blasting process in open pit mines are manual, intermittent and inefficient and can expose technical manpower to hazardous conditions. This study presents the application of unmanned aerial vehicle (UAV) systems for monitoring and improving the blasting process in open pit mines. Field experiments were conducted in different open pit mines to assess rock fragmentation, blast-induced damage on final pit walls, blast dynamics and the accuracy of blastholes including production and pre-split holes. The UAV-based monitoring was done in three different stages, including pre-blasting, blasting and post-blasting. In the pre-blasting stage, pit walls were mapped to collect structural data to predict in situ block size distribution and to develop as-built pit wall digital elevation models (DEM) to assess blast-induced damage. This was followed by mapping the production blasthole patterns implemented in the mine to investigate drillhole alignment. To monitor the blasting process, a high-speed camera was mounted on the UAV to investigate blast initiation, sequencing, misfired holes and stemming ejection. In the post-blast stage, the blasted rock pile (muck pile) was monitored to estimate fragmentation and assess muck pile configuration, heave and throw. The collected aerial data provide detailed information and high spatial and temporal resolution on the quality of the blasting process and significant opportunities for process improvement. The current challenges with regards to the application of UAVs for blasting process monitoring are discussed, and recommendations for obtaining the most value out of an UAV application are provided.

@article{bamford-rs20,
title = {Continuous Monitoring and Improvement of the Blasting Process in Open Pit Mines Using Unmanned Aerial Vehicle Techniques},
author = {Thomas Bamford and Filip Medinac and Kamran Esmaeili},
journal = {{Remote Sensing}},
year = {2020},
volume = {12},
number = {17},
doi = {10.3390/rs12172801},
pages = {2801},
abstract = {The current techniques used for monitoring the blasting process in open pit mines are manual, intermittent and inefficient and can expose technical manpower to hazardous conditions. This study presents the application of unmanned aerial vehicle (UAV) systems for monitoring and improving the blasting process in open pit mines. Field experiments were conducted in different open pit mines to assess rock fragmentation, blast-induced damage on final pit walls, blast dynamics and the accuracy of blastholes including production and pre-split holes. The UAV-based monitoring was done in three different stages, including pre-blasting, blasting and post-blasting. In the pre-blasting stage, pit walls were mapped to collect structural data to predict in situ block size distribution and to develop as-built pit wall digital elevation models (DEM) to assess blast-induced damage. This was followed by mapping the production blasthole patterns implemented in the mine to investigate drillhole alignment. To monitor the blasting process, a high-speed camera was mounted on the UAV to investigate blast initiation, sequencing, misfired holes and stemming ejection. In the post-blast stage, the blasted rock pile (muck pile) was monitored to estimate fragmentation and assess muck pile configuration, heave and throw. The collected aerial data provide detailed information and high spatial and temporal resolution on the quality of the blasting process and significant opportunities for process improvement. The current challenges with regards to the application of UAVs for blasting process monitoring are discussed, and recommendations for obtaining the most value out of an UAV application are provided.},
}

To share or not to share? performance guarantees and the asymmetric nature of cross-robot experience transfer
M. J. Sorocky, S. Zhou, and A. P. Schoellig
IEEE Control Systems Letters, vol. 5, iss. 3, p. 923–928, 2020.

In the robotics literature, experience transfer has been proposed in different learning-based control frameworks to minimize the costs and risks associated with training robots. While various works have shown the feasibility of transferring prior experience from a source robot to improve or accelerate the learning of a target robot, there are usually no guarantees that experience transfer improves the performance of the target robot. In practice, the efficacy of transferring experience is often not known until it is tested on physical robots. This trial-and-error approach can be extremely unsafe and inefficient. Building on our previous work, in this paper we consider an inverse module transfer learning framework, where the inverse module of a source robot system is transferred to a target robot system to improve its tracking performance on arbitrary trajectories. We derive a theoretical bound on the tracking error when a source inverse module is transferred to the target robot and propose a Bayesian-optimization-based algorithm to estimate this bound from data. We further highlight the asymmetric nature of cross-robot experience transfer that has often been neglected in the literature. We demonstrate our approach in quadrotor experiments and show that we can guarantee positive transfer on the target robot for tracking random periodic trajectories.

@article{sorocky-lcss20,
title = {To Share or Not to Share? Performance Guarantees and the Asymmetric Nature of Cross-Robot Experience Transfer},
author = {Michael J. Sorocky and Siqi Zhou and Angela P. Schoellig},
journal = {{IEEE Control Systems Letters}},
year = {2020},
volume = {5},
number = {3},
pages = {923--928},
doi = {10.1109/LCSYS.2020.3005886},
urlvideo2 = {https://youtu.be/wVAxJO-pejQ},
abstract = {In the robotics literature, experience transfer has been proposed in different learning-based control frameworks to minimize the costs and risks associated with training robots. While various works have shown the feasibility of transferring prior experience from a source robot to improve or accelerate the learning of a target robot, there are usually no guarantees that experience transfer improves the performance of the target robot. In practice, the efficacy of transferring experience is often not known until it is tested on physical robots. This trial-and-error approach can be extremely unsafe and inefficient. Building on our previous work, in this paper we consider an inverse module transfer learning framework, where the inverse module of a source robot system is transferred to a target robot system to improve its tracking performance on arbitrary trajectories. We derive a theoretical bound on the tracking error when a source inverse module is transferred to the target robot and propose a Bayesian-optimization-based algorithm to estimate this bound from data. We further highlight the asymmetric nature of cross-robot experience transfer that has often been neglected in the literature. We demonstrate our approach in quadrotor experiments and show that we can guarantee positive transfer on the target robot for tracking random periodic trajectories.}
}

Variational inference with parameter learning applied to vehicle trajectory estimation
J. N. Wong, D. J. Yoon, A. P. Schoellig, and T. D. Barfoot
IEEE Robotics and Automation Letters, vol. 5, iss. 4, p. 5291–5298, 2020.

We present parameter learning in a Gaussian variational inference setting using only noisy measurements (i.e., no groundtruth). This is demonstrated in the context of vehicle trajectory estimation, although the method we propose is general. The letter extends the Exactly Sparse Gaussian Variational Inference (ESGVI) framework, which has previously been used for large-scale nonlinear batch state estimation. Our contribution is to additionally learn parameters of our system models (which may be difficult to choose in practice) within the ESGVI framework. In this letter, we learn the covariances for the motion and sensor models used within vehicle trajectory estimation. Specifically, we learn the parameters of a white-noise-on-acceleration motion model and the parameters of an Inverse-Wishart prior over measurement covariances for our sensor model. We demonstrate our technique using a 36 km dataset consisting of a car using lidar to localize against a high-definition map; we learn the parameters on a training section of the data and then show that we achieve high-quality state estimates on a test section, even in the presence of outliers. Lastly, we show that our framework can be used to solve pose graph optimization even with many false loop closures.

@article{wong-ral20b,
title = {Variational Inference with Parameter Learning Applied to Vehicle Trajectory Estimation},
author = {Jeremy N. Wong and David J. Yoon and Angela P. Schoellig and Timothy D. Barfoot},
journal = {{IEEE Robotics and Automation Letters}},
year = {2020},
volume = {5},
number = {4},
pages = {5291--5298},
doi = {10.1109/LRA.2020.3007381},
urlvideo = {https://youtu.be/WTj7Cl0wXFo},
abstract = {We present parameter learning in a Gaussian variational inference setting using only noisy measurements (i.e., no groundtruth). This is demonstrated in the context of vehicle trajectory estimation, although the method we propose is general. The letter extends the Exactly Sparse Gaussian Variational Inference (ESGVI) framework, which has previously been used for large-scale nonlinear batch state estimation. Our contribution is to additionally learn parameters of our system models (which may be difficult to choose in practice) within the ESGVI framework. In this letter, we learn the covariances for the motion and sensor models used within vehicle trajectory estimation. Specifically, we learn the parameters of a white-noise-on-acceleration motion model and the parameters of an Inverse-Wishart prior over measurement covariances for our sensor model. We demonstrate our technique using a 36 km dataset consisting of a car using lidar to localize against a high-definition map; we learn the parameters on a training section of the data and then show that we achieve high-quality state estimates on a test section, even in the presence of outliers. Lastly, we show that our framework can be used to solve pose graph optimization even with many false loop closures.}
}

Online trajectory generation with distributed model predictive control for multi-robot motion planning
C. E. Luis, M. Vukosavljev, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 5, iss. 2, p. 604–611, 2020.

We present a distributed model predictive control (DMPC) algorithm to generate trajectories in real-time for multiple robots. We adopted the on-demand collision avoidance method presented in previous work to efficiently compute non-colliding trajectories in transition tasks. An event-triggered replanning strategy is proposed to account for disturbances. Our simulation results show that the proposed collision avoidance method can reduce, on average, around 50\% of the travel time required to complete a multi-agent point-to-point transition when compared to the well-studied Buffered Voronoi Cells (BVC) approach. Additionally, it shows a higher success rate in transition tasks with a high density of agents, with more than 90\% success rate with 30 palm-sized quadrotor agents in a 18 m^3 arena. The approach was experimentally validated with a swarm of up to 20 drones flying in close proximity.

@article{luis-ral20,
title = {Online Trajectory Generation with Distributed Model Predictive Control for Multi-Robot Motion Planning},
author = {Carlos E. Luis and Marijan Vukosavljev and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2020},
volume = {5},
number = {2},
pages = {604--611},
doi = {10.1109/LRA.2020.2964159},
abstract = {We present a distributed model predictive control (DMPC) algorithm to generate trajectories in real-time for multiple robots. We adopted the on-demand collision avoidance method presented in previous work to efficiently compute non-colliding trajectories in transition tasks. An event-triggered replanning strategy is proposed to account for disturbances. Our simulation results show that the proposed collision avoidance method can reduce, on average, around 50\% of the travel time required to complete a multi-agent point-to-point transition when compared to the well-studied Buffered Voronoi Cells (BVC) approach. Additionally, it shows a higher success rate in transition tasks with a high density of agents, with more than 90\% success rate with 30 palm-sized quadrotor agents in a 18 m^3 arena. The approach was experimentally validated with a swarm of up to 20 drones flying in close proximity.}
}

A data-driven motion prior for continuous-time trajectory estimation on SE(3)
J. N. Wong, D. J. Yoon, A. P. Schoellig, and T. D. Barfoot
IEEE Robotics and Automation Letters, vol. 5, iss. 2, p. 1429–1436, 2020.

Simultaneous trajectory estimation and mapping (STEAM) is a method for continuous-time trajectory estimation in which the trajectory is represented as a Gaussian Process (GP). Previous formulations of STEAM used a GP prior that assumed either white-noise-on-acceleration (WNOA) or white-noise-on-jerk (WNOJ). However, previous work did not provide a principled way to choose the continuous-time motion prior or its parameters on a real robotic system. This paper derives a novel data-driven motion prior where ground truth trajectories of a moving robot are used to train a motion prior that better represents the robot’s motion. In this approach, we use a prior where latent accelerations are represented as a GP with a Matérn covariance function and draw a connection to the Singer acceleration model. We then formulate a variation of STEAM using this new prior. We train the WNOA, WNOJ, and our new latent-force prior and evaluate their performance in the context of both lidar localization and lidar odometry of a car driving along a 20km route, where we show improved state estimates compared to the two previous formulations.

@article{wong-ral20,
title = {A Data-Driven Motion Prior for Continuous-Time Trajectory Estimation on {SE(3)}},
author = {Jeremy N. Wong and David J. Yoon and Angela P. Schoellig and Timothy D. Barfoot},
journal = {{IEEE Robotics and Automation Letters}},
year = {2020},
volume = {5},
number = {2},
pages = {1429--1436},
doi = {10.1109/LRA.2020.2969153},
urlvideo = {https://youtu.be/xUGl3w6meZg},
abstract = {Simultaneous trajectory estimation and mapping (STEAM) is a method for continuous-time trajectory estimation in which the trajectory is represented as a Gaussian Process (GP). Previous formulations of STEAM used a GP prior that assumed either white-noise-on-acceleration (WNOA) or white-noise-on-jerk (WNOJ). However, previous work did not provide a principled way to choose the continuous-time motion prior or its parameters on a real robotic system. This paper derives a novel data-driven motion prior where ground truth trajectories of a moving robot are used to train a motion prior that better represents the robot's motion. In this approach, we use a prior where latent accelerations are represented as a GP with a Mat\'{e}rn covariance function and draw a connection to the Singer acceleration model. We then formulate a variation of STEAM using this new prior.
We train the WNOA, WNOJ, and our new latent-force prior and evaluate their performance in the context of both lidar localization and lidar odometry of a car driving along a 20km route, where we show improved state estimates compared to the two previous formulations.}
}

Catch the ball: accurate high-speed motions for mobile manipulators via inverse dynamics learning
K. Dong, K. Pereida, F. Shkurti, and A. P. Schoellig
in Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2020, p. 6718–6725.

Mobile manipulators consist of a mobile platform equipped with one or more robot arms and are of interest for a wide array of challenging tasks because of their extended workspace and dexterity. Typically, mobile manipulators are deployed in slow-motion collaborative robot scenarios. In this paper, we consider scenarios where accurate high-speed motions are required. We introduce a framework for this regime of tasks including two main components: (i) a bi-level motion optimization algorithm for real-time trajectory generation, which relies on Sequential Quadratic Programming (SQP) and Quadratic Programming (QP), respectively; and (ii) a learning-based controller optimized for precise tracking of high-speed motions via a learned inverse dynamics model. We evaluate our framework with a mobile manipulator platform through numerous high-speed ball catching experiments, where we show a success rate of 85.33\%. To the best of our knowledge, this success rate exceeds the reported performance of existing related systems and sets a new state of the art.

@INPROCEEDINGS{dong-iros20,
author = {Ke Dong and Karime Pereida and Florian Shkurti and Angela P. Schoellig},
title = {Catch the Ball: Accurate High-Speed Motions for Mobile Manipulators via Inverse Dynamics Learning},
booktitle = {{Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS)}},
year = {2020},
pages = {6718--6725},
urlvideo2 = {https://youtu.be/LlWN3cGUIbk},
abstract = {Mobile manipulators consist of a mobile platform equipped with one or more robot arms and are of interest for a wide array of challenging tasks because of their extended workspace and dexterity. Typically, mobile manipulators are deployed in slow-motion collaborative robot scenarios. In this paper, we consider scenarios where accurate high-speed motions are required. We introduce a framework for this regime of tasks including two main components: (i) a bi-level motion optimization algorithm for real-time trajectory generation, which relies on Sequential Quadratic Programming (SQP) and Quadratic Programming (QP), respectively; and (ii) a learning-based controller optimized for precise tracking of high-speed motions via a learned inverse dynamics model. We evaluate our framework with a mobile manipulator platform through numerous high-speed ball catching experiments, where we show a success rate of 85.33\%. To the best of our knowledge, this success rate exceeds the reported performance of existing related systems and sets a new state of the art.},
}

Optimal geometry for ultra-wideband localization using Bayesian optimization
W. Zhao, M. Vukosavljev, and A. P. Schoellig
in Proc. of the International Federation of Automatic Control (IFAC) World Congress, 2020, p. 15481–15488.

This paper introduces a novel algorithm to find a geometric configuration of ultrawideband sources in order to provide optimal position estimation performance with TimeDifference-of-Arrival measurements. Different from existing works, we aim to achieve the best localization performance for a user-defined region of interest instead of a single target point. We employ an analysis based on the Cramer-Rao lower bound and dilution of precision to formulate an optimization problem. A Bayesian optimization-based algorithm is proposed to find an optimal geometry that achieves the smallest estimation variance upper bound while ensuring source placement constraints. The approach is validated through simulation and experimental results in 2D scenarios, showing an improvement over a naive source placement.

@INPROCEEDINGS{zhao-ifac20,
author = {Wenda Zhao and Marijan Vukosavljev and Angela P. Schoellig},
title = {Optimal Geometry for Ultra-wideband Localization using {Bayesian} Optimization},
booktitle = {{Proc. of the International Federation of Automatic Control (IFAC) World Congress}},
year = {2020},
volume = {53},
number = {2},
pages = {15481--15488},
urlvideo = {https://youtu.be/5mqKOfWpEWc},
abstract = {This paper introduces a novel algorithm to find a geometric configuration of ultrawideband sources in order to provide optimal position estimation performance with TimeDifference-of-Arrival measurements. Different from existing works, we aim to achieve the best localization performance for a user-defined region of interest instead of a single target point. We employ an analysis based on the Cramer-Rao lower bound and dilution of precision to formulate an optimization problem. A Bayesian optimization-based algorithm is proposed to find an optimal geometry that achieves the smallest estimation variance upper bound while ensuring source placement constraints. The approach is validated through simulation and experimental results in 2D scenarios, showing an improvement over a naive source placement.},
}

A perception-aware flatness-based model predictive controller for fast vision-based multirotor flight
M. Greeff, T. D. Barfoot, and A. P. Schoellig
in Proc. of the International Federation of Automatic Control (IFAC) World Congress, 2020, p. 9412–9419.

Despite the push toward fast, reliable vision-based multirotor flight, most vision- based navigation systems still rely on controllers that are perception-agnostic. Given that these controllers ignore their effect on the system’s localisation capabilities, they can produce an action that allows vision-based localisation (and consequently navigation) to fail. In this paper, we present a perception-aware flatness-based model predictive controller (MPC) that accounts for its effect on visual localisation. To achieve perception awareness, we first develop a simple geometric model that uses over 12 km of flight data from two different environments (urban and rural) to associate visual landmarks with a probability of being successfully matched. In order to ensure localisation, we integrate this model as a chance constraint in our MPC such that we are probabilistically guaranteed that the number of successfully matched visual landmarks exceeds a minimum threshold. We show how to simplify the chance constraint to a nonlinear, deterministic constraint on the position of the multirotor. With desired speeds of 10 m/s, we demonstrate in simulation (based on real-world perception data) how our proposed perception-aware MPC is able to achieve faster flight while guaranteeing localisation compared to similar perception-agnostic controllers. We illustrate how our perception-aware MPC adapts the path constraint along the path based on the perception model by accounting for camera orientation, path error and location of the visual landmarks. The result is that repeating the same geometric path but with the camera facing in opposite directions can lead to different optimal paths flown.

@INPROCEEDINGS{greeff-ifac20,
author = {Melissa Greeff and Timothy D. Barfoot and Angela P. Schoellig},
title = {A Perception-Aware Flatness-Based Model Predictive Controller for Fast Vision-Based Multirotor Flight},
booktitle = {{Proc. of the International Federation of Automatic Control (IFAC) World Congress}},
year = {2020},
volume = {53},
number = {2},
pages = {9412--9419},
urlvideo = {https://youtu.be/aBEce5aWfvk},
abstract = {Despite the push toward fast, reliable vision-based multirotor flight, most vision-
based navigation systems still rely on controllers that are perception-agnostic. Given that these controllers ignore their effect on the system’s localisation capabilities, they can produce an action that allows vision-based localisation (and consequently navigation) to fail. In this paper, we present a perception-aware flatness-based model predictive controller (MPC) that accounts for its effect on visual localisation. To achieve perception awareness, we first develop a simple geometric model that uses over 12 km of flight data from two different environments (urban and rural) to associate visual landmarks with a probability of being successfully matched. In order to ensure localisation, we integrate this model as a chance constraint in our MPC such that we are probabilistically guaranteed that the number of successfully matched visual landmarks exceeds a minimum threshold. We show how to simplify the chance constraint to a nonlinear, deterministic constraint on the position of the multirotor. With desired speeds of 10 m/s, we demonstrate in simulation (based on real-world perception data) how our proposed perception-aware MPC is able to achieve faster flight while guaranteeing localisation compared to similar perception-agnostic controllers. We illustrate how our perception-aware MPC adapts the path constraint along the path based on the perception model by accounting for camera orientation, path error and location of the visual landmarks. The result is that repeating the same geometric path but with the camera facing in opposite directions can lead to different optimal paths flown.},
}

Visual localization with Google Earth images for robust global pose estimation of UAVs
B. Patel, T. D. Barfoot, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2020, p. 6491–6497.

We estimate the global pose of a multirotor UAV by visually localizing images captured during a flight with Google Earth images pre-rendered from known poses. We metrically localize real images with georeferenced rendered images using a dense mutual information technique to allow accurate global pose estimation in outdoor GPS-denied environments. We show the ability to consistently localize throughout a sunny summer day despite major lighting changes while demonstrating that a typical feature-based localizer struggles under the same conditions. Successful image registrations are used as measurements in a filtering framework to apply corrections to the pose estimated by a gimballed visual odometry pipeline. We achieve less than 1 metre and 1 degree RMSE on a 303 metre flight and less than 3 metres and 3 degrees RMSE on six 1132 metre flights as low as 36 metres above ground level conducted at different times of the day from sunrise to sunset.

@INPROCEEDINGS{patel-icra20,
title = {Visual Localization with {Google Earth} Images for Robust Global Pose Estimation of {UAV}s},
author = {Bhavit Patel and Timothy D. Barfoot and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2020},
pages = {6491--6497},
urlvideo = {https://tiny.cc/GElocalization},
abstract = {We estimate the global pose of a multirotor UAV by visually localizing images captured during a flight with Google Earth images pre-rendered from known poses. We metrically localize real images with georeferenced rendered images using a dense mutual information technique to allow accurate global pose estimation in outdoor GPS-denied environments. We show the ability to consistently localize throughout a sunny summer day despite major lighting changes while demonstrating that a typical feature-based localizer struggles under the same conditions. Successful image registrations are used as measurements in a filtering framework to apply corrections to the pose estimated by a gimballed visual odometry pipeline. We achieve less than 1 metre and 1 degree RMSE on a 303 metre flight and less than 3 metres and 3 degrees RMSE on six 1132 metre flights as low as 36 metres above ground level conducted at different times of the day from sunrise to sunset.}
}

Context-aware cost shaping to reduce the impact of model error in safe, receding horizon control
C. D. McKinnon and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 2386-2392.

This paper presents a method to enable a robot using stochastic Model Predictive Control (MPC) to achieve high performance on a repetitive path-following task. In particular, we consider the case where the accuracy of the model for robot dynamics varies significantly over the path–motivated by the fact that the models used in MPC must be computationally efficient, which limits their expressive power. Our approach is based on correcting the cost predicted using a simple learned dynamics model over the MPC horizon. This discourages the controller from taking actions that lead to higher cost than would have been predicted using the dynamics model. In addition, stochastic MPC provides a quantitative measure of safety by limiting the probability of violating state and input constraints over the prediction horizon. Our approach is unique in that it combines both online model learning and cost learning over the prediction horizon and is geared towards operating a robot in changing conditions. We demonstrate our algorithm in simulation and experiment on a ground robot that uses a stereo camera for localization.

@INPROCEEDINGS{mckinnon-icra20,
title = {Context-aware Cost Shaping to Reduce the Impact of Model Error in Safe, Receding Horizon Control},
author = {Christopher D. McKinnon and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2020},
pages = {2386-2392},
doi = {10.1109/ICRA40945.2020.9197521},
urlvideo = {https://youtu.be/xrgcO2-A9bo},
abstract = {This paper presents a method to enable a robot using stochastic Model Predictive Control (MPC) to achieve high performance on a repetitive path-following task. In particular, we consider the case where the accuracy of the model for robot dynamics varies significantly over the path–motivated by the fact that the models used in MPC must be computationally efficient, which limits their expressive power. Our approach is based on correcting the cost predicted using a simple learned dynamics model over the MPC horizon. This discourages the controller from taking actions that lead to higher cost than would have been predicted using the dynamics model. In addition, stochastic MPC provides a quantitative measure of safety by limiting the probability of violating state and input constraints over the prediction horizon. Our approach is unique in that it combines both online model learning and cost learning over the prediction horizon and is geared towards operating a robot in changing conditions. We demonstrate our algorithm in simulation and experiment on a ground robot that uses a stereo camera for localization.}
}

Experience selection using dynamics similarity for efficient multi-source transfer learning between robots
M. J. Sorocky, S. Zhou, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2020, p. 2739–2745.

In the robotics literature, different knowledge transfer approaches have been proposed to leverage the experience from a source task or robot—real or virtual—to accelerate the learning process on a new task or robot. A commonly made but infrequently examined assumption is that incorporating experience from a source task or robot will be beneficial. For practical applications, inappropriate knowledge transfer can result in negative transfer or unsafe behaviour. In this work, inspired by a system gap metric from robust control theory, the nu-gap, we present a data-efficient algorithm for estimating the similarity between pairs of robot systems. In a multi-source inter-robot transfer learning setup, we show that this similarity metric allows us to predict relative transfer performance and thus informatively select experiences from a source robot before knowledge transfer. We demonstrate our approach with quadrotor experiments, where we transfer an inverse dynamics model from a real or virtual source quadrotor to enhance the tracking performance of a target quadrotor on arbitrary hand-drawn trajectories. We show that selecting experiences based on the proposed similarity metric effectively facilitates the learning of the target quadrotor, improving performance by 62\% compared to a poorly selected experience.

@INPROCEEDINGS{sorocky-icra20,
author = {Michael J. Sorocky and Siqi Zhou and Angela P. Schoellig},
title = {Experience Selection Using Dynamics Similarity for Efficient Multi-Source Transfer Learning Between Robots},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2020},
pages = {2739--2745},
urlvideo = {https://youtu.be/8m3mOkljujM},
abstract = {In the robotics literature, different knowledge transfer approaches have been proposed to leverage the experience from a source task or robot—real or virtual—to accelerate the learning process on a new task or robot. A commonly made but infrequently examined assumption is that incorporating experience from a source task or robot will be beneficial. For practical applications, inappropriate knowledge transfer can result in negative transfer or unsafe behaviour. In this work, inspired by a system gap metric from robust control theory, the nu-gap, we present a data-efficient algorithm for estimating the similarity between pairs of robot systems. In a multi-source inter-robot transfer learning setup, we show that this similarity metric allows us to predict relative transfer performance and thus informatively select experiences from a source robot before knowledge transfer. We demonstrate our approach with quadrotor experiments, where we transfer an inverse dynamics model from a real or virtual source quadrotor to enhance the tracking performance of a target quadrotor on arbitrary hand-drawn trajectories. We show that selecting experiences based on the proposed similarity metric effectively facilitates the learning of the target quadrotor, improving performance by 62\% compared to a poorly selected experience.},
}

## 2019

Distributed iterative learning control for multi-agent systems
A. Hock and A. P. Schoellig
Autonomous Robots, vol. 43, iss. 8, p. 1989–2010, 2019.

The goal of this work is to enable a team of quadrotors to learn how to accurately track a desired trajectory while holding agiven formation. We solve this problem in a distributed manner, where each vehicle has only access to the information of its neighbors. The desired trajectory is only available to one (or few) vehicle(s). We present a distributed iterative learning control {(ILC)} approach where each vehicle learns from the experience of its own and its neighbors’ previous task repetitions and adapts its feedforward input to improve performance. Existing algorithms are extended in theory to make them more applicable to real-world experiments. In particular, we prove convergence of the learning scheme for any linear, causal learning function with gains chosen according to a simple scalar condition. Previous proofs were restricted to a specific learning function, which only depends on the tracking error derivative {(D-type ILC)}. This extension provides more degrees of freedom in the {ILC} design and, as a result, better performance can be achieved. We also show that stability is not affected by a linear dynamic coupling between neighbors. This allows the use of an additional consensus feedback controller to compensate for non-repetitive disturbances. Possible robustness extensions for the {ILC} algorithm are discussed, the so-called {Q-filter} and a {Kalman} filter for disturbance estimation. Finally, this is the first work to show distributed ILC in experiment. With a team of two quadrotors, the practical applicability of the proposed distributed multi-agent {ILC} approach is attested and the benefits of the theoretic extension are analyzed. In a second experimental setup with a team of four quadrotors, we evaluate the impact of different communication graph structures on the learning performance. The results indicate, that there is a trade-off between fast learning convergence and formation synchronicity, especially during the first iterations.

@article{hock-auro19,
title = {Distributed iterative learning control for multi-agent systems},
author = {Andreas Hock and Angela P. Schoellig},
journal = {{Autonomous Robots}},
year = {2019},
volume = {43},
number = {8},
pages = {1989--2010},
doi = {10.1007/s10514-019-09845-4},
abstract = {The goal of this work is to enable a team of quadrotors to learn how to accurately track a desired trajectory while holding agiven formation. We solve this problem in a distributed manner, where each vehicle has only access to the information of its neighbors. The desired trajectory is only available to one (or few) vehicle(s). We present a distributed iterative learning control {(ILC)} approach where each vehicle learns from the experience of its own and its neighbors’ previous task repetitions and adapts its feedforward input to improve performance. Existing algorithms are extended in theory to make them more applicable to real-world experiments. In particular, we prove convergence of the learning scheme for any linear, causal learning function with gains chosen according to a simple scalar condition. Previous proofs were restricted to a specific learning function, which only depends on the tracking error derivative {(D-type ILC)}. This extension provides more degrees of freedom in the {ILC} design and, as a result, better performance can be achieved. We also show that stability is not affected by a linear dynamic coupling between neighbors. This allows the use of an additional consensus feedback controller to compensate for non-repetitive disturbances. Possible robustness extensions for the {ILC} algorithm are discussed, the so-called {Q-filter} and a {Kalman} filter for disturbance estimation. Finally, this is the first work to show distributed ILC in experiment. With a team of two quadrotors, the practical applicability of the proposed distributed multi-agent {ILC} approach is attested and the benefits of the theoretic extension are analyzed. In a second experimental setup with a team of four quadrotors, we evaluate the impact of different communication graph structures on the learning performance. The results indicate, that there is a trade-off between fast learning convergence and formation synchronicity, especially during the first iterations.}
}

A modular framework for motion planning using safe-by-design motion primitives
M. Vukosavljev, Z. Kroeze, A. P. Schoellig, and M. E. Broucke
IEEE Transactions on Robotics, vol. 35, iss. 5, p. 1233–1252, 2019.

In this paper, we present a modular framework for solving a motion planning problem among a group of robots. The proposed framework utilizes a finite set of low-level motion primitives to generate motions in a gridded workspace. The constraints on allowable sequences of motion primitives are formalized through a maneuver automaton . At the high level, a control policy determines which motion primitive is executed in each box of the gridded workspace. We state general conditions on motion primitives to obtain provably correct behavior so that a library of safe-by-design motion primitives can be designed. The overall framework yields a highly robust design by utilizing feedback strategies at both the low and high levels. We provide specific designs for motion primitives and control policies suitable for multirobot motion planning; the modularity of our approach enables one to independently customize the designs of each of these components. Our approach is experimentally validated on a group of quadrocopters.

@article{vukosavljev-tro19,
title = {A modular framework for motion planning using safe-by-design motion primitives},
author = {Marijan Vukosavljev and Zachary Kroeze and Angela P. Schoellig and Mireille E. Broucke},
journal = {{IEEE Transactions on Robotics}},
year = {2019},
volume = {35},
number = {5},
pages = {1233--1252},
doi = {10.1109/TRO.2019.2923335},
urlvideo = {http://tiny.cc/modular-3alg},
abstract = {In this paper, we present a modular framework for solving a motion planning problem among a group of robots. The proposed framework utilizes a finite set of low-level motion primitives to generate motions in a gridded workspace. The constraints on allowable sequences of motion primitives are formalized through a maneuver automaton . At the high level, a control policy determines which motion primitive is executed in each box of the gridded workspace. We state general conditions on motion primitives to obtain provably correct behavior so that a library of safe-by-design motion primitives can be designed. The overall framework yields a highly robust design by utilizing feedback strategies at both the low and high levels. We provide specific designs for motion primitives and control policies suitable for multirobot motion planning; the modularity of our approach enables one to independently customize the designs of each of these components. Our approach is experimentally validated on a group of quadrocopters.}
}

There’s no place like home: visual teach and repeat for emergency return of multirotor UAVs during GPS failure
M. Warren, M. Greeff, B. Patel, J. Collier, A. P. Schoellig, and T. D. Barfoot
IEEE Robotics and Automation Letters, vol. 4, iss. 1, p. 161–168, 2019.

Redundant navigation systems are critical for safe operation of UAVs in high-risk environments. Since most commercial UAVs almost wholly rely on GPS, jamming, interference and multi-pathing are real concerns that usually limit their operations to low-risk environments and VLOS. This paper presents a vision-based route-following system for the autonomous, safe return of UAVs under primary navigation failure such as GPS jamming. Using a Visual Teach and Repeat framework to build a visual map of the environment during an outbound flight, we show the autonomous return of the UAV by visually localising the live view to this map when a simulated GPS failure occurs, controlling the vehicle to follow the safe outbound path back to the launch point. Using gimbal-stabilised stereo vision alone, without reliance on external infrastructure or inertial sensing, Visual Odometry and localisation are achieved at altitudes of 5-25 m and flight speeds up to 55 km/h. We examine the performance of the visual localisation algorithm under a variety of conditions and also demonstrate closed-loop autonomy along a complicated 450 m path.

@article{warren-ral19,
title = {There's No Place Like Home: Visual Teach and Repeat for Emergency Return of Multirotor {UAV}s During {GPS} Failure},
author = {Michael Warren and Melissa Greeff and Bhavit Patel and Jack Collier and Angela P. Schoellig and Timothy D. Barfoot},
journal = {{IEEE Robotics and Automation Letters}},
year = {2019},
volume = {4},
number = {1},
pages = {161--168},
doi = {10.1109/LRA.2018.2883408},
urlvideo = {https://youtu.be/oJaQ4ZbvsFw},
abstract = {Redundant navigation systems are critical for safe operation of UAVs in high-risk environments. Since most commercial UAVs almost wholly rely on GPS, jamming, interference and multi-pathing are real concerns that usually limit their operations to low-risk environments and VLOS. This paper presents a vision-based route-following system for the autonomous, safe return of UAVs under primary navigation failure such as GPS jamming. Using a Visual Teach and Repeat framework to build a visual map of the environment during an outbound flight, we show the autonomous return of the UAV by visually localising the live view to this map when a simulated GPS failure occurs, controlling the vehicle to follow the safe outbound path back to the launch point. Using gimbal-stabilised stereo vision alone, without reliance on external infrastructure or inertial sensing, Visual Odometry and localisation are achieved at altitudes of 5-25 m and flight speeds up to 55 km/h. We examine the performance of the visual localisation algorithm under a variety of conditions and also demonstrate closed-loop autonomy along a complicated 450 m path.}
}

Provably robust learning-based approach for high-accuracy tracking control of Lagrangian systems
M. K. Helwa, A. Heins, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 4, iss. 2, p. 1587–1594, 2019.

Lagrangian systems represent a wide range of robotic systems, including manipulators, wheeled and legged robots, and quadrotors. Inverse dynamics control and feed-forward linearization techniques are typically used to convert the complex nonlinear dynamics of Lagrangian systems to a set of decoupled double integrators, and then a standard, outer-loop controller can be used to calculate the commanded acceleration for the linearized system. However, these methods typically depend on having a very accurate system model, which is often not available in practice. While this challenge has been addressed in the literature using different learning approaches, most of these approaches do not provide safety guarantees in terms of stability of the learning-based control system. In this paper, we provide a novel, learning-based control approach based on Gaussian processes (GPs) that ensures both stability of the closed-loop system and high-accuracy tracking. We use GPs to approximate the error between the commanded acceleration and the actual acceleration of the system, and then use the predicted mean and variance of the GP to calculate an upper bound on the uncertainty of the linearized model. This uncertainty bound is then used in a robust, outer-loop controller to ensure stability of the overall system. Moreover, we show that the tracking error converges to a ball with a radius that can be made arbitrarily small. Furthermore, we verify the effectiveness of our approach via simulations on a 2 degree-of-freedom (DOF) planar manipulator and experimentally on a 6 DOF industrial manipulator.

@article{helwa-ral19,
title = {Provably Robust Learning-Based Approach for High-Accuracy Tracking Control of {L}agrangian Systems},
author = {Mohamed K. Helwa and Adam Heins and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2019},
volume = {4},
number = {2},
pages = {1587--1594},
doi = {10.1109/LRA.2019.2896728},
urlvideo = {https://youtu.be/CBmZ4F79gmI},
abstract = {Lagrangian systems represent a wide range of robotic systems, including manipulators, wheeled and legged robots, and quadrotors. Inverse dynamics control and feed-forward linearization techniques are typically used to convert the complex nonlinear dynamics of Lagrangian systems to a set of decoupled double integrators, and then a standard, outer-loop controller can be used to calculate the commanded acceleration for the linearized system. However, these methods typically depend on having a very accurate system model, which is often not available in practice. While this challenge has been addressed in the literature using different learning approaches, most of these approaches do not provide safety guarantees in terms of stability of the learning-based control system. In this paper, we provide a novel, learning-based control approach based on Gaussian processes (GPs) that ensures both stability of the closed-loop system and high-accuracy tracking. We use GPs to approximate the error between the commanded acceleration and the actual acceleration of the system, and then use the predicted mean and variance of the GP to calculate an upper bound on the uncertainty of the linearized model. This uncertainty bound is then used in a robust, outer-loop controller to ensure stability of the overall system. Moreover, we show that the tracking error converges to a ball with a radius that can be made arbitrarily small. Furthermore, we verify the effectiveness of our approach via simulations on a 2 degree-of-freedom (DOF) planar manipulator and experimentally on a 6 DOF industrial manipulator.}
}

Trajectory generation for multiagent point-to-point transitions via distributed model predictive control
C. E. Luis and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 4, iss. 2, p. 357–382, 2019.

This paper introduces a novel algorithm for multiagent offline trajectory generation based on distributed model predictive control (DMPC). By predicting future states and sharing this information with their neighbours, the agents are able to detect and avoid collisions while moving towards their goals. The proposed algorithm computes transition trajectories for dozens of vehicles in a few seconds. It reduces the computation time by more than 85\% compared to previous optimization approaches based on sequential convex programming (SCP), with only causing a small impact on the optimality of the plans. We replaced the previous compatibility constraints in DMPC, which limit the motion of the agents in order to avoid collisions, by relaxing the collision constraints and enforcing them only when required. The approach was validated both through extensive simulations for a wide range of randomly generated transitions and with teams of up to 25 quadrotors flying in confined indoor spaces.

@article{luis-ral19,
title = {Trajectory Generation for Multiagent Point-To-Point Transitions via Distributed Model Predictive Control},
author = {Carlos E. Luis and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2019},
volume = {4},
number = {2},
pages = {357--382},
urlvideo = {https://youtu.be/ZN2e7h-kkpw},
abstract = {This paper introduces a novel algorithm for multiagent offline trajectory generation based on distributed model predictive control (DMPC). By predicting future states and sharing this information with their neighbours, the agents are able to detect and avoid collisions while moving towards their goals. The proposed algorithm computes transition trajectories for dozens of vehicles in a few seconds. It reduces the computation time by more than 85\% compared to previous optimization approaches based on sequential convex programming (SCP), with only causing a small impact on the optimality of the plans. We replaced the previous compatibility constraints in DMPC, which limit the motion of the agents in order to avoid collisions, by relaxing the collision constraints and enforcing them only when required. The approach was validated both through extensive simulations for a wide range of randomly generated transitions and with teams of up to 25 quadrotors flying in confined indoor spaces.}
}

Learn fast, forget slow: safe predictive control for systems with locally linear actuator dynamics performing repetitive tasks
C. D. McKinnon and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 4, iss. 2, p. 2180–2187, 2019.

We present a control method for improved repetitive path following for a ground vehicle that is geared towards long-term operation where the operating conditions can change over time and are initially unknown. We use weighted Bayesian Linear Regression to model the unknown actuator dynamics, and show how this simple model is more accurate in both its estimate of the mean behaviour and model uncertainty than Gaussian Process Regression and generalizes to novel operating conditions with little or no tuning. In addition, it allows us to use fast adaptation and long-term learning in one, unified framework, to adapt quickly to new operating conditions and learn repetitive model errors over time. This comes with the added benefit of lower computational cost, longer look-ahead, and easier optimization when the model is used in a robust, Model Predictive controller (MPC). In order to fully capitalize on the long prediction horizons that are possible with this new approach, we use Tube MPC to reduce predicted uncertainty growth. We demonstrate the effectiveness of our approach in experiment on a 900 kg ground robot showing results over 2.7 km of driving with both physical and artificial changes to the robot’s dynamics. All of our experiments are conducted using a stereo camera for localization.

@article{mckinnon-ral19,
title={Learn Fast, Forget Slow: Safe Predictive Control for Systems with Locally Linear Actuator Dynamics Performing Repetitive Tasks},
author={Christopher D. McKinnon and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2019},
volume = {4},
number = {2},
pages = {2180--2187},
urlvideo={https://youtu.be/fLNMtYabuU4},
abstract = {We present a control method for improved repetitive path following for a ground vehicle that is geared towards long-term operation where the operating conditions can change over time and are initially unknown. We use weighted Bayesian Linear Regression to model the unknown actuator dynamics, and show how this simple model is more accurate in both its estimate of the mean behaviour and model uncertainty than Gaussian Process Regression and generalizes to novel operating conditions with little or no tuning. In addition, it allows us to use fast adaptation and long-term learning in one, unified framework, to adapt quickly to new operating conditions and learn repetitive model errors over time. This comes with the added benefit of lower computational cost, longer look-ahead, and easier optimization when the model is used in a robust, Model Predictive controller (MPC). In order to fully capitalize on the long prediction horizons that are possible with this new approach, we use Tube MPC to reduce predicted uncertainty growth. We demonstrate the effectiveness of our approach in experiment on a 900 kg ground robot showing results over 2.7 km of driving with both physical and artificial changes to the robot's dynamics. All of our experiments are conducted using a stereo camera for localization.}
}

Transfer learning for high-precision trajectory tracking through L1 adaptive feedback and iterative learning
K. Pereida, D. Kooijman, R. R. P. R. Duivenvoorden, and A. P. Schoellig
International Journal of Adaptive Control and Signal Processing, vol. 33, iss. 2, p. 388–409, 2019.

Robust and adaptive control strategies are needed when robots or automated systems are introduced to unknown and dynamic environments where they are required to cope with disturbances, unmodeled dynamics, and parametric uncertainties. In this paper, we demonstrate the capabilities of a combined L_1 adaptive control and iterative learning control (ILC) framework to achieve high-precision trajectory tracking in the presence of unknown and changing disturbances. The L1 adaptive controller makes the system behave close to a reference model; however, it does not guarantee that perfect trajectory tracking is achieved, while ILC improves trajectory tracking performance based on previous iterations. The combined framework in this paper uses L1 adaptive control as an underlying controller that achieves a robust and repeatable behavior, while the ILC acts as a high-level adaptation scheme that mainly compensates for systematic tracking errors. We illustrate that this framework enables transfer learning between dynamically different systems, where learned experience of one system can be shown to be beneficial for another different system. Experimental results with two different quadrotors show the superior performance of the combined L1-ILC framework compared with approaches using ILC with an underlying proportional-derivative controller or proportional-integral-derivative controller. Results highlight that our L1-ILC framework can achieve high-precision trajectory tracking when unknown and changing disturbances are present and can achieve transfer of learned experience between dynamically different systems. Moreover, our approach is able to achieve precise trajectory tracking in the first attempt when the initial input is generated based on the reference model of the adaptive controller.

@ARTICLE{pereida-acsp18,
title={Transfer Learning for High-Precision Trajectory Tracking Through {L1} Adaptive Feedback and Iterative Learning},
author={Karime Pereida and Dave Kooijman and Rikky R. P. R. Duivenvoorden and Angela P. Schoellig},
journal={{International Journal of Adaptive Control and Signal Processing}},
year={2019},
volume = {33},
number = {2},
pages = {388--409},
doi={10.1002/acs.2887},
abstract={Robust and adaptive control strategies are needed when robots or automated systems are introduced to unknown and dynamic environments where they are required to cope with disturbances, unmodeled dynamics, and parametric uncertainties. In this paper, we demonstrate the capabilities of a combined L_1 adaptive control and iterative learning control (ILC) framework to achieve high-precision trajectory tracking in the presence of unknown and changing disturbances. The L1 adaptive controller makes the system behave close to a reference model; however, it does not guarantee that perfect trajectory tracking is achieved, while ILC improves trajectory tracking performance based on previous iterations. The combined framework in this paper uses L1 adaptive control as an underlying controller that achieves a robust and repeatable behavior, while the ILC acts as a high-level adaptation scheme that mainly compensates for systematic tracking errors. We illustrate that this framework enables transfer learning between dynamically different systems, where learned experience of one system can be shown to be beneficial for another different system. Experimental results with two different quadrotors show the superior performance of the combined L1-ILC framework compared with approaches using ILC with an underlying proportional-derivative controller or proportional-integral-derivative controller. Results highlight that our L1-ILC framework can achieve high-precision trajectory tracking when unknown and changing disturbances are present and can achieve transfer of learned experience between dynamically different systems. Moreover, our approach is able to achieve precise trajectory tracking in the first attempt when the initial input is generated based on the reference model of the adaptive controller.},
}

Active training trajectory generation for inverse dynamics model learning with deep neural networks
S. Zhou and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2019, p. 1784–1790.

Inverse dynamics models have been used in robot control algorithms to realize a desired motion or to enhance a robot’s performance. As robot dynamics and their operating environments become more complex, there is a growing trend of learning uncertain or unknown dynamics from data. While techniques such as deep neural networks (DNNs) have been successfully used to learn inverse dynamics, it is usually implicitly assumed that the learning modules are trained on sufficiently rich datasets. In practical implementations, this assumption typically results in a trial-and-error training process, which can be inefficient or unsafe for robot applications. In this paper, we present an active trajectory generation framework that allows us to systematically design informative trajectories for training DNN inverse dynamics modules. In particular, we introduce an episode-based algorithm that integrates a spline trajectory optimization approach with DNN active learning for efficient data collection. We consider different DNN uncertainty estimation techniques and active learning heuristics in our work and illustrate the proposed active training trajectory generation approach in simulation. We show that the proposed active training trajectory generation outperforms adhoc, intuitive training approaches.

@INPROCEEDINGS{zhou-cdc19,
author = {Siqi Zhou and Angela P. Schoellig},
title = {Active Training Trajectory Generation for Inverse Dynamics Model Learning with Deep Neural Networks},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2019},
pages = {1784--1790},
abstract = {Inverse dynamics models have been used in robot control algorithms to realize a desired motion or to enhance a robot’s performance. As robot dynamics and their operating environments become more complex, there is a growing trend of learning uncertain or unknown dynamics from data. While techniques such as deep neural networks (DNNs) have been successfully used to learn inverse dynamics, it is usually implicitly assumed that the learning modules are trained on sufficiently rich datasets. In practical implementations, this assumption typically results in a trial-and-error training process, which can be inefficient or unsafe for robot applications. In this paper, we present an active trajectory generation framework that allows us to systematically design informative trajectories for training DNN inverse dynamics modules. In particular, we introduce an episode-based algorithm that integrates a spline trajectory optimization approach with DNN active learning for efficient data collection. We consider different DNN uncertainty estimation techniques and active learning heuristics in our work and illustrate the proposed active training trajectory generation approach in simulation. We show that the proposed active training trajectory generation outperforms adhoc, intuitive training approaches.},
}

Trajectory tracking for quadrotors with attitude control on $\mathcal{S}^2\times \mathcal{S}^1$
D. Kooijman, A. P. Schoellig, and D. J. Antunes
in Proc. of the European Control Conference (ECC), 2019, p. 4002–4009.

The control of a quadrotor is typically split into two subsequent problems: finding desired accelerations to control its position, and controlling its attitude and the total thrust to track these accelerations and to track a yaw angle reference. While the thrust vector, generating accelerations, and the angle of rotation about the thrust vector, determining the yaw angle, can be controlled independently, most attitude control strategies in the literature, relying on representations in terms of quaternions, rotation matrices or Euler angles, result in an unnecessary coupling between the control of the thrust vector and of the angle about this vector. This leads, for instance, to undesired position tracking errors due to yaw tracking errors. In this paper we propose to tackle the attitude control problem using an attitude representation in the Cartesian product of the 2-sphere and the 1-sphere, denoted by $\mathcal{S}^2\times \mathcal{S}^1$. We propose a non-linear tracking control law on $\mathcal{S}^2\times \mathcal{S}^1$ that decouples the control of the thrust vector and of the angle of rotation about the thrust vector, and guarantees almost global asymptotic stability. Simulation results highlight the advantages of the proposed approach over previous approaches.

@INPROCEEDINGS{kooijman-ecc19,
author = {Dave Kooijman and Angela P. Schoellig and Duarte J. Antunes},
title = {Trajectory Tracking for Quadrotors with Attitude Control on {$\mathcal{S}^2\times \mathcal{S}^1$}},
booktitle = {{Proc. of the European Control Conference (ECC)}},
year = {2019},
pages = {4002--4009},
abstract = {The control of a quadrotor is typically split into two subsequent problems: finding desired accelerations to control its position, and controlling its attitude and the total thrust to track these accelerations and to track a yaw angle reference. While the thrust vector, generating accelerations, and the angle of rotation about the thrust vector, determining the yaw angle, can be controlled independently, most attitude control strategies in the literature, relying on representations in terms of quaternions, rotation matrices or Euler angles, result in an unnecessary coupling between the control of the thrust vector and of the angle about this vector. This leads, for instance, to undesired position tracking errors due to yaw tracking errors. In this paper we propose to tackle the attitude control problem using an attitude representation in the Cartesian product of the 2-sphere and the 1-sphere, denoted by $\mathcal{S}^2\times \mathcal{S}^1$. We propose a non-linear tracking control law on $\mathcal{S}^2\times \mathcal{S}^1$ that decouples the control of the thrust vector and of the angle of rotation about the thrust vector, and guarantees almost global asymptotic stability. Simulation results highlight the advantages of the proposed approach over previous approaches.},
}

aUToTrack: a lightweight object detection and tracking system for the SAE AutoDrive challenge
K. Burnett, S. Samavi, S. Waslander, T. D. Barfoot, and A. P. Schoellig
in Proc. of the Conference on Computer and Robot Vision (CRV), 2019, p. 209–216. Best poster presentation award.

The University of Toronto is one of eight teams competing in the SAE AutoDrive Challenge – a competition to develop a self-driving car by 2020. After placing first at the Year 1 challenge [1], we are headed to MCity in June 2019 for the second challenge. There, we will interact with pedestrians, cyclists, and cars. For safe operation, it is critical to have an accurate estimate of the position of all objects surrounding the vehicle. The contributions of this work are twofold: First, we present a new object detection and tracking dataset (UofTPed50), which uses GPS to ground truth the position and velocity of a pedestrian. To our knowledge, a dataset of this type for pedestrians has not been shown in the literature before. Second, we present a lightweight object detection and tracking system (aUToTrack) that uses vision, LIDAR, and GPS/IMU positioning to achieve state-of-the-art performance on the KITTI Object Tracking benchmark. We show that aUToTrack accurately estimates the position and velocity of pedestrians, in real-time, using CPUs only. aUToTrack has been tested in closed-loop experiments on a real self-driving car (seen in Figure 1), and we demonstrate its performance on our dataset.

@INPROCEEDINGS{burnett-crv19,
author = {Keenan Burnett and Sepehr Samavi and Steven Waslander and Timothy D. Barfoot and Angela P. Schoellig},
title = {{aUToTrack:} A lightweight object detection and tracking system for the {SAE} {AutoDrive} Challenge},
booktitle = {{Proc. of the Conference on Computer and Robot Vision (CRV)}},
year = {2019},
pages = {209--216},
note = {Best poster presentation award},
urlvideo = {https://youtu.be/FLCgcgzNo80},
abstract = {The University of Toronto is one of eight teams competing in the SAE AutoDrive Challenge – a competition to develop a self-driving car by 2020. After placing first at the Year 1 challenge [1], we are headed to MCity in June 2019 for the second challenge. There, we will interact with pedestrians, cyclists, and cars. For safe operation, it is critical to have an accurate estimate of the position of all objects surrounding the vehicle. The contributions of this work are twofold: First, we present a new object detection and tracking dataset (UofTPed50), which uses GPS to ground truth the position and velocity of a pedestrian. To our knowledge, a dataset of this type for pedestrians has not been shown in the literature before. Second, we present a lightweight object detection and tracking system (aUToTrack) that uses vision, LIDAR, and GPS/IMU positioning to achieve state-of-the-art performance on the KITTI Object Tracking benchmark. We show that aUToTrack accurately estimates the position and velocity of pedestrians, in real-time, using CPUs only. aUToTrack has been tested in closed-loop experiments on a real self-driving car (seen in Figure 1), and we demonstrate its performance on our dataset.},
}

Learning probabilistic models for safe predictive control in unknown environments
C. D. McKinnon and A. P. Schoellig
in Proc. of the European Control Conference (ECC), 2019, p. 2472–2479.

Researchers rely increasingly on tools from machine learning to improve the performance of control algorithms on real world tasks and enable robots to operate for long periods of time without intervention. Many of these algorithms require a model for the dynamics of the robot. In particular, researchers designing methods for safe learning control often rely on an upper bound on model error to make guarantees about the worst-case closed-loop performance of their algorithm. There are different options for how to learn such a model of the robot dynamics. We study probabilistic models for use in the context of stochastic model predictive control. Two popular choices for learning the robot dynamics are Gaussian Process (GP) regression and various forms of local linear regression. In this paper, we present a study comparing GPs with a particular form of local linear regression for learning robot dynamics with the aim of guaranteeing safety when a robot operates in novel conditions. We show results based on experimental data from a 900 kg ground robot using vision for localisation.

@INPROCEEDINGS{mckinnon-ecc19,
author = {Christopher D. McKinnon and Angela P. Schoellig},
title = {Learning Probabilistic Models for Safe Predictive Control in Unknown Environments},
booktitle = {{Proc. of the European Control Conference (ECC)}},
year = {2019},
pages = {2472--2479},
abstract = {Researchers rely increasingly on tools from machine learning to improve the performance of control algorithms on real world tasks and enable robots to operate for long periods of time without intervention. Many of these algorithms require a model for the dynamics of the robot. In particular, researchers designing methods for safe learning control often rely on an upper bound on model error to make guarantees about the worst-case closed-loop performance of their algorithm. There are different options for how to learn such a model of the robot dynamics. We study probabilistic models for use in the context of stochastic model predictive control. Two popular choices for learning the robot dynamics are Gaussian Process (GP) regression and various forms of local linear regression. In this paper, we present a study comparing GPs with a particular form of local linear regression for learning robot dynamics with the aim of guaranteeing safety when a robot operates in novel conditions. We show results based on experimental data from a 900 kg ground robot using vision for localisation.},
}

Improved tag-based indoor localization of UAVs using extended Kalman filter
N. Kayhani, A. Heins, W. Zhao, M. Nahangi, B. McCabe, and A. P. Schoellig
in Proc. of the International Symposium on Automation and Robotics in Construction (ISARC), 2019, p. 624–631.

Indoor localization and navigation of unmanned aerial vehicles (UAVs) is a critical function for autonomous flight and automated visual inspection of construction elements in continuously changing construction environments. The key challenge for indoor localization and navigation is that the global positioning system (GPS) signal is not sufficiently reliable for state estimation. Having used the AprilTag markers for indoor localization, we showed a proof-of-concept that a camera-equipped UAV can be localized in a GPS-denied environment; however, the accuracy of the localization was inadequate in some situations. This study presents the implementation and performance assessment of an Extended Kalman Filter (EKF) for improving the estimation process of a previously developed indoor localization framework using AprilTag markers. An experimental set up is used to assess the performance of the updated estimation process in comparison to the previous state estimation method and the ground truth data. Results show that the state estimation and indoor localization are improved substantially using the EKF. To have a more robust estimation, we extract and fuse data from multiple tags. The framework can now be tested in real-world environments given that our continuous localization is sufficiently robust and reliable.

@INPROCEEDINGS{kayhani-isarc19,
author = {Navid Kayhani and Adam Heins and Wenda Zhao and Mohammad Nahangi and Brenda McCabe and Angela P. Schoellig},
title = {Improved Tag-based Indoor Localization of {UAV}s Using Extended {Kalman} Filter},
booktitle = {{Proc. of the International Symposium on Automation and Robotics in Construction (ISARC)}},
year = {2019},
pages = {624--631},
doi={10.22260/ISARC2019/0083},
abstract = {Indoor localization and navigation of unmanned aerial vehicles (UAVs) is a critical function for autonomous flight and automated visual inspection of construction elements in continuously changing construction environments. The key challenge for indoor localization and navigation is that the global positioning system (GPS) signal is not sufficiently reliable for state estimation. Having used the AprilTag markers for indoor localization, we showed a proof-of-concept that a camera-equipped UAV can be localized in a GPS-denied environment; however, the accuracy of the localization was inadequate in some situations. This study presents the implementation and performance assessment of an Extended Kalman Filter (EKF) for improving the estimation process of a previously developed indoor localization framework using AprilTag markers. An experimental set up is used to assess the performance of the updated estimation process in comparison to the previous state estimation method and the ground truth data. Results show that the state estimation and indoor localization are improved substantially using the EKF. To have a more robust estimation, we extract and fuse data from multiple tags. The framework can now be tested in real-world environments given that our continuous localization is sufficiently robust and reliable.},
}

Knowledge transfer between robots with similar dynamics for high-accuracy impromptu trajectory tracking
S. Zhou, A. Sarabakha, E. Kayacan, M. K. Helwa, and A. P. Schoellig
in Proc. of the European Control Conference (ECC), 2019, p. 1–8.

In this paper, we propose an online learning approach that enables the inverse dynamics model learned for a source robot to be transferred to a target robot (e.g., from one quadrotor to another quadrotor with different mass or aerodynamic properties). The goal is to leverage knowledge from the source robot such that the target robot achieves high-accuracy trajectory tracking on arbitrary trajectories from the first attempt with minimal data recollection and training. Most existing approaches for multi-robot knowledge transfer are based on post-analysis of datasets collected from both robots. In this work, we study the feasibility of impromptu transfer of models across robots by learning an error prediction module online. In particular, we analytically derive the form of the mapping to be learned by the online module for exact tracking, propose an approach for characterizing similarity between robots, and use these results to analyze the stability of the overall system. The proposed approach is illustrated in simulation and verified experimentally on two different quadrotors performing impromptu trajectory tracking tasks, where the quadrotors are required to accurately track arbitrary hand-drawn trajectories from the first attempt.

@INPROCEEDINGS{zhou-ecc19,
author = {Siqi Zhou and Andriy Sarabakha and Erdal Kayacan and Mohamed K. Helwa and Angela P. Schoellig},
title = {Knowledge Transfer Between Robots with Similar Dynamics for High-Accuracy Impromptu Trajectory Tracking},
booktitle = {{Proc. of the European Control Conference (ECC)}},
year = {2019},
pages = {1--8},
urlvideo = {https://youtu.be/Pj_irRLHsD8},
abstract = {In this paper, we propose an online learning approach that enables the inverse dynamics model learned for a source robot to be transferred to a target robot (e.g., from one quadrotor to another quadrotor with different mass or aerodynamic properties). The goal is to leverage knowledge from the source robot such that the target robot achieves high-accuracy trajectory tracking on arbitrary trajectories from the first attempt with minimal data recollection and training. Most existing approaches for multi-robot knowledge transfer are based on post-analysis of datasets collected from both robots. In this work, we study the feasibility of impromptu transfer of models across robots by learning an error prediction module online. In particular, we analytically derive the form of the mapping to be learned by the online module for exact tracking, propose an approach for characterizing similarity between robots, and use these results to analyze the stability of the overall system. The proposed approach is illustrated in simulation and verified experimentally on two different quadrotors performing impromptu trajectory tracking tasks, where the quadrotors are required to accurately track arbitrary hand-drawn trajectories from the first attempt.},
}

Building a winning self-driving car in six months
K. Burnett, A. Schimpe, S. Samavi, M. Gridseth, C. W. Liu, Q. Li, Z. Kroeze, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2019, p. 9583–9589.

The SAE AutoDrive Challenge is a three-year competition to develop a Level 4 autonomous vehicle by 2020. The first set of challenges were held in April of 2018 in Yuma, Arizona. Our team (aUToronto/Zeus) placed first. In this paper, we describe our complete system architecture and specialized algorithms that enabled us to win. We show that it is possible to develop a vehicle with basic autonomy features in just six months relying on simple, robust algorithms. We do not make use of a prior map. Instead, we have developed a multi-sensor visual localization solution. All of our algorithms run in real-time using CPUs only. We also highlight the closed-loop performance of our system in detail in several experiments.

@INPROCEEDINGS{burnett-icra19,
author = {Keenan Burnett and Andreas Schimpe and Sepehr Samavi and Mona Gridseth and Chengzhi Winston Liu and Qiyang Li and Zachary Kroeze and Angela P. Schoellig},
title = {Building a Winning Self-Driving Car in Six Months},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2019},
pages = {9583--9589},
urlvideo = {http://tiny.cc/zeus-y1},
abstract = {The SAE AutoDrive Challenge is a three-year competition to develop a Level 4 autonomous vehicle by 2020. The first set of challenges were held in April of 2018 in Yuma, Arizona. Our team (aUToronto/Zeus) placed first. In this paper, we describe our complete system architecture and specialized algorithms that enabled us to win. We show that it is possible to develop a vehicle with basic autonomy features in just six months relying on simple, robust algorithms. We do not make use of a prior map. Instead, we have developed a multi-sensor visual localization solution. All of our algorithms run in real-time using CPUs only. We also highlight the closed-loop performance of our system in detail in several experiments.},
}

Point me in the right direction: improving visual localization on UAVs with active gimballed camera pointing
B. Patel, M. Warren, and A. P. Schoellig
in Proc. of the Conference on Computer and Robot Vision (CRV), 2019, p. 105–112. Best paper award, robot vision.

Robust autonomous navigation of multirotor UAVs in GPS-denied environments is critical to enable their safe operation in many applications such as surveillance and reconnaissance, inspection, and delivery services. In this paper, we use a gimballed stereo camera for localization and demonstrate how the localization performance and robustness can be improved by actively controlling the camera’s viewpoint. For an autonomous route-following task based on a recorded map, multiple gimbal pointing strategies are compared: off-the-shelf passive stabilization, active stabilization, minimization of viewpoint orientation error, and pointing the camera optical axis at the centroid of previously observed landmarks. We demonstrate improved localization performance using an active gimbal-stabilized camera in multiple outdoor flight experiments on routes up to 315 m, and with 6-25 m altitude variations. Scenarios are shown where a static camera frequently fails to localize while a gimballed camera attenuates perspective errors to retain localization. We demonstrate that our orientation matching and centroid pointing strategies provide the best performance; enabling localization despite increasing velocity discrepancies between the map-generation flight and the live flight from 3-9 m/s, and 8 m path offsets.

@INPROCEEDINGS{patel-crv19,
author = {Bhavit Patel and Michael Warren and Angela P. Schoellig},
title = {Point Me In The Right Direction: Improving Visual Localization on {UAV}s with Active Gimballed Camera Pointing},
booktitle = {{Proc. of the Conference on Computer and Robot Vision (CRV)}},
year = {2019},
pages = {105--112},
note = {Best paper award, robot vision},
abstract = {Robust autonomous navigation of multirotor UAVs in GPS-denied environments is critical to enable their safe operation in many applications such as surveillance and reconnaissance, inspection, and delivery services. In this paper, we use a gimballed stereo camera for localization and demonstrate how the localization performance and robustness can be improved by actively controlling the camera’s viewpoint. For an autonomous route-following task based on a recorded map, multiple gimbal pointing strategies are compared: off-the-shelf passive stabilization, active stabilization, minimization of viewpoint orientation error, and pointing the camera optical axis at the centroid of previously observed landmarks. We demonstrate improved localization performance using an active gimbal-stabilized camera in multiple outdoor flight experiments on routes up to 315 m, and with 6-25 m altitude variations. Scenarios are shown where a static camera frequently fails to localize while a gimballed camera attenuates perspective errors to retain localization. We demonstrate that our orientation matching and centroid pointing strategies provide the best performance; enabling localization despite increasing velocity discrepancies between the map-generation flight and the live flight from 3-9 m/s, and 8 m path offsets.},
}

Fast and in sync: periodic swarm patterns for quadrotors
X. Du, C. E. Luis, M. Vukosavljev, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2019, p. 9143–9149.

This paper aims to design quadrotor swarm performances, where the swarm acts as an integrated, coordinated unit embodying moving and deforming objects. We divide the task of creating a choreography into three basic steps: designing swarm motion primitives, transitioning between those movements, and synchronizing the motion of the drones. The result is a flexible framework for designing choreographies comprised of a wide variety of motions. The motion primitives can be intuitively designed using few parameters, providing a rich library for choreography design. Moreover, we combine and adapt existing goal assignment and trajectory generation algorithms to maximize the smoothness of the transitions between motion primitives. Finally, we propose a correction algorithm to compensate for motion delays and synchronize the motion of the drones to a desired periodic motion pattern. The proposed methodology was validated experimentally by generating and executing choreographies on a swarm of 25 quadrotors.

@INPROCEEDINGS{du-icra19,
author = {Xintong Du and Carlos E. Luis and Marijan Vukosavljev and Angela P. Schoellig},
title = {Fast and in sync: periodic swarm patterns for quadrotors},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2019},
pages={9143--9149},
abstract = {This paper aims to design quadrotor swarm performances, where the swarm acts as an integrated, coordinated unit embodying moving and deforming objects. We divide the task of creating a choreography into three basic steps: designing swarm motion primitives, transitioning between those movements, and synchronizing the motion of the drones. The result is a flexible framework for designing choreographies comprised of a wide variety of motions. The motion primitives can be intuitively designed using few parameters, providing a rich library for choreography design. Moreover, we combine and adapt existing goal assignment and trajectory generation algorithms to maximize the smoothness of the transitions between motion primitives. Finally, we propose a correction algorithm to compensate for motion delays and synchronize the motion of the drones to a desired periodic motion pattern. The proposed methodology was validated experimentally by generating and executing choreographies on a swarm of 25 quadrotors.},
}

Hierarchically consistent motion primitives for quadrotor coordination
M. Vukosavljev, A. P. Schoellig, and M. E. Broucke
Technical Report, arXiv, 2019.

We present a hierarchical framework for motion planning of a large collection of agents. The proposed framework starts from low level motion primitives over a gridded workspace and provides a set of rules for constructing higher level motion primitives. Our hierarchical approach is highly scalable and robust making it an ideal tool for planning for multi-agent systems. Results are demonstrated experimentally on a collection of quadrotors that must navigate a cluttered environment while maintaining a formation.

@TECHREPORT{vukosavljev-report19,
author = {Marijan Vukosavljev and Angela P. Schoellig and Mireille E. Broucke},
title = {Hierarchically consistent motion primitives for quadrotor coordination},
year = {2019},
institution = {arXiv},
urlvideo = {http://tiny.cc/hier-moprim},
abstract = {We present a hierarchical framework for motion planning of a large collection of agents. The proposed framework starts from low level motion primitives over a gridded workspace and provides a set of rules for constructing higher level motion primitives. Our hierarchical approach is highly scalable and robust making it an ideal tool for planning for multi-agent systems. Results are demonstrated experimentally on a collection of quadrotors that must navigate a cluttered environment while maintaining a formation.},
}

Robust adaptive model predictive control for high-accuracy trajectory tracking in changing conditions
K. Pereida and A. P. Schoellig
Short Paper and Presentation, in Proc. of the Algorithms and Architectures for Learning in-the-Loop Systems in Autonomous Flight Workshop at IEEE International Conference on Robotics and Automation (ICRA), 2019.

Robots are being deployed in unknown and dynamic environments where they are required to handle disturbances, unmodeled dynamics, and parametric uncertainties. Sophisticated control strategies can guarantee high performance in these changing environments. In this work, we propose a novel robust adaptive model predictive controller that combines robust model predictive control (MPC) with an underlying $\mathcal{L}_1$ adaptive controller to improve trajectory tracking of a system subject to unknown and changing disturbances. The $\mathcal{L}_1$ adaptive controller forces the system to behave close to a specified linear reference model. The controlled system may still deviate from the reference model, but this deviation is shown to be upper bounded. An outer-loop robust MPC uses this upper bound, the linear reference model and system constraints to calculate the optimal reference input that minimizes the given cost function. The proposed robust adaptive MPC is able to achieve high-accuracy trajectory tracking even in the presence of unknown disturbances. We show preliminary experimental results of an adaptive MPC on a quadrotor. The adaptive MPC has a lower trajectory tracking error compared to a predictive, non-adaptive approach, even when wind disturbances are applied.

@MISC{pereida-icra19a,
author = {Karime Pereida and Angela P. Schoellig},
title = {Robust Adaptive Model Predictive Control for High-Accuracy Trajectory Tracking in Changing Conditions},
year = {2019},
howpublished = {Short Paper and Presentation, in Proc. of the Algorithms and Architectures for Learning in-the-Loop Systems in Autonomous Flight Workshop at IEEE International Conference on Robotics and Automation (ICRA)},
urlvideo = {https://youtu.be/xuyLst5mkEE},
abstract = {Robots are being deployed in unknown and dynamic environments where they are required to handle disturbances, unmodeled dynamics, and parametric uncertainties. Sophisticated control strategies can guarantee high performance in these changing environments. In this work, we propose a novel robust adaptive model predictive controller that combines robust model predictive control (MPC) with an underlying $\mathcal{L}_1$ adaptive controller to improve trajectory tracking of a system subject to unknown and changing disturbances. The $\mathcal{L}_1$ adaptive controller forces the system to behave close to a specified linear reference model. The controlled system may still deviate from the reference model, but this deviation is shown to be upper bounded. An outer-loop robust MPC uses this upper bound, the linear reference model and system constraints to calculate the optimal reference input that minimizes the given cost function. The proposed robust adaptive MPC is able to achieve high-accuracy trajectory tracking even in the presence of unknown disturbances. We show preliminary experimental results of an adaptive MPC on a quadrotor. The adaptive MPC has a lower trajectory tracking error compared to a predictive, non-adaptive approach, even when wind disturbances are applied.},
}

Diversity in robotics: from diverse teams to diverse impact
K. Pereida and M. Greeff
Short Paper and Presentation, in Proc. of the Debates on the Future of Robotics Research Workshop at the IEEE International Conference on Robotics and Automation (ICRA), 2019.

Roboticists develop technologies that are used by people worldwide, consequently impacting many aspects of human life – from healthcare and law enforcement to autonomous transportation. The development of these technologies involves design and innovation – both of which rely on personal choice and experience. Hence, personal biases, whether intentionally or unintentionally, tend to be embedded in the final product designs. Homogeneous teams of designers and engineers are more likely to develop products that overlook the needs of a given part of the population – even missing gaps for potential technological innovation. In this talk we emphasize some of the negative impacts a lack of diversity has on robotic innovation by highlighting examples of embedded biases within certain technologies and providing some evidence that this is linked to a lack of diverse teams. If our aim as a community is to increase research capacity, creativity, and broaden the impact of robotics, making it a more diverse field must be a goal.

@MISC{pereida-icra19b,
author = {Karime Pereida and Melissa Greeff},
title = {Diversity in Robotics: From Diverse Teams to Diverse Impact},
year = {2019},
howpublished = {Short Paper and Presentation, in Proc. of the Debates on the Future of Robotics Research Workshop at the IEEE International Conference on Robotics and Automation (ICRA)},
abstract = {Roboticists develop technologies that are used by people worldwide, consequently impacting many aspects of human life - from healthcare and law enforcement to autonomous transportation. The development of these technologies involves design and innovation - both of which rely on personal choice and experience. Hence, personal biases, whether intentionally or unintentionally, tend to be embedded in the final product designs. Homogeneous teams of designers and engineers are more likely to develop products that overlook the needs of a given part of the population - even missing gaps for potential technological innovation. In this talk we emphasize some of the negative impacts a lack of diversity has on robotic innovation by highlighting examples of embedded biases within certain technologies and providing some evidence that this is linked to a lack of diverse teams. If our aim as a community is to increase research capacity, creativity, and broaden the impact of robotics, making it a more diverse field must be a goal.},
}

Data-efficient multi-robot, multi-task transfer learning for trajectory tracking
K. Pereida, M. K. Helwa, and A. P. Schoellig
Abstract and Poster, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA), 2019.

Learning can significantly improve the performance of robots in uncertain and changing environments; however, typical learning approaches need to start a new learning process for each new task or robot as transferring knowledge is cumbersome or not possible. In this work, we introduce a multi-robot, multi-task transfer learning framework that allows a system to complete a task by learning from a few demonstrations of another task executed on a different system. We focus on the trajectory tracking problem where each trajectory represents a different task. The proposed learning control architecture has two stages: (i) \emph{multi-robot} transfer learning framework that combines $\mathcal{L}_1$ adaptive control and iterative learning control, where the key idea is that the adaptive controller forces dynamically different systems to behave as a specified reference model; and (ii) a \emph{multi-task} transfer learning framework that uses theoretical control results (e.g., the concept of vector relative degree) to learn a map from desired trajectories to the inputs that make the system track these trajectories with high accuracy. This map is used to calculate the inputs for a new, unseen trajectory. We conduct experiments on two different quadrotor platforms and six different trajectories where we show that using information from tracking a single trajectory learned by one quadrotor reduces, on average, the first-iteration tracking error on another quadrotor by 74\%.

@MISC{pereida-icra19c,
author = {Karime Pereida and Mohamed K. Helwa and Angela P. Schoellig},
title = {Data-Efficient Multi-Robot, Multi-Task Transfer Learning for Trajectory Tracking},
year = {2019},
howpublished = {Abstract and Poster, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA)},
abstract = {Learning can significantly improve the performance of robots in uncertain and changing environments; however, typical learning approaches need to start a new learning process for each new task or robot as transferring knowledge is cumbersome or not possible. In this work, we introduce a multi-robot, multi-task transfer learning framework that allows a system to complete a task by learning from a few demonstrations of another task executed on a different system. We focus on the trajectory tracking problem where each trajectory represents a different task. The proposed learning control architecture has two stages: (i) \emph{multi-robot} transfer learning framework that combines $\mathcal{L}_1$ adaptive control and iterative learning control, where the key idea is that the adaptive controller forces dynamically different systems to behave as a specified reference model; and (ii) a \emph{multi-task} transfer learning framework that uses theoretical control results (e.g., the concept of vector relative degree) to learn a map from desired trajectories to the inputs that make the system track these trajectories with high accuracy. This map is used to calculate the inputs for a new, unseen trajectory. We conduct experiments on two different quadrotor platforms and six different trajectories where we show that using information from tracking a single trajectory learned by one quadrotor reduces, on average, the first-iteration tracking error on another quadrotor by 74\%.},
}

Towards scalable online trajectory generation for multi-robot systems
C. E. Luis, M. Vukosavljev, and A. P. Schoellig
Abstract and Poster, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA), 2019.

We present a distributed model predictive control (DMPC) algorithm to generate trajectories in real-time for multiple robots, taking into account their trajectory tracking dynamics and actuation limits. An event-triggered replanning strategy is proposed to account for disturbances in the system. We adopted the on-demand collision avoidance method presented in previous work to efficiently compute non-colliding trajectories in transition tasks. Preliminary results in simulation show a higher success rate than previous online methods based on Buffered Voronoi Cells (BVC), while maintaining computational tractability for real-time operation.

@MISC{luis-icra19,
author = {Carlos E. Luis and Marijan Vukosavljev and Angela P. Schoellig},
title = {Towards Scalable Online Trajectory Generation for Multi-robot Systems},
year = {2019},
howpublished = {Abstract and Poster, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA)},
abstract = {We present a distributed model predictive control (DMPC) algorithm to generate trajectories in real-time for multiple robots, taking into account their trajectory tracking dynamics and actuation limits. An event-triggered replanning strategy is proposed to account for disturbances in the system. We adopted the on-demand collision avoidance method presented in previous work to efficiently compute non-colliding trajectories in transition tasks. Preliminary results in simulation show a higher success rate than previous online methods based on Buffered Voronoi Cells (BVC), while maintaining computational tractability for real-time operation.},
}

Knowledge transfer between robots with online learning for enhancing robot performance in impromptu trajectory tracking
S. Zhou, A. Sarabakha, E. Kayacan, M. K. Helwa, and A. P. Schoellig
Abstract and Presentation, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA), 2019.

As robot dynamics become more complex, learning from data is emerging as an alternative for obtaining accurate dynamic models to assist control system designs or to enhance robot performance. Though being effective, common model learning techniques rely on rich datasets collected from the robots, and the learned experience is often platform-specific. In this work, we propose an online learning approach for transferring deep neural network (DNN) inverse dynamics models across two robots and analyze the role of dynamic similarity in the transfer problem. We demonstrate our proposed knowledge transfer approach with two different quadrotors on impromptu trajectory tracking tasks, in which the quadrotors are required to track arbitrary hand-drawn trajectories accurately from the first attempt. With this work, we illustrate that (i) we can relate the transferability of DNN inverse models to the robot dynamic properties, and (ii) when the transfer is feasible, we can significantly reduce data recollections that would be otherwise costly or risky for robot applications. Given a heterogeneous robot team, we envision having to train only one of the agents to allow the whole team achieving higher performance.

@MISC{zhou-icra19,
author = {Siqi Zhou and Andriy Sarabakha and Erdal Kayacan and Mohamed K. Helwa and Angela P. Schoellig},
title = {Knowledge Transfer Between Robots with Online Learning for Enhancing Robot Performance in Impromptu Trajectory Tracking},
year = {2019},
howpublished = {Abstract and Presentation, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA)},
abstract = {As robot dynamics become more complex, learning from data is emerging as an alternative for obtaining accurate dynamic models to assist control system designs or to enhance robot performance. Though being effective, common model learning techniques rely on rich datasets collected from the robots, and the learned experience is often platform-specific. In this work, we propose an online learning approach for transferring deep neural network (DNN) inverse dynamics models across two robots and analyze the role of dynamic similarity in the transfer problem. We demonstrate our proposed knowledge transfer approach with two different quadrotors on impromptu trajectory tracking tasks, in which the quadrotors are required to track arbitrary hand-drawn trajectories accurately from the first attempt. With this work, we illustrate that (i) we can relate the transferability of DNN inverse models to the robot dynamic properties, and (ii) when the transfer is feasible, we can significantly reduce data recollections that would be otherwise costly or risky for robot applications. Given a heterogeneous robot team, we envision having to train only one of the agents to allow the whole team achieving higher performance.}
}

## 2018

Data-efficient multi-robot, multi-task transfer learning for trajectory tracking
K. Pereida, M. K. Helwa, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 3, iss. 2, p. 1260–1267, 2018.

Transfer learning has the potential to reduce the burden of data collection and to decrease the unavoidable risks of the training phase. In this paper, we introduce a multi-robot, multi-task transfer learning framework that allows a system to complete a task by learning from a few demonstrations of another task executed on another system. We focus on the trajectory tracking problem where each trajectory represents a different task, since many robotic tasks can be described as a trajectory tracking problem. The proposed, multi-robot transfer learning framework is based on a combined L1 adaptive control and iterative learning control approach. The key idea is that the adaptive controller forces dynamically different systems to behave as a specified reference model. The proposed multi-task transfer learning framework uses theoretical control results (e.g., the concept of vector relative degree) to learn a map from desired trajectories to the inputs that make the system track these trajectories with high accuracy. This map is used to calculate the inputs for a new, unseen trajectory. Experimental results using two different quadrotor platforms and six different trajectories show that, on average, the proposed framework reduces the first-iteration tracking error by 74% when information from tracking a different, single trajectory on a different quadrotor is utilized.

@article{pereida-ral18,
title = {Data-Efficient Multi-Robot, Multi-Task Transfer Learning for Trajectory Tracking},
author = {Karime Pereida and Mohamed K. Helwa and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2018},
volume = {3},
number = {2},
doi = {10.1109/LRA.2018.2795653},
pages = {1260--1267},
abstract = {Transfer learning has the potential to reduce the burden of data collection and to decrease the unavoidable risks of the training phase. In this paper, we introduce a multi-robot, multi-task transfer learning framework that allows a system to complete a task by learning from a few demonstrations of another task executed on another system. We focus on the trajectory tracking problem where each trajectory represents a different task, since many robotic tasks can be described as a trajectory tracking problem. The proposed, multi-robot transfer learning framework is based on a combined L1 adaptive control and iterative learning control approach. The key idea is that the adaptive controller forces dynamically different systems to behave as a specified reference model. The proposed multi-task transfer learning framework uses theoretical control results (e.g., the concept of vector relative degree) to learn a map from desired trajectories to the inputs that make the system track these trajectories with high accuracy. This map is used to calculate the inputs for a new, unseen trajectory. Experimental results using two different quadrotor platforms and six different trajectories show that, on average, the proposed framework reduces the first-iteration tracking error by 74% when information from tracking a different, single trajectory on a different quadrotor is utilized.},
}

The regular indefinite linear quadratic optimal control problem: stabilizable case
M. Vukosavljev, A. P. Schoellig, and M. E. Broucke
SIAM Journal on Control and Optimization, vol. 56, iss. 1, pp. 496-516, 2018.

This paper addresses an open problem in the area of linear quadratic optimal control. We consider the regular, infinite-horizon, stability-modulo-a-subspace, indefinite linear quadratic problem under the assumption that the dynamics are stabilizable. Our result generalizes previous works dealing with the same problem in the case of controllable dynamics. We explicitly characterize the unique solution of the algebraic Riccati equation that gives the optimal cost and optimal feedback control, as well as necessary and sufficient conditions for the existence of optimal controls.

@article{vukosavljev-sicon18,
title = {The regular indefinite linear quadratic optimal control problem: stabilizable case},
author = {Marijan Vukosavljev and Angela P. Schoellig and Mireille E. Broucke},
journal = {{SIAM Journal on Control and Optimization}},
year = {2018},
volume = {56},
number = {1},
pages = {496-516},
doi = {10.1137/17M1143137},
abstract = {This paper addresses an open problem in the area of linear quadratic optimal control. We consider the regular, infinite-horizon, stability-modulo-a-subspace, indefinite linear quadratic problem under the assumption that the dynamics are stabilizable. Our result generalizes previous works dealing with the same problem in the case of controllable dynamics. We explicitly characterize the unique solution of the algebraic Riccati equation that gives the optimal cost and optimal feedback control, as well as necessary and sufficient conditions for the existence of optimal controls.},
}

On the construction of safe controllable regions for affine systems with applications to robotics
M. K. Helwa and A. P. Schoellig
Automatica, vol. 98, p. 323–330, 2018.

This paper studies the problem of constructing in-block controllable (IBC) regions for affine systems. That is, we are concerned with constructing regions in the state space of affine systems such that all the states in the interior of the region are mutually accessible within the region’s interior by applying uniformly bounded inputs. We first show that existing results for checking in- block controllability on given polytopic regions cannot be easily extended to address the question of constructing IBC regions. We then explore the geometry of the problem to provide a computationally efficient algorithm for constructing IBC regions. We also prove the soundness of the algorithm. We then use the proposed algorithm to construct safe speed profiles for robotic systems. As a case study, we present several experimental results on unmanned aerial vehicles (UAVs) to verify the effectiveness of the proposed algorithm; these results include using the proposed algorithm for real-time collision avoidance for UAVs.

@article{helwa-auto18,
title = {On the construction of safe controllable regions for affine systems with applications to robotics},
author = {Mohamed K. Helwa and Angela P. Schoellig},
journal = {{Automatica}},
volume = {98},
pages = {323--330},
doi = {https://doi.org/10.1016/j.automatica.2018.09.019},
year = {2018},
abstract = {This paper studies the problem of constructing in-block controllable (IBC) regions for affine systems. That is, we are concerned with constructing regions in the state space of affine systems such that all the states in the interior of the region are mutually accessible within the region’s interior by applying uniformly bounded inputs. We first show that existing results for checking in- block controllability on given polytopic regions cannot be easily extended to address the question of constructing IBC regions. We then explore the geometry of the problem to provide a computationally efficient algorithm for constructing IBC regions. We also prove the soundness of the algorithm. We then use the proposed algorithm to construct safe speed profiles for robotic systems. As a case study, we present several experimental results on unmanned aerial vehicles (UAVs) to verify the effectiveness of the proposed algorithm; these results include using the proposed algorithm for real-time collision avoidance for UAVs.},
}

An inversion-based learning approach for improving impromptu trajectory tracking of robots with non-minimum phase dynamics
S. Zhou, M. K. Helwa, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 3, iss. 3, p. 1663–1670, 2018.

This letter presents a learning-based approach for impromptu trajectory tracking for non-minimum phase systems, i.e., systems with unstable inverse dynamics. Inversion-based feedforward approaches are commonly used for improving tracking performance; however, these approaches are not directly applicable to non-minimum phase systems due to their inherent instability. In order to resolve the instability issue, existing methods have assumed that the system model is known and used preactuation or inverse approximation techniques. In this work, we propose an approach for learning a stable, approximate inverse of a non-minimum phase baseline system directly from its input–output data. Through theoretical discussions, simulations, and experiments on two different platforms, we show the stability of our proposed approach and its effectiveness for high-accuracy, impromptu tracking. Our approach also shows that including more information in the training, as is commonly assumed to be useful, does not lead to better performance but may trigger instability and impact the effectiveness of the overall approach.

@article{zhou-ral18,
title = {An Inversion-Based Learning Approach for Improving Impromptu Trajectory Tracking of Robots With Non-Minimum Phase Dynamics},
author = {SiQi Zhou and Mohamed K. Helwa and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2018},
volume = {3},
number = {3},
doi = {10.1109/LRA.2018.2801471},
pages = {1663--1670},
abstract = {This letter presents a learning-based approach for impromptu trajectory tracking for non-minimum phase systems, i.e., systems with unstable inverse dynamics. Inversion-based feedforward approaches are commonly used for improving tracking performance; however, these approaches are not directly applicable to non-minimum phase systems due to their inherent instability. In order to resolve the instability issue, existing methods have assumed that the system model is known and used preactuation or inverse approximation techniques. In this work, we propose an approach for learning a stable, approximate inverse of a non-minimum phase baseline system directly from its input–output data. Through theoretical discussions, simulations, and experiments on two different platforms, we show the stability of our proposed approach and its effectiveness for high-accuracy, impromptu tracking. Our approach also shows that including more information in the training, as is commonly assumed to be useful, does not lead to better performance but may trigger instability and impact the effectiveness of the overall approach.},
}

Level-headed: gimbal-stabilised visual teach & repeat for improved high-speed path-following
M. Warren, A. P. Schoellig, and T. D. Barfoot
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2018.

Operating in rough, unstructured terrain is an essential requirement for any truly field-deployable ground robot. Search-and-rescue, border patrol and agricultural work all require operation in environments with little established infrastructure for easy navigation. This presents challenges for sensor-based navigation such as vision, where erratic motion and feature-poor environments test feature tracking and hinder the performance of repeat matching of point features. For vision-based route-following methods such as Visual Teach and Repeat (VT&R), maintaining similar visual perspective of salient point features is critical for reliable odometry and accurate localisation over long periods. In this paper, we investigate a potential solution to these challenges by integrating a gimballed camera with VT&R on a Grizzly Robotic Utility Vehicle (RUV) for testing at high speeds and in visually challenging environments. We investigate the benefits and drawbacks of using an actively gimballed camera to attenuate image motion and control viewpoint. We compare the use of a gimballed camera to our traditional fixed stereo configuration and demonstrate cases of improved performance in Visual Odometry (VO), localisation, and path following in several sets of outdoor experiments.

@INPROCEEDINGS{warren-icra18,
author = {Michael Warren and Angela P. Schoellig and Tim D. Barfoot},
title = {Level-headed: gimbal-stabilised visual teach & repeat for improved high-speed path-following},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year={2018},
abstract = {Operating in rough, unstructured terrain is an essential requirement for any truly field-deployable ground robot. Search-and-rescue, border patrol and agricultural work all require operation in environments with little established infrastructure for easy navigation. This presents challenges for sensor-based navigation such as vision, where erratic motion and feature-poor environments test feature tracking and hinder the performance of repeat matching of point features. For vision-based route-following methods such as Visual Teach and Repeat (VT&R), maintaining similar visual perspective of salient point features is critical for reliable odometry and accurate localisation over long periods. In this paper, we investigate a potential solution to these challenges by integrating a gimballed camera with VT&R on a Grizzly Robotic Utility Vehicle (RUV) for testing at high speeds and in visually challenging environments. We investigate the benefits and drawbacks of using an actively gimballed camera to attenuate image motion and control viewpoint. We compare the use of a gimballed camera to our traditional fixed stereo configuration and demonstrate cases of improved performance in Visual Odometry (VO), localisation, and path following in several sets of outdoor experiments.},
}

Pre- and post-blast rock block size analysis using UAV-Lidar based data and discrete fracture network
F. Medinac, T. Bamford, K. Esmaeili, and A. P. Schoellig
in Proc. of the 2nd International Discrete Fracture Network Engineering (DFNE), 2018.

@INPROCEEDINGS{medinac-dfne18,
author = {Filip Medinac and Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
title = {Pre- and post-blast rock block size analysis using {UAV-Lidar} based data and discrete fracture network},
booktitle = {{Proc. of the 2nd International Discrete Fracture Network Engineering (DFNE)}},
year = {2018},
abstact = {Drilling and blasting is one of the key processes in open pit mining, required to reduce in-situ rock block size to rock fragments that can be handled by mine equipment. It is a significant cost driver of any mining operation which can influence the downstream mining processes. In-situ rock block size influences the muck pile size distribution after blast, and the amount of drilling and explosive required to achieve a desired distribution. Thus, continuous measurement of pre- and post-blast rock block size distribution is essential for the optimization of the rock fragmentation process. This paper presents the results of a case study in an open pit mine where an Unmanned Aerial Vehicle (UAV) was used for mapping of the pit walls before blast. Pit wall mapping and aerial data was used as input to generate a 3D Discrete Fracture Network (DFN) model of the rock mass and to estimate the in-situ block size distribution. Data collected by the UAV was also used to estimate the post-blast rock fragment size distribution. The knowledge of in-situ and blasted rock size distributions can be related to assess blast performance. This knowledge will provide feedback to production engineers to adjust the blast design.},
}

Adaptive model predictive control for high-accuracy trajectory tracking in changing conditions
K. Pereida and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018, p. 7831–7837.

Robots and automated systems are increasingly being introduced to unknown and dynamic environments where they are required to handle disturbances, unmodeled dynamics, and parametric uncertainties. Robust and adaptive control strategies are required to achieve high performance in these dynamic environments. In this paper, we propose a novel adaptive model predictive controller that combines model predictive control (MPC) with an underlying L_1 adaptive controller to improve trajectory tracking of a system subject to unknown and changing disturbances. The L_1 adaptive controller forces the system to behave in a predefined way, as specified by a reference model. A higher-level model predictive controller then uses this reference model to calculate the optimal reference input based on a cost function, while taking into account input and state constraints. We focus on the experimental validation of the proposed approach and demonstrate its effectiveness in experiments on a quadrotor. We show that the proposed approach has a lower trajectory tracking error compared to non-predictive, adaptive approaches and a predictive, non-adaptive approach, even when external wind disturbances are applied.

@INPROCEEDINGS{pereida-iros18,
author={Karime Pereida and Angela P. Schoellig},
title={Adaptive Model Predictive Control for High-Accuracy Trajectory Tracking in Changing Conditions},
booktitle={{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year={2018},
pages={7831--7837},
abstract={Robots and automated systems are increasingly being introduced to unknown and dynamic environments where they are required to handle disturbances, unmodeled dynamics, and parametric uncertainties. Robust and adaptive control strategies are required to achieve high performance in these dynamic environments. In this paper, we propose a novel adaptive model predictive controller that combines model predictive control (MPC) with an underlying L_1 adaptive controller to improve trajectory tracking of a system subject to unknown and changing disturbances. The L_1 adaptive controller forces the system to behave in a predefined way, as specified by a reference model. A higher-level model predictive controller then uses this reference model to calculate the optimal reference input based on a cost function, while taking into account input and state constraints. We focus on the experimental validation of the proposed approach and demonstrate its effectiveness in experiments on a quadrotor. We show that the proposed approach has a lower trajectory tracking error compared to non-predictive, adaptive approaches and a predictive, non-adaptive approach, even when external wind disturbances are applied.},
}

Flatness-based model predictive control for quadrotor trajectory tracking
M. Greeff and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018, p. 6740–6745.

The use of model predictive control for quadrotor applications requires balancing trajectory tracking performance and constraint satisfaction with fast computational demands. This paper proposes a Flatness-based Model Predictive Control (FMPC) approach that can be applied to quadrotors, and more generally, differentially flat nonlinear systems. Our proposed FMPC couples feedback model predictive control with feedforward linearization. The proposed approach has the computational advantage that, similar to linear model predictive control, it only requires solving a convex quadratic program instead of a nonlinear program. However, unlike linear model predictive control, we still account for the nonlinearity in the model through the use of an inverse term. In simulation, we demonstrate improved robustness over approaches that couple model predictive control with feedback linearization. In experiments using quadrotor vehicles, we demonstrate improved trajectory tracking compared to classical linear and nonlinear model predictive controllers.

@INPROCEEDINGS{greeff-iros18,
author={Melissa Greeff and Angela P. Schoellig},
title={Flatness-based Model Predictive Control for Quadrotor Trajectory Tracking},
booktitle={{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year={2018},
urldata = {../../wp-content/papercite-data/data/greeff-icra18-supplementary.pdf},
pages={6740--6745},
abstract={The use of model predictive control for quadrotor applications requires balancing trajectory tracking performance and constraint satisfaction with fast computational demands. This paper proposes a Flatness-based Model Predictive Control (FMPC) approach that can be applied to quadrotors, and more generally, differentially flat nonlinear systems. Our proposed FMPC couples feedback model predictive control with feedforward linearization. The proposed approach has the computational advantage that, similar to linear model predictive control, it only requires solving a convex quadratic program instead of a nonlinear program. However, unlike linear model predictive control, we still account for the nonlinearity in the model through the use of an inverse term. In simulation, we demonstrate improved robustness over approaches that couple model predictive control with feedback linearization. In experiments using quadrotor vehicles, we demonstrate improved trajectory tracking compared to classical linear and nonlinear model predictive controllers.},
}

Hybrid model predictive control for crosswind stabilization of hybrid airships
J. F. M. Foerster, M. K. Helwa, X. Du, and A. P. Schoellig
in Proc. of the International Symposium on Experimental Robotics (ISER), 2018, pp. 499-510.

Hybrid airships are heavier-than-air vehicles that generate a majority of the lift using buoyancy. The resulting high energy efficiency during operation and short take-off and landing distances make this vehicle class very suited for a number of logistics applications. However, the range of safe operating conditions can be limited due to a high susceptibility to crosswinds during taxiing, take-off and landing. The goal of this work is to design and implement an automated counter-gust system (CGS) that stabilizes a hybrid airship against wind disturbances during ground operations by controlling thrusters that are mounted to the wingtips. The CGS controller should compute optimal control inputs, run autonomously without pilot intervention, be computationally efficient to run on onboard hardware, and be flexible regarding adaption to future aircraft.

@INPROCEEDINGS{foerster-iser18,
author={Julian F. M. Foerster and Mohamed K. Helwa and Xintong Du and Angela P. Schoellig},
title={Hybrid Model Predictive Control for Crosswind Stabilization of Hybrid Airships},
booktitle={{Proc. of the International Symposium on Experimental Robotics (ISER)}},
year={2018},
pages={499-510},
doi={10.1007/978-3-030-33950-0_43},
abstract={Hybrid airships are heavier-than-air vehicles that generate a majority of the lift using buoyancy. The resulting high energy efficiency during operation and short take-off and landing distances make this vehicle class very suited for a number of logistics applications. However, the range of safe operating conditions can be limited due to a high susceptibility to crosswinds during taxiing, take-off and landing. The goal of this work is to design and implement an automated counter-gust system (CGS) that stabilizes a hybrid airship against wind disturbances during ground operations by controlling thrusters that are mounted to the wingtips. The CGS controller should compute optimal control inputs, run autonomously without pilot intervention, be computationally efficient to run on onboard hardware, and be flexible regarding adaption to future aircraft.},
}

Automated localization of UAVs in GPS-denied indoor construction environments using fiducial markers
M. Nahangi, A. Heins, B. McCabe, and A. P. Schoellig
in Proc. International Symposium on Automation and Robotics in Construction (ISARC), 2018, p. 88–94.

Unmanned Aerial Vehicles (UAVs) have opened a wide range of opportunities and applications in different sectors including construction. Such applications include: 3D mapping from 2D images and video footage, automated site inspection, and performance monitoring. All of the above-mentioned applications perform well outdoors where GPS is quite reliable for localization and navigation of UAV’s. Indoor localization and consequently indoor navigation have remained relatively untapped, because GPS is not sufficiently reliable and accurate in indoor environments. This paper presents a method for localization of aerial vehicles in GPS-denied indoor construction environments. The proposed method employs AprilTags that are linked to previously known coordinates in the 3D building information model (BIM). Using cameras on-board the UAV and extracting the transformation from the tag to the camera’s frame, the UAV can be localized on the site. It can then use the previously computed information for navigation between critical locations on construction sites. We use an experimental setup to verify and validate the proposed method by comparing with an indoor localization system as the ground truth. Results show that the proposed method is sufficiently accurate to perform indoor navigation. Moreover, the method does not intensify the complexity of the construction execution as the tags are simply printed and placed on available surfaces at the construction site.

@INPROCEEDINGS{nahangi-isarc18,
author={Mohammad Nahangi and Adam Heins and Brenda McCabe and Angela P. Schoellig},
title={Automated Localization of {UAV}s in {GPS}-Denied Indoor Construction Environments Using Fiducial Markers},
booktitle = {{Proc. International Symposium on Automation and Robotics in Construction (ISARC)}},
year = {2018},
pages={88--94},
abstract = {Unmanned Aerial Vehicles (UAVs) have opened a wide range of opportunities and applications in different sectors including construction. Such applications include: 3D mapping from 2D images and video footage, automated site inspection, and performance monitoring. All of the above-mentioned applications perform well outdoors where GPS is quite reliable for localization and navigation of UAV’s. Indoor localization and consequently indoor navigation have remained relatively untapped, because GPS is not sufficiently reliable and accurate in indoor environments. This paper presents a method for localization of aerial vehicles in GPS-denied indoor construction environments. The proposed method employs AprilTags that are linked to previously known coordinates in the 3D building information model (BIM). Using cameras on-board the UAV and extracting the transformation from the tag to the camera’s frame, the UAV can be localized on the site. It can then use the previously computed information for navigation between critical locations on construction sites. We use an experimental setup to verify and validate the proposed method by comparing with an indoor localization system as the ground truth. Results show that the proposed method is sufficiently accurate to perform indoor navigation. Moreover, the method does not intensify the complexity of the construction execution as the tags are simply printed and placed on available surfaces at the construction site.},
}

Experience-based model selection to enable long-term, safe control for repetitive tasks under changing conditions
C. D. McKinnon and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018, p. 2977–2984.

Learning has propelled the cutting edge of performance in robotic control to new heights, allowing robots to operate with high performance in conditions that were previously unimaginable. The majority of the work, however, assumes that the unknown parts are static or slowly changing. This limits them to static or slowly changing environments. However, in the real world, a robot may experience various unknown conditions. This paper presents a method to extend an existing single mode GP-based safe learning controller to learn an increasing number of non-linear models for the robot dynamics. We show that this approach enables a robot to re-use past experience from a large number of previously visited operating conditions, and to safely adapt when a new and distinct operating condition is encountered. This allows the robot to achieve safety and high performance in an large number of operating conditions that do not have to be specified ahead of time. Our approach runs independently from the controller, imposing no additional computation time on the control loop regardless of the number of previous operating conditions considered. We demonstrate the effectiveness of our approach in experiment on a 900\,kg ground robot with both physical and artificial changes to its dynamics. All of our experiments are conducted using vision for localization.

@inproceedings{mckinnon-iros18,
author = {Christopher D. McKinnon and Angela P. Schoellig},
title = {Experience-Based Model Selection to Enable Long-Term, Safe Control for Repetitive Tasks Under Changing Conditions},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year = {2018},
urlslides = {../../wp-content/papercite-data/slides/mckinnon-iros18-slides.pdf},
pages = {2977--2984},
abstract = {Learning has propelled the cutting edge of performance in robotic control to new heights, allowing robots to operate with high performance in conditions that were previously unimaginable. The majority of the work, however, assumes that the unknown parts are static or slowly changing. This limits them to static or slowly changing environments. However, in the real world, a robot may experience various unknown conditions. This paper presents a method to extend an existing single mode GP-based safe learning controller to learn an increasing number of non-linear models for the robot dynamics. We show that this approach enables a robot to re-use past experience from a large number of previously visited operating conditions, and to safely adapt when a new and distinct operating condition is encountered. This allows the robot to achieve safety and high performance in an large number of operating conditions that do not have to be specified ahead of time. Our approach runs independently from the controller, imposing no additional computation time on the control loop regardless of the number of previous operating conditions considered. We demonstrate the effectiveness of our approach in experiment on a 900\,kg ground robot with both physical and artificial changes to its dynamics. All of our experiments are conducted using vision for localization.},
}

Evaluation of UAV system accuracy for automated fragmentation measurement
T. Bamford, K. Esmaeili, and A. P. Schoellig
in Proc. of the 12th International Symposium on Rock Fragmentation by Blasting (FRAGBLAST), 2018, p. 715–730.

The current practice of collecting rock fragmentation data is highly manual and provides data with low temporal and spatial resolution. Unmanned Aerial Vehicle (UAV) technology can increase both temporal and spatial data resolution without exposing technicians to hazardous conditions. Our previous works using UAV technology to acquire real-time rock fragmentation data has shown comparable quality results to sieving in a lab environment. However, when applied to a mining environment, it is essential to quantify the accuracy of scale estimation and rock size distribution by considering various sources of uncertainties such as the UAV GPS, which provides noisy measurements. In the current paper, we investigate the accuracy of application of UAVs to collect photographic data for fragmentation analysis. This is done by evaluating the accuracy of the 3D model generated using the UAV system, estimated image scale, and the measured rock size distribution. This paper also investigates the impact of flight altitude on the measured rock size distribution.

@inproceedings{bamford-fragblast12,
author = {Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
title = {Evaluation of {UAV} system accuracy for automated fragmentation measurement},
booktitle = {{Proc. of the 12th International Symposium on Rock Fragmentation by Blasting (FRAGBLAST)}},
year = {2018},
pages = {715--730},
abstract = {The current practice of collecting rock fragmentation data is highly manual and provides data with low temporal and spatial resolution. Unmanned Aerial Vehicle (UAV) technology can increase both temporal and spatial data resolution without exposing technicians to hazardous conditions. Our previous works using UAV technology to acquire real-time rock fragmentation data has shown comparable quality results to sieving in a lab environment. However, when applied to a mining environment, it is essential to quantify the accuracy of scale estimation and rock size distribution by considering various sources of uncertainties such as the UAV GPS, which provides noisy measurements. In the current paper, we investigate the accuracy of application of UAVs to collect photographic data for fragmentation analysis. This is done by evaluating the accuracy of the 3D model generated using the UAV system, estimated image scale, and the measured rock size distribution. This paper also investigates the impact of flight altitude on the measured rock size distribution.},
}

Learning of coordination policies for robotic swarms
Q. Li, X. Du, Y. Huang, Q. Sykora, and A. P. Schoellig
Technical Report, arXiv, 2018.

Inspired by biological swarms, robotic swarms are envisioned to solve real-world problems that are difficult for individual agents. Biological swarms can achieve collective intelligence based on local interactions and simple rules; however, designing effective distributed policies for large-scale robotic swarms to achieve a global objective can be challenging. Although it is often possible to design an optimal centralized strategy for smaller numbers of agents, those methods can fail as the number of agents increases. Motivated by the growing success of machine learning, we develop a deep learning approach that learns distributed coordination policies from centralized policies. In contrast to traditional distributed control approaches, which are usually based on human-designed policies for relatively simple tasks, this learning-based approach can be adapted to more difficult tasks. We demonstrate the efficacy of our proposed approach on two different tasks, the well-known rendezvous problem and a more difficult particle assignment problem. For the latter, no known distributed policy exists. From extensive simulations, it is shown that the performance of the learned coordination policies is comparable to the centralized policies, surpassing state-of-the-art distributed policies. Thereby, our proposed approach provides a promising alternative for real-world coordination problems that would be otherwise computationally expensive to solve or intangible to explore.

@TECHREPORT{li-icra18,
title = {Learning of Coordination Policies for Robotic Swarms},
institution = {arXiv},
author = {Qiyang Li and Xintong Du and Yizhou Huang and Quinlan Sykora and Angela P. Schoellig},
year = {2018},
abstract = {Inspired by biological swarms, robotic swarms are envisioned to solve real-world problems that are difficult for individual agents. Biological swarms can achieve collective intelligence based on local interactions and simple rules; however, designing effective distributed policies for large-scale robotic swarms to achieve a global objective can be challenging. Although it is often possible to design an optimal centralized strategy for smaller numbers of agents, those methods can fail as the number of agents increases. Motivated by the growing success of machine learning, we develop a deep learning approach that learns distributed coordination policies from centralized policies. In contrast to traditional distributed control approaches, which are usually based on human-designed policies for relatively simple tasks, this learning-based approach can be adapted to more difficult tasks. We demonstrate the efficacy of our proposed approach on two different tasks, the well-known rendezvous problem and a more difficult particle assignment problem. For the latter, no known distributed policy exists. From extensive simulations, it is shown that the performance of the learned coordination policies is comparable to the centralized policies, surpassing state-of-the-art distributed policies. Thereby, our proposed approach provides a promising alternative for real-world coordination problems that would be otherwise computationally expensive to solve or intangible to explore.},
}

## 2017

A real-time analysis of post-blast rock fragmentation using UAV technology
T. Bamford, K. Esmaeili, and A. P. Schoellig
International Journal of Mining, Reclamation and Environment, vol. 31, iss. 6, p. 439–456, 2017.

The current practice of collecting rock fragmentation data for image analysis is highly manual and provides data with low temporal and spatial resolution. Using Unmanned Aerial Vehicles (UAVs) for collecting images of rock fragments improves the quality of the image data and automates the data collection process. This work presents the results of laboratory-scale using a UAV. The goal is to highlight the benefits of aerial fragmentation analysis in terms of both prediction accuracy and time effort. The pile was manually photographed and the results of the manual method were compared to the UAV method.

@article{bamford-ijmre17,
title = {A Real-Time Analysis of Post-Blast Rock Fragmentation Using {UAV} Technology},
author = {Bamford, Thomas and Esmaeili, Kamran and Schoellig, Angela P.},
journal = {{International Journal of Mining, Reclamation and Environment}},
year = {2017},
volume = {31},
number = {6},
doi = {10.1080/17480930.2017.1339170},
pages = {439--456},
publisher = {Taylor & Francis},
urlvideo = {https://youtu.be/q0syk6J_JHY},
abstract = {The current practice of collecting rock fragmentation data for image analysis is highly manual and provides data with low temporal and spatial resolution. Using Unmanned Aerial Vehicles (UAVs) for collecting images of rock fragments improves the quality of the image data and automates the data collection process. This work presents the results of laboratory-scale using a UAV. The goal is to highlight the benefits of aerial fragmentation analysis in terms of both prediction accuracy and time effort. The pile was manually photographed and the results of the manual method were compared to the UAV method.},
}

Optimizing a drone network to deliver automated external defibrillators
J. J. Boutilier, S. C. Brooks, A. Janmohamed, A. Byers, J. E. Buick, C. Zhan, A. P. Schoellig, S. Cheskes, L. J. Morrison, and T. C. Y. Chan
Circulation, 2017. In press.

BACKGROUND Public access defibrillation programs can improve survival after out-of-hospital cardiac arrest (OHCA), but automated external defibrillators (AEDs) are rarely available for bystander use at the scene. Drones are an emerging technology that can deliver an AED to the scene of an OHCA for bystander use. We hypothesize that a drone network designed with the aid of a mathematical model combining both optimization and queuing can reduce the time to AED arrival. METHODS We applied our model to 53,702 OHCAs that occurred in the eight regions of the Toronto Regional RescuNET between January 1st 2006 and December 31st 2014. Our primary analysis quantified the drone network size required to deliver an AED one, two, or three minutes faster than historical median 911 response times for each region independently. A secondary analysis quantified the reduction in drone resources required if RescuNET was treated as one large coordinated region. RESULTS The region-specific analysis determined that 81 bases and 100 drones would be required to deliver an AED ahead of median 911 response times by three minutes. In the most urban region, the 90th percentile of the AED arrival time was reduced by 6 minutes and 43 seconds relative to historical 911 response times in the region. In the most rural region, the 90th percentile was reduced by 10 minutes and 34 seconds. A single coordinated drone network across all regions required 39.5% fewer bases and 30.0% fewer drones to achieve similar AED delivery times. CONCLUSIONS An optimized drone network designed with the aid of a novel mathematical model can substantially reduce the AED delivery time to an OHCA event.

@article{boutilier-circ17,
title={Optimizing a Drone Network to Deliver Automated External Defibrillators},
author = {Boutilier, Justin J. and Brooks, Steven C. and Janmohamed, Alyf and Byers, Adam and Buick, Jason E. and Zhan, Cathy and Schoellig, Angela P. and Cheskes, Sheldon and Morrison, Laurie J. and Chan, Timothy C. Y.},
journal={Circulation},
year={2017},
doi = {10.1161/CIRCULATIONAHA.116.026318},
publisher = {American Heart Association, Inc.},
note = {In press},
abstract = {BACKGROUND Public access defibrillation programs can improve survival after out-of-hospital cardiac arrest (OHCA), but automated external defibrillators (AEDs) are rarely available for bystander use at the scene. Drones are an emerging technology that can deliver an AED to the scene of an OHCA for bystander use. We hypothesize that a drone network designed with the aid of a mathematical model combining both optimization and queuing can reduce the time to AED arrival. METHODS We applied our model to 53,702 OHCAs that occurred in the eight regions of the Toronto Regional RescuNET between January 1st 2006 and December 31st 2014. Our primary analysis quantified the drone network size required to deliver an AED one, two, or three minutes faster than historical median 911 response times for each region independently. A secondary analysis quantified the reduction in drone resources required if RescuNET was treated as one large coordinated region. RESULTS The region-specific analysis determined that 81 bases and 100 drones would be required to deliver an AED ahead of median 911 response times by three minutes. In the most urban region, the 90th percentile of the AED arrival time was reduced by 6 minutes and 43 seconds relative to historical 911 response times in the region. In the most rural region, the 90th percentile was reduced by 10 minutes and 34 seconds. A single coordinated drone network across all regions required 39.5% fewer bases and 30.0% fewer drones to achieve similar AED delivery times. CONCLUSIONS An optimized drone network designed with the aid of a novel mathematical model can substantially reduce the AED delivery time to an OHCA event.},
}

Safe model-based reinforcement learning with stability guarantees
F. Berkenkamp, M. Turchetta, A. P. Schoellig, and A. Krause
in Proc. of Neural Information Processing Systems (NIPS), 2017, p. 908–918.

Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorithm that explicitly considers safety in terms of stability guarantees. Specifically, we extend control theoretic results on Lyapunov stability verification and show how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates. Moreover, under additional regularity assumptions in terms of a Gaussian process prior, we prove that one can effectively and safely collect data in order to learn about the dynamics and thus both improve control performance and expand the safe region of the state space. In our experiments, we show how the resulting algorithm can safely optimize a neural network policy on a simulated inverted pendulum, without the pendulum ever falling down.

@INPROCEEDINGS{berkenkamp-nips17,
title = {Safe model-based reinforcement learning with stability guarantees},
booktitle = {{Proc. of Neural Information Processing Systems (NIPS)}},
author = {Felix Berkenkamp and Matteo Turchetta and Angela P. Schoellig and Andreas Krause},
year = {2017},
pages = {908--918},
abstract = {Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorithm that explicitly considers safety in terms of stability guarantees. Specifically, we extend control theoretic results on Lyapunov stability verification and show how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates. Moreover, under additional regularity assumptions in terms of a Gaussian process prior, we prove that one can effectively and safely collect data in order to learn about the dynamics and thus both improve control performance and expand the safe region of the state space. In our experiments, we show how the resulting algorithm can safely optimize a neural network policy on a simulated inverted pendulum, without the pendulum ever falling down.},
}

Towards visual teach & repeat for GPS-denied flight of a fixed-wing UAV
M. Warren, M. Paton, K. MacTavish, A. P. Schoellig, and T. D. Barfoot
in Proc. of the 11th Conference on Field and Service Robotics (FSR), 2017, p. 481–498.

Most consumer and industrial Unmanned Aerial Vehicles (UAVs) rely on combining Global Navigation Satellite Systems (GNSS) with barometric and inertial sensors for outdoor operation. As a consequence these vehicles are prone to a variety of potential navigation failures such as jamming and environmental interference. This usually limits their legal activities to locations of low population density within line-of-sight of a human pilot to reduce risk of injury and damage. Autonomous route-following methods such as Visual Teach & Repeat (VT&R) have enabled long-range navigational autonomy for ground robots without the need for reliance on external infrastructure or an accurate global position estimate. In this paper, we demonstrate the localisation component of (VT&R) outdoors on a fixed-wing UAV as a method of backup navigation in case of primary sensor failure. We modify the localisation engine of (VT&R) to work with a single downward facing camera on a UAV to enable safe navigation under the guidance of vision alone. We evaluate the method using visual data from the UAV flying a 1200 m trajectory (at altitude of 80 m) several times during a multi-day period, covering a total distance of 10.8 km using the algorithm. We examine the localisation performance for both small (single flight) and large (inter-day) temporal differences from teach to repeat. Through these experiments, we demonstrate the ability to successfully localise the aircraft on a self-taught route using vision alone without the need for additional sensing or infrastructure.

@INPROCEEDINGS{warren-fsr17,
author={Michael Warren and Michael Paton and Kirk MacTavish and Angela P. Schoellig and Tim D. Barfoot},
title={Towards visual teach & repeat for {GPS}-denied
flight of a fixed-wing {UAV}},
booktitle={{Proc. of the 11th Conference on Field and Service Robotics (FSR)}},
year={2017},
pages={481--498},
abstract={Most consumer and industrial Unmanned Aerial Vehicles (UAVs) rely on combining Global Navigation Satellite Systems (GNSS) with barometric and inertial sensors for outdoor operation. As a consequence these vehicles are prone to a variety of potential navigation failures such as jamming and environmental interference. This usually limits their legal activities to locations of low population density within line-of-sight of a human pilot to reduce risk of injury and damage. Autonomous route-following methods such as Visual Teach & Repeat (VT&R) have enabled long-range navigational autonomy for ground robots without the need for reliance on external infrastructure or an accurate global position estimate. In this paper, we demonstrate the localisation component of (VT&R) outdoors on a fixed-wing UAV as a method of backup navigation in case of primary sensor failure. We modify the localisation engine of (VT&R) to work with a single downward facing camera on a UAV to enable safe navigation under the guidance of vision alone. We evaluate the method using visual data from the UAV flying a 1200 m trajectory (at altitude of 80 m) several times during a multi-day period, covering a total distance of 10.8 km using the algorithm. We examine the localisation performance for both small (single flight) and large (inter-day) temporal differences from teach to repeat. Through these experiments, we demonstrate the ability to successfully localise the aircraft on a self-taught route using vision alone without the need for additional sensing or infrastructure.},
}

Multi-robot transfer learning: a dynamical system perspective
M. K. Helwa and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 4702-4708.

Multi-robot transfer learning allows a robot to use data generated by a second, similar robot to improve its own behavior. The potential advantages are reducing the time of training and the unavoidable risks that exist during the training phase. Transfer learning algorithms aim to find an optimal transfer map between different robots. In this paper, we investigate, through a theoretical study of single-input single-output (SISO) systems, the properties of such optimal transfer maps. We first show that the optimal transfer learning map is, in general, a dynamic system. The main contribution of the paper is to provide an algorithm for determining the properties of this optimal dynamic map including its order and regressors (i.e., the variables it depends on). The proposed algorithm does not require detailed knowledge of the robots’ dynamics, but relies on basic system properties easily obtainable through simple experimental tests. We validate the proposed algorithm experimentally through an example of transfer learning between two different quadrotor platforms. Experimental results show that an optimal dynamic map, with correct properties obtained from our proposed algorithm, achieves 60-70% reduction of transfer learning error compared to the cases when the data is directly transferred or transferred using an optimal static map.

@INPROCEEDINGS{helwa-iros17,
author={Mohamed K. Helwa and Angela P. Schoellig},
title={Multi-Robot Transfer Learning: A Dynamical System Perspective},
booktitle={{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year={2017},
pages={4702-4708},
doi={10.1109/IROS.2017.8206342},
abstract={Multi-robot transfer learning allows a robot to use data generated by a second, similar robot to improve its own behavior. The potential advantages are reducing the time of training and the unavoidable risks that exist during the training phase. Transfer learning algorithms aim to find an optimal transfer map between different robots. In this paper, we investigate, through a theoretical study of single-input single-output (SISO) systems, the properties of such optimal transfer maps. We first show that the optimal transfer learning map is, in general, a dynamic system. The main contribution of the paper is to provide an algorithm for determining the properties of this optimal dynamic map including its order and regressors (i.e., the variables it depends on). The proposed algorithm does not require detailed knowledge of the robots’ dynamics, but relies on basic system properties easily obtainable through simple experimental tests. We validate the proposed algorithm experimentally through an example of transfer learning between two different quadrotor platforms. Experimental results show that an optimal dynamic map, with correct properties obtained from our proposed algorithm, achieves 60-70% reduction of transfer learning error compared to the cases when the data is directly transferred or transferred using an optimal static map.},
}

Aerial rock fragmentation analysis in low-light condition using UAV technology
T. Bamford, K. Esmaeili, and A. P. Schoellig
in Proc. of Application of Computers and Operations Research in the Mining Industry (APCOM), 2017, p. 4-1–4-8.

In recent years, Unmanned Aerial Vehicle (UAV) technology has been introduced into the mining industry to conduct terrain surveying. This work investigates the application of UAVs with artificial lighting for measurement of rock fragmentation under poor lighting conditions, representing night shifts in surface mines or working conditions in underground mines. The study relies on indoor and outdoor experiments for rock fragmentation analysis using a quadrotor UAV. Comparison of the rock size distributions in both cases show that adequate artificial lighting enables similar accuracy to ideal lighting conditions.

@INPROCEEDINGS{bamford-apcom17,
author={Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
title={Aerial Rock Fragmentation Analysis in Low-Light Condition Using {UAV} Technology},
booktitle={{Proc. of Application of Computers and Operations Research in the Mining Industry (APCOM)}},
year={2017},
pages = {4-1--4-8},
urlslides={../../wp-content/papercite-data/slides/bamford-apcom17-slides.pdf},
abstract={In recent years, Unmanned Aerial Vehicle (UAV) technology has been introduced into the mining industry to conduct terrain surveying. This work investigates the application of UAVs with artificial lighting for measurement of rock fragmentation under poor lighting conditions, representing night shifts in surface mines or working conditions in underground mines. The study relies on indoor and outdoor experiments for rock fragmentation analysis using a quadrotor UAV. Comparison of the rock size distributions in both cases show that adequate artificial lighting enables similar accuracy to ideal lighting conditions.},
}

A framework for multi-vehicle navigation using feedback-based motion primitives
M. Vukosavljev, Z. Kroeze, M. E. Broucke, and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, p. 223–229.

We present a hybrid control framework for solving a motion planning problem among a collection of heterogenous agents. The proposed approach utilizes a finite set of low-level motion primitives, each based on a piecewise affine feedback control, to generate complex motions in a gridded workspace. The constraints on allowable sequences of successive motion primitives are formalized through a maneuver automaton. At the higher level, a control policy generated by a shortest path non-deterministic algorithm determines which motion primitive is executed in each box of the gridded workspace. The overall framework yields a highly robust control design on both the low and high levels. We experimentally demonstrate the efficacy and robustness of this framework for multiple quadrocopters maneuvering in a 2D or 3D workspace.

@INPROCEEDINGS{vukosavljev-iros17,
author={Marijan Vukosavljev and Zachary Kroeze and Mireille E. Broucke and Angela P. Schoellig},
title={A Framework for Multi-Vehicle Navigation Using Feedback-Based Motion Primitives},
booktitle={{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year={2017},
pages={223--229},
doi={10.1109/IROS.2017.8202161},
urlslides = {../../wp-content/papercite-data/slides/vukosavljev-iros17-slides.pdf},
abstract={We present a hybrid control framework for solving a motion planning problem among a collection of heterogenous agents. The proposed approach utilizes a finite set of low-level motion primitives, each based on a piecewise affine feedback control, to generate complex motions in a gridded workspace. The constraints on allowable sequences of successive motion primitives are formalized through a maneuver automaton. At the higher level, a control policy generated by a shortest path non-deterministic algorithm determines which motion primitive is executed in each box of the gridded workspace. The overall framework yields a highly robust control design on both the low and high levels. We experimentally demonstrate the efficacy and robustness of this framework for multiple quadrocopters maneuvering in a 2D or 3D workspace.},
}

Design of deep neural networks as add-on blocks for improving impromptu trajectory tracking
S. Zhou, M. K. Helwa, and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2017, p. 5201–5207.

This paper introduces deep neural networks (DNNs) as add-on blocks to baseline feedback control systems to enhance tracking performance of arbitrary desired trajectories. The DNNs are trained to adapt the reference signals to the feedback control loop. The goal is to achieve a unity map between the desired and the actual outputs. In previous work, the efficacy of this approach was demonstrated on quadrotors; on 30 unseen test trajectories, the proposed DNN approach achieved an average impromptu tracking error reduction of 43% as compared to the baseline feedback controller. Motivated by these results, this work aims to provide platform-independent design guidelines for the proposed DNN-enhanced control architecture. In particular, we provide specific guidelines for the DNN feature selection, derive conditions for when the proposed approach is effective, and show in which cases the training efficiency can be further increased.

@INPROCEEDINGS{zhou-cdc17,
author={SiQi Zhou and Mohamed K. Helwa and Angela P. Schoellig},
title={Design of Deep Neural Networks as Add-on Blocks for Improving Impromptu Trajectory Tracking},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2017},
pages={5201--5207},
abstract = {This paper introduces deep neural networks (DNNs) as add-on blocks to baseline feedback control systems to enhance tracking performance of arbitrary desired trajectories. The DNNs are trained to adapt the reference signals to the feedback control loop. The goal is to achieve a unity map between the desired and the actual outputs. In previous work, the efficacy of this approach was demonstrated on quadrotors; on 30 unseen test trajectories, the proposed DNN approach achieved an average impromptu tracking error reduction of 43% as compared to the baseline feedback controller. Motivated by these results, this work aims to provide platform-independent design guidelines for the proposed DNN-enhanced control architecture. In particular, we provide specific guidelines for the DNN feature selection, derive conditions for when the proposed approach is effective, and show in which cases the training efficiency can be further increased.}
}

Constrained Bayesian optimization with particle swarms for safe adaptive controller tuning
R. R. P. R. Duivenvoorden, F. Berkenkamp, N. Carion, A. Krause, and A. P. Schoellig
in Proc. of the IFAC (International Federation of Automatic Control) World Congress, 2017, p. 12306–12313.

Tuning controller parameters is a recurring and time-consuming problem in control. This is especially true in the field of adaptive control, where good performance is typically only achieved after significant tuning. Recently, it has been shown that constrained Bayesian optimization is a promising approach to automate the tuning process without risking system failures during the optimization process. However, this approach is computationally too expensive for tuning more than a couple of parameters. In this paper, we provide a heuristic in order to efficiently perform constrained Bayesian optimization in high-dimensional parameter spaces by using an adaptive discretization based on particle swarms. We apply the method to the tuning problem of an L1 adaptive controller on a quadrotor vehicle and show that we can reliably and automatically tune parameters in experiments.

@INPROCEEDINGS{duivenvoorden-ifac17,
author = {Rikky R.P.R. Duivenvoorden and Felix Berkenkamp and Nicolas Carion and Andreas Krause and Angela P. Schoellig},
title = {Constrained {B}ayesian Optimization with Particle Swarms for Safe Adaptive Controller Tuning},
booktitle = {{Proc. of the IFAC (International Federation of Automatic Control) World Congress}},
year = {2017},
pages = {12306--12313},
abstract = {Tuning controller parameters is a recurring and time-consuming problem in control. This is especially true in the field of adaptive control, where good performance is typically only achieved after significant tuning. Recently, it has been shown that constrained Bayesian optimization is a promising approach to automate the tuning process without risking system failures during the optimization process. However, this approach is computationally too expensive for tuning more than a couple of parameters. In this paper, we provide a heuristic in order to efficiently perform constrained Bayesian optimization in high-dimensional parameter spaces by using an adaptive discretization based on particle swarms. We apply the method to the tuning problem of an L1 adaptive controller on a quadrotor vehicle and show that we can reliably and automatically tune parameters in experiments.},
}

Learning multimodal models for robot dynamics online with a mixture of Gaussian process experts
C. D. McKinnon and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2017, p. 322–328.

For decades, robots have been essential allies alongside humans in controlled industrial environments like heavy manufacturing facilities. However, without the guidance of a trusted human operator to shepherd a robot safely through a wide range of conditions, they have been barred from the complex, ever changing environments that we live in from day to day. Safe learning control has emerged as a promising way to start bridging algorithms based on first principles to complex real-world scenarios by using data to adapt, and improve performance over time. Safe learning methods rely on a good estimate of the robot dynamics and of the bounds on modelling error in order to be effective. Current methods focus on either a single adaptive model, or a fixed, known set of models for the robot dynamics. This limits them to static or slowly changing environments. This paper presents a method using Gaussian Processes in a Dirichlet Process mixture model to learn an increasing number of non-linear models for the robot dynamics. We show that this approach enables a robot to re-use past experience from an arbitrary number of previously visited operating conditions, and to automatically learn a new model when a new and distinct operating condition is encountered. This approach improves the robustness of existing Gaussian Process-based models to large changes in dynamics that do not have to be specified ahead of time.

@INPROCEEDINGS{mckinnon-icra17,
author = {Christopher D. McKinnon and Angela P. Schoellig},
title = {Learning multimodal models for robot dynamics online with a mixture of {G}aussian process experts},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2017},
pages = {322--328},
doi = {10.1109/ICRA.2017.7989041},
abstract = {For decades, robots have been essential allies alongside humans in controlled industrial environments like heavy manufacturing facilities. However, without the guidance of a trusted human operator to shepherd a robot safely through a wide range of conditions, they have been barred from the complex, ever changing environments that we live in from day to day. Safe learning control has emerged as a promising way to start bridging algorithms based on first principles to complex real-world scenarios by using data to adapt, and improve performance over time. Safe learning methods rely on a good estimate of the robot dynamics and of the bounds on modelling error in order to be effective. Current methods focus on either a single adaptive model, or a fixed, known set of models for the robot dynamics. This limits them to static or slowly changing environments. This paper presents a method using Gaussian Processes in a Dirichlet Process mixture model to learn an increasing number of non-linear models for the robot dynamics. We show that this approach enables a robot to re-use past experience from an arbitrary number of previously visited operating conditions, and to automatically learn a new model when a new and distinct operating condition is encountered. This approach improves the robustness of existing Gaussian Process-based models to large changes in dynamics that do not have to be specified ahead of time.},
}

High-precision trajectory tracking in changing environments through L1 adaptive feedback and iterative learning
K. Pereida, R. R. P. R. Duivenvoorden, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2017, p. 344–350.

As robots and other automated systems are introduced to unknown and dynamic environments, robust and adaptive control strategies are required to cope with disturbances, unmodeled dynamics and parametric uncertainties. In this paper, we propose and provide theoretical proofs of a combined L1 adaptive feedback and iterative learning control (ILC) framework to improve trajectory tracking of a system subject to unknown and changing disturbances. The L1 adaptive controller forces the system to behave in a repeatable, predefined way, even in the presence of unknown and changing disturbances; however, this does not imply that perfect trajectory tracking is achieved. ILC improves the tracking performance based on experience from previous executions. The performance of ILC is limited by the robustness and repeatability of the underlying system, which, in this approach, is handled by the L1 adaptive controller. In particular, we are able to generalize learned trajectories across different system configurations because the L1 adaptive controller handles the underlying changes in the system. We demonstrate the improved trajectory tracking performance and generalization capabilities of the combined method compared to pure ILC in experiments with a quadrotor subject to unknown, dynamic disturbances. This is the first work to show L1 adaptive control combined with ILC in experiment.

@INPROCEEDINGS{pereida-icra17,
author = {Karime Pereida and Rikky R. P. R. Duivenvoorden and Angela P. Schoellig},
title = {High-precision trajectory tracking in changing environments through {L1} adaptive feedback and iterative learning},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2017},
pages = {344--350},
doi = {10.1109/ICRA.2017.7989041},
urlslides = {../../wp-content/papercite-data/slides/pereida-icra17-slides.pdf},
abstract = {As robots and other automated systems are introduced to unknown and dynamic environments, robust and adaptive control strategies are required to cope with disturbances, unmodeled dynamics and parametric uncertainties. In this paper, we propose and provide theoretical proofs of a combined L1 adaptive feedback and iterative learning control (ILC) framework to improve trajectory tracking of a system subject to unknown and changing disturbances. The L1 adaptive controller forces the system to behave in a repeatable, predefined way, even in the presence of unknown and changing disturbances; however, this does not imply that perfect trajectory tracking is achieved. ILC improves the tracking performance based on experience from previous executions. The performance of ILC is limited by the robustness and repeatability of the underlying system, which, in this approach, is handled by the L1 adaptive controller. In particular, we are able to generalize learned trajectories across different system configurations because the L1 adaptive controller handles the underlying changes in the system. We demonstrate the improved trajectory tracking performance and generalization capabilities of the combined method compared to pure ILC in experiments with a quadrotor subject to unknown, dynamic disturbances. This is the first work to show L1 adaptive control combined with ILC in experiment.},
}

Deep neural networks for improved, impromptu trajectory tracking of quadrotors
Q. Li, J. Qian, Z. Zhu, X. Bao, M. K. Helwa, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2017, p. 5183–5189.

Trajectory tracking control for quadrotors is important for applications ranging from surveying and inspection, to film making. However, designing and tuning classical controllers, such as proportional-integral-derivative (PID) controllers, to achieve high tracking precision can be time-consuming and difficult, due to hidden dynamics and other non-idealities. The Deep Neural Network (DNN), with its superior capability of approximating abstract, nonlinear functions, proposes a novel approach for enhancing trajectory tracking control. This paper presents a DNN-based algorithm as an add-on module that improves the tracking performance of a classical feedback controller. Given a desired trajectory, the DNNs provide a tailored reference input to the controller based on their gained experience. The input aims to achieve a unity map between the desired and the output trajectory. The motivation for this work is an interactive �fly-as-you-draw� application, in which a user draws a trajectory on a mobile device, and a quadrotor instantly flies that trajectory with the DNN-enhanced control system. Experimental results demonstrate that the proposed approach improves the tracking precision for user-drawn trajectories after the DNNs are trained on selected periodic trajectories, suggesting the method�s potential in real-world applications. Tracking errors are reduced by around 40-50% for both training and testing trajectories from users, highlighting the DNNs� capability of generalizing knowledge.

@INPROCEEDINGS{li-icra17,
author = {Qiyang Li and Jingxing Qian and Zining Zhu and Xuchan Bao and Mohamed K. Helwa and Angela P. Schoellig},
title = {Deep neural networks for improved, impromptu trajectory tracking of quadrotors},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2017},
pages = {5183--5189},
doi = {10.1109/ICRA.2017.7989607},
urlvideo = {https://youtu.be/r1WnMUZy9-Y},
abstract = {Trajectory tracking control for quadrotors is important for applications ranging from surveying and inspection, to film making. However, designing and tuning classical controllers, such as proportional-integral-derivative (PID) controllers, to achieve high tracking precision can be time-consuming and difficult, due to hidden dynamics and other non-idealities. The Deep Neural Network (DNN), with its superior capability of approximating abstract, nonlinear functions, proposes a novel approach for enhancing trajectory tracking control. This paper presents a DNN-based algorithm as an add-on module that improves the tracking performance of a classical feedback controller. Given a desired trajectory, the DNNs provide a tailored reference input to the controller based on their gained experience. The input aims to achieve a unity map between the desired and the output trajectory. The motivation for this work is an interactive �fly-as-you-draw� application, in which a user draws a trajectory on a mobile device, and a quadrotor instantly flies that trajectory with the DNN-enhanced control system. Experimental results demonstrate that the proposed approach improves the tracking precision for user-drawn trajectories after the DNNs are trained on selected periodic trajectories, suggesting the method�s potential in real-world applications. Tracking errors are reduced by around 40-50% for both training and testing trajectories from users, highlighting the DNNs� capability of generalizing knowledge.},
}

Virtual vs. real: trading off simulations and physical experiments in reinforcement learning with Bayesian optimization
A. Marco, F. Berkenkamp, P. Hennig, A. P. Schoellig, A. Krause, S. Schaal, and S. Trimpe
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2017, p. 1557–1563.

In practice, the parameters of control policies are often tuned manually. This is time-consuming and frustrating. Reinforcement learning is a promising alternative that aims to automate this process, yet often requires too many experiments to be practical. In this paper, we propose a solution to this problem by exploiting prior knowledge from simulations, which are readily available for most robotic platforms. Specifically, we extend Entropy Search, a Bayesian optimization algorithm that maximizes information gain from each experiment, to the case of multiple information sources. The result is a principled way to automatically combine cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. We apply the resulting method to a cart-pole system, which confirms that the algorithm can find good control policies with fewer experiments than standard Bayesian optimization on the physical system only.

@INPROCEEDINGS{marco-icra17,
author = {Alonso Marco and Felix Berkenkamp and Phillipp Hennig and Angela P. Schoellig and Andreas Krause and Stefan Schaal and Sebastian Trimpe},
title = {Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with {B}ayesian Optimization},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
month = {may},
year = {2017},
pages = {1557--1563},
doi = {10.1109/ICRA.2017.7989186},
abstract = {In practice, the parameters of control policies are often tuned manually. This is time-consuming and frustrating. Reinforcement learning is a promising alternative that aims to automate this process, yet often requires too many experiments to be practical. In this paper, we propose a solution to this problem by exploiting prior knowledge from simulations, which are readily available for most robotic platforms. Specifically, we extend Entropy Search, a Bayesian optimization algorithm that maximizes information gain from each experiment, to the case of multiple information sources. The result is a principled way to automatically combine cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. We apply the resulting method to a cart-pole system, which confirms that the algorithm can find good control policies with fewer experiments than standard Bayesian optimization on the physical system only.}
}

Point-cloud-based aerial fragmentation analysis for application in the minerals industry
T. Bamford, K. Esmaeili, and A. P. Schoellig
Technical Report, arXiv, 2017.

This work investigates the application of Unmanned Aerial Vehicle (UAV) technology for measurement of rock fragmentation without placement of scale objects in the scene to determine image scale. Commonly practiced image-based rock fragmentation analysis requires a technician to walk to a rock pile, place a scale object of known size in the area of interest, and capture individual 2D images. Our previous work has used UAV technology for the first time to acquire real-time rock fragmentation data and has shown comparable quality of results; however, it still required the (potentially dangerous) placement of scale objects, and continued to make the assumption that the rock pile surface is planar and that the scale objects lie on the surface plane. This work improves our UAV-based approach to enable rock fragmentation measurement without placement of scale objects and without the assumption of planarity. This is achieved by first generating a point cloud of the rock pile from 2D images, taking into account intrinsic and extrinsic camera parameters, and then taking 2D images for fragmentation analysis. This work represents an important step towards automating post-blast rock fragmentation analysis. In experiments, a rock pile with known size distribution was photographed by the UAV with and without using scale objects. For fragmentation analysis without scale objects, a point cloud of the rock pile was generated and used to compute image scale. Comparison of the rock size distributions show that this point-cloud-based method enables producing measurements with better or comparable accuracy (within 10% of the ground truth) to the manual method with scale objects.

@TECHREPORT{bamford-iros17,
title = {Point-cloud-based aerial fragmentation analysis for application in the minerals industry},
institution = {arXiv},
author = {Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
year = {2017},
abstract = {This work investigates the application of Unmanned Aerial Vehicle (UAV) technology for measurement of rock fragmentation without placement of scale objects in the scene to determine image scale. Commonly practiced image-based rock fragmentation analysis requires a technician to walk to a rock pile, place a scale object of known size in the area of interest, and capture individual 2D images. Our previous work has used UAV technology for the first time to acquire real-time rock fragmentation data and has shown comparable quality of results; however, it still required the (potentially dangerous) placement of scale objects, and continued to make the assumption that the rock pile surface is planar and that the scale objects lie on the surface plane. This work improves our UAV-based approach to enable rock fragmentation measurement without placement of scale objects and without the assumption of planarity. This is achieved by first generating a point cloud of the rock pile from 2D images, taking into account intrinsic and extrinsic camera parameters, and then taking 2D images for fragmentation analysis. This work represents an important step towards automating post-blast rock fragmentation analysis. In experiments, a rock pile with known size distribution was photographed by the UAV with and without using scale objects. For fragmentation analysis without scale objects, a point cloud of the rock pile was generated and used to compute image scale. Comparison of the rock size distributions show that this point-cloud-based method enables producing measurements with better or comparable accuracy (within 10% of the ground truth) to the manual method with scale objects.},
}

## 2016

Robust constrained learning-based NMPC enabling reliable mobile robot path tracking
C. J. Ostafew, A. P. Schoellig, and T. D. Barfoot
International Journal of Robotics Research, vol. 35, iss. 13, pp. 1547-1563, 2016.

This paper presents a Robust Constrained Learning-based Nonlinear Model Predictive Control (RC-LB-NMPC) algorithm for path-tracking in off-road terrain. For mobile robots, constraints may represent solid obstacles or localization limits. As a result, constraint satisfaction is required for safety. Constraint satisfaction is typically guaranteed through the use of accurate, a priori models or robust control. However, accurate models are generally not available for off-road operation. Furthermore, robust controllers are often conservative, since model uncertainty is not updated online. In this work our goal is to use learning to generate low-uncertainty, non-parametric models in situ. Based on these models, the predictive controller computes both linear and angular velocities in real-time, such that the robot drives at or near its capabilities while respecting path and localization constraints. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, off-road environments. The paper presents experimental results, including over 5 km of travel by a 900 kg skid-steered robot at speeds of up to 2.0 m/s. The result is a robust, learning controller that provides safe, conservative control during initial trials when model uncertainty is high and converges to high-performance, optimal control during later trials when model uncertainty is reduced with experience.

@ARTICLE{ostafew-ijrr16,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot},
title = {Robust Constrained Learning-Based {NMPC} Enabling Reliable Mobile Robot Path Tracking},
year = {2016},
journal = {{International Journal of Robotics Research}},
volume = {35},
number = {13},
pages = {1547-1563},
doi = {10.1177/0278364916645661},
url = {http://dx.doi.org/10.1177/0278364916645661},
eprint = {http://dx.doi.org/10.1177/0278364916645661},
urlvideo = {https://youtu.be/3xRNmNv5Efk},
abstract = {This paper presents a Robust Constrained Learning-based Nonlinear Model Predictive Control (RC-LB-NMPC) algorithm for path-tracking in off-road terrain. For mobile robots, constraints may represent solid obstacles or localization limits. As a result, constraint satisfaction is required for safety. Constraint satisfaction is typically guaranteed through the use of accurate, a priori models or robust control. However, accurate models are generally not available for off-road operation. Furthermore, robust controllers are often conservative, since model uncertainty is not updated online. In this work our goal is to use learning to generate low-uncertainty, non-parametric models in situ. Based on these models, the predictive controller computes both linear and angular velocities in real-time, such that the robot drives at or near its capabilities while respecting path and localization constraints. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, off-road environments. The paper presents experimental results, including over 5 km of travel by a 900 kg skid-steered robot at speeds of up to 2.0 m/s. The result is a robust, learning controller that provides safe, conservative control during initial trials when model uncertainty is high and converges to high-performance, optimal control during later trials when model uncertainty is reduced with experience.},
}

Learning-based nonlinear model predictive control to improve vision-based mobile robot path tracking
C. J. Ostafew, J. Collier, A. P. Schoellig, and T. D. Barfoot
Journal of Field Robotics, vol. 33, iss. 1, pp. 133-152, 2016.

This paper presents a Learning-based Nonlinear Model Predictive Control (LB-NMPC) algorithm to achieve high-performance path tracking in challenging off-road terrain through learning. The LB-NMPC algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) as a function of system state, input, and other relevant variables. The GP is updated based on experience collected during previous trials. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results including over 3.0 km of travel by three significantly different robot platforms with masses ranging from 50 kg to 600 kg and at speeds ranging from 0.35 m/s to 1.2 m/s. Planned speeds are generated by a novel experience-based speed scheduler that balances overall travel time, path-tracking errors, and localization reliability. The results show that the controller can start from a generic a priori vehicle model and subsequently learn to reduce vehicle- and trajectory-specific path-tracking errors based on experience.

@ARTICLE{ostafew-jfr16,
author = {Chris J. Ostafew and Jack Collier and Angela P. Schoellig and Timothy D. Barfoot},
title = {Learning-based nonlinear model predictive control to improve vision-based mobile robot path tracking},
year = {2016},
journal = {{Journal of Field Robotics}},
volume = {33},
number = {1},
pages = {133-152},
doi = {10.1002/rob.21587},
urlvideo={https://youtu.be/lxm-2A6yOY0?list=PLC12E387419CEAFF2},
urlvideo3={http://youtu.be/MwVElAn95-M?list=PLC0E5EB919968E507},
urlvideo4={http://youtu.be/Pu3_F6k6Fa4?list=PLC0E5EB919968E507},
abstract = {This paper presents a Learning-based Nonlinear Model Predictive Control (LB-NMPC) algorithm to achieve high-performance path tracking in challenging off-road terrain through learning. The LB-NMPC algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) as a function of system state, input, and other relevant variables. The GP is updated based on experience collected during previous trials. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results including over 3.0 km of travel by three significantly different robot platforms with masses ranging from 50 kg to 600 kg and at speeds ranging from 0.35 m/s to 1.2 m/s. Planned speeds are generated by a novel experience-based speed scheduler that balances overall travel time, path-tracking errors, and localization reliability. The results show that the controller can start from a generic a priori vehicle model and subsequently learn to reduce vehicle- and trajectory-specific path-tracking errors based on experience.}
}

Distributed iterative learning control for a team of quadrotors
A. Hock and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2016, pp. 4640-4646.

The goal of this work is to enable a team of quadrotors to learn how to accurately track a desired trajectory while holding a given formation. We solve this problem in a distributed manner, where each vehicle has only access to the information of its neighbors. The desired trajectory is only available to one (or few) vehicles. We present a distributed iterative learning control (ILC) approach where each vehicle learns from the experience of its own and its neighbors’ previous task repetitions, and adapts its feedforward input to improve performance. Existing algorithms are extended in theory to make them more applicable to real-world experiments. In particular, we prove stability for any causal learning function with gains chosen according to a simple scalar condition. Previous proofs were restricted to a specific learning function that only depends on the tracking error derivative (D-type ILC). Our extension provides more degrees of freedom in the ILC design and, as a result, better performance can be achieved. We also show that stability is not affected by a linear dynamic coupling between neighbors. This allows us to use an additional consensus feedback controller to compensate for non-repetitive disturbances. Experiments with two quadrotors attest the effectiveness of the proposed distributed multi-agent ILC approach. This is the first work to show distributed ILC in experiment.

@INPROCEEDINGS{hock-cdc16,
author = {Andreas Hock and Angela P. Schoellig},
title = {Distributed iterative learning control for a team of quadrotors},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2016},
pages = {4640-4646},
doi = {10.1109/CDC.2016.7798976},
urlvideo = {https://youtu.be/Qw598DRw6-Q},
urlvideo2 = {https://youtu.be/JppRu26eZgI},
urlslides = {../../wp-content/papercite-data/slides/hock-cdc16-slides.pdf},
abstract = {The goal of this work is to enable a team of quadrotors to learn how to accurately track a desired trajectory while holding a given formation. We solve this problem in a distributed manner, where each vehicle has only access to the information of its neighbors. The desired trajectory is only available to one (or few) vehicles. We present a distributed iterative learning control (ILC) approach where each vehicle learns from the experience of its own and its neighbors’ previous task repetitions, and adapts its feedforward input to improve performance. Existing algorithms are extended in theory to make them more applicable to real-world experiments. In particular, we prove stability for any causal learning function with gains chosen according to a simple scalar condition. Previous proofs were restricted to a specific learning function that only depends on the tracking error derivative (D-type ILC). Our extension provides more degrees of freedom in the ILC design and, as a result, better performance can be achieved. We also show that stability is not affected by a linear dynamic coupling between neighbors. This allows us to use an additional consensus feedback controller to compensate for non-repetitive disturbances. Experiments with two quadrotors attest the effectiveness of the proposed distributed multi-agent ILC approach. This is the first work to show distributed ILC in experiment.},
}

On the construction of safe controllable regions for affine systems with applications to robotics
M. K. Helwa and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2016, pp. 3000-3005.

This paper studies the problem of constructing in-block controllable (IBC) regions for affine systems. That is, we are concerned with constructing regions in the state space of affine systems such that all the states in the interior of the region are mutually accessible through the region’s interior by applying uniformly bounded inputs. We first show that existing results for checking in-block controllability on given polytopic regions cannot be easily extended to address the question of constructing IBC regions. We then explore the geometry of the problem to provide a computationally efficient algorithm for constructing IBC regions. We also prove the soundness of the algorithm. Finally, we use the proposed algorithm to construct safe speed profiles for fully-actuated robots and for ground robots modeled as unicycles with acceleration limits.

@INPROCEEDINGS{helwa-cdc16,
author = {Mohamed K. Helwa and Angela P. Schoellig},
title = {On the construction of safe controllable regions for affine systems with applications to robotics},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2016},
pages = {3000-3005},
doi = {10.1109/CDC.2016.7798717},
urlslides = {../../wp-content/papercite-data/slides/helwa-cdc16-slides.pdf},
urlvideo = {https://youtu.be/s_N7zTtCjd0},
abstract = {This paper studies the problem of constructing in-block controllable (IBC) regions for affine systems. That is, we are concerned with constructing regions in the state space of affine systems such that all the states in the interior of the region are mutually accessible through the region’s interior by applying uniformly bounded inputs. We first show that existing results for checking in-block controllability on given polytopic regions cannot be easily extended to address the question of constructing IBC regions. We then explore the geometry of the problem to provide a computationally efficient algorithm for constructing IBC regions. We also prove the soundness of the algorithm. Finally, we use the proposed algorithm to construct safe speed profiles for fully-actuated robots and for ground robots modeled as unicycles with acceleration limits.},
}

Safe learning of regions of attraction for uncertain, nonlinear systems with Gaussian processes
F. Berkenkamp, R. Moriconi, A. P. Schoellig, and A. Krause
in Proc. of the IEEE Conference on Decision and Control (CDC), 2016, pp. 4661-4666.

Control theory can provide useful insights into the properties of controlled, dynamic systems. One important property of nonlinear systems is the region of attraction (ROA), a safe subset of the state space in which a given controller renders an equilibrium point asymptotically stable. The ROA is typically estimated based on a model of the system. However, since models are only an approximation of the real world, the resulting estimated safe region can contain states outside the ROA of the real system. This is not acceptable in safety-critical applications. In this paper, we consider an approach that learns the ROA from experiments on a real system, without ever leaving the true ROA and, thus, without risking safety-critical failures. Based on regularity assumptions on the model errors in terms of a Gaussian process prior, we use an underlying Lyapunov function in order to determine a region in which an equilibrium point is asymptotically stable with high probability. Moreover, we provide an algorithm to actively and safely explore the state space in order to expand the ROA estimate. We demonstrate the effectiveness of this method in simulation.

@INPROCEEDINGS{berkenkamp-cdc16,
author = {Felix Berkenkamp and Riccardo Moriconi and Angela P. Schoellig and Andreas Krause},
title = {Safe learning of regions of attraction for uncertain, nonlinear systems with {G}aussian processes},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2016},
pages = {4661-4666},
doi = {10.1109/CDC.2016.7798979},
urlvideo = {https://youtu.be/bSv-pNOWn7c},
urlslides={../../wp-content/papercite-data/slides/berkenkamp-cdc16-slides.pdf},
urlcode = {https://github.com/befelix/lyapunov-learning},
urlcode2 = {http://berkenkamp.me/jupyter/lyapunov},
abstract = {Control theory can provide useful insights into the properties of controlled, dynamic systems. One important property of nonlinear systems is the region of attraction (ROA), a safe subset of the state space in which a given controller renders an equilibrium point asymptotically stable. The ROA is typically estimated based on a model of the system. However, since models are only an approximation of the real world, the resulting estimated safe region can contain states outside the ROA of the real system. This is not acceptable in safety-critical applications. In this paper, we consider an approach that learns the ROA from experiments on a real system, without ever leaving the true ROA and, thus, without risking safety-critical failures. Based on regularity assumptions on the model errors in terms of a Gaussian process prior, we use an underlying Lyapunov function in order to determine a region in which an equilibrium point is asymptotically stable with high probability. Moreover, we provide an algorithm to actively and safely explore the state space in order to expand the ROA estimate. We demonstrate the effectiveness of this method in simulation.}
}

A real-time analysis of rock fragmentation using UAV technology
T. Bamford, K. Esmaeili, and A. P. Schoellig
in Proc. of the International Conference on Computer Applications in the Minerals Industries (CAMI), 2016.

Accurate measurement of blast-induced rock fragmentation is of great importance for many mining operations. The post-blast rock size distribution can significantly influence the efficiency of all the downstream mining and comminution processes. Image analysis methods are one of the most common methods used to measure rock fragment size distribution in mines regardless of criticism for lack of accuracy to measure fine particles and other perceived deficiencies. The current practice of collecting rock fragmentation data for image analysis is highly manual and provides data with low temporal and spatial resolution. Using Unmanned Aerial Vehicles (UAVs) for collecting images of rock fragments can not only improve the quality of the image data but also automate the data collection process. Ultimately, real-time acquisition of high temporal- and spatial-resolution data based on UAV technology will provide a broad range of opportunities for both improving blast design without interrupting the production process and reducing the cost of the human operator.

@INPROCEEDINGS{bamford-cami16,
author = {Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
title = {A real-time analysis of rock fragmentation using {UAV} technology},
booktitle = {{Proc. of the International Conference on Computer Applications in the Minerals Industries (CAMI)}},
year = {2016},
urlvideo = {https://youtu.be/q0syk6J_JHY},
urlslides={../../wp-content/papercite-data/slides/bamford-cami16-slides.pdf},
abstract = {Accurate measurement of blast-induced rock fragmentation is of great importance for many mining operations. The post-blast rock size distribution can significantly influence the efficiency of all the downstream mining and comminution processes. Image analysis methods are one of the most common methods used to measure rock fragment size distribution in mines regardless of criticism for lack of accuracy to measure fine particles and other perceived deficiencies. The current practice of collecting rock fragmentation data for image analysis is highly manual and provides data with low temporal and spatial resolution. Using Unmanned Aerial Vehicles (UAVs) for collecting images of rock fragments can not only improve the quality of the image data but also automate the data collection process. Ultimately, real-time acquisition of high temporal- and spatial-resolution data based on UAV technology will provide a broad range of opportunities for both improving blast design without interrupting the production process and reducing the cost of the human operator.},
}

Unscented external force estimation for quadrotors and experiments
C. D. McKinnon and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016, pp. 5651-5657.

In this paper, we describe an algorithm, based on the well-known Unscented Quaternion Estimator, to estimate external forces and torques acting on a quadrotor. This formulation uses a non-linear model for the quadrotor dynamics, naturally incorporates process and measurement noise, requires only a few parameters to be tuned manually, and uses singularity-free unit quaternions to represent attitude. We demonstrate in simulation that the proposed algorithm can outperform existing methods. We then highlight how our approach can be used to generate force and torque profiles from experimental data, and how this information can later be used for controller design. Finally, we show how the resulting controllers enable a quadrotor to stay in the wind field of a moving fan.

@INPROCEEDINGS{mckinnon-iros16,
author = {Christopher D. McKinnon and Angela P. Schoellig},
title = {Unscented external force estimation for quadrotors and experiments},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year = {2016},
pages = {5651-5657},
doi = {10.1109/IROS.2016.7759831},
urlvideo = {https://youtu.be/YFA3kHabY38},
abstract = {In this paper, we describe an algorithm, based on the well-known Unscented Quaternion Estimator, to estimate external forces and torques acting on a quadrotor. This formulation uses a non-linear model for the quadrotor dynamics, naturally incorporates process and measurement noise, requires only a few parameters to be tuned manually, and uses singularity-free unit quaternions to represent attitude. We demonstrate in simulation that the proposed algorithm can outperform existing methods. We then highlight how our approach can be used to generate force and torque profiles from experimental data, and how this information can later be used for controller design. Finally, we show how the resulting controllers enable a quadrotor to stay in the wind field of a moving fan.},
}

Safe and robust quadrotor maneuvers based on reach control
M. Vukosavljev, I. Jansen, M. E. Broucke, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2016, pp. 5677-5682.

In this paper, we investigate the synthesis of piecewise affine feedback controllers to execute safe and robust quadrocopter maneuvers. The methodology is based on formulating the problem as a reach control problem on a polytopic state space. Reach control has so far only been developed in theory and has not been tested experimentally in a real system before. We demonstrate that these theoretical tools can achieve aggressive, albeit safe and robust, quadrocopter maneuvers without the need for a predefined open-loop reference trajectory. In a proof-of-concept demonstration, the reach controller is implemented in one translational direction while the other degrees of freedom are stabilized by separate controllers. Experimental results on a quadrocopter show the effectiveness and robustness of this control approach.

@INPROCEEDINGS{vukosavljev-icra16,
author = {Marijan Vukosavljev and Ivo Jansen and Mireille E. Broucke and Angela P. Schoellig},
title = {Safe and robust quadrotor maneuvers based on reach control},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2016},
pages = {5677-5682},
doi = {10.1109/ICRA.2016.7487789},
urlvideo={https://youtu.be/l4vdxdmd2xc},
urlslides={../../wp-content/papercite-data/slides/vukosavljev-icra16-slides.pdf},
urlslides2={../../wp-content/papercite-data/slides/vukosavljev-icra16-slides2.pdf},
abstract = {In this paper, we investigate the synthesis of piecewise affine feedback controllers to execute safe and robust quadrocopter maneuvers. The methodology is based on formulating the problem as a reach control problem on a polytopic state space. Reach control has so far only been developed in theory and has not been tested experimentally in a real system before. We demonstrate that these theoretical tools can achieve aggressive, albeit safe and robust, quadrocopter maneuvers without the need for a predefined open-loop reference trajectory. In a proof-of-concept demonstration, the reach controller is implemented in one translational direction while the other degrees of freedom are stabilized by separate controllers. Experimental results on a quadrocopter show the effectiveness and robustness of this control approach.}
}

Safe controller optimization for quadrotors with Gaussian processes
F. Berkenkamp, A. P. Schoellig, and A. Krause
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2016, p. 491–496.

One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to obtain an initial controller, but ultimately the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning step, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters on the real system, safety-critical system failures may happen. In this paper, we overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SafeOpt, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SafeOpt automatically optimizes the parameters of a control law while guaranteeing safety. It models the underlying performance measure as a Gaussian process and only explores new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.

@INPROCEEDINGS{berkenkamp-icra16,
author = {Felix Berkenkamp and Angela P. Schoellig and Andreas Krause},
title = {Safe controller optimization for quadrotors with {G}aussian processes},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2016},
month = {May},
pages = {491--496},
doi = {10.1109/ICRA.2016.7487170},
urlcode = {https://github.com/befelix/SafeOpt},
abstract = {One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to obtain an initial controller, but ultimately the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning step, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters on the real system, safety-critical system failures may happen. In this paper, we overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SafeOpt, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SafeOpt automatically optimizes the parameters of a control law while guaranteeing safety. It models the underlying performance measure as a Gaussian process and only explores new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.},
}

A preliminary study of transfer learning between unicycle robots
K. V. Raimalwala, B. A. Francis, and A. P. Schoellig
in Proc. of the AAAI Spring Symposium Series, 2016, p. 53–59.

Methods from machine learning have successfully been used to improve the performance of control systems in cases when accurate models of the system or the environment are not available. These methods require the use of data generated from physical trials. Transfer Learning (TL) allows for this data to come from a different, similar system. The goal of this work is to understand in which cases a simple, alignment-based transfer of data is beneficial. A scalar, linear, time- invariant (LTI) transformation is applied to the output from a source system to align with the output from a tar- get system. In a theoretic study, we have already shown that for linear, single-input, single-output systems, the upper bound of the transformation error depends on the dynamic properties of the source and target system, and is small for systems with similar response times. We now consider two nonlinear, unicycle robots. Based on our previous work, we derive analytic error bounds for the linearized robot models. We then provide simulations of the nonlinear robot models and experiments with a Pioneer 3-AT robot that confirm the theoretical findings. As a result, key characteristics of alignment- based transfer learning observed in our theoretic study prove to be also true for real, nonlinear unicycle robots.

@INPROCEEDINGS{raimalwala-aaai16,
author = {Kaizad V. Raimalwala and Bruce A. Francis and Angela P. Schoellig},
title = {A preliminary study of transfer learning between unicycle robots},
booktitle = {{Proc. of the AAAI Spring Symposium Series}},
year = {2016},
pages = {53--59},
abstract = {Methods from machine learning have successfully been used to improve the performance of control systems in cases when accurate models of the system or the environment are not available. These methods require the use of data generated from physical trials. Transfer Learning (TL) allows for this data to come from a different, similar system. The goal of this work is to understand in which cases a simple, alignment-based transfer of data is beneficial. A scalar, linear, time- invariant (LTI) transformation is applied to the output from a source system to align with the output from a tar- get system. In a theoretic study, we have already shown that for linear, single-input, single-output systems, the upper bound of the transformation error depends on the dynamic properties of the source and target system, and is small for systems with similar response times. We now consider two nonlinear, unicycle robots. Based on our previous work, we derive analytic error bounds for the linearized robot models. We then provide simulations of the nonlinear robot models and experiments with a Pioneer 3-AT robot that confirm the theoretical findings. As a result, key characteristics of alignment- based transfer learning observed in our theoretic study prove to be also true for real, nonlinear unicycle robots.},
}

Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics
F. Berkenkamp, A. Krause, and A. P. Schoellig
Technical Report, arXiv, 2016.

Robotic algorithms typically depend on various parameters, the choice of which significantly affects the robot’s performance. While an initial guess for the parameters may be obtained from dynamic models of the robot, parameters are usually tuned manually on the real system to achieve the best performance. Optimization algorithms, such as Bayesian optimization, have been used to automate this process. However, these methods may evaluate unsafe parameters during the optimization process that lead to safety-critical system failures. Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been developed, which guarantees that the performance of the system never falls below a critical value; that is, safety is defined based on the performance function. However, coupling performance and safety is often not desirable in robotics. For example, high-gain controllers might achieve low average tracking error (performance), but can overshoot and violate input constraints. In this paper, we present a generalized algorithm that allows for multiple safety constraints separate from the objective. Given an initial set of safe parameters, the algorithm maximizes performance but only evaluates parameters that satisfy safety for all constraints with high probability. To this end, it carefully explores the parameter space by exploiting regularity assumptions in terms of a Gaussian process prior. Moreover, we show how context variables can be used to safely transfer knowledge to new situations and tasks. We provide a theoretical analysis and demonstrate that the proposed algorithm enables fast, automatic, and safe optimization of tuning parameters in experiments on a quadrotor vehicle.

@TECHREPORT{berkenkamp-tr16,
title = {Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics},
institution = {arXiv},
author = {Berkenkamp, Felix and Krause, Andreas and Schoellig, Angela P.},
year = {2016},
urlvideo = {https://youtu.be/GiqNQdzc5TI},
urlcode = {https://github.com/befelix/SafeOpt},
abstract = {Robotic algorithms typically depend on various parameters, the choice of which significantly affects the robot's performance. While an initial guess for the parameters may be obtained from dynamic models of the robot, parameters are usually tuned manually on the real system to achieve the best performance. Optimization algorithms, such as Bayesian optimization, have been used to automate this process. However, these methods may evaluate unsafe parameters during the optimization process that lead to safety-critical system failures. Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been developed, which guarantees that the performance of the system never falls below a critical value; that is, safety is defined based on the performance function. However, coupling performance and safety is often not desirable in robotics. For example, high-gain controllers might achieve low average tracking error (performance), but can overshoot and violate input constraints. In this paper, we present a generalized algorithm that allows for multiple safety constraints separate from the objective. Given an initial set of safe parameters, the algorithm maximizes performance but only evaluates parameters that satisfy safety for all constraints with high probability. To this end, it carefully explores the parameter space by exploiting regularity assumptions in terms of a Gaussian process prior. Moreover, we show how context variables can be used to safely transfer knowledge to new situations and tasks. We provide a theoretical analysis and demonstrate that the proposed algorithm enables fast, automatic, and safe optimization of tuning parameters in experiments on a quadrotor vehicle.}
}

Rock fragmentation analysis using UAV technology
T. Bamford, K. Esmaeili, and A. P. Schoellig
Professional Magazine Article, Ontario Professional Surveyor (OPS) Magazine, Assn. of Ontario Land Surveyors, 2016.

@MISC{bamford-ops16,
author = {Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
title = {Rock fragmentation analysis using {UAV} technology},
year = {2016},
volume = {59},
number = {4},
pages = {14-16},
howpublished = {Professional Magazine Article, Ontario Professional Surveyor (OPS) Magazine, Assn. of Ontario Land Surveyors},
}

Quantifying the value of drone-delivered AEDs in cardiac arrest response
J. J. Boutilier, S. C. Brooks, A. Janmohamed, A. Byers, C. Zhan, J. E. Buick, A. P. Schoellig, L. J. Morrison, S. Cheskes, and T. C. Y. Chan
Abstract and Oral Presentation, in American Heart Association (AHA) Resuscitation Science Symposium, 2016.

@MISC{boutilier-aha16,
author = {J. J. Boutilier and S. C. Brooks and A. Janmohamed and A. Byers and C. Zhan and J. E. Buick and A. P. Schoellig and L. J. Morrison and S. Cheskes and T.C.Y. Chan},
title = {Quantifying the value of drone-delivered {AEDs} in cardiac arrest response},
howpublished = {Abstract and Oral Presentation, in American Heart Association (AHA) Resuscitation Science Symposium},
year = {2016},
}

Safe automatic controller tuning for quadrotors
F. Berkenkamp, A. Krause, and A. P. Schoellig
Video Submission, Assn. of the Advancement of Artificial Intelligence (AAAI) AI Video Competition, 2016.

@MISC{berkenkamp-aaai16,
author = {Felix Berkenkamp and Andreas Krause and Angela P. Schoellig},
title = {Safe automatic controller tuning for quadrotors},
howpublished = {Video Submission, Assn. of the Advancement of Artificial Intelligence (AAAI) AI Video Competition},
urlvideo = {https://youtu.be/7ZkZlxXHgTY?list=PLuOoXrWK6Kz5ySULxGMtAUdZEg9SkXDoq},
year = {2016},
}

Data-driven interaction for quadrotors based on external forces
C. McKinnon and A. P. Schoellig
Video Submission, Assn. of the Advancement of Artificial Intelligence (AAAI) AI Video Competition, 2016.

@MISC{mckinnon-aaai16,
author = {Chris McKinnon and Angela P. Schoellig},
title = {Data-driven interaction for quadrotors based on external forces},
howpublished = {Video Submission, Assn. of the Advancement of Artificial Intelligence (AAAI) AI Video Competition},
urlvideo = {https://youtu.be/x0RL7Jh6F9s?list=PLuOoXrWK6Kz5ySULxGMtAUdZEg9SkXDoq},
year = {2016},
}

## 2015

An upper bound on the error of alignment-based transfer learning between two linear, time-invariant, scalar systems
K. V. Raimalwala, B. A. Francis, and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015, p. 5253–5258.

Methods from machine learning have successfully been used to improve the performance of control systems in cases when accurate models of the system or the environment are not available. These methods require the use of data generated from physical trials. Transfer Learning (TL) allows for this data to come from a different, similar system. This paper studies a simplified TL scenario with the goal of understanding in which cases a simple, alignment-based transfer of data is possible and beneficial. Two linear, time-invariant (LTI), single-input, single-output systems are tasked to follow the same reference signal. A scalar, LTI transformation is applied to the output from a source system to align with the output from a target system. An upper bound on the 2-norm of the transformation error is derived for a large set of reference signals and is minimized with respect to the transformation scalar. Analysis shows that the minimized error bound is reduced for systems with poles that lie close to each other (that is, for systems with similar response times). This criterion is relaxed for systems with poles that have a larger negative real part (that is, for stable systems with fast response), meaning that poles can be further apart for the same minimized error bound. Additionally, numerical results show that using the reference signal as input to the transformation reduces the minimized bound further.

@INPROCEEDINGS{raimalwala-iros15,
author = {Kaizad V. Raimalwala and Bruce A. Francis and Angela P. Schoellig},
title = {An upper bound on the error of alignment-based transfer learning between two linear, time-invariant, scalar systems},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {5253--5258},
year = {2015},
doi = {10.1109/IROS.2015.7354118},
note = {},
abstract = {Methods from machine learning have successfully been used to improve the performance of control systems in cases when accurate models of the system or the environment are not available. These methods require the use of data generated from physical trials. Transfer Learning (TL) allows for this data to come from a different, similar system. This paper studies a simplified TL scenario with the goal of understanding in which cases a simple, alignment-based transfer of data is possible and beneficial. Two linear, time-invariant (LTI), single-input, single-output systems are tasked to follow the same reference signal. A scalar, LTI transformation is applied to the output from a source system to align with the output from a target system. An upper bound on the 2-norm of the transformation error is derived for a large set of reference signals and is minimized with respect to the transformation scalar. Analysis shows that the minimized error bound is reduced for systems with poles that lie close to each other (that is, for systems with similar response times). This criterion is relaxed for systems with poles that have a larger negative real part (that is, for stable systems with fast response), meaning that poles can be further apart for the same minimized error bound. Additionally, numerical results show that using the reference signal as input to the transformation reduces the minimized bound further.}
}

Safe and robust learning control with Gaussian processes
F. Berkenkamp and A. P. Schoellig
in Proc. of the European Control Conference (ECC), 2015, p. 2501–2506.

This paper introduces a learning-based robust control algorithm that provides robust stability and performance guarantees during learning. The approach uses Gaussian process (GP) regression based on data gathered during operation to update an initial model of the system and to gradually decrease the uncertainty related to this model. Embedding this data-based update scheme in a robust control framework guarantees stability during the learning process. Traditional robust control approaches have not considered online adaptation of the model and its uncertainty before. As a result, their controllers do not improve performance during operation. Typical machine learning algorithms that have achieved similar high-performance behavior by adapting the model and controller online do not provide the guarantees presented in this paper. In particular, this paper considers a stabilization task, linearizes the nonlinear, GP-based model around a desired operating point, and solves a convex optimization problem to obtain a linear robust controller. The resulting performance improvements due to the learning-based controller are demonstrated in experiments on a quadrotor vehicle.

@INPROCEEDINGS{berkenkamp-ecc15,
author = {Felix Berkenkamp and Angela P. Schoellig},
title = {Safe and robust learning control with {G}aussian processes},
booktitle = {{Proc. of the European Control Conference (ECC)}},
pages = {2501--2506},
year = {2015},
doi = {10.1109/ECC.2015.7330913},
urlvideo={https://youtu.be/YqhLnCm0KXY?list=PLC12E387419CEAFF2},
urlslides={../../wp-content/papercite-data/slides/berkenkamp-ecc15-slides.pdf},
abstract = {This paper introduces a learning-based robust control algorithm that provides robust stability and performance guarantees during learning. The approach uses Gaussian process (GP) regression based on data gathered during operation to update an initial model of the system and to gradually decrease the uncertainty related to this model. Embedding this data-based update scheme in a robust control framework guarantees stability during the learning process. Traditional robust control approaches have not considered online adaptation of the model and its uncertainty before. As a result, their controllers do not improve performance during operation. Typical machine learning algorithms that have achieved similar high-performance behavior by adapting the model and controller online do not provide the guarantees presented in this paper. In particular, this paper considers a stabilization task, linearizes the nonlinear, GP-based model around a desired operating point, and solves a convex optimization problem to obtain a linear robust controller. The resulting performance improvements due to the learning-based controller are demonstrated in experiments on a quadrotor vehicle.}
}

Conservative to confident: treating uncertainty robustly within learning-based control
C. J. Ostafew, A. P. Schoellig, and T. D. Barfoot
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2015, p. 421–427.

Robust control maintains stability and performance for a fixed amount of model uncertainty but can be conservative since the model is not updated online. Learning- based control, on the other hand, uses data to improve the model over time but is not typically guaranteed to be robust throughout the process. This paper proposes a novel combination of both ideas: a robust Min-Max Learning-Based Nonlinear Model Predictive Control (MM-LB-NMPC) algorithm. Based on an existing LB-NMPC algorithm, we present an efficient and robust extension, altering the NMPC performance objective to optimize for the worst-case scenario. The algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) based on experience collected during previous trials as a function of system state, input, and other relevant variables. Nominal state sequences are predicted using an Unscented Transform and worst-case scenarios are defined as sequences bounding the 3σ confidence region. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results from testing on a 50 kg skid-steered robot executing a path-tracking task. The results show reductions in maximum lateral and heading path-tracking errors by up to 30% and a clear transition from robust control when the model uncertainty is high to optimal control when model uncertainty is reduced.

@INPROCEEDINGS{ostafew-icra15,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot},
title = {Conservative to confident: treating uncertainty robustly within learning-based control},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
pages = {421--427},
year = {2015},
doi = {10.1109/ICRA.2015.7139033},
note = {},
abstract = {Robust control maintains stability and performance for a fixed amount of model uncertainty but can be conservative since the model is not updated online. Learning- based control, on the other hand, uses data to improve the model over time but is not typically guaranteed to be robust throughout the process. This paper proposes a novel combination of both ideas: a robust Min-Max Learning-Based Nonlinear Model Predictive Control (MM-LB-NMPC) algorithm. Based on an existing LB-NMPC algorithm, we present an efficient and robust extension, altering the NMPC performance objective to optimize for the worst-case scenario. The algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) based on experience collected during previous trials as a function of system state, input, and other relevant variables. Nominal state sequences are predicted using an Unscented Transform and worst-case scenarios are defined as sequences bounding the 3σ confidence region. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results from testing on a 50 kg skid-steered robot executing a path-tracking task. The results show reductions in maximum lateral and heading path-tracking errors by up to 30% and a clear transition from robust control when the model uncertainty is high to optimal control when model uncertainty is reduced.}
}

A flying drum machine
X. Wang, N. Dalal, T. Laidlow, and A. P. Schoellig
Technical Report, 2015.

This paper proposes the use of a quadrotor aerial vehicle as a musical instrument. Using the idea of interactions based on physical contact, a system is developed that enables humans to engage in artistic expression with a flying robot and produce music. A robotic user interface that uses physical interactions was created for a quadcopter. The interactive quadcopter was then programmed to drive playback of drum sounds in real-time in response to physical interaction. An intuitive mapping was developed between machine movement and art/creative composition. Challenges arose in meeting realtime latency requirements mainly due to delays in input detection. They were overcome through the development of a quick input detection method, which relies on accurate yet fast digital filtering. Successful experiments were conducted with a professional musician who used the interface to compose complex rhythm patterns. A video accompanying this paper demonstrates his performance.

@TECHREPORT{wang-tr15,
author = {Xingbo Wang and Natasha Dalal and Tristan Laidlow and Angela P. Schoellig},
title = {A Flying Drum Machine},
year = {2015},
urlvideo={https://youtu.be/d5zG-BWB7lE?list=PLD6AAACCBFFE64AC5},
abstract = {This paper proposes the use of a quadrotor aerial vehicle as a musical instrument. Using the idea of interactions based on physical contact, a system is developed that enables humans to engage in artistic expression with a flying robot and produce music. A robotic user interface that uses physical interactions was created for a quadcopter. The interactive quadcopter was then programmed to drive playback of drum sounds in real-time in response to physical interaction. An intuitive mapping was developed between machine movement and art/creative composition. Challenges arose in meeting realtime latency requirements mainly due to delays in input detection. They were overcome through the development of a quick input detection method, which relies on accurate yet fast digital filtering. Successful experiments were conducted with a professional musician who used the interface to compose complex rhythm patterns. A video accompanying this paper demonstrates his performance.}
}

Safe controller optimization for quadrotors with Gaussian processes
F. Berkenkamp, A. P. Schoellig, and A. Krause
Abstract and Presentation in Proc. of the Second Machine Learning in Planning and Control of Robot Motion Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015.

One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to design an initial controller, but ultimately, the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters, safety-critical system failures may hap- pen. We overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SAFEOPT, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SAFEOPT automatically optimizes the parameters of a control law while guaranteeing system safety and stability. It achieves this by modeling the underlying performance measure as a Gaussian process and only exploring new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.

@MISC{berkenkamp-iros15,
author = {Felix Berkenkamp and Angela P. Schoellig and Andreas Krause},
title = {Safe controller optimization for quadrotors with {G}aussian processes},
howpublished = {Abstract and Presentation in Proc. of the Second Machine Learning in Planning and Control of Robot Motion Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year = {2015},
abstract = {One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to design an initial controller, but ultimately, the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters, safety-critical system failures may hap- pen. We overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SAFEOPT, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SAFEOPT automatically optimizes the parameters of a control law while guaranteeing system safety and stability. It achieves this by modeling the underlying performance measure as a Gaussian process and only exploring new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.},
}

## 2014

Application-driven design of aerial communication networks
T. Andre, K. A. Hummel, A. P. Schoellig, E. Yanmaz, M. Asedpour, C. Bettstetter, P. Grippa, H. Hellwagner, S. Sand, and S. Zhang
IEEE Communications Magazine, vol. 52, iss. 5, pp. 129-137, 2014. Authors 1 to 4 contributed equally.

Networks of micro aerial vehicles (MAVs) equipped with various sensors are increasingly used for civil applications, such as monitoring, surveillance, and disaster management. In this article, we discuss the communication requirements raised by applications in MAV networks. We propose a novel system representation that can be used to specify different application demands. To this end, we extract key functionalities expected in an MAV network. We map these functionalities into building blocks to characterize the expected communication needs. Based on insights from our own and related real-world experiments, we discuss the capabilities of existing communications technologies and their limitations to implement the proposed building blocks. Our findings indicate that while certain requirements of MAV applications are met with available technologies, further research and development is needed to address the scalability, heterogeneity, safety, quality of service, and security aspects of multi-MAV systems.

@ARTICLE{andre-com14,
author = {Torsten Andre and Karin A. Hummel and Angela P. Schoellig and Evsen Yanmaz and Mahdi Asedpour and Christian Bettstetter and Pasquale Grippa and Hermann Hellwagner and Stephan Sand and Siwei Zhang},
title = {Application-driven design of aerial communication networks},
journal = {{IEEE Communications Magazine}},
note={Authors 1 to 4 contributed equally},
volume = {52},
number = {5},
pages = {129-137},
year = {2014},
doi = {10.1109/MCOM.2014.6815903},
abstract = {Networks of micro aerial vehicles (MAVs) equipped with various sensors are increasingly used for civil applications, such as monitoring, surveillance, and disaster management. In this article, we discuss the communication requirements raised by applications in MAV networks. We propose a novel system representation that can be used to specify different application demands. To this end, we extract key functionalities expected in an MAV network. We map these functionalities into building blocks to characterize the expected communication needs. Based on insights from our own and related real-world experiments, we discuss the capabilities of existing communications technologies and their limitations to implement the proposed building blocks. Our findings indicate that while certain requirements of MAV applications are met with available technologies, further research and development is needed to address the scalability, heterogeneity, safety, quality of service, and security aspects of multi-MAV systems.}
}

A platform for aerial robotics research and demonstration: The Flying Machine Arena
S. Lupashin, M. Hehn, M. W. Mueller, A. P. Schoellig, and R. D’Andrea
Mechatronics, vol. 24, iss. 1, pp. 41-54, 2014.

The Flying Machine Arena is a platform for experiments and demonstrations with fleets of small flying vehicles. It utilizes a distributed, modular architecture linked by robust communication layers. An estimation and control framework along with built-in system protection components enable prototyping of new control systems concepts and implementation of novel demonstrations. More recently, a mobile version has been featured at several eminent public events. We describe the architecture of the Arena from the viewpoint of system robustness and its capability as a dual-purpose research and demonstration platform.

@ARTICLE{lupashin-mech14,
author = {Sergei Lupashin and Markus Hehn and Mark W. Mueller and Angela P. Schoellig and Raffaello D'Andrea},
title = {A platform for aerial robotics research and demonstration: {The Flying Machine Arena}},
journal = {{Mechatronics}},
volume = {24},
number = {1},
pages = {41-54},
year = {2014},
doi = {10.1016/j.mechatronics.2013.11.006},
urlvideo={https://youtu.be/pcgvWhu8Arc?list=PLuLKX4lDsLIaVjdGsZxNBKLcogBnVVFQr},
abstract = {The Flying Machine Arena is a platform for experiments and demonstrations with fleets of small flying vehicles. It utilizes a distributed, modular architecture linked by robust communication layers. An estimation and control framework along with built-in system protection components enable prototyping of new control systems concepts and implementation of novel demonstrations. More recently, a mobile version has been featured at several eminent public events. We describe the architecture of the Arena from the viewpoint of system robustness and its capability as a dual-purpose research and demonstration platform.}
}

So you think you can dance? Rhythmic flight performances with quadrocopters
A. P. Schoellig, H. Siegel, F. Augugliaro, and R. D’Andrea
in Controls and Art, A. LaViers and M. Egerstedt, Eds., Springer international publishing, 2014, pp. 73-105.

This chapter presents a set of algorithms that enable quadrotor vehicles to “fly with the music”; that is, to perform rhythmic motions that are aligned with the beat of a given music piece.

@INCOLLECTION{schoellig-springer14,
author = {Angela P. Schoellig and Hallie Siegel and Federico Augugliaro and Raffaello D'Andrea},
title = {So you think you can dance? {Rhythmic} flight performances with quadrocopters},
booktitle = {{Controls and Art}},
editor = {Amy LaViers and Magnus Egerstedt},
publisher = {Springer International Publishing},
pages = {73-105},
year = {2014},
doi = {10.1007/978-3-319-03904-6_4},
urldata={../../wp-content/papercite-data/data/schoellig-springer14-files.zip},
urlslides={../../wp-content/papercite-data/slides/schoellig-springer14-slides.pdf},
abstract = {This chapter presents a set of algorithms that enable quadrotor vehicles to "fly with the music"; that is, to perform rhythmic motions that are aligned with the beat of a given music piece.}
}

Learning-based robust control: guaranteeing stability while improving performance
F. Berkenkamp and A. P. Schoellig
in Proc. of the Machine Learning in Planning and Control of Robot Motion Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2014.

To control dynamic systems, modern control theory relies on accurate mathematical models that describe the system behavior. Machine learning methods have proven to be an effective method to compensate for initial errors in these models and to achieve high-performance maneuvers by adapting the system model and control online. However, these methods usually do not guarantee stability during the learning process. On the other hand, the control community has traditionally accounted for model uncertainties by designing robust controllers. Robust controllers use a mathematical description of the uncertainty in the dynamic system derived prior to operation and guarantee robust stability for all uncertainties. Unlike machine learning methods, robust control does not improve the control performance by adapting the model online. This paper combines machine learning and robust control theory for the first time with the goal of improving control performance while guaranteeing stability. Data gathered during operation is used to reduce the uncertainty in the model and to learn systematic errors. Specifically, a nonlinear, nonparametric model of the unknown dynamics is learned with a Gaussian Process. This model is used for the computation of a linear robust controller, which guarantees stability around an operating point for all uncertainties. As a result, the robust controller improves its performance online while guaranteeing robust stability. A simulation example illustrates the performance improvements due to the learning-based robust controller.

@INPROCEEDINGS{berkenkamp-iros14,
author = {Felix Berkenkamp and Angela P. Schoellig},
title = {Learning-based robust control: guaranteeing stability while improving performance},
booktitle = {{Proc. of the Machine Learning in Planning and Control of Robot Motion Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year = {2014},
urlvideo={https://youtu.be/YqhLnCm0KXY?list=PLC12E387419CEAFF2},
urlslides={../../wp-content/papercite-data/slides/berkenkamp-iros14-slides.pdf},
abstract = {To control dynamic systems, modern control theory relies on accurate mathematical models that describe the system behavior. Machine learning methods have proven to be an effective method to compensate for initial errors in these models and to achieve high-performance maneuvers by adapting the system model and control online. However, these methods usually do not guarantee stability during the learning process. On the other hand, the control community has traditionally accounted for model uncertainties by designing robust controllers. Robust controllers use a mathematical description of the uncertainty in the dynamic system derived prior to operation and guarantee robust stability for all uncertainties. Unlike machine learning methods, robust control does not improve the control performance by adapting the model online. This paper combines machine learning and robust control theory for the first time with the goal of improving control performance while guaranteeing stability. Data gathered during operation is used to reduce the uncertainty in the model and to learn systematic errors. Specifically, a nonlinear, nonparametric model of the unknown dynamics is learned with a Gaussian Process. This model is used for the computation of a linear robust controller, which guarantees stability around an operating point for all uncertainties. As a result, the robust controller improves its performance online while guaranteeing robust stability. A simulation example illustrates the performance improvements due to the learning-based robust controller.}
}

Design of norm-optimal iterative learning controllers: the effect of an iteration-domain Kalman filter for disturbance estimation
N. Degen and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2014, pp. 3590-3596.

Iterative learning control (ILC) has proven to be an effective method for improving the performance of repetitive control tasks. This paper revisits two optimization-based ILC algorithms: (i) the widely used quadratic-criterion ILC law (QILC) and (ii) an estimation-based ILC law using an iteration-domain Kalman filter (K-ILC). The goal of this paper is to analytically compare both algorithms and to highlight the advantages of the Kalman-filter-enhanced algorithm. We first show that for an iteration-constant estimation gain and an appropriate choice of learning parameters both algorithms are identical. We then show that the estimation-enhanced algorithm with its iteration-varying optimal Kalman gains can achieve both fast initial convergence and good noise rejection by (optimally) adapting the learning update rule over the course of an experiment. We conclude that the clear separation of disturbance estimation and input update of the K-ILC algorithm provides an intuitive architecture to design learning schemes that achieve both low noise sensitivity and fast convergence. To benchmark the algorithms we use a simulation of a single-input, single-output mass-spring-damper system.

@INPROCEEDINGS{degen-cdc14,
author = {Nicolas Degen and Angela P. Schoellig},
title = {Design of norm-optimal iterative learning controllers: the effect of an iteration-domain {K}alman filter for disturbance estimation},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
pages = {3590-3596},
year = {2014},
doi = {10.1109/CDC.2014.7039947},
urlslides = {../../wp-content/papercite-data/slides/degen-cdc14-slides.pdf},
abstract = {Iterative learning control (ILC) has proven to be an effective method for improving the performance of repetitive control tasks. This paper revisits two optimization-based ILC algorithms: (i) the widely used quadratic-criterion ILC law (QILC) and (ii) an estimation-based ILC law using an iteration-domain Kalman filter (K-ILC). The goal of this paper is to analytically compare both algorithms and to highlight the advantages of the Kalman-filter-enhanced algorithm. We first show that for an iteration-constant estimation gain and an appropriate choice of learning parameters both algorithms are identical. We then show that the estimation-enhanced algorithm with its iteration-varying optimal Kalman gains can achieve both fast initial convergence and good noise rejection by (optimally) adapting the learning update rule over the course of an experiment. We conclude that the clear separation of disturbance estimation and input update of the K-ILC algorithm provides an intuitive architecture to design learning schemes that achieve both low noise sensitivity and fast convergence. To benchmark the algorithms we use a simulation of a single-input, single-output mass-spring-damper system.}
}

Learning-based nonlinear model predictive control to improve vision-based mobile robot path-tracking in challenging outdoor environments
C. J. Ostafew, A. P. Schoellig, and T. D. Barfoot
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2014, pp. 4029-4036.

This paper presents a Learning-based Nonlinear Model Predictive Control (LB-NMPC) algorithm for an autonomous mobile robot to reduce path-tracking errors over repeated traverses along a reference path. The LB-NMPC algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) based on experience collected during previous traversals as a function of system state, input and other relevant variables. Modelling the disturbance as a GP enables interpolation and extrapolation of learned disturbances, a key feature of this algorithm. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results including over 1.8 km of travel by a four-wheeled, 50 kg robot travelling through challenging terrain (including steep, uneven hills) and by a six-wheeled, 160 kg robot learning disturbances caused by unmodelled dynamics at speeds ranging from 0.35 m/s to 1.0 m/s. The speed is scheduled to balance trial time, path-tracking errors, and localization reliability based on previous experience. The results show that the system can start from a generic a priori vehicle model and subsequently learn to reduce vehicle- and trajectory-specific path-tracking errors based on experience.

@INPROCEEDINGS{ostafew-icra14,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot},
title = {Learning-based nonlinear model predictive control to improve vision-based mobile robot path-tracking in challenging outdoor environments},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
pages = {4029-4036},
year = {2014},
doi = {10.1109/ICRA.2014.6907444},
urlvideo = {https://youtu.be/MwVElAn95-M?list=PLC12E387419CEAFF2},
abstract = {This paper presents a Learning-based Nonlinear Model Predictive Control (LB-NMPC) algorithm for an autonomous mobile robot to reduce path-tracking errors over repeated traverses along a reference path. The LB-NMPC algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) based on experience collected during previous traversals as a function of system state, input and other relevant variables. Modelling the disturbance as a GP enables interpolation and extrapolation of learned disturbances, a key feature of this algorithm. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results including over 1.8 km of travel by a four-wheeled, 50 kg robot travelling through challenging terrain (including steep, uneven hills) and by a six-wheeled, 160 kg robot learning disturbances caused by unmodelled dynamics at speeds ranging from 0.35 m/s to 1.0 m/s. The speed is scheduled to balance trial time, path-tracking errors, and localization reliability based on previous experience. The results show that the system can start from a generic a priori vehicle model and subsequently learn to reduce vehicle- and trajectory-specific path-tracking errors based on experience.}
}

Speed daemon: experience-based mobile robot speed scheduling
C. J. Ostafew, A. P. Schoellig, T. D. Barfoot, and J. Collier
in Proc. of the International Conference on Computer and Robot Vision (CRV), 2014, pp. 56-62. Best Robotics Paper Award.

A time-optimal speed schedule results in a mobile robot driving along a planned path at or near the limits of the robot’s capability. However, deriving models to predict the effect of increased speed can be very difficult. In this paper, we present a speed scheduler that uses previous experience, instead of complex models, to generate time-optimal speed schedules. The algorithm is designed for a vision-based, path-repeating mobile robot and uses experience to ensure reliable localization, low path-tracking errors, and realizable control inputs while maximizing the speed along the path. To our knowledge, this is the first speed scheduler to incorporate experience from previous path traversals in order to address system constraints. The proposed speed scheduler was tested in over 4 km of path traversals in outdoor terrain using a large Ackermann-steered robot travelling between 0.5 m/s and 2.0 m/s. The approach to speed scheduling is shown to generate fast speed schedules while remaining within the limits of the robot’s capability.

@INPROCEEDINGS{ostafew-crv14,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot and J. Collier},
title = {Speed daemon: experience-based mobile robot speed scheduling},
booktitle = {{Proc. of the International Conference on Computer and Robot Vision (CRV)}},
pages = {56-62},
year = {2014},
doi = {10.1109/CRV.2014.16},
urlvideo = {https://youtu.be/Pu3_F6k6Fa4?list=PLC12E387419CEAFF2},
abstract = {A time-optimal speed schedule results in a mobile robot driving along a planned path at or near the limits of the robot's capability. However, deriving models to predict the effect of increased speed can be very difficult. In this paper, we present a speed scheduler that uses previous experience, instead of complex models, to generate time-optimal speed schedules. The algorithm is designed for a vision-based, path-repeating mobile robot and uses experience to ensure reliable localization, low path-tracking errors, and realizable control inputs while maximizing the speed along the path. To our knowledge, this is the first speed scheduler to incorporate experience from previous path traversals in order to address system constraints. The proposed speed scheduler was tested in over 4 km of path traversals in outdoor terrain using a large Ackermann-steered robot travelling between 0.5 m/s and 2.0 m/s. The approach to speed scheduling is shown to generate fast speed schedules while remaining within the limits of the robot's capability.},
note = {Best Robotics Paper Award}
}

A proof-of-concept demonstration of visual teach and repeat on a quadrocopter using an altitude sensor and a monocular camera
A. Pfrunder, A. P. Schoellig, and T. D. Barfoot
in Proc. of the International Conference on Computer and Robot Vision (CRV), 2014, pp. 238-245.

This paper applies an existing vision-based navigation algorithm to a micro aerial vehicle (MAV). The algorithm has previously been used for long-range navigation of ground robots based on on-board 3D vision sensors such as a stereo or Kinect cameras. A teach-and-repeat operational strategy enables a robot to autonomously repeat a manually taught route without relying on an external positioning system such as GPS. For MAVs we show that a monocular downward looking camera combined with an altitude sensor can be used as 3D vision sensor replacing other resource-expensive 3D vision solutions. The paper also includes a simple path tracking controller that uses feedback from the visual and inertial sensors to guide the vehicle along a straight and level path. Preliminary experimental results demonstrate reliable, accurate and fully autonomous flight of an 8-m-long (straight and level) route, which was taught with the quadrocopter fixed to a cart. Finally, we present the successful flight of a more complex, 16-m-long route.

@INPROCEEDINGS{pfrunder-crv14,
author = {Andreas Pfrunder and Angela P. Schoellig and Timothy D. Barfoot},
title = {A proof-of-concept demonstration of visual teach and repeat on a quadrocopter using an altitude sensor and a monocular camera},
booktitle = {{Proc. of the International Conference on Computer and Robot Vision (CRV)}},
pages = {238-245},
year = {2014},
doi = {10.1109/CRV.2014.40},
urlvideo = {https://youtu.be/BRDvK4xD8ZY?list=PLuLKX4lDsLIaJEVTsuTAVdDJDx0xmzxXr},
urlslides = {../../wp-content/papercite-data/slides/pfrunder-crv14-slides.pdf},
abstract = {This paper applies an existing vision-based navigation algorithm to a micro aerial vehicle (MAV). The algorithm has previously been used for long-range navigation of ground robots based on on-board 3D vision sensors such as a stereo or Kinect cameras. A teach-and-repeat operational strategy enables a robot to autonomously repeat a manually taught route without relying on an external positioning system such as GPS. For MAVs we show that a monocular downward looking camera combined with an altitude sensor can be used as 3D vision sensor replacing other resource-expensive 3D vision solutions. The paper also includes a simple path tracking controller that uses feedback from the visual and inertial sensors to guide the vehicle along a straight and level path. Preliminary experimental results demonstrate reliable, accurate and fully autonomous flight of an 8-m-long (straight and level) route, which was taught with the quadrocopter fixed to a cart. Finally, we present the successful flight of a more complex, 16-m-long route.}
}

## 2013

Dance of the flying machines: methods for designing and executing an aerial dance choreography
F. Augugliaro, A. P. Schoellig, and R. D’Andrea
IEEE Robotics Automation Magazine, vol. 20, iss. 4, pp. 96-104, 2013.

Imagine a troupe of dancers flying together across a big open stage, their movement choreographed to the rhythm of the music. Their performance is both coordinated and skilled; the dancers are well rehearsed, and the choreography well suited to their abilities. They are no ordinary dancers, however, and this is not an ordinary stage. The performers are quadrocopters, and the stage is the ETH Zurich Flying Machine Arena, a state-of-the-art mobile testbed for aerial motion control research.

@ARTICLE{augugliaro-ram13,
author = {Federico Augugliaro and Angela P. Schoellig and Raffaello D'Andrea},
title = {Dance of the Flying Machines: Methods for Designing and Executing an Aerial Dance Choreography},
journal = {{IEEE Robotics Automation Magazine}},
volume = {20},
number = {4},
pages = {96-104},
year = {2013},
doi = {10.1109/MRA.2013.2275693},
urlvideo={http://youtu.be/NRL_1ozDQCA?t=21s},
urlslides={../../wp-content/papercite-data/slides/augugliaro-ram13-slides.pdf},
abstract = {Imagine a troupe of dancers flying together across a big open stage, their movement choreographed to the rhythm of the music. Their performance is both coordinated and skilled; the dancers are well rehearsed, and the choreography well suited to their abilities. They are no ordinary dancers, however, and this is not an ordinary stage. The performers are quadrocopters, and the stage is the ETH Zurich Flying Machine Arena, a state-of-the-art mobile testbed for aerial motion control research.}
}

Visual teach and repeat, repeat, repeat: iterative learning control to improve mobile robot path tracking in challenging outdoor environments
C. J. Ostafew, A. P. Schoellig, and T. D. Barfoot
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013, pp. 176-181.

This paper presents a path-repeating, mobile robot controller that combines a feedforward, proportional Iterative Learning Control (ILC) algorithm with a feedback-linearized path-tracking controller to reduce path-tracking errors over repeated traverses along a reference path. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied, extreme environments. The paper presents experimental results including over 600 m of travel by a four-wheeled, 50 kg robot travelling through challenging terrain including steep hills and sandy turns and by a six-wheeled, 160 kg robot at gradually-increased speeds up to three times faster than the nominal, safe speed. In the absence of a global localization system, ILC is demonstrated to reduce path-tracking errors caused by unmodelled robot dynamics and terrain challenges.

@INPROCEEDINGS{ostafew-iros13,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot},
title = {Visual teach and repeat, repeat, repeat: Iterative learning control to improve mobile robot path tracking in challenging outdoor environments},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {176-181},
year = {2013},
doi = {10.1109/IROS.2013.6696350},
abstract = {This paper presents a path-repeating, mobile robot controller that combines a feedforward, proportional Iterative Learning Control (ILC) algorithm with a feedback-linearized path-tracking controller to reduce path-tracking errors over repeated traverses along a reference path. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied, extreme environments. The paper presents experimental results including over 600 m of travel by a four-wheeled, 50 kg robot travelling through challenging terrain including steep hills and sandy turns and by a six-wheeled, 160 kg robot at gradually-increased speeds up to three times faster than the nominal, safe speed. In the absence of a global localization system, ILC is demonstrated to reduce path-tracking errors caused by unmodelled robot dynamics and terrain challenges.}
}

Improving tracking performance by learning from past data
A. P. Schoellig
PhD Thesis, Diss. ETH No. 20593, Institute for Dynamic Systems and Control, ETH Zurich, Switzerland, 2013. Awards: ETH Medal, Dimitris N. Chorafas Foundation Prize.

@PHDTHESIS{schoellig-eth13,
author = {Angela P. Schoellig},
title = {Improving tracking performance by learning from past data},
school = {Diss. ETH No. 20593, Institute for Dynamic Systems and Control, ETH Zurich},
doi = {10.3929/ethz-a-009758916},
year = {2013},
urlabstract = {../../wp-content/papercite-data/pdf/schoellig-eth13-abstract.pdf},
urlslides = {../../wp-content/papercite-data/slides/schoellig-eth13-slides.pdf},
urlvideo2 = {https://youtu.be/7r281vgfotg?list=PLD6AAACCBFFE64AC5},
note = {Awards: ETH Medal, Dimitris N. Chorafas Foundation Prize}
}

## 2012

Optimization-based iterative learning for precise quadrocopter trajectory tracking
A. P. Schoellig, F. L. Mueller, and R. D’Andrea
Autonomous Robots, vol. 33, iss. 1-2, pp. 103-127, 2012.

Current control systems regulate the behavior of dynamic systems by reacting to noise and unexpected disturbances as they occur. To improve the performance of such control systems, experience from iterative executions can be used to anticipate recurring disturbances and proactively compensate for them. This paper presents an algorithm that exploits data from previous repetitions in order to learn to precisely follow a predefined trajectory. We adapt the feed-forward input signal to the system with the goal of achieving high tracking performance – even under the presence of model errors and other recurring disturbances. The approach is based on a dynamics model that captures the essential features of the system and that explicitly takes system input and state constraints into account. We combine traditional optimal filtering methods with state-of-the-art optimization techniques in order to obtain an effective and computationally efficient learning strategy that updates the feed-forward input signal according to a customizable learning objective. It is possible to define a termination condition that stops an execution early if the deviation from the nominal trajectory exceeds a given bound. This allows for a safe learning that gradually extends the time horizon of the trajectory. We developed a framework for generating arbitrary flight trajectories and for applying the algorithm to highly maneuverable autonomous quadrotor vehicles in the ETH Flying Machine Arena testbed. Experimental results are discussed for selected trajectories and different learning algorithm parameters.

@ARTICLE{schoellig-auro12,
author = {Angela P. Schoellig and Fabian L. Mueller and Raffaello D'Andrea},
title = {Optimization-based iterative learning for precise quadrocopter trajectory tracking},
journal = {{Autonomous Robots}},
volume = {33},
number = {1-2},
pages = {103-127},
year = {2012},
doi = {10.1007/s10514-012-9283-2},
urlvideo={http://youtu.be/goVuP5TJIUU?list=PLC12E387419CEAFF2},
abstract = {Current control systems regulate the behavior of dynamic systems by reacting to noise and unexpected disturbances as they occur. To improve the performance of such control systems, experience from iterative executions can be used to anticipate recurring disturbances and proactively compensate for them. This paper presents an algorithm that exploits data from previous repetitions in order to learn to precisely follow a predefined trajectory. We adapt the feed-forward input signal to the system with the goal of achieving high tracking performance - even under the presence of model errors and other recurring disturbances. The approach is based on a dynamics model that captures the essential features of the system and that explicitly takes system input and state constraints into account. We combine traditional optimal filtering methods with state-of-the-art optimization techniques in order to obtain an effective and computationally efficient learning strategy that updates the feed-forward input signal according to a customizable learning objective. It is possible to define a termination condition that stops an execution early if the deviation from the nominal trajectory exceeds a given bound. This allows for a safe learning that gradually extends the time horizon of the trajectory. We developed a framework for generating arbitrary flight trajectories and for applying the algorithm to highly maneuverable autonomous quadrotor vehicles in the ETH Flying Machine Arena testbed. Experimental results are discussed for selected trajectories and different learning algorithm parameters.}
}

Limited benefit of joint estimation in multi-agent iterative learning
A. P. Schoellig, J. Alonso-Mora, and R. D’Andrea
Asian Journal of Control, vol. 14, iss. 3, pp. 613-623, 2012.

This paper studies iterative learning control (ILC) in a multi-agent framework, wherein a group of agents simultaneously and repeatedly perform the same task. Assuming similarity between the agents, we investigate whether exchanging information between the agents improves an individual’s learning performance. That is, does an individual agent benefit from the experience of the other agents? We consider the multi-agent iterative learning problem as a two-step process of: first, estimating the repetitive disturbance of each agent; and second, correcting for it. We present a comparison of an agent’s disturbance estimate in the case of (I) independent estimation, where each agent has access only to its own measurement, and (II) joint estimation, where information of all agents is globally accessible. When the agents are identical and noise comes from measurement only, joint estimation yields a noticeable improvement in performance. However, when process noise is encountered or when the agents have an individual disturbance component, the benefit of joint estimation is negligible.

@ARTICLE{schoellig-ajc12,
author = {Angela P. Schoellig and Javier Alonso-Mora and Raffaello D'Andrea},
title = {Limited benefit of joint estimation in multi-agent iterative learning},
journal = {{Asian Journal of Control}},
volume = {14},
number = {3},
pages = {613-623},
year = {2012},
doi = {10.1002/asjc.398},
urldata={../../wp-content/papercite-data/data/schoellig-ajc12-files.zip},
urlslides={../../wp-content/papercite-data/slides/schoellig-ajc12-slides.pdf},
abstract = {This paper studies iterative learning control (ILC) in a multi-agent framework, wherein a group of agents simultaneously and repeatedly perform the same task. Assuming similarity between the agents, we investigate whether exchanging information between the agents improves an individual's learning performance. That is, does an individual agent benefit from the experience of the other agents? We consider the multi-agent iterative learning problem as a two-step process of: first, estimating the repetitive disturbance of each agent; and second, correcting for it. We present a comparison of an agent's disturbance estimate in the case of (I) independent estimation, where each agent has access only to its own measurement, and (II) joint estimation, where information of all agents is globally accessible. When the agents are identical and noise comes from measurement only, joint estimation yields a noticeable improvement in performance. However, when process noise is encountered or when the agents have an individual disturbance component, the benefit of joint estimation is negligible.}
}

Generation of collision-free trajectories for a quadrocopter fleet: a sequential convex programming approach
F. Augugliaro, A. P. Schoellig, and R. D’Andrea
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2012, pp. 1917-1922.

This paper presents an algorithm that generates collision-free trajectories in three dimensions for multiple vehicles within seconds. The problem is cast as a non-convex optimization problem, which is iteratively solved using sequential convex programming that approximates non-convex constraints by using convex ones. The method generates trajectories that account for simple dynamics constraints and is thus independent of the vehicle’s type. An extensive a posteriori vehicle-specific feasibility check is included in the algorithm. The algorithm is applied to a quadrocopter fleet. Experimental results are shown.

@INPROCEEDINGS{augugliaro-iros12,
author = {Federico Augugliaro and Angela P. Schoellig and Raffaello D'Andrea},
title = {Generation of collision-free trajectories for a quadrocopter fleet: A sequential convex programming approach},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {1917-1922},
year = {2012},
doi = {10.1109/IROS.2012.6385823},
urlvideo = {https://youtu.be/wwK7WvvUvlI?list=PLD6AAACCBFFE64AC5},
abstract = {This paper presents an algorithm that generates collision-free trajectories in three dimensions for multiple vehicles within seconds. The problem is cast as a non-convex optimization problem, which is iteratively solved using sequential convex programming that approximates non-convex constraints by using convex ones. The method generates trajectories that account for simple dynamics constraints and is thus independent of the vehicle's type. An extensive a posteriori vehicle-specific feasibility check is included in the algorithm. The algorithm is applied to a quadrocopter fleet. Experimental results are shown.}
}

Iterative learning of feed-forward corrections for high-performance tracking
F. L. Mueller, A. P. Schoellig, and R. D’Andrea
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2012, pp. 3276-3281.

We revisit a recently developed iterative learning algorithm that enables systems to learn from a repeated operation with the goal of achieving high tracking performance of a given trajectory. The learning scheme is based on a coarse dynamics model of the system and uses past measurements to iteratively adapt the feed-forward input signal to the system. The novelty of this work is an identification routine that uses a numerical simulation of the system dynamics to extract the required model information. This allows the learning algorithm to be applied to any dynamic system for which a dynamics simulation is available (including systems with underlying feedback loops). The proposed learning algorithm is applied to a quadrocopter system that is guided by a trajectory-following controller. With the identification routine, we are able to extend our previous learning results to three-dimensional quadrocopter motions and achieve significantly higher tracking accuracy due to the underlying feedback control, which accounts for non-repetitive noise.

@INPROCEEDINGS{mueller-iros12,
author = {Fabian L. Mueller and Angela P. Schoellig and Raffaello D'Andrea},
title = {Iterative learning of feed-forward corrections for high-performance tracking},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {3276-3281},
year = {2012},
doi = {10.1109/IROS.2012.6385647},
urlslides = {../../wp-content/papercite-data/slides/mueller-iros12-slides.pdf},
abstract = {We revisit a recently developed iterative learning algorithm that enables systems to learn from a repeated operation with the goal of achieving high tracking performance of a given trajectory. The learning scheme is based on a coarse dynamics model of the system and uses past measurements to iteratively adapt the feed-forward input signal to the system. The novelty of this work is an identification routine that uses a numerical simulation of the system dynamics to extract the required model information. This allows the learning algorithm to be applied to any dynamic system for which a dynamics simulation is available (including systems with underlying feedback loops). The proposed learning algorithm is applied to a quadrocopter system that is guided by a trajectory-following controller. With the identification routine, we are able to extend our previous learning results to three-dimensional quadrocopter motions and achieve significantly higher tracking accuracy due to the underlying feedback control, which accounts for non-repetitive noise.}
}

Feed-forward parameter identification for precise periodic quadrocopter motions
A. P. Schoellig, C. Wiltsche, and R. D’Andrea
in Proc. of the American Control Conference (ACC), 2012, pp. 4313-4318.

This paper presents an approach for precisely tracking periodic trajectories with a quadrocopter. In order to improve temporal and spatial tracking performance, we propose a feed-forward strategy that adapts the motion parameters sent to the vehicle controller. The motion parameters are either adjusted on the fly or, in order to avoid initial transients, identified prior to the flight performance. We outline an identification scheme that tunes parameters for a large class of periodic motions, and requires only a small number of identification experiments prior to flight. This reduced identification is based on analysis and experiments showing that the quadrocopter’s closed-loop dynamics can be approximated by three directionally decoupled linear systems. We show the effectiveness of this approach by performing a sequence of periodic motions on real quadrocopters using the tuned parameters obtained by the reduced identification.

@INPROCEEDINGS{schoellig-acc12,
author = {Angela P. Schoellig and Clemens Wiltsche and Raffaello D'Andrea},
title = {Feed-forward parameter identification for precise periodic quadrocopter motions},
booktitle = {{Proc. of the American Control Conference (ACC)}},
pages = {4313-4318},
year = {2012},
doi = {10.1109/ACC.2012.6315248},
urlvideo = {http://tiny.cc/MusicInMotion},
urlslides = {../../wp-content/papercite-data/slides/schoellig-acc12-slides.pdf},
abstract = {This paper presents an approach for precisely tracking periodic trajectories with a quadrocopter. In order to improve temporal and spatial tracking performance, we propose a feed-forward strategy that adapts the motion parameters sent to the vehicle controller. The motion parameters are either adjusted on the fly or, in order to avoid initial transients, identified prior to the flight performance. We outline an identification scheme that tunes parameters for a large class of periodic motions, and requires only a small number of identification experiments prior to flight. This reduced identification is based on analysis and experiments showing that the quadrocopter's closed-loop dynamics can be approximated by three directionally decoupled linear systems. We show the effectiveness of this approach by performing a sequence of periodic motions on real quadrocopters using the tuned parameters obtained by the reduced identification.}
}

An aerial robotics demonstration for controls research at the ETH Flying Machine Arena
R. Ritz, M. W. Müller, F. Augugliaro, M. Hehn, S. Lupashin, A. P. Schoellig, and R. D’Andrea
Swiss Society for Automatic Control Bulletin, 2012.

@MISC{ritz-sga12,
author = {Robin Ritz and Mark W. M{\"u}ller and Federico Augugliaro and Markus Hehn and Sergei Lupashin and Angela P. Schoellig and Raffaello D'Andrea},
title = {An aerial robotics demonstration for controls research at the {ETH Flying Machine Arena}},
year = {2012},
number = {463},
pages = {2-15},
howpublished = {Swiss Society for Automatic Control Bulletin}
}

A. P. Schoellig, F. L. Mueller, and R. D’Andrea
Video Submission, AI and Robotics Multimedia Fair, Conference on Artificial Intelligence (AI), Assn. of the Advancement of Artificial Intelligence (AAAI), 2012.

@MISC{schoellig-aaai12,
author = {Angela P. Schoellig and Fabian L. Mueller and Raffaello D'Andrea},
howpublished = {Video Submission, AI and Robotics Multimedia Fair, Conference on Artificial Intelligence (AI), Assn. of the Advancement of Artificial Intelligence (AAAI)},
year = {2012},
}

## 2011

Sensitivity of joint estimation in multi-agent iterative learning control
A. P. Schoellig and R. D’Andrea
in Proc. of the IFAC (International Federation of Automatic Control) World Congress, 2011, pp. 1204-1212.

We consider a group of agents that simultaneously learn the same task, and revisit a previously developed algorithm, where agents share their information and learn jointly. We have already shown that, as compared to an independent learning model that disregards the information of the other agents, and when assuming similarity between the agents, a joint algorithm improves the learning performance of an individual agent. We now revisit the joint learning algorithm to determine its sensitivity to the underlying assumption of similarity between agents. We note that an incorrect assumption about the agents’ degree of similarity degrades the performance of the joint learning scheme. The degradation is particularly acute if we assume that the agents are more similar than they are in reality; in this case, a joint learning scheme can result in a poorer performance than the independent learning algorithm. In the worst case (when we assume that the agents are identical, but they are, in reality, not) the joint learning does not even converge to the correct value. We conclude that, when applying the joint algorithm, it is crucial not to overestimate the similarity of the agents; otherwise, a learning scheme that is independent of the similarity assumption is preferable.

@INPROCEEDINGS{schoellig-ifac11,
author = {Angela P. Schoellig and Raffaello D'Andrea},
title = {Sensitivity of joint estimation in multi-agent iterative learning control},
booktitle = {{Proc. of the IFAC (International Federation of Automatic Control) World Congress}},
pages = {1204-1212},
year = {2011},
doi = {10.3182/20110828-6-IT-1002.03687},
urlslides = {../../wp-content/papercite-data/slides/schoellig-ifac11-slides.pdf},
urldata = {../../wp-content/papercite-data/data/schoellig-ifac11-files.zip},
abstract = {We consider a group of agents that simultaneously learn the same task, and revisit a previously developed algorithm, where agents share their information and learn jointly. We have already shown that, as compared to an independent learning model that disregards the information of the other agents, and when assuming similarity between the agents, a joint algorithm improves the learning performance of an individual agent. We now revisit the joint learning algorithm to determine its sensitivity to the underlying assumption of similarity between agents. We note that an incorrect assumption about the agents' degree of similarity degrades the performance of the joint learning scheme. The degradation is particularly acute if we assume that the agents are more similar than they are in reality; in this case, a joint learning scheme can result in a poorer performance than the independent learning algorithm. In the worst case (when we assume that the agents are identical, but they are, in reality, not) the joint learning does not even converge to the correct value. We conclude that, when applying the joint algorithm, it is crucial not to overestimate the similarity of the agents; otherwise, a learning scheme that is independent of the similarity assumption is preferable.}
}

Feasibility of motion primitives for choreographed quadrocopter flight
A. P. Schoellig, M. Hehn, S. Lupashin, and R. D’Andrea
in Proc. of the American Control Conference (ACC), 2011, pp. 3843-3849.

This paper describes a method for checking the feasibility of quadrocopter motions. The approach, meant as a validation tool for preprogrammed quadrocopter performances, is based on first principles models and ensures that a desired trajectory respects both vehicle dynamics and motor thrust limits. We apply this method towards the eventual goal of using parameterized motion primitives for expressive quadrocopter choreographies. First, we show how a large class of motion primitives can be formulated as truncated Fourier series. We then show how the feasibility check can be applied to such motions by deriving explicit parameter constraints for two particular parameterized primitives. The predicted feasibility constraints are compared against experimental results from quadrocopters in the ETH Flying Machine Arena.

@INPROCEEDINGS{schoellig-acc11,
author = {Angela P. Schoellig and Markus Hehn and Sergei Lupashin and Raffaello D'Andrea},
title = {Feasibility of motion primitives for choreographed quadrocopter flight},
booktitle = {{Proc. of the American Control Conference (ACC)}},
pages = {3843-3849},
year = {2011},
doi = {10.1109/ACC.2011.5991482},
urlslides = {../../wp-content/papercite-data/slides/schoellig-acc11-slides.pdf},
urldata = {../../wp-content/papercite-data/data/schoellig-acc11-files.zip},
abstract = {This paper describes a method for checking the feasibility of quadrocopter motions. The approach, meant as a validation tool for preprogrammed quadrocopter performances, is based on first principles models and ensures that a desired trajectory respects both vehicle dynamics and motor thrust limits. We apply this method towards the eventual goal of using parameterized motion primitives for expressive quadrocopter choreographies. First, we show how a large class of motion primitives can be formulated as truncated Fourier series. We then show how the feasibility check can be applied to such motions by deriving explicit parameter constraints for two particular parameterized primitives. The predicted feasibility constraints are compared against experimental results from quadrocopters in the ETH Flying Machine Arena.}
}

The Flying Machine Arena as of 2010
S. Lupashin, A. P. Schoellig, M. Hehn, and R. D’Andrea
Video Submission, in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2011.

The Flying Machine Arena (FMA) is an indoor research space built specifically for the study of autonomous systems and aerial robotics. In this video, we give an overview of this testbed and some of its capabilities. We show the FMA infrastructure and hardware, which includes a fleet of quadrocopters and a motion capture system for vehicle localization. The physical components of the FMA are complemented by specialized software tools and components that facilitate the use of the space and provide a unified framework for communication and control. The flexibility and modularity of the experimental platform is highlighted by various research projects and demonstrations.

@MISC{lupashin-icra11,
author = {Sergei Lupashin and Angela P. Schoellig and Markus Hehn and Raffaello D'Andrea},
title = {The {Flying Machine Arena} as of 2010},
howpublished = {Video Submission, in Proc. of the IEEE International Conference on Robotics and Automation (ICRA)},
year = {2011},
pages = {2970-2971},
doi = {10.1109/ICRA.2011.5980308},
urlvideo = {https://youtu.be/pcgvWhu8Arc?list=PLuLKX4lDsLIaVjdGsZxNBKLcogBnVVFQr},
abstract = {The Flying Machine Arena (FMA) is an indoor research space built specifically for the study of autonomous systems and aerial robotics. In this video, we give an overview of this testbed and some of its capabilities. We show the FMA infrastructure and hardware, which includes a fleet of quadrocopters and a motion capture system for vehicle localization. The physical components of the FMA are complemented by specialized software tools and components that facilitate the use of the space and provide a unified framework for communication and control. The flexibility and modularity of the experimental platform is highlighted by various research projects and demonstrations.},
}

## 2010

A simple learning strategy for high-speed quadrocopter multi-flips
S. Lupashin, A. P. Schoellig, M. Sherback, and R. D’Andrea
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2010, pp. 1642-1648.

We describe a simple and intuitive policy gradient method for improving parametrized quadrocopter multi-flips by combining iterative experiments with information from a first-principles model. We start by formulating an N-flip maneuver as a five-step primitive with five adjustable parameters. Optimization using a low-order first-principles 2D vertical plane model of the quadrocopter yields an initial set of parameters and a corrective matrix. The maneuver is then repeatedly performed with the vehicle. At each iteration the state error at the end of the primitive is used to update the maneuver parameters via a gradient adjustment. The method is demonstrated at the ETH Zurich Flying Machine Arena testbed on quadrotor helicopters performing and improving on flips, double flips and triple flips.

@INPROCEEDINGS{lupashin-icra10,
author = {Sergei Lupashin and Angela P. Schoellig and Michael Sherback and Raffaello D'Andrea},
title = {A simple learning strategy for high-speed quadrocopter multi-flips},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
pages = {1642-1648},
year = {2010},
doi = {10.1109/ROBOT.2010.5509452},
urlvideo = {https://youtu.be/bWExDW9J9sA?list=PLC12E387419CEAFF2},
abstract = {We describe a simple and intuitive policy gradient method for improving parametrized quadrocopter multi-flips by combining iterative experiments with information from a first-principles model. We start by formulating an N-flip maneuver as a five-step primitive with five adjustable parameters. Optimization using a low-order first-principles 2D vertical plane model of the quadrocopter yields an initial set of parameters and a corrective matrix. The maneuver is then repeatedly performed with the vehicle. At each iteration the state error at the end of the primitive is used to update the maneuver parameters via a gradient adjustment. The method is demonstrated at the ETH Zurich Flying Machine Arena testbed on quadrotor helicopters performing and improving on flips, double flips and triple flips.}
}

Independent vs. joint estimation in multi-agent iterative learning control
A. P. Schoellig, J. Alonso-Mora, and R. D’Andrea
in Proc. of the IEEE Conference on Decision and Control (CDC), 2010, pp. 6949-6954.

This paper studies iterative learning control (ILC) in a multi-agent framework, wherein a group of agents simultaneously and repeatedly perform the same task. The agents improve their performance by using the knowledge gained from previous executions. Assuming similarity between the agents, we investigate whether exchanging information between the agents improves an individual’s learning performance. That is, does an individual agent benefit from the experience of the other agents? We consider the multi-agent iterative learning problem as a two-step process of: first, estimating the repetitive disturbance of each agent; and second, correcting for it. We present a comparison of an agent’s disturbance estimate in the case of (I) independent estimation, where each agent has access only to its own measurement, and (II) joint estimation, where information of all agents is globally accessible. We analytically derive an upper bound of the performance improvement due to joint estimation. Results are obtained for two limiting cases: (i) pure process noise, and (ii) pure measurement noise. The benefits of information sharing are negligible in (i). For (ii), a performance improvement is observed when a high similarity between the agents is guaranteed.

@INPROCEEDINGS{schoellig-cdc10,
author = {Angela P. Schoellig and Javier Alonso-Mora and Raffaello D'Andrea},
title = {Independent vs. joint estimation in multi-agent iterative learning control},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
pages = {6949-6954},
year = {2010},
doi = {10.1109/CDC.2010.5717888},
urlslides = {../../wp-content/papercite-data/slides/schoellig-cdc10-slides.pdf},
urldata = {../../wp-content/papercite-data/data/schoellig-cdc10-files.zip},
abstract = {This paper studies iterative learning control (ILC) in a multi-agent framework, wherein a group of agents simultaneously and repeatedly perform the same task. The agents improve their performance by using the knowledge gained from previous executions. Assuming similarity between the agents, we investigate whether exchanging information between the agents improves an individual's learning performance. That is, does an individual agent benefit from the experience of the other agents? We consider the multi-agent iterative learning problem as a two-step process of: first, estimating the repetitive disturbance of each agent; and second, correcting for it. We present a comparison of an agent's disturbance estimate in the case of (I) independent estimation, where each agent has access only to its own measurement, and (II) joint estimation, where information of all agents is globally accessible. We analytically derive an upper bound of the performance improvement due to joint estimation. Results are obtained for two limiting cases: (i) pure process noise, and (ii) pure measurement noise. The benefits of information sharing are negligible in (i). For (ii), a performance improvement is observed when a high similarity between the agents is guaranteed.}
}

A platform for dance performances with multiple quadrocopters
A. P. Schoellig, F. Augugliaro, and R. D’Andrea
in Proc. of the Workshop on Robots and Musical Expressions at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2010, pp. 1-8.

This paper presents a platform for rhythmic flight with multiple quadrocopters. We envision an expressive multimedia dance performance that is automatically composed and controlled, given a random piece of music. Results in this paper prove the feasibility of audio-motion synchronization when precisely timing the side-to-side motion of a quadrocopter to the beat of the music. An illustration of the indoor flight space and the vehicles shows the characteristics and capabilities of the experimental setup. Prospective features of the platform are outlined and key challenges are emphasized. The paper concludes with a proof-of-concept demonstration showing three vehicles synchronizing their side-to-side motion to the music beat. Moreover, a dance performance to a remix of the sound track ‘Pirates of the Caribbean’ gives a first impression of the novel musical experience. Future steps include an appropriate multiscale music analysis and the development of algorithms for the automated generation of choreography based on a database of motion primitives.

@INPROCEEDINGS{schoellig-iros10,
author = {Angela P. Schoellig and Federico Augugliaro and Raffaello D'Andrea},
title = {A platform for dance performances with multiple quadrocopters},
booktitle = {{Proc. of the Workshop on Robots and Musical Expressions at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {1-8},
year = {2010},
urlvideo = {https://youtu.be/aaaGJKnJdrg?list=PLD6AAACCBFFE64AC5},
urlslides = {../../wp-content/papercite-data/slides/schoellig-iros10-slides.pdf},
abstract = {This paper presents a platform for rhythmic flight with multiple quadrocopters. We envision an expressive multimedia dance performance that is automatically composed and controlled, given a random piece of music. Results in this paper prove the feasibility of audio-motion synchronization when precisely timing the side-to-side motion of a quadrocopter to the beat of the music. An illustration of the indoor flight space and the vehicles shows the characteristics and capabilities of the experimental setup. Prospective features of the platform are outlined and key challenges are emphasized. The paper concludes with a proof-of-concept demonstration showing three vehicles synchronizing their side-to-side motion to the music beat. Moreover, a dance performance to a remix of the sound track 'Pirates of the Caribbean' gives a first impression of the novel musical experience. Future steps include an appropriate multiscale music analysis and the development of algorithms for the automated generation of choreography based on a database of motion primitives.}
}

Synchronizing the motion of a quadrocopter to music
A. P. Schoellig, F. Augugliaro, and R. D’Andrea
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2010, pp. 3355-3360.

This paper presents a quadrocopter flying in rhythm to music. The quadrocopter performs a periodic side-to-side motion in time to a musical beat. Underlying controllers are designed that stabilize the vehicle and produce a swinging motion. Synchronization is then achieved by using concepts from phase-locked loops. A phase comparator combined with a correction algorithm eliminate the phase error between the music reference and the actual quadrocopter motion. Experimental results show fast and effective synchronization that is robust to sudden changes in the reference amplitude and frequency. Changes in frequency and amplitude are tracked precisely when adding an additional feedforward component, based on an experimentally determined look-up table.

@INPROCEEDINGS{schoellig-icra10,
author = {Angela P. Schoellig and Federico Augugliaro and Raffaello D'Andrea},
title = {Synchronizing the motion of a quadrocopter to music},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
pages = {3355-3360},
year = {2010},
doi = {10.1109/ROBOT.2010.5509755},
urlslides = {../../wp-content/papercite-data/slides/schoellig-icra10-slides.pdf},
urlvideo = {https://youtu.be/Kx4DtXv_bPo?list=PLD6AAACCBFFE64AC5},
abstract = {This paper presents a quadrocopter flying in rhythm to music. The quadrocopter performs a periodic side-to-side motion in time to a musical beat. Underlying controllers are designed that stabilize the vehicle and produce a swinging motion. Synchronization is then achieved by using concepts from phase-locked loops. A phase comparator combined with a correction algorithm eliminate the phase error between the music reference and the actual quadrocopter motion. Experimental results show fast and effective synchronization that is robust to sudden changes in the reference amplitude and frequency. Changes in frequency and amplitude are tracked precisely when adding an additional feedforward component, based on an experimentally determined look-up table.}
}

## 2009

Optimization-based iterative learning control for trajectory tracking
A. P. Schoellig and R. D’Andrea
in Proc. of the European Control Conference (ECC), 2009, pp. 1505-1510.

In this paper, an optimization-based iterative learning control approach is presented. Given a desired trajectory to be followed, the proposed learning algorithm improves the system performance from trial to trial by exploiting the experience gained from previous repetitions. Taking advantage of the a-priori knowledge about the systems dominating dynamics, a data-based update rule is derived which adapts the feedforward input signal after each trial. By combining traditional model-based optimal filtering methods with state-of-the-art optimization techniques such as convex programming, an effective and computationally highly efficient learning strategy is obtained. Moreover, the derived formalism allows for the direct treatment of input and state constraints. Different (nonlinear) performance objectives can be specified defining the overall learning behavior. Finally, the proposed algorithm is successfully applied to the benchmark problem of swinging up a pendulum using open-loop control only.

@INPROCEEDINGS{schoellig-ecc09,
author = {Angela P. Schoellig and Raffaello D'Andrea},
title = {Optimization-based iterative learning control for trajectory tracking},
booktitle = {{Proc. of the European Control Conference (ECC)}},
pages = {1505-1510},
year = {2009},
urlslides = {../../wp-content/papercite-data/slides/schoellig-ecc09-slides.pdf},
urlvideo = {https://youtu.be/W2gCn6aAwz4?list=PLC12E387419CEAFF2},
abstract = {In this paper, an optimization-based iterative learning control approach is presented. Given a desired trajectory to be followed, the proposed learning algorithm improves the system performance from trial to trial by exploiting the experience gained from previous repetitions. Taking advantage of the a-priori knowledge about the systems dominating dynamics, a data-based update rule is derived which adapts the feedforward input signal after each trial. By combining traditional model-based optimal filtering methods with state-of-the-art optimization techniques such as convex programming, an effective and computationally highly efficient learning strategy is obtained. Moreover, the derived formalism allows for the direct treatment of input and state constraints. Different (nonlinear) performance objectives can be specified defining the overall learning behavior. Finally, the proposed algorithm is successfully applied to the benchmark problem of swinging up a pendulum using open-loop control only.}
}

## 2008

Verification of the performance of selected subsystems for the LISA mission (in German)
P. F. Gath, D. Weise, T. Heinrich, A. P. Schoellig, and S. Otte
in Proc. of the German Aerospace Congress, German Society for Aeronautics and Astronautics (DGLR), 2008.

Im Rahmen der Untersuchung zur Systemleistung alternativer Nutzlastkonzepte fuer die LISA Mission (Laser Interferometer Space Antenna) werden bei Astrium derzeit einzelne Untersysteme der Nutzlast auf ihre Leistungsfaehigkeit hin ueberprueft. Dies geschieht sowohl durch theoretische Untersuchungen im Rahmen von Simulationen als auch durch experimentelle Laboruntersuchungen.

@INPROCEEDINGS{gath-gac08,
author = {Peter F. Gath and Dennis Weise and Thomas Heinrich and Angela P. Schoellig and S. Otte},
title = {Verification of the performance of selected subsystems for the {LISA} mission {(in German)}},
booktitle = {{Proc. of the German Aerospace Congress, German Society for Aeronautics and Astronautics (DGLR)}},
year = {2008},
abstract = {Im Rahmen der Untersuchung zur Systemleistung alternativer Nutzlastkonzepte fuer die LISA Mission (Laser Interferometer Space Antenna) werden bei Astrium derzeit einzelne Untersysteme der Nutzlast auf ihre Leistungsfaehigkeit hin ueberprueft. Dies geschieht sowohl durch theoretische Untersuchungen im Rahmen von Simulationen als auch durch experimentelle Laboruntersuchungen.}
}

Learning through experience – Optimizing performance by repetition
A. P. Schoellig and R. D’Andrea
Abstract and Poster, in Proc. of the Robotics Challenges for Machine Learning Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2008.

The goal of our research is to develop a strategy which enables a system, executing the same task multiple times, to use the knowledge of the previous trials to learn more about its own dynamics and enhance its future performance. Our approach, which falls in the field of iterative learning control, combines methods from both areas, traditional model-based estimation and control and purely data-based learning.

@MISC{schoellig-iros08,
author = {Angela P. Schoellig and Raffaello D'Andrea},
title = {Learning through experience -- {O}ptimizing performance by repetition},
howpublished = {Abstract and Poster, in Proc. of the Robotics Challenges for Machine Learning Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year = {2008},
urlvideo = {https://youtu.be/W2gCn6aAwz4?list=PLC12E387419CEAFF2},
urlslides = {../../wp-content/papercite-data/slides/schoellig-iros08-slides.pdf},
abstract = {The goal of our research is to develop a strategy which enables a system, executing the same task multiple times, to use the knowledge of the previous trials to learn more about its own dynamics and enhance its future performance. Our approach, which falls in the field of iterative learning control, combines methods from both areas, traditional model-based estimation and control and purely data-based learning.},
}

## 2007

A hybrid Bellman equation for bimodal systems
P. Caines, M. Egerstedt, R. Malhame, and A. P. Schoellig
in Hybrid Systems: Computation and Control, A. Bemporad, A. Bicchi, and G. Buttazzo, Eds., Springer berlin heidelberg, 2007, vol. 4416, pp. 656-659.

In this paper we present a dynamic programming formulation of a hybrid optimal control problem for bimodal systems with regional dynamics. In particular, based on optimality-zone computations, a framework is presented in which the resulting hybrid Bellman equation guides the design of optimal control programs with, at most, N discrete transitions.

@INCOLLECTION{caines-springer07,
author={Peter Caines and Magnus Egerstedt and Roland Malhame and Angela P. Schoellig},
title={A Hybrid {Bellman} Equation for Bimodal Systems},
booktitle={{Hybrid Systems: Computation and Control}},
editor={Bemporad, Alberto and Bicchi, Antonio and Buttazzo, Giorgio},
publisher={Springer Berlin Heidelberg},
pages={656-659},
year={2007},
volume={4416},
series={Lecture Notes in Computer Science},
doi={10.1007/978-3-540-71493-4_54},
abstract = {In this paper we present a dynamic programming formulation of a hybrid optimal control problem for bimodal systems with regional dynamics. In particular, based on optimality-zone computations, a framework is presented in which the resulting hybrid Bellman equation guides the design of optimal control programs with, at most, N discrete transitions.}
}

A hybrid Bellman equation for systems with regional dynamics
A. P. Schoellig, P. E. Caines, M. Egerstedt, and R. P. Malhamé
in Proc. of the IEEE Conference on Decision and Control (CDC), 2007, pp. 3393-3398.

In this paper, we study hybrid systems with regional dynamics, i.e., systems where transitions between different dynamical regimes occur as the continuous state of the system reaches given switching surfaces. In particular, we focus our attention on the optimal control problem associated with such systems, and we present a Hybrid Bellman Equation for such systems that provide a characterization of global optimality, given an upper bound on the number of switches. Not surprisingly, the solution will be hybrid in nature in that it will depend on not only the continuous control signals, but also on discrete decisions as to what domains the system should go through in the first place. A number of examples are presented to highlight the operation of the proposed approach.

@INPROCEEDINGS{schoellig-cdc07,
author = {Angela P. Schoellig and Peter E. Caines and Magnus Egerstedt and Roland P. Malham\'e},
title = {A hybrid {B}ellman equation for systems with regional dynamics},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
pages = {3393-3398},
year = {2007},
doi = {10.1109/CDC.2007.4434952},
abstract = {In this paper, we study hybrid systems with regional dynamics, i.e., systems where transitions between different dynamical regimes occur as the continuous state of the system reaches given switching surfaces. In particular, we focus our attention on the optimal control problem associated with such systems, and we present a Hybrid Bellman Equation for such systems that provide a characterization of global optimality, given an upper bound on the number of switches. Not surprisingly, the solution will be hybrid in nature in that it will depend on not only the continuous control signals, but also on discrete decisions as to what domains the system should go through in the first place. A number of examples are presented to highlight the operation of the proposed approach.}
}

Topology-dependent stability of a network of dynamical systems with communication delays
A. P. Schoellig, U. Münz, and F. Allgöwer
in Proc. of the European Control Conference (ECC), 2007, pp. 1197-1202.

In this paper, we analyze the stability of a network of first-order linear time-invariant systems with constant, identical communication delays. We investigate the influence of both system parameters and network characteristics on stability. In particular, a non-conservative stability bound for the delay is given such that the network is asymptotically stable for any delay smaller than this bound. We show how the network topology changes the stability bound. Exemplarily, we use these results to answer the question if a symmetric or skew-symmetric interconnection is preferable for a given set of subsystems.

@INPROCEEDINGS{schoellig-ecc07,
author = {Angela P. Schoellig and Ulrich M\"unz and Frank Allg\"ower},
title = {Topology-Dependent Stability of a Network of Dynamical Systems with Communication Delays},
booktitle = {{Proc. of the European Control Conference (ECC)}},
pages = {1197-1202},
year = {2007},
abstract = {In this paper, we analyze the stability of a network of first-order linear time-invariant systems with constant, identical communication delays. We investigate the influence of both system parameters and network characteristics on stability. In particular, a non-conservative stability bound for the delay is given such that the network is asymptotically stable for any delay smaller than this bound. We show how the network topology changes the stability bound. Exemplarily, we use these results to answer the question if a symmetric or skew-symmetric interconnection is preferable for a given set of subsystems.}
}

Optimal control of hybrid systems with regional dynamics
A. P. Schoellig
Master Thesis, Georgia Institute of Technology, USA, 2007.

In this work, hybrid systems with regional dynamics are considered. These are systems where transitions between different dynamical regimes occur as the continuous state of the system reaches given switching surfaces. In particular, the attention is focused on the optimal control problem associated with such systems. More precisely, given a specific cost function, the goal is to determine the optimal path of going from a given starting point to a fixed final state during an a priori specified time horizon. The key characteristic of the approach presented in this thesis is a hierarchical decomposition of the hybrid optimal control problem, yielding to a framework which allows a solution on different levels of control. On the highest level of abstraction, the regional structure of the state space is taken into account and a discrete representation of the connections between the different regions provides global accessibility relations between regions. These are used on a lower level of control to formulate the main theorem of this work, namely, the Hybrid Bellman Equation for multimodal systems, which, in fact, provides a characterization of global optimality, given an upper bound on the number of transitions along a hybrid trajectory. Not surprisingly, the optimal solution is hybrid in nature, in that it depends on not only the continuous control signals, but also on discrete decisions as to what domains the system’s continuous state should go through in the first place. The main benefit with the proposed approach lies in the fact that a hierarchical Dynamic Programming algorithm can be used to representing both a theoretical characterization of the hybrid solution’s structural composition and, from a more application-driven point of view, a numerically implementable calculation rule yielding to globally optimal solutions in a regional dynamics framework. The operation of the recursive algorithm is highlighted by the consideration of numerous examples, among them, a heterogeneous multi-agent problem.

@MASTERSTHESIS{schoellig-gatech07,
author = {Angela P. Schoellig},
title = {Optimal control of hybrid systems with regional dynamics},
school = {Georgia Institute of Technology},
urlslides = {../../wp-content/papercite-data/slides/schoellig-gatech07-slides.pdf},
year = {2007},
abstract = {In this work, hybrid systems with regional dynamics are considered. These are systems where transitions between different dynamical regimes occur as the continuous state of the system reaches given switching surfaces. In particular, the attention is focused on the optimal control problem associated with such systems. More precisely, given a specific cost function, the goal is to determine the optimal path of going from a given starting point to a fixed final state during an a priori specified time horizon. The key characteristic of the approach presented in this thesis is a hierarchical decomposition of the hybrid optimal control problem, yielding to a framework which allows a solution on different levels of control. On the highest level of abstraction, the regional structure of the state space is taken into account and a discrete representation of the connections between the different regions provides global accessibility relations between regions. These are used on a lower level of control to formulate the main theorem of this work, namely, the Hybrid Bellman Equation for multimodal systems, which, in fact, provides a characterization of global optimality, given an upper bound on the number of transitions along a hybrid trajectory. Not surprisingly, the optimal solution is hybrid in nature, in that it depends on not only the continuous control signals, but also on discrete decisions as to what domains the system's continuous state should go through in the first place. The main benefit with the proposed approach lies in the fact that a hierarchical Dynamic Programming algorithm can be used to representing both a theoretical characterization of the hybrid solution's structural composition and, from a more application-driven point of view, a numerically implementable calculation rule yielding to globally optimal solutions in a regional dynamics framework. The operation of the recursive algorithm is highlighted by the consideration of numerous examples, among them, a heterogeneous multi-agent problem.},
}

## 2006

Stability of a network of dynamical systems with communication delays (in German)
A. P. Schoellig
Semester Project, University of Stuttgart, Germany, 2006.

@MASTERSTHESIS{schoellig-stuttgart06,
author = {Angela P. Schoellig},
title = {Stability of a network of dynamical systems with communication delays {(in German)}},
school = {University of Stuttgart},
type = {Semester Project},
urlslides = {../../wp-content/papercite-data/slides/schoellig-stuttgart06-slides.pdf},
year = {2006},
}