Publications

A BibTex file that includes all references is found here. You can also follow our publication updates via Google Scholar.

Type:

2018

Model predictive path-following for constrained differentially flat systems
M. Greeff and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2018. Submitted.
[View BibTeX] [View Abstract] [Download PDF] [Download Additional Material] [More Information]
For many tasks, predictive path-following control can significantly improve the performance and robustness of autonomous robots over traditional trajectory tracking control. It does this by prioritizing closeness to the path over timed progress along the path and by looking ahead to account for changes in the path. We propose a novel predictive path-following approach that couples feedforward linearization with path-based model predictive control. Our approach has a few key advantages. By utilizing the differential flatness property, we reduce the path-based model predictive control problem from a nonlinear to a convex optimization problem. Robustness to disturbances is achieved by a dynamic path reference, which adjusts its speed based on the robot’s progress. We also account for key system constraints. We demonstrate these advantages in experiment on a quadrotor. We show improved performance over a baseline trajectory tracking controller by keeping the quadrotor closer to the desired path under nominal conditions, with an initial offset and under a wind disturbance.

@INPROCEEDINGS{greeff-icra18,
author={Melissa Greeff and Angela P. Schoellig},
title={Model Predictive Path-Following for Constrained Differentially Flat Systems},
booktitle={{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year={2018},
note={Submitted},
urllink={https://arxiv.org/abs/1710.02555},
urldata = {../../wp-content/papercite-data/data/greeff-icra18-supplementary.pdf},
abstract={For many tasks, predictive path-following control can significantly improve the performance and robustness of autonomous robots over traditional trajectory tracking control. It does this by prioritizing closeness to the path over timed progress along the path and by looking ahead to account for changes in the path. We propose a novel predictive path-following approach that couples feedforward linearization with path-based model predictive control. Our approach has a few key advantages. By utilizing the differential flatness property, we reduce the path-based model predictive control problem from a nonlinear to a convex optimization problem. Robustness to disturbances is achieved by a dynamic path reference, which adjusts its speed based on the robot’s progress. We also account for key system constraints. We demonstrate these advantages in experiment on a quadrotor. We show improved performance over a baseline trajectory tracking controller by keeping the quadrotor closer to the desired path under nominal conditions, with an initial offset and under a wind disturbance.},
}

An inversion-based learning approach for improving impromptu trajectory tracking of robots with non-minimum phase dynamics
S. Zhou, M. K. Helwa, and A. P. Schoellig
in Robotics and Automation Letters (RA-L) and the Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2018. Submitted.
[View BibTeX] [View Abstract] [Download PDF] [More Information]

This paper presents a learning-based approach for impromptu trajectory tracking of non-minimum phase systems — systems with unstable inverse dynamics. In the control systems literature, inversion-based feedforward approaches are commonly used for improving the trajectory tracking performance; however, these approaches are not directly applicable to non-minimum phase systems due to the inherent instability. In order to resolve the instability issue, they assume that models of the systems are known and have dealt with the non-minimum phase systems by pre-actuation or inverse approximation techniques. In this work, we extend our deep-neural-network-enhanced impromptu trajectory tracking approach to the challenging case of non-minimum phase systems. Through theoretical discussions, simulations, and experiments, we show the stability and effectiveness of our proposed learning approach. In fact, for a known system, our approach performs equally well or better as a typical model-based approach but does not require a prior model of the system. Interestingly, our approach also shows that including more information in training (as is commonly assumed to be useful) does not lead to better performance but may trigger instability issues and impede the effectiveness of the overall approach.

@INPROCEEDINGS{zhou-icra18,
author={SiQi Zhou and Mohamed K. Helwa and Angela P. Schoellig},
title={An Inversion-Based Learning Approach for Improving Impromptu Trajectory Tracking of Robots with Non-Minimum Phase Dynamics},
booktitle={{Robotics and Automation Letters (RA-L) and the Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year={2018},
note={Submitted},
urllink={https://arxiv.org/abs/1709.04407},
abstract={This paper presents a learning-based approach for impromptu trajectory tracking of non-minimum phase systems — systems with unstable inverse dynamics. In the control systems literature, inversion-based feedforward approaches are commonly used for improving the trajectory tracking performance; however, these approaches are not directly applicable to non-minimum phase systems due to the inherent instability. In order to resolve the instability issue, they assume that models of the systems are known and have dealt with the non-minimum phase systems by pre-actuation or inverse approximation techniques. In this work, we extend our deep-neural-network-enhanced impromptu trajectory tracking approach to the challenging case of non-minimum phase systems. Through theoretical discussions, simulations, and experiments, we show the stability and effectiveness of our proposed learning approach. In fact, for a known system, our approach performs equally well or better as a typical model-based approach but does not require a prior model of the system. Interestingly, our approach also shows that including more information in training (as is commonly assumed to be useful) does not lead to better performance but may trigger instability issues and impede the effectiveness of the overall approach.},
}

2017

[DOI] A real-time analysis of post-blast rock fragmentation using UAV technology
T. Bamford, K. Esmaeili, and A. P. Schoellig
International Journal of Mining, Reclamation and Environment, vol. 31, iss. 6, pp. 439-456, 2017.
[View BibTeX] [View Abstract] [Download PDF] [View Video]
The current practice of collecting rock fragmentation data for image analysis is highly manual and provides data with low temporal and spatial resolution. Using Unmanned Aerial Vehicles (UAVs) for collecting images of rock fragments improves the quality of the image data and automates the data collection process. This work presents the results of laboratory-scale using a UAV. The goal is to highlight the benefits of aerial fragmentation analysis in terms of both prediction accuracy and time effort. The pile was manually photographed and the results of the manual method were compared to the UAV method.

@article{bamford-ijmre17,
title = {A Real-Time Analysis of Post-Blast Rock Fragmentation Using {UAV} Technology},
author = {Bamford, Thomas and Esmaeili, Kamran and Schoellig, Angela P.},
journal = {{International Journal of Mining, Reclamation and Environment}},
year = {2017},
volume = {31},
number = {6},
doi = {10.1080/17480930.2017.1339170},
pages = {439--456},
publisher = {Taylor & Francis},
urlvideo = {https://youtu.be/q0syk6J_JHY},
abstract = {The current practice of collecting rock fragmentation data for image analysis is highly manual and provides data with low temporal and spatial resolution. Using Unmanned Aerial Vehicles (UAVs) for collecting images of rock fragments improves the quality of the image data and automates the data collection process. This work presents the results of laboratory-scale using a UAV. The goal is to highlight the benefits of aerial fragmentation analysis in terms of both prediction accuracy and time effort. The pile was manually photographed and the results of the manual method were compared to the UAV method.},
}

[DOI] Optimizing a drone network to deliver automated external defibrillators
J. J. Boutilier, S. C. Brooks, A. Janmohamed, A. Byers, J. E. Buick, C. Zhan, A. P. Schoellig, S. Cheskes, L. J. Morrison, and T. C. Y. Chan
Circulation, 2017. In press.
[View BibTeX] [View Abstract] [Download PDF]

BACKGROUND Public access defibrillation programs can improve survival after out-of-hospital cardiac arrest (OHCA), but automated external defibrillators (AEDs) are rarely available for bystander use at the scene. Drones are an emerging technology that can deliver an AED to the scene of an OHCA for bystander use. We hypothesize that a drone network designed with the aid of a mathematical model combining both optimization and queuing can reduce the time to AED arrival. METHODS We applied our model to 53,702 OHCAs that occurred in the eight regions of the Toronto Regional RescuNET between January 1st 2006 and December 31st 2014. Our primary analysis quantified the drone network size required to deliver an AED one, two, or three minutes faster than historical median 911 response times for each region independently. A secondary analysis quantified the reduction in drone resources required if RescuNET was treated as one large coordinated region. RESULTS The region-specific analysis determined that 81 bases and 100 drones would be required to deliver an AED ahead of median 911 response times by three minutes. In the most urban region, the 90th percentile of the AED arrival time was reduced by 6 minutes and 43 seconds relative to historical 911 response times in the region. In the most rural region, the 90th percentile was reduced by 10 minutes and 34 seconds. A single coordinated drone network across all regions required 39.5% fewer bases and 30.0% fewer drones to achieve similar AED delivery times. CONCLUSIONS An optimized drone network designed with the aid of a novel mathematical model can substantially reduce the AED delivery time to an OHCA event.

@article{boutilier-circ17,
title={Optimizing a Drone Network to Deliver Automated External Defibrillators},
author = {Boutilier, Justin J. and Brooks, Steven C. and Janmohamed, Alyf and Byers, Adam and Buick, Jason E. and Zhan, Cathy and Schoellig, Angela P. and Cheskes, Sheldon and Morrison, Laurie J. and Chan, Timothy C. Y.},
journal={Circulation},
year={2017},
doi = {10.1161/CIRCULATIONAHA.116.026318},
publisher = {American Heart Association, Inc.},
note = {In press},
abstract = {BACKGROUND Public access defibrillation programs can improve survival after out-of-hospital cardiac arrest (OHCA), but automated external defibrillators (AEDs) are rarely available for bystander use at the scene. Drones are an emerging technology that can deliver an AED to the scene of an OHCA for bystander use. We hypothesize that a drone network designed with the aid of a mathematical model combining both optimization and queuing can reduce the time to AED arrival. METHODS We applied our model to 53,702 OHCAs that occurred in the eight regions of the Toronto Regional RescuNET between January 1st 2006 and December 31st 2014. Our primary analysis quantified the drone network size required to deliver an AED one, two, or three minutes faster than historical median 911 response times for each region independently. A secondary analysis quantified the reduction in drone resources required if RescuNET was treated as one large coordinated region. RESULTS The region-specific analysis determined that 81 bases and 100 drones would be required to deliver an AED ahead of median 911 response times by three minutes. In the most urban region, the 90th percentile of the AED arrival time was reduced by 6 minutes and 43 seconds relative to historical 911 response times in the region. In the most rural region, the 90th percentile was reduced by 10 minutes and 34 seconds. A single coordinated drone network across all regions required 39.5% fewer bases and 30.0% fewer drones to achieve similar AED delivery times. CONCLUSIONS An optimized drone network designed with the aid of a novel mathematical model can substantially reduce the AED delivery time to an OHCA event.},
}

Towards visual teach & repeat for GPS-denied flight of a fixed-wing UAV
M. Warren, M. Paton, K. MacTavish, A. P. Schoellig, and T. D. Barfoot
in Proc. of the 11th Conference on Field and Service Robotics (FSR), 2017. Accepted.
[View BibTeX] [View Abstract] [Download PDF]

Most consumer and industrial Unmanned Aerial Vehicles (UAVs) rely on combining Global Navigation Satellite Systems (GNSS) with barometric and inertial sensors for outdoor operation. As a consequence these vehicles are prone to a variety of potential navigation failures such as jamming and environmental interference. This usually limits their legal activities to locations of low population density within line-of-sight of a human pilot to reduce risk of injury and damage. Autonomous route-following methods such as Visual Teach & Repeat (VT&R) have enabled long-range navigational autonomy for ground robots without the need for reliance on external infrastructure or an accurate global position estimate. In this paper, we demonstrate the localisation component of (VT&R) outdoors on a fixed-wing UAV as a method of backup navigation in case of primary sensor failure. We modify the localisation engine of (VT&R) to work with a single downward facing camera on a UAV to enable safe navigation under the guidance of vision alone. We evaluate the method using visual data from the UAV flying a 1200 m trajectory (at altitude of 80 m) several times during a multi-day period, covering a total distance of 10.8 km using the algorithm. We examine the localisation performance for both small (single flight) and large (inter-day) temporal differences from teach to repeat. Through these experiments, we demonstrate the ability to successfully localise the aircraft on a self-taught route using vision alone without the need for additional sensing or infrastructure.

@INPROCEEDINGS{warren-fsr17,
author={Michael Warren and Michael Paton and Kirk MacTavish and Angela P. Schoellig and Tim D. Barfoot},
title={Towards visual teach & repeat for {GPS}-denied
flight of a fixed-wing {UAV}},
booktitle={{Proc. of the 11th Conference on Field and Service Robotics (FSR)}},
year={2017},
note={Accepted},
abstract={Most consumer and industrial Unmanned Aerial Vehicles (UAVs) rely on combining Global Navigation Satellite Systems (GNSS) with barometric and inertial sensors for outdoor operation. As a consequence these vehicles are prone to a variety of potential navigation failures such as jamming and environmental interference. This usually limits their legal activities to locations of low population density within line-of-sight of a human pilot to reduce risk of injury and damage. Autonomous route-following methods such as Visual Teach & Repeat (VT&R) have enabled long-range navigational autonomy for ground robots without the need for reliance on external infrastructure or an accurate global position estimate. In this paper, we demonstrate the localisation component of (VT&R) outdoors on a fixed-wing UAV as a method of backup navigation in case of primary sensor failure. We modify the localisation engine of (VT&R) to work with a single downward facing camera on a UAV to enable safe navigation under the guidance of vision alone. We evaluate the method using visual data from the UAV flying a 1200 m trajectory (at altitude of 80 m) several times during a multi-day period, covering a total distance of 10.8 km using the algorithm. We examine the localisation performance for both small (single flight) and large (inter-day) temporal differences from teach to repeat. Through these experiments, we demonstrate the ability to successfully localise the aircraft on a self-taught route using vision alone without the need for additional sensing or infrastructure.},
}

Multi-robot transfer learning: a dynamical system perspective
M. K. Helwa and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017. Accepted.
[View BibTeX] [View Abstract] [Download PDF] [More Information]

Multi-robot transfer learning allows a robot to use data generated by a second, similar robot to improve its own behavior. The potential advantages are reducing the time of training and the unavoidable risks that exist during the training phase. Transfer learning algorithms aim to find an optimal transfer map between different robots. In this paper, we investigate, through a theoretical study of single-input single-output (SISO) systems, the properties of such optimal transfer maps. We first show that the optimal transfer learning map is, in general, a dynamic system. The main contribution of the paper is to provide an algorithm for determining the properties of this optimal dynamic map including its order and regressors (i.e., the variables it depends on). The proposed algorithm does not require detailed knowledge of the robots’ dynamics, but relies on basic system properties easily obtainable through simple experimental tests. We validate the proposed algorithm experimentally through an example of transfer learning between two different quadrotor platforms. Experimental results show that an optimal dynamic map, with correct properties obtained from our proposed algorithm, achieves 60-70% reduction of transfer learning error compared to the cases when the data is directly transferred or transferred using an optimal static map.

@INPROCEEDINGS{helwa-iros17,
author={Mohamed K. Helwa and Angela P. Schoellig},
title={Multi-Robot Transfer Learning: A Dynamical System Perspective},
booktitle={{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year={2017},
note={Accepted},
urllink={https://arxiv.org/abs/1707.08689},
abstract={Multi-robot transfer learning allows a robot to use data generated by a second, similar robot to improve its own behavior. The potential advantages are reducing the time of training and the unavoidable risks that exist during the training phase. Transfer learning algorithms aim to find an optimal transfer map between different robots. In this paper, we investigate, through a theoretical study of single-input single-output (SISO) systems, the properties of such optimal transfer maps. We first show that the optimal transfer learning map is, in general, a dynamic system. The main contribution of the paper is to provide an algorithm for determining the properties of this optimal dynamic map including its order and regressors (i.e., the variables it depends on). The proposed algorithm does not require detailed knowledge of the robots’ dynamics, but relies on basic system properties easily obtainable through simple experimental tests. We validate the proposed algorithm experimentally through an example of transfer learning between two different quadrotor platforms. Experimental results show that an optimal dynamic map, with correct properties obtained from our proposed algorithm, achieves 60-70% reduction of transfer learning error compared to the cases when the data is directly transferred or transferred using an optimal static map.},
}

Aerial rock fragmentation analysis in low-light condition using UAV technology
T. Bamford, K. Esmaeili, and A. P. Schoellig
in Proc. of Application of Computers and Operations Research in the Mining Industry (APCOM), 2017, p. 4-1–4-8.
[View BibTeX] [View Abstract] [Download PDF] [Download Slides]

In recent years, Unmanned Aerial Vehicle (UAV) technology has been introduced into the mining industry to conduct terrain surveying. This work investigates the application of UAVs with artificial lighting for measurement of rock fragmentation under poor lighting conditions, representing night shifts in surface mines or working conditions in underground mines. The study relies on indoor and outdoor experiments for rock fragmentation analysis using a quadrotor UAV. Comparison of the rock size distributions in both cases show that adequate artificial lighting enables similar accuracy to ideal lighting conditions.

@INPROCEEDINGS{bamford-apcom17,
author={Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
title={Aerial Rock Fragmentation Analysis in Low-Light Condition Using {UAV} Technology},
booktitle={{Proc. of Application of Computers and Operations Research in the Mining Industry (APCOM)}},
year={2017},
pages = {4-1--4-8},
urlslides={../../wp-content/papercite-data/slides/bamford-apcom17-slides.pdf},
abstract={In recent years, Unmanned Aerial Vehicle (UAV) technology has been introduced into the mining industry to conduct terrain surveying. This work investigates the application of UAVs with artificial lighting for measurement of rock fragmentation under poor lighting conditions, representing night shifts in surface mines or working conditions in underground mines. The study relies on indoor and outdoor experiments for rock fragmentation analysis using a quadrotor UAV. Comparison of the rock size distributions in both cases show that adequate artificial lighting enables similar accuracy to ideal lighting conditions.},
}

A framework for multi-vehicle navigation using feedback-based motion primitives
M. Vukosavljev, Z. Kroeze, M. E. Broucke, and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017. Accepted.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [More Information]

We present a hybrid control framework for solving a motion planning problem among a collection of heterogenous agents. The proposed approach utilizes a finite set of low-level motion primitives, each based on a piecewise affine feedback control, to generate complex motions in a gridded workspace. The constraints on allowable sequences of successive motion primitives are formalized through a maneuver automaton. At the higher level, a control policy generated by a shortest path non-deterministic algorithm determines which motion primitive is executed in each box of the gridded workspace. The overall framework yields a highly robust control design on both the low and high levels. We experimentally demonstrate the efficacy and robustness of this framework for multiple quadrocopters maneuvering in a 2D or 3D workspace.

@INPROCEEDINGS{vukosavljev-iros17,
author={Marijan Vukosavljev and Zachary Kroeze and Mireille E. Broucke and Angela P. Schoellig},
title={A Framework for Multi-Vehicle Navigation Using Feedback-Based Motion Primitives},
booktitle={{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year={2017},
note={Accepted},
urllink={https://arxiv.org/abs/1707.06988},
urlvideo={https://www.youtube.com/watch?v=qhDQyvYNVEc},
abstract={We present a hybrid control framework for solving a motion planning problem among a collection of heterogenous agents. The proposed approach utilizes a finite set of low-level motion primitives, each based on a piecewise affine feedback control, to generate complex motions in a gridded workspace. The constraints on allowable sequences of successive motion primitives are formalized through a maneuver automaton. At the higher level, a control policy generated by a shortest path non-deterministic algorithm determines which motion primitive is executed in each box of the gridded workspace. The overall framework yields a highly robust control design on both the low and high levels. We experimentally demonstrate the efficacy and robustness of this framework for multiple quadrocopters maneuvering in a 2D or 3D workspace.},
}

Design of deep neural networks as add-on blocks for improving impromptu trajectory tracking
S. Zhou, M. K. Helwa, and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2017. Accepted.
[View BibTeX] [View Abstract] [Download PDF] [More Information]

This paper introduces deep neural networks (DNNs) as add-on blocks to baseline feedback control systems to enhance tracking performance of arbitrary desired trajectories. The DNNs are trained to adapt the reference signals to the feedback control loop. The goal is to achieve a unity map between the desired and the actual outputs. In previous work, the efficacy of this approach was demonstrated on quadrotors; on 30 unseen test trajectories, the proposed DNN approach achieved an average impromptu tracking error reduction of 43% as compared to the baseline feedback controller. Motivated by these results, this work aims to provide platform-independent design guidelines for the proposed DNN-enhanced control architecture. In particular, we provide specific guidelines for the DNN feature selection, derive conditions for when the proposed approach is effective, and show in which cases the training efficiency can be further increased.

@INPROCEEDINGS{zhou-cdc17,
author={SiQi Zhou and Mohamed K. Helwa and Angela P. Schoellig},
title={Design of Deep Neural Networks as Add-on Blocks for Improving Impromptu Trajectory Tracking},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2017},
note={Accepted},
urllink = {https://arxiv.org/pdf/1705.10932.pdf},
abstract = {This paper introduces deep neural networks (DNNs) as add-on blocks to baseline feedback control systems to enhance tracking performance of arbitrary desired trajectories. The DNNs are trained to adapt the reference signals to the feedback control loop. The goal is to achieve a unity map between the desired and the actual outputs. In previous work, the efficacy of this approach was demonstrated on quadrotors; on 30 unseen test trajectories, the proposed DNN approach achieved an average impromptu tracking error reduction of 43% as compared to the baseline feedback controller. Motivated by these results, this work aims to provide platform-independent design guidelines for the proposed DNN-enhanced control architecture. In particular, we provide specific guidelines for the DNN feature selection, derive conditions for when the proposed approach is effective, and show in which cases the training efficiency can be further increased.}
}

Constrained Bayesian optimization with particle swarms for safe adaptive controller tuning
R. R. P. R. Duivenvoorden, F. Berkenkamp, N. Carion, A. Krause, and A. P. Schoellig
in Proc. of the IFAC (International Federation of Automatic Control) World Congress, 2017, pp. 12306-12313.
[View BibTeX] [View Abstract] [Download PDF]

Tuning controller parameters is a recurring and time-consuming problem in control. This is especially true in the field of adaptive control, where good performance is typically only achieved after significant tuning. Recently, it has been shown that constrained Bayesian optimization is a promising approach to automate the tuning process without risking system failures during the optimization process. However, this approach is computationally too expensive for tuning more than a couple of parameters. In this paper, we provide a heuristic in order to efficiently perform constrained Bayesian optimization in high-dimensional parameter spaces by using an adaptive discretization based on particle swarms. We apply the method to the tuning problem of an L1 adaptive controller on a quadrotor vehicle and show that we can reliably and automatically tune parameters in experiments.

@INPROCEEDINGS{duivenvoorden-ifac17,
author = {Rikky R.P.R. Duivenvoorden and Felix Berkenkamp and Nicolas Carion and Andreas Krause and Angela P. Schoellig},
title = {Constrained {B}ayesian Optimization with Particle Swarms for Safe Adaptive Controller Tuning},
booktitle = {{Proc. of the IFAC (International Federation of Automatic Control) World Congress}},
year = {2017},
pages = {12306--12313},
abstract = {Tuning controller parameters is a recurring and time-consuming problem in control. This is especially true in the field of adaptive control, where good performance is typically only achieved after significant tuning. Recently, it has been shown that constrained Bayesian optimization is a promising approach to automate the tuning process without risking system failures during the optimization process. However, this approach is computationally too expensive for tuning more than a couple of parameters. In this paper, we provide a heuristic in order to efficiently perform constrained Bayesian optimization in high-dimensional parameter spaces by using an adaptive discretization based on particle swarms. We apply the method to the tuning problem of an L1 adaptive controller on a quadrotor vehicle and show that we can reliably and automatically tune parameters in experiments.},
}

[DOI] Learning multimodal models for robot dynamics online with a mixture of Gaussian process experts
C. D. McKinnon and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2017, pp. 322-328.
[View BibTeX] [View Abstract] [Download PDF]

For decades, robots have been essential allies alongside humans in controlled industrial environments like heavy manufacturing facilities. However, without the guidance of a trusted human operator to shepherd a robot safely through a wide range of conditions, they have been barred from the complex, ever changing environments that we live in from day to day. Safe learning control has emerged as a promising way to start bridging algorithms based on first principles to complex real-world scenarios by using data to adapt, and improve performance over time. Safe learning methods rely on a good estimate of the robot dynamics and of the bounds on modelling error in order to be effective. Current methods focus on either a single adaptive model, or a fixed, known set of models for the robot dynamics. This limits them to static or slowly changing environments. This paper presents a method using Gaussian Processes in a Dirichlet Process mixture model to learn an increasing number of non-linear models for the robot dynamics. We show that this approach enables a robot to re-use past experience from an arbitrary number of previously visited operating conditions, and to automatically learn a new model when a new and distinct operating condition is encountered. This approach improves the robustness of existing Gaussian Process-based models to large changes in dynamics that do not have to be specified ahead of time.

@INPROCEEDINGS{mckinnon-icra17,
author = {Christopher D. McKinnon and Angela P. Schoellig},
title = {Learning multimodal models for robot dynamics online with a mixture of {G}aussian process experts},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2017},
pages = {322--328},
doi = {10.1109/ICRA.2017.7989041},
abstract = {For decades, robots have been essential allies alongside humans in controlled industrial environments like heavy manufacturing facilities. However, without the guidance of a trusted human operator to shepherd a robot safely through a wide range of conditions, they have been barred from the complex, ever changing environments that we live in from day to day. Safe learning control has emerged as a promising way to start bridging algorithms based on first principles to complex real-world scenarios by using data to adapt, and improve performance over time. Safe learning methods rely on a good estimate of the robot dynamics and of the bounds on modelling error in order to be effective. Current methods focus on either a single adaptive model, or a fixed, known set of models for the robot dynamics. This limits them to static or slowly changing environments. This paper presents a method using Gaussian Processes in a Dirichlet Process mixture model to learn an increasing number of non-linear models for the robot dynamics. We show that this approach enables a robot to re-use past experience from an arbitrary number of previously visited operating conditions, and to automatically learn a new model when a new and distinct operating condition is encountered. This approach improves the robustness of existing Gaussian Process-based models to large changes in dynamics that do not have to be specified ahead of time.},
}

[DOI] High-precision trajectory tracking in changing environments through L1 adaptive feedback and iterative learning
K. Pereida, R. R. P. R. Duivenvoorden, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2017, pp. 344-350.
[View BibTeX] [View Abstract] [Download PDF] [Download Slides] [More Information]

As robots and other automated systems are introduced to unknown and dynamic environments, robust and adaptive control strategies are required to cope with disturbances, unmodeled dynamics and parametric uncertainties. In this paper, we propose and provide theoretical proofs of a combined L1 adaptive feedback and iterative learning control (ILC) framework to improve trajectory tracking of a system subject to unknown and changing disturbances. The L1 adaptive controller forces the system to behave in a repeatable, predefined way, even in the presence of unknown and changing disturbances; however, this does not imply that perfect trajectory tracking is achieved. ILC improves the tracking performance based on experience from previous executions. The performance of ILC is limited by the robustness and repeatability of the underlying system, which, in this approach, is handled by the L1 adaptive controller. In particular, we are able to generalize learned trajectories across different system configurations because the L1 adaptive controller handles the underlying changes in the system. We demonstrate the improved trajectory tracking performance and generalization capabilities of the combined method compared to pure ILC in experiments with a quadrotor subject to unknown, dynamic disturbances. This is the first work to show L1 adaptive control combined with ILC in experiment.

@INPROCEEDINGS{pereida-icra17,
author = {Karime Pereida and Rikky R. P. R. Duivenvoorden and Angela P. Schoellig},
title = {High-precision trajectory tracking in changing environments through {L1} adaptive feedback and iterative learning},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2017},
pages = {344--350},
doi = {10.1109/ICRA.2017.7989041},
urllink = {http://ieeexplore.ieee.org/abstract/document/7989044/},
urlslides = {../../wp-content/papercite-data/slides/pereida-icra17-slides.pdf},
abstract = {As robots and other automated systems are introduced to unknown and dynamic environments, robust and adaptive control strategies are required to cope with disturbances, unmodeled dynamics and parametric uncertainties. In this paper, we propose and provide theoretical proofs of a combined L1 adaptive feedback and iterative learning control (ILC) framework to improve trajectory tracking of a system subject to unknown and changing disturbances. The L1 adaptive controller forces the system to behave in a repeatable, predefined way, even in the presence of unknown and changing disturbances; however, this does not imply that perfect trajectory tracking is achieved. ILC improves the tracking performance based on experience from previous executions. The performance of ILC is limited by the robustness and repeatability of the underlying system, which, in this approach, is handled by the L1 adaptive controller. In particular, we are able to generalize learned trajectories across different system configurations because the L1 adaptive controller handles the underlying changes in the system. We demonstrate the improved trajectory tracking performance and generalization capabilities of the combined method compared to pure ILC in experiments with a quadrotor subject to unknown, dynamic disturbances. This is the first work to show L1 adaptive control combined with ILC in experiment.},
}

[DOI] Deep neural networks for improved, impromptu trajectory tracking of quadrotors
Q. Li, J. Qian, Z. Zhu, X. Bao, M. K. Helwa, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2017, pp. 5183-5189.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [More Information]

Trajectory tracking control for quadrotors is important for applications ranging from surveying and inspection, to film making. However, designing and tuning classical controllers, such as proportional-integral-derivative (PID) controllers, to achieve high tracking precision can be time-consuming and difficult, due to hidden dynamics and other non-idealities. The Deep Neural Network (DNN), with its superior capability of approximating abstract, nonlinear functions, proposes a novel approach for enhancing trajectory tracking control. This paper presents a DNN-based algorithm as an add-on module that improves the tracking performance of a classical feedback controller. Given a desired trajectory, the DNNs provide a tailored reference input to the controller based on their gained experience. The input aims to achieve a unity map between the desired and the output trajectory. The motivation for this work is an interactive �fly-as-you-draw� application, in which a user draws a trajectory on a mobile device, and a quadrotor instantly flies that trajectory with the DNN-enhanced control system. Experimental results demonstrate that the proposed approach improves the tracking precision for user-drawn trajectories after the DNNs are trained on selected periodic trajectories, suggesting the method�s potential in real-world applications. Tracking errors are reduced by around 40-50% for both training and testing trajectories from users, highlighting the DNNs� capability of generalizing knowledge.

@INPROCEEDINGS{li-icra17,
author = {Qiyang Li and Jingxing Qian and Zining Zhu and Xuchan Bao and Mohamed K. Helwa and Angela P. Schoellig},
title = {Deep neural networks for improved, impromptu trajectory tracking of quadrotors},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2017},
pages = {5183--5189},
doi = {10.1109/ICRA.2017.7989607},
urllink = {https://arxiv.org/abs/1610.06283},
urlvideo = {https://youtu.be/r1WnMUZy9-Y},
abstract = {Trajectory tracking control for quadrotors is important for applications ranging from surveying and inspection, to film making. However, designing and tuning classical controllers, such as proportional-integral-derivative (PID) controllers, to achieve high tracking precision can be time-consuming and difficult, due to hidden dynamics and other non-idealities. The Deep Neural Network (DNN), with its superior capability of approximating abstract, nonlinear functions, proposes a novel approach for enhancing trajectory tracking control. This paper presents a DNN-based algorithm as an add-on module that improves the tracking performance of a classical feedback controller. Given a desired trajectory, the DNNs provide a tailored reference input to the controller based on their gained experience. The input aims to achieve a unity map between the desired and the output trajectory. The motivation for this work is an interactive �fly-as-you-draw� application, in which a user draws a trajectory on a mobile device, and a quadrotor instantly flies that trajectory with the DNN-enhanced control system. Experimental results demonstrate that the proposed approach improves the tracking precision for user-drawn trajectories after the DNNs are trained on selected periodic trajectories, suggesting the method�s potential in real-world applications. Tracking errors are reduced by around 40-50% for both training and testing trajectories from users, highlighting the DNNs� capability of generalizing knowledge.},
}

[DOI] Virtual vs. real: trading off simulations and physical experiments in reinforcement learning with Bayesian optimization
A. Marco, F. Berkenkamp, P. Hennig, A. P. Schoellig, A. Krause, S. Schaal, and S. Trimpe
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2017, pp. 1557-1563.
[View BibTeX] [View Abstract] [Download PDF] [More Information]

In practice, the parameters of control policies are often tuned manually. This is time-consuming and frustrating. Reinforcement learning is a promising alternative that aims to automate this process, yet often requires too many experiments to be practical. In this paper, we propose a solution to this problem by exploiting prior knowledge from simulations, which are readily available for most robotic platforms. Specifically, we extend Entropy Search, a Bayesian optimization algorithm that maximizes information gain from each experiment, to the case of multiple information sources. The result is a principled way to automatically combine cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. We apply the resulting method to a cart-pole system, which confirms that the algorithm can find good control policies with fewer experiments than standard Bayesian optimization on the physical system only.

@INPROCEEDINGS{marco-icra17,
author = {Alonso Marco and Felix Berkenkamp and Phillipp Hennig and Angela P. Schoellig and Andreas Krause and Stefan Schaal and Sebastian Trimpe},
title = {Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with {B}ayesian Optimization},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
month = {may},
year = {2017},
pages = {1557--1563},
doi = {10.1109/ICRA.2017.7989186},
urllink = {https://arxiv.org/abs/1703.01250},
abstract = {In practice, the parameters of control policies are often tuned manually. This is time-consuming and frustrating. Reinforcement learning is a promising alternative that aims to automate this process, yet often requires too many experiments to be practical. In this paper, we propose a solution to this problem by exploiting prior knowledge from simulations, which are readily available for most robotic platforms. Specifically, we extend Entropy Search, a Bayesian optimization algorithm that maximizes information gain from each experiment, to the case of multiple information sources. The result is a principled way to automatically combine cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. We apply the resulting method to a cart-pole system, which confirms that the algorithm can find good control policies with fewer experiments than standard Bayesian optimization on the physical system only.}
}

Safe model-based reinforcement learning with stability guarantees
F. Berkenkamp, M. Turchetta, A. P. Schoellig, and A. Krause
Technical Report, arXiv, 2017.
[View BibTeX] [View Abstract] [Download PDF] [More Information]

Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorithm that explicitly considers safety in terms of stability guarantees. Specifically, we extend control theoretic results on Lyapunov stability verification and show how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates. Moreover, under additional regularity assumptions in terms of a Gaussian process prior, we prove that one can effectively and safely collect data in order to learn about the dynamics and thus both improve control performance and expand the safe region of the state space. In our experiments, we show how the resulting algorithm can safely optimize a neural network policy on a simulated inverted pendulum, without the pendulum ever falling down.

@TECHREPORT{berkenkamp-nips17,
title = {Safe model-based reinforcement learning with stability guarantees},
institution = {arXiv},
author = {Felix Berkenkamp and Matteo Turchetta and Angela P. Schoellig and Andreas Krause},
year = {2017},
urllink = {https://arxiv.org/abs/1705.08551},
abstract = {Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorithm that explicitly considers safety in terms of stability guarantees. Specifically, we extend control theoretic results on Lyapunov stability verification and show how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates. Moreover, under additional regularity assumptions in terms of a Gaussian process prior, we prove that one can effectively and safely collect data in order to learn about the dynamics and thus both improve control performance and expand the safe region of the state space. In our experiments, we show how the resulting algorithm can safely optimize a neural network policy on a simulated inverted pendulum, without the pendulum ever falling down.}
}

Point-cloud-based aerial fragmentation analysis for application in the minerals industry
T. Bamford, K. Esmaeili, and A. P. Schoellig
Technical Report, arXiv, 2017.
[View BibTeX] [View Abstract] [Download PDF] [More Information]

This work investigates the application of Unmanned Aerial Vehicle (UAV) technology for measurement of rock fragmentation without placement of scale objects in the scene to determine image scale. Commonly practiced image-based rock fragmentation analysis requires a technician to walk to a rock pile, place a scale object of known size in the area of interest, and capture individual 2D images. Our previous work has used UAV technology for the first time to acquire real-time rock fragmentation data and has shown comparable quality of results; however, it still required the (potentially dangerous) placement of scale objects, and continued to make the assumption that the rock pile surface is planar and that the scale objects lie on the surface plane. This work improves our UAV-based approach to enable rock fragmentation measurement without placement of scale objects and without the assumption of planarity. This is achieved by first generating a point cloud of the rock pile from 2D images, taking into account intrinsic and extrinsic camera parameters, and then taking 2D images for fragmentation analysis. This work represents an important step towards automating post-blast rock fragmentation analysis. In experiments, a rock pile with known size distribution was photographed by the UAV with and without using scale objects. For fragmentation analysis without scale objects, a point cloud of the rock pile was generated and used to compute image scale. Comparison of the rock size distributions show that this point-cloud-based method enables producing measurements with better or comparable accuracy (within 10% of the ground truth) to the manual method with scale objects.

@TECHREPORT{bamford-iros17,
title = {Point-cloud-based aerial fragmentation analysis for application in the minerals industry},
institution = {arXiv},
author = {Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
year = {2017},
urllink = {https://arxiv.org/abs/1703.01945},
abstract = {This work investigates the application of Unmanned Aerial Vehicle (UAV) technology for measurement of rock fragmentation without placement of scale objects in the scene to determine image scale. Commonly practiced image-based rock fragmentation analysis requires a technician to walk to a rock pile, place a scale object of known size in the area of interest, and capture individual 2D images. Our previous work has used UAV technology for the first time to acquire real-time rock fragmentation data and has shown comparable quality of results; however, it still required the (potentially dangerous) placement of scale objects, and continued to make the assumption that the rock pile surface is planar and that the scale objects lie on the surface plane. This work improves our UAV-based approach to enable rock fragmentation measurement without placement of scale objects and without the assumption of planarity. This is achieved by first generating a point cloud of the rock pile from 2D images, taking into account intrinsic and extrinsic camera parameters, and then taking 2D images for fragmentation analysis. This work represents an important step towards automating post-blast rock fragmentation analysis. In experiments, a rock pile with known size distribution was photographed by the UAV with and without using scale objects. For fragmentation analysis without scale objects, a point cloud of the rock pile was generated and used to compute image scale. Comparison of the rock size distributions show that this point-cloud-based method enables producing measurements with better or comparable accuracy (within 10% of the ground truth) to the manual method with scale objects.},
}

2016

[DOI] Robust constrained learning-based NMPC enabling reliable mobile robot path tracking
C. J. Ostafew, A. P. Schoellig, and T. D. Barfoot
International Journal of Robotics Research, vol. 35, iss. 13, pp. 1547-1563, 2016.
[View BibTeX] [View Abstract] [Download PDF] [View Video]
This paper presents a Robust Constrained Learning-based Nonlinear Model Predictive Control (RC-LB-NMPC) algorithm for path-tracking in off-road terrain. For mobile robots, constraints may represent solid obstacles or localization limits. As a result, constraint satisfaction is required for safety. Constraint satisfaction is typically guaranteed through the use of accurate, a priori models or robust control. However, accurate models are generally not available for off-road operation. Furthermore, robust controllers are often conservative, since model uncertainty is not updated online. In this work our goal is to use learning to generate low-uncertainty, non-parametric models in situ. Based on these models, the predictive controller computes both linear and angular velocities in real-time, such that the robot drives at or near its capabilities while respecting path and localization constraints. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, off-road environments. The paper presents experimental results, including over 5 km of travel by a 900 kg skid-steered robot at speeds of up to 2.0 m/s. The result is a robust, learning controller that provides safe, conservative control during initial trials when model uncertainty is high and converges to high-performance, optimal control during later trials when model uncertainty is reduced with experience.

@ARTICLE{ostafew-ijrr16,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot},
title = {Robust Constrained Learning-Based {NMPC} Enabling Reliable Mobile Robot Path Tracking},
year = {2016},
journal = {{International Journal of Robotics Research}},
volume = {35},
number = {13},
pages = {1547-1563},
doi = {10.1177/0278364916645661},
url = {http://dx.doi.org/10.1177/0278364916645661},
eprint = {http://dx.doi.org/10.1177/0278364916645661},
urlvideo = {https://youtu.be/3xRNmNv5Efk},
abstract = {This paper presents a Robust Constrained Learning-based Nonlinear Model Predictive Control (RC-LB-NMPC) algorithm for path-tracking in off-road terrain. For mobile robots, constraints may represent solid obstacles or localization limits. As a result, constraint satisfaction is required for safety. Constraint satisfaction is typically guaranteed through the use of accurate, a priori models or robust control. However, accurate models are generally not available for off-road operation. Furthermore, robust controllers are often conservative, since model uncertainty is not updated online. In this work our goal is to use learning to generate low-uncertainty, non-parametric models in situ. Based on these models, the predictive controller computes both linear and angular velocities in real-time, such that the robot drives at or near its capabilities while respecting path and localization constraints. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, off-road environments. The paper presents experimental results, including over 5 km of travel by a 900 kg skid-steered robot at speeds of up to 2.0 m/s. The result is a robust, learning controller that provides safe, conservative control during initial trials when model uncertainty is high and converges to high-performance, optimal control during later trials when model uncertainty is reduced with experience.},
}

[DOI] Distributed iterative learning control for a team of quadrotors
A. Hock and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2016, pp. 4640-4646.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [View 2nd Video] [Download Slides] [More Information]

The goal of this work is to enable a team of quadrotors to learn how to accurately track a desired trajectory while holding a given formation. We solve this problem in a distributed manner, where each vehicle has only access to the information of its neighbors. The desired trajectory is only available to one (or few) vehicles. We present a distributed iterative learning control (ILC) approach where each vehicle learns from the experience of its own and its neighbors’ previous task repetitions, and adapts its feedforward input to improve performance. Existing algorithms are extended in theory to make them more applicable to real-world experiments. In particular, we prove stability for any causal learning function with gains chosen according to a simple scalar condition. Previous proofs were restricted to a specific learning function that only depends on the tracking error derivative (D-type ILC). Our extension provides more degrees of freedom in the ILC design and, as a result, better performance can be achieved. We also show that stability is not affected by a linear dynamic coupling between neighbors. This allows us to use an additional consensus feedback controller to compensate for non-repetitive disturbances. Experiments with two quadrotors attest the effectiveness of the proposed distributed multi-agent ILC approach. This is the first work to show distributed ILC in experiment.

@INPROCEEDINGS{hock-cdc16,
author = {Andreas Hock and Angela P. Schoellig},
title = {Distributed iterative learning control for a team of quadrotors},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2016},
pages = {4640-4646},
doi = {10.1109/CDC.2016.7798976},
urllink = {http://arxiv.org/ads/1603.05933},
urlvideo = {https://youtu.be/Qw598DRw6-Q},
urlvideo2 = {https://youtu.be/JppRu26eZgI},
urlslides = {../../wp-content/papercite-data/slides/hock-cdc16-slides.pdf},
abstract = {The goal of this work is to enable a team of quadrotors to learn how to accurately track a desired trajectory while holding a given formation. We solve this problem in a distributed manner, where each vehicle has only access to the information of its neighbors. The desired trajectory is only available to one (or few) vehicles. We present a distributed iterative learning control (ILC) approach where each vehicle learns from the experience of its own and its neighbors’ previous task repetitions, and adapts its feedforward input to improve performance. Existing algorithms are extended in theory to make them more applicable to real-world experiments. In particular, we prove stability for any causal learning function with gains chosen according to a simple scalar condition. Previous proofs were restricted to a specific learning function that only depends on the tracking error derivative (D-type ILC). Our extension provides more degrees of freedom in the ILC design and, as a result, better performance can be achieved. We also show that stability is not affected by a linear dynamic coupling between neighbors. This allows us to use an additional consensus feedback controller to compensate for non-repetitive disturbances. Experiments with two quadrotors attest the effectiveness of the proposed distributed multi-agent ILC approach. This is the first work to show distributed ILC in experiment.},
}

[DOI] On the construction of safe controllable regions for affine systems with applications to robotics
M. K. Helwa and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2016, pp. 3000-3005.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Slides] [More Information]

This paper studies the problem of constructing in-block controllable (IBC) regions for affine systems. That is, we are concerned with constructing regions in the state space of affine systems such that all the states in the interior of the region are mutually accessible through the region’s interior by applying uniformly bounded inputs. We first show that existing results for checking in-block controllability on given polytopic regions cannot be easily extended to address the question of constructing IBC regions. We then explore the geometry of the problem to provide a computationally efficient algorithm for constructing IBC regions. We also prove the soundness of the algorithm. Finally, we use the proposed algorithm to construct safe speed profiles for fully-actuated robots and for ground robots modeled as unicycles with acceleration limits.

@INPROCEEDINGS{helwa-cdc16,
author = {Mohamed K. Helwa and Angela P. Schoellig},
title = {On the construction of safe controllable regions for affine systems with applications to robotics},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2016},
pages = {3000-3005},
doi = {10.1109/CDC.2016.7798717},
urllink = {https://arxiv.org/abs/1610.01243},
urlslides = {../../wp-content/papercite-data/slides/helwa-cdc16-slides.pdf},
urlvideo = {https://youtu.be/s_N7zTtCjd0},
abstract = {This paper studies the problem of constructing in-block controllable (IBC) regions for affine systems. That is, we are concerned with constructing regions in the state space of affine systems such that all the states in the interior of the region are mutually accessible through the region’s interior by applying uniformly bounded inputs. We first show that existing results for checking in-block controllability on given polytopic regions cannot be easily extended to address the question of constructing IBC regions. We then explore the geometry of the problem to provide a computationally efficient algorithm for constructing IBC regions. We also prove the soundness of the algorithm. Finally, we use the proposed algorithm to construct safe speed profiles for fully-actuated robots and for ground robots modeled as unicycles with acceleration limits.},
}

[DOI] Safe learning of regions of attraction for uncertain, nonlinear systems with Gaussian processes
F. Berkenkamp, R. Moriconi, A. P. Schoellig, and A. Krause
in Proc. of the IEEE Conference on Decision and Control (CDC), 2016, pp. 4661-4666.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Code] [Code 2] [Download Slides] [More Information]

Control theory can provide useful insights into the properties of controlled, dynamic systems. One important property of nonlinear systems is the region of attraction (ROA), a safe subset of the state space in which a given controller renders an equilibrium point asymptotically stable. The ROA is typically estimated based on a model of the system. However, since models are only an approximation of the real world, the resulting estimated safe region can contain states outside the ROA of the real system. This is not acceptable in safety-critical applications. In this paper, we consider an approach that learns the ROA from experiments on a real system, without ever leaving the true ROA and, thus, without risking safety-critical failures. Based on regularity assumptions on the model errors in terms of a Gaussian process prior, we use an underlying Lyapunov function in order to determine a region in which an equilibrium point is asymptotically stable with high probability. Moreover, we provide an algorithm to actively and safely explore the state space in order to expand the ROA estimate. We demonstrate the effectiveness of this method in simulation.

@INPROCEEDINGS{berkenkamp-cdc16,
author = {Felix Berkenkamp and Riccardo Moriconi and Angela P. Schoellig and Andreas Krause},
title = {Safe learning of regions of attraction for uncertain, nonlinear systems with {G}aussian processes},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2016},
pages = {4661-4666},
doi = {10.1109/CDC.2016.7798979},
urllink = {http://arxiv.org/abs/1603.04915},
urlvideo = {https://youtu.be/bSv-pNOWn7c},
urlslides={../../wp-content/papercite-data/slides/berkenkamp-cdc16-slides.pdf},
urlcode = {https://github.com/befelix/lyapunov-learning},
urlcode2 = {http://berkenkamp.me/jupyter/lyapunov},
abstract = {Control theory can provide useful insights into the properties of controlled, dynamic systems. One important property of nonlinear systems is the region of attraction (ROA), a safe subset of the state space in which a given controller renders an equilibrium point asymptotically stable. The ROA is typically estimated based on a model of the system. However, since models are only an approximation of the real world, the resulting estimated safe region can contain states outside the ROA of the real system. This is not acceptable in safety-critical applications. In this paper, we consider an approach that learns the ROA from experiments on a real system, without ever leaving the true ROA and, thus, without risking safety-critical failures. Based on regularity assumptions on the model errors in terms of a Gaussian process prior, we use an underlying Lyapunov function in order to determine a region in which an equilibrium point is asymptotically stable with high probability. Moreover, we provide an algorithm to actively and safely explore the state space in order to expand the ROA estimate. We demonstrate the effectiveness of this method in simulation.}
}

A real-time analysis of rock fragmentation using UAV technology
T. Bamford, K. Esmaeili, and A. P. Schoellig
in Proc. of the International Conference on Computer Applications in the Minerals Industries (CAMI), 2016.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Slides] [More Information]

Accurate measurement of blast-induced rock fragmentation is of great importance for many mining operations. The post-blast rock size distribution can significantly influence the efficiency of all the downstream mining and comminution processes. Image analysis methods are one of the most common methods used to measure rock fragment size distribution in mines regardless of criticism for lack of accuracy to measure fine particles and other perceived deficiencies. The current practice of collecting rock fragmentation data for image analysis is highly manual and provides data with low temporal and spatial resolution. Using Unmanned Aerial Vehicles (UAVs) for collecting images of rock fragments can not only improve the quality of the image data but also automate the data collection process. Ultimately, real-time acquisition of high temporal- and spatial-resolution data based on UAV technology will provide a broad range of opportunities for both improving blast design without interrupting the production process and reducing the cost of the human operator.

@INPROCEEDINGS{bamford-cami16,
author = {Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
title = {A real-time analysis of rock fragmentation using {UAV} technology},
booktitle = {{Proc. of the International Conference on Computer Applications in the Minerals Industries (CAMI)}},
year = {2016},
urllink = {http://arxiv.org/abs/1607.04243},
urlvideo = {https://youtu.be/q0syk6J_JHY},
urlslides={../../wp-content/papercite-data/slides/bamford-cami16-slides.pdf},
abstract = {Accurate measurement of blast-induced rock fragmentation is of great importance for many mining operations. The post-blast rock size distribution can significantly influence the efficiency of all the downstream mining and comminution processes. Image analysis methods are one of the most common methods used to measure rock fragment size distribution in mines regardless of criticism for lack of accuracy to measure fine particles and other perceived deficiencies. The current practice of collecting rock fragmentation data for image analysis is highly manual and provides data with low temporal and spatial resolution. Using Unmanned Aerial Vehicles (UAVs) for collecting images of rock fragments can not only improve the quality of the image data but also automate the data collection process. Ultimately, real-time acquisition of high temporal- and spatial-resolution data based on UAV technology will provide a broad range of opportunities for both improving blast design without interrupting the production process and reducing the cost of the human operator.},
}

[DOI] Unscented external force estimation for quadrotors and experiments
C. D. McKinnon and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016, pp. 5651-5657.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [More Information]

In this paper, we describe an algorithm, based on the well-known Unscented Quaternion Estimator, to estimate external forces and torques acting on a quadrotor. This formulation uses a non-linear model for the quadrotor dynamics, naturally incorporates process and measurement noise, requires only a few parameters to be tuned manually, and uses singularity-free unit quaternions to represent attitude. We demonstrate in simulation that the proposed algorithm can outperform existing methods. We then highlight how our approach can be used to generate force and torque profiles from experimental data, and how this information can later be used for controller design. Finally, we show how the resulting controllers enable a quadrotor to stay in the wind field of a moving fan.

@INPROCEEDINGS{mckinnon-iros16,
author = {Christopher D. McKinnon and Angela P. Schoellig},
title = {Unscented external force estimation for quadrotors and experiments},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year = {2016},
pages = {5651-5657},
doi = {10.1109/IROS.2016.7759831},
urllink = {http://arxiv.org/abs/1603.02772},
urlvideo = {https://youtu.be/YFA3kHabY38},
abstract = {In this paper, we describe an algorithm, based on the well-known Unscented Quaternion Estimator, to estimate external forces and torques acting on a quadrotor. This formulation uses a non-linear model for the quadrotor dynamics, naturally incorporates process and measurement noise, requires only a few parameters to be tuned manually, and uses singularity-free unit quaternions to represent attitude. We demonstrate in simulation that the proposed algorithm can outperform existing methods. We then highlight how our approach can be used to generate force and torque profiles from experimental data, and how this information can later be used for controller design. Finally, we show how the resulting controllers enable a quadrotor to stay in the wind field of a moving fan.},
}

[DOI] Safe and robust quadrotor maneuvers based on reach control
M. Vukosavljev, I. Jansen, M. E. Broucke, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2016, pp. 5677-5682.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Slides] [Download 2nd Slides] [More Information]

In this paper, we investigate the synthesis of piecewise affine feedback controllers to execute safe and robust quadrocopter maneuvers. The methodology is based on formulating the problem as a reach control problem on a polytopic state space. Reach control has so far only been developed in theory and has not been tested experimentally in a real system before. We demonstrate that these theoretical tools can achieve aggressive, albeit safe and robust, quadrocopter maneuvers without the need for a predefined open-loop reference trajectory. In a proof-of-concept demonstration, the reach controller is implemented in one translational direction while the other degrees of freedom are stabilized by separate controllers. Experimental results on a quadrocopter show the effectiveness and robustness of this control approach.

@INPROCEEDINGS{vukosavljev-icra16,
author = {Marijan Vukosavljev and Ivo Jansen and Mireille E. Broucke and Angela P. Schoellig},
title = {Safe and robust quadrotor maneuvers based on reach control},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2016},
pages = {5677-5682},
doi = {10.1109/ICRA.2016.7487789},
urllink = {https://arxiv.org/abs/1610.02385},
urlvideo={https://youtu.be/l4vdxdmd2xc},
urlslides={../../wp-content/papercite-data/slides/vukosavljev-icra16-slides.pdf},
urlslides2={../../wp-content/papercite-data/slides/vukosavljev-icra16-slides2.pdf},
abstract = {In this paper, we investigate the synthesis of piecewise affine feedback controllers to execute safe and robust quadrocopter maneuvers. The methodology is based on formulating the problem as a reach control problem on a polytopic state space. Reach control has so far only been developed in theory and has not been tested experimentally in a real system before. We demonstrate that these theoretical tools can achieve aggressive, albeit safe and robust, quadrocopter maneuvers without the need for a predefined open-loop reference trajectory. In a proof-of-concept demonstration, the reach controller is implemented in one translational direction while the other degrees of freedom are stabilized by separate controllers. Experimental results on a quadrocopter show the effectiveness and robustness of this control approach.}
}

[DOI] Safe controller optimization for quadrotors with Gaussian processes
F. Berkenkamp, A. P. Schoellig, and A. Krause
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2016, pp. 491-496.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [View 2nd Video] [Code] [More Information]

One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to obtain an initial controller, but ultimately the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning step, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters on the real system, safety-critical system failures may happen. In this paper, we overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SafeOpt, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SafeOpt automatically optimizes the parameters of a control law while guaranteeing safety. It models the underlying performance measure as a Gaussian process and only explores new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.

@INPROCEEDINGS{berkenkamp-icra16,
author = {Felix Berkenkamp and Angela P. Schoellig and Andreas Krause},
title = {Safe controller optimization for quadrotors with {G}aussian processes},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2016},
month = {May},
pages = {491--496},
doi = {10.1109/ICRA.2016.7487170},
urllink = {http://arxiv.org/abs/1509.01066},
urlvideo = {https://www.youtube.com/watch?v=GiqNQdzc5TI},
urlvideo2 = {https://www.youtube.com/watch?v=IYi8qMnt0yU},
urlcode = {https://github.com/befelix/SafeOpt},
abstract = {One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to obtain an initial controller, but ultimately the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning step, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters on the real system, safety-critical system failures may happen. In this paper, we overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SafeOpt, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SafeOpt automatically optimizes the parameters of a control law while guaranteeing safety. It models the underlying performance measure as a Gaussian process and only explores new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.},
}

A preliminary study of transfer learning between unicycle robots
K. V. Raimalwala, B. A. Francis, and A. P. Schoellig
in Proc. of the AAAI Spring Symposium Series, 2016, pp. 53-59.
[View BibTeX] [View Abstract] [Download PDF]

Methods from machine learning have successfully been used to improve the performance of control systems in cases when accurate models of the system or the environment are not available. These methods require the use of data generated from physical trials. Transfer Learning (TL) allows for this data to come from a different, similar system. The goal of this work is to understand in which cases a simple, alignment-based transfer of data is beneficial. A scalar, linear, time- invariant (LTI) transformation is applied to the output from a source system to align with the output from a tar- get system. In a theoretic study, we have already shown that for linear, single-input, single-output systems, the upper bound of the transformation error depends on the dynamic properties of the source and target system, and is small for systems with similar response times. We now consider two nonlinear, unicycle robots. Based on our previous work, we derive analytic error bounds for the linearized robot models. We then provide simulations of the nonlinear robot models and experiments with a Pioneer 3-AT robot that confirm the theoretical findings. As a result, key characteristics of alignment- based transfer learning observed in our theoretic study prove to be also true for real, nonlinear unicycle robots.

@INPROCEEDINGS{raimalwala-aaai16,
author = {Kaizad V. Raimalwala and Bruce A. Francis and Angela P. Schoellig},
title = {A preliminary study of transfer learning between unicycle robots},
booktitle = {{Proc. of the AAAI Spring Symposium Series}},
year = {2016},
pages = {53--59},
abstract = {Methods from machine learning have successfully been used to improve the performance of control systems in cases when accurate models of the system or the environment are not available. These methods require the use of data generated from physical trials. Transfer Learning (TL) allows for this data to come from a different, similar system. The goal of this work is to understand in which cases a simple, alignment-based transfer of data is beneficial. A scalar, linear, time- invariant (LTI) transformation is applied to the output from a source system to align with the output from a tar- get system. In a theoretic study, we have already shown that for linear, single-input, single-output systems, the upper bound of the transformation error depends on the dynamic properties of the source and target system, and is small for systems with similar response times. We now consider two nonlinear, unicycle robots. Based on our previous work, we derive analytic error bounds for the linearized robot models. We then provide simulations of the nonlinear robot models and experiments with a Pioneer 3-AT robot that confirm the theoretical findings. As a result, key characteristics of alignment- based transfer learning observed in our theoretic study prove to be also true for real, nonlinear unicycle robots.},
}

Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics
F. Berkenkamp, A. Krause, and A. P. Schoellig
Technical Report, arXiv, 2016.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Code] [More Information]

Robotics algorithms typically depend on various parameters, the choice of which significantly affects the robot�s performance. While an initial guess for the parameters may be obtained from dynamic models of the robot, parameters are usually tuned manually on the real system to achieve the best performance. Optimization algorithms, such as Bayesian optimization, have been used to automate this process. However, these methods may evaluate parameters during the optimization process that lead to safety-critical system failures. Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been developed and applied in robotics, which guarantees that the performance of the system never falls below a critical value; that is, safety is defined based on the performance function. However, coupling performance and safety is not desirable in most cases. In this paper, we define separate functions for performance and safety. We present a generalized SafeOpt algorithm that, given an initial safe guess for the parameters, maximizes performance but only evaluates parameters that satisfy all safety constraints with high probability. It achieves this by modeling the underlying and unknown performance and constraint functions as Gaussian processes. We provide a theoretical analysis and demonstrate in experiments on a quadrotor vehicle that the proposed algorithm enables fast, automatic, and safe optimization of tuning parameters. Moreover, we show an extension to context- or environment-dependent, safe optimization in the experiments.

@TECHREPORT{berkenkamp-tr16,
title = {Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics},
institution = {arXiv},
author = {Berkenkamp, Felix and Krause, Andreas and Schoellig, Angela P.},
year = {2016},
urllink = {http://arxiv.org/abs/1602.04450},
urlvideo = {https://youtu.be/GiqNQdzc5TI},
urlcode = {https://github.com/befelix/SafeOpt},
abstract = {Robotics algorithms typically depend on various parameters, the choice of which significantly affects the robot�s performance. While an initial guess for the parameters may be obtained from dynamic models of the robot, parameters are usually tuned manually on the real system to achieve the best performance. Optimization algorithms, such as Bayesian optimization, have been used to automate this process. However, these methods may evaluate parameters during the optimization process that lead to safety-critical system failures. Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been developed and applied in robotics, which guarantees that the performance of the system never falls below a critical value; that is, safety is defined based on the performance function. However, coupling performance and safety is not desirable in most cases. In this paper, we define separate functions for performance and safety. We present a generalized SafeOpt algorithm that, given an initial safe guess for the parameters, maximizes performance but only evaluates parameters that satisfy all safety constraints with high probability. It achieves this by modeling the underlying and unknown performance and constraint functions as Gaussian processes. We provide a theoretical analysis and demonstrate in experiments on a quadrotor vehicle that the proposed algorithm enables fast, automatic, and safe optimization of tuning parameters. Moreover, we show an extension to context- or environment-dependent, safe optimization in the experiments.}
}

Rock fragmentation analysis using UAV technology
T. Bamford, K. Esmaeili, and A. P. Schoellig
Professional Magazine Article, Ontario Professional Surveyor (OPS) Magazine, Assn. of Ontario Land Surveyors, 2016.
[View BibTeX] [Download PDF] [More Information]

@MISC{bamford-ops16,
author = {Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
title = {Rock fragmentation analysis using {UAV} technology},
year = {2016},
volume = {59},
number = {4},
pages = {14-16},
howpublished = {Professional Magazine Article, Ontario Professional Surveyor (OPS) Magazine, Assn. of Ontario Land Surveyors},
urllink = {http://publications.aols.org/OPS-Magazine/2016Fall/},
}

Quantifying the value of drone-delivered AEDs in cardiac arrest response
J. J. Boutilier, S. C. Brooks, A. Janmohamed, A. Byers, C. Zhan, J. E. Buick, A. P. Schoellig, L. J. Morrison, S. Cheskes, and T. C. Y. Chan
Abstract and Oral Presentation, in American Heart Association (AHA) Resuscitation Science Symposium, 2016.
[View BibTeX]

@MISC{boutilier-aha16,
author = {J. J. Boutilier and S. C. Brooks and A. Janmohamed and A. Byers and C. Zhan and J. E. Buick and A. P. Schoellig and L. J. Morrison and S. Cheskes and T.C.Y. Chan},
title = {Quantifying the value of drone-delivered {AEDs} in cardiac arrest response},
howpublished = {Abstract and Oral Presentation, in American Heart Association (AHA) Resuscitation Science Symposium},
year = {2016},
}

Safe automatic controller tuning for quadrotors
F. Berkenkamp, A. Krause, and A. P. Schoellig
Video Submission, Assn. of the Advancement of Artificial Intelligence (AAAI) AI Video Competition, 2016.
[View BibTeX] [View Video]

@MISC{berkenkamp-aaai16,
author = {Felix Berkenkamp and Andreas Krause and Angela P. Schoellig},
title = {Safe automatic controller tuning for quadrotors},
howpublished = {Video Submission, Assn. of the Advancement of Artificial Intelligence (AAAI) AI Video Competition},
urlvideo = {https://youtu.be/7ZkZlxXHgTY?list=PLuOoXrWK6Kz5ySULxGMtAUdZEg9SkXDoq},
year = {2016},
}

Data-driven interaction for quadrotors based on external forces
C. McKinnon and A. P. Schoellig
Video Submission, Assn. of the Advancement of Artificial Intelligence (AAAI) AI Video Competition, 2016.
[View BibTeX] [View Video]

@MISC{mckinnon-aaai16,
author = {Chris McKinnon and Angela P. Schoellig},
title = {Data-driven interaction for quadrotors based on external forces},
howpublished = {Video Submission, Assn. of the Advancement of Artificial Intelligence (AAAI) AI Video Competition},
urlvideo = {https://youtu.be/x0RL7Jh6F9s?list=PLuOoXrWK6Kz5ySULxGMtAUdZEg9SkXDoq},
year = {2016},
}

2015

[DOI] Learning-based nonlinear model predictive control to improve vision-based mobile robot path tracking
C. J. Ostafew, J. Collier, A. P. Schoellig, and T. D. Barfoot
Journal of Field Robotics, vol. 33, iss. 1, pp. 133-152, 2015.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [View 2nd Video] [View 3rd Video] [View 4th Video]
This paper presents a Learning-based Nonlinear Model Predictive Control (LB-NMPC) algorithm to achieve high-performance path tracking in challenging off-road terrain through learning. The LB-NMPC algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) as a function of system state, input, and other relevant variables. The GP is updated based on experience collected during previous trials. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results including over 3.0 km of travel by three significantly different robot platforms with masses ranging from 50 kg to 600 kg and at speeds ranging from 0.35 m/s to 1.2 m/s. Planned speeds are generated by a novel experience-based speed scheduler that balances overall travel time, path-tracking errors, and localization reliability. The results show that the controller can start from a generic a priori vehicle model and subsequently learn to reduce vehicle- and trajectory-specific path-tracking errors based on experience.

@ARTICLE{ostafew-jfr15,
author = {Chris J. Ostafew and Jack Collier and Angela P. Schoellig and Timothy D. Barfoot},
title = {Learning-based nonlinear model predictive control to improve vision-based mobile robot path tracking},
year = {2015},
journal = {{Journal of Field Robotics}},
volume = {33},
number = {1},
pages = {133-152},
doi = {10.1002/rob.21587},
urlvideo={https://youtu.be/lxm-2A6yOY0?list=PLC12E387419CEAFF2},
urlvideo2={https://youtu.be/M9xhkHCzpMo?list=PL0F1AD87C0266A961},
urlvideo3={http://youtu.be/MwVElAn95-M?list=PLC0E5EB919968E507},
urlvideo4={http://youtu.be/Pu3_F6k6Fa4?list=PLC0E5EB919968E507},
abstract = {This paper presents a Learning-based Nonlinear Model Predictive Control (LB-NMPC) algorithm to achieve high-performance path tracking in challenging off-road terrain through learning. The LB-NMPC algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) as a function of system state, input, and other relevant variables. The GP is updated based on experience collected during previous trials. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results including over 3.0 km of travel by three significantly different robot platforms with masses ranging from 50 kg to 600 kg and at speeds ranging from 0.35 m/s to 1.2 m/s. Planned speeds are generated by a novel experience-based speed scheduler that balances overall travel time, path-tracking errors, and localization reliability. The results show that the controller can start from a generic a priori vehicle model and subsequently learn to reduce vehicle- and trajectory-specific path-tracking errors based on experience.}
}

[DOI] An upper bound on the error of alignment-based transfer learning between two linear, time-invariant, scalar systems
K. V. Raimalwala, B. A. Francis, and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015, pp. 5253-5258.
[View BibTeX] [View Abstract] [Download PDF] [More Information]

Methods from machine learning have successfully been used to improve the performance of control systems in cases when accurate models of the system or the environment are not available. These methods require the use of data generated from physical trials. Transfer Learning (TL) allows for this data to come from a different, similar system. This paper studies a simplified TL scenario with the goal of understanding in which cases a simple, alignment-based transfer of data is possible and beneficial. Two linear, time-invariant (LTI), single-input, single-output systems are tasked to follow the same reference signal. A scalar, LTI transformation is applied to the output from a source system to align with the output from a target system. An upper bound on the 2-norm of the transformation error is derived for a large set of reference signals and is minimized with respect to the transformation scalar. Analysis shows that the minimized error bound is reduced for systems with poles that lie close to each other (that is, for systems with similar response times). This criterion is relaxed for systems with poles that have a larger negative real part (that is, for stable systems with fast response), meaning that poles can be further apart for the same minimized error bound. Additionally, numerical results show that using the reference signal as input to the transformation reduces the minimized bound further.

@INPROCEEDINGS{raimalwala-iros15,
author = {Kaizad V. Raimalwala and Bruce A. Francis and Angela P. Schoellig},
title = {An upper bound on the error of alignment-based transfer learning between two linear, time-invariant, scalar systems},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {5253--5258},
year = {2015},
doi = {10.1109/IROS.2015.7354118},
urllink = {http://hdl.handle.net/1807/69365},
note = {},
abstract = {Methods from machine learning have successfully been used to improve the performance of control systems in cases when accurate models of the system or the environment are not available. These methods require the use of data generated from physical trials. Transfer Learning (TL) allows for this data to come from a different, similar system. This paper studies a simplified TL scenario with the goal of understanding in which cases a simple, alignment-based transfer of data is possible and beneficial. Two linear, time-invariant (LTI), single-input, single-output systems are tasked to follow the same reference signal. A scalar, LTI transformation is applied to the output from a source system to align with the output from a target system. An upper bound on the 2-norm of the transformation error is derived for a large set of reference signals and is minimized with respect to the transformation scalar. Analysis shows that the minimized error bound is reduced for systems with poles that lie close to each other (that is, for systems with similar response times). This criterion is relaxed for systems with poles that have a larger negative real part (that is, for stable systems with fast response), meaning that poles can be further apart for the same minimized error bound. Additionally, numerical results show that using the reference signal as input to the transformation reduces the minimized bound further.}
}

[DOI] Safe and robust learning control with Gaussian processes
F. Berkenkamp and A. P. Schoellig
in Proc. of the European Control Conference (ECC), 2015, pp. 2501-2506.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Slides]

This paper introduces a learning-based robust control algorithm that provides robust stability and performance guarantees during learning. The approach uses Gaussian process (GP) regression based on data gathered during operation to update an initial model of the system and to gradually decrease the uncertainty related to this model. Embedding this data-based update scheme in a robust control framework guarantees stability during the learning process. Traditional robust control approaches have not considered online adaptation of the model and its uncertainty before. As a result, their controllers do not improve performance during operation. Typical machine learning algorithms that have achieved similar high-performance behavior by adapting the model and controller online do not provide the guarantees presented in this paper. In particular, this paper considers a stabilization task, linearizes the nonlinear, GP-based model around a desired operating point, and solves a convex optimization problem to obtain a linear robust controller. The resulting performance improvements due to the learning-based controller are demonstrated in experiments on a quadrotor vehicle.

@INPROCEEDINGS{berkenkamp-ecc15,
author = {Felix Berkenkamp and Angela P. Schoellig},
title = {Safe and robust learning control with {G}aussian processes},
booktitle = {{Proc. of the European Control Conference (ECC)}},
pages = {2501--2506},
year = {2015},
doi = {10.1109/ECC.2015.7330913},
urlvideo={https://youtu.be/YqhLnCm0KXY?list=PLC12E387419CEAFF2},
urlslides={../../wp-content/papercite-data/slides/berkenkamp-ecc15-slides.pdf},
abstract = {This paper introduces a learning-based robust control algorithm that provides robust stability and performance guarantees during learning. The approach uses Gaussian process (GP) regression based on data gathered during operation to update an initial model of the system and to gradually decrease the uncertainty related to this model. Embedding this data-based update scheme in a robust control framework guarantees stability during the learning process. Traditional robust control approaches have not considered online adaptation of the model and its uncertainty before. As a result, their controllers do not improve performance during operation. Typical machine learning algorithms that have achieved similar high-performance behavior by adapting the model and controller online do not provide the guarantees presented in this paper. In particular, this paper considers a stabilization task, linearizes the nonlinear, GP-based model around a desired operating point, and solves a convex optimization problem to obtain a linear robust controller. The resulting performance improvements due to the learning-based controller are demonstrated in experiments on a quadrotor vehicle.}
}

[DOI] Conservative to confident: treating uncertainty robustly within learning-based control
C. J. Ostafew, A. P. Schoellig, and T. D. Barfoot
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 421-427.
[View BibTeX] [View Abstract] [Download PDF]

Robust control maintains stability and performance for a fixed amount of model uncertainty but can be conservative since the model is not updated online. Learning- based control, on the other hand, uses data to improve the model over time but is not typically guaranteed to be robust throughout the process. This paper proposes a novel combination of both ideas: a robust Min-Max Learning-Based Nonlinear Model Predictive Control (MM-LB-NMPC) algorithm. Based on an existing LB-NMPC algorithm, we present an efficient and robust extension, altering the NMPC performance objective to optimize for the worst-case scenario. The algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) based on experience collected during previous trials as a function of system state, input, and other relevant variables. Nominal state sequences are predicted using an Unscented Transform and worst-case scenarios are defined as sequences bounding the 3σ confidence region. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results from testing on a 50 kg skid-steered robot executing a path-tracking task. The results show reductions in maximum lateral and heading path-tracking errors by up to 30% and a clear transition from robust control when the model uncertainty is high to optimal control when model uncertainty is reduced.

@INPROCEEDINGS{ostafew-icra15,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot},
title = {Conservative to confident: treating uncertainty robustly within learning-based control},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
pages = {421--427},
year = {2015},
doi = {10.1109/ICRA.2015.7139033},
note = {},
abstract = {Robust control maintains stability and performance for a fixed amount of model uncertainty but can be conservative since the model is not updated online. Learning- based control, on the other hand, uses data to improve the model over time but is not typically guaranteed to be robust throughout the process. This paper proposes a novel combination of both ideas: a robust Min-Max Learning-Based Nonlinear Model Predictive Control (MM-LB-NMPC) algorithm. Based on an existing LB-NMPC algorithm, we present an efficient and robust extension, altering the NMPC performance objective to optimize for the worst-case scenario. The algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) based on experience collected during previous trials as a function of system state, input, and other relevant variables. Nominal state sequences are predicted using an Unscented Transform and worst-case scenarios are defined as sequences bounding the 3σ confidence region. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results from testing on a 50 kg skid-steered robot executing a path-tracking task. The results show reductions in maximum lateral and heading path-tracking errors by up to 30% and a clear transition from robust control when the model uncertainty is high to optimal control when model uncertainty is reduced.}
}

A flying drum machine
X. Wang, N. Dalal, T. Laidlow, and A. P. Schoellig
Technical Report, 2015.
[View BibTeX] [View Abstract] [Download PDF] [View Video]

This paper proposes the use of a quadrotor aerial vehicle as a musical instrument. Using the idea of interactions based on physical contact, a system is developed that enables humans to engage in artistic expression with a flying robot and produce music. A robotic user interface that uses physical interactions was created for a quadcopter. The interactive quadcopter was then programmed to drive playback of drum sounds in real-time in response to physical interaction. An intuitive mapping was developed between machine movement and art/creative composition. Challenges arose in meeting realtime latency requirements mainly due to delays in input detection. They were overcome through the development of a quick input detection method, which relies on accurate yet fast digital filtering. Successful experiments were conducted with a professional musician who used the interface to compose complex rhythm patterns. A video accompanying this paper demonstrates his performance.

@TECHREPORT{wang-tr15,
author = {Xingbo Wang and Natasha Dalal and Tristan Laidlow and Angela P. Schoellig},
title = {A Flying Drum Machine},
year = {2015},
urlvideo={https://youtu.be/d5zG-BWB7lE?list=PLD6AAACCBFFE64AC5},
abstract = {This paper proposes the use of a quadrotor aerial vehicle as a musical instrument. Using the idea of interactions based on physical contact, a system is developed that enables humans to engage in artistic expression with a flying robot and produce music. A robotic user interface that uses physical interactions was created for a quadcopter. The interactive quadcopter was then programmed to drive playback of drum sounds in real-time in response to physical interaction. An intuitive mapping was developed between machine movement and art/creative composition. Challenges arose in meeting realtime latency requirements mainly due to delays in input detection. They were overcome through the development of a quick input detection method, which relies on accurate yet fast digital filtering. Successful experiments were conducted with a professional musician who used the interface to compose complex rhythm patterns. A video accompanying this paper demonstrates his performance.}
}

Safe controller optimization for quadrotors with Gaussian processes
F. Berkenkamp, A. P. Schoellig, and A. Krause
Abstract and Presentation in Proc. of the Second Machine Learning in Planning and Control of Robot Motion Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015.
[View BibTeX] [View Abstract] [Download PDF]

One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to design an initial controller, but ultimately, the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters, safety-critical system failures may hap- pen. We overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SAFEOPT, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SAFEOPT automatically optimizes the parameters of a control law while guaranteeing system safety and stability. It achieves this by modeling the underlying performance measure as a Gaussian process and only exploring new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.

@MISC{berkenkamp-iros15,
author = {Felix Berkenkamp and Angela P. Schoellig and Andreas Krause},
title = {Safe controller optimization for quadrotors with {G}aussian processes},
howpublished = {Abstract and Presentation in Proc. of the Second Machine Learning in Planning and Control of Robot Motion Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year = {2015},
abstract = {One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to design an initial controller, but ultimately, the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters, safety-critical system failures may hap- pen. We overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SAFEOPT, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SAFEOPT automatically optimizes the parameters of a control law while guaranteeing system safety and stability. It achieves this by modeling the underlying performance measure as a Gaussian process and only exploring new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.},
}

2014

[DOI] Application-driven design of aerial communication networks
T. Andre, K. A. Hummel, A. P. Schoellig, E. Yanmaz, M. Asedpour, C. Bettstetter, P. Grippa, H. Hellwagner, S. Sand, and S. Zhang
IEEE Communications Magazine, vol. 52, iss. 5, pp. 129-137, 2014. Authors 1 to 4 contributed equally.
[View BibTeX] [View Abstract] [Download PDF] [More Information]
Networks of micro aerial vehicles (MAVs) equipped with various sensors are increasingly used for civil applications, such as monitoring, surveillance, and disaster management. In this article, we discuss the communication requirements raised by applications in MAV networks. We propose a novel system representation that can be used to specify different application demands. To this end, we extract key functionalities expected in an MAV network. We map these functionalities into building blocks to characterize the expected communication needs. Based on insights from our own and related real-world experiments, we discuss the capabilities of existing communications technologies and their limitations to implement the proposed building blocks. Our findings indicate that while certain requirements of MAV applications are met with available technologies, further research and development is needed to address the scalability, heterogeneity, safety, quality of service, and security aspects of multi-MAV systems.

@ARTICLE{andre-com14,
author = {Torsten Andre and Karin A. Hummel and Angela P. Schoellig and Evsen Yanmaz and Mahdi Asedpour and Christian Bettstetter and Pasquale Grippa and Hermann Hellwagner and Stephan Sand and Siwei Zhang},
title = {Application-driven design of aerial communication networks},
journal = {{IEEE Communications Magazine}},
note={Authors 1 to 4 contributed equally},
volume = {52},
number = {5},
pages = {129-137},
year = {2014},
doi = {10.1109/MCOM.2014.6815903},
urllink = {http://nes.aau.at/?p=1176},
abstract = {Networks of micro aerial vehicles (MAVs) equipped with various sensors are increasingly used for civil applications, such as monitoring, surveillance, and disaster management. In this article, we discuss the communication requirements raised by applications in MAV networks. We propose a novel system representation that can be used to specify different application demands. To this end, we extract key functionalities expected in an MAV network. We map these functionalities into building blocks to characterize the expected communication needs. Based on insights from our own and related real-world experiments, we discuss the capabilities of existing communications technologies and their limitations to implement the proposed building blocks. Our findings indicate that while certain requirements of MAV applications are met with available technologies, further research and development is needed to address the scalability, heterogeneity, safety, quality of service, and security aspects of multi-MAV systems.}
}

[DOI] A platform for aerial robotics research and demonstration: The Flying Machine Arena
S. Lupashin, M. Hehn, M. W. Mueller, A. P. Schoellig, and R. D’Andrea
Mechatronics, vol. 24, iss. 1, pp. 41-54, 2014.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [More Information]

The Flying Machine Arena is a platform for experiments and demonstrations with fleets of small flying vehicles. It utilizes a distributed, modular architecture linked by robust communication layers. An estimation and control framework along with built-in system protection components enable prototyping of new control systems concepts and implementation of novel demonstrations. More recently, a mobile version has been featured at several eminent public events. We describe the architecture of the Arena from the viewpoint of system robustness and its capability as a dual-purpose research and demonstration platform.

@ARTICLE{lupashin-mech14,
author = {Sergei Lupashin and Markus Hehn and Mark W. Mueller and Angela P. Schoellig and Raffaello D'Andrea},
title = {A platform for aerial robotics research and demonstration: {The Flying Machine Arena}},
journal = {{Mechatronics}},
volume = {24},
number = {1},
pages = {41-54},
year = {2014},
doi = {10.1016/j.mechatronics.2013.11.006},
urllink = {http://flyingmachinearena.org/},
urlvideo={https://youtu.be/pcgvWhu8Arc?list=PLuLKX4lDsLIaVjdGsZxNBKLcogBnVVFQr},
abstract = {The Flying Machine Arena is a platform for experiments and demonstrations with fleets of small flying vehicles. It utilizes a distributed, modular architecture linked by robust communication layers. An estimation and control framework along with built-in system protection components enable prototyping of new control systems concepts and implementation of novel demonstrations. More recently, a mobile version has been featured at several eminent public events. We describe the architecture of the Arena from the viewpoint of system robustness and its capability as a dual-purpose research and demonstration platform.}
}

[DOI] So you think you can dance? Rhythmic flight performances with quadrocopters
A. P. Schoellig, H. Siegel, F. Augugliaro, and R. D’Andrea
in Controls and Art, A. LaViers and M. Egerstedt, Eds., Springer International Publishing, 2014, pp. 73-105.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Additional Material] [Download Slides] [More Information]

This chapter presents a set of algorithms that enable quadrotor vehicles to "fly with the music"; that is, to perform rhythmic motions that are aligned with the beat of a given music piece.

@INCOLLECTION{schoellig-springer14,
author = {Angela P. Schoellig and Hallie Siegel and Federico Augugliaro and Raffaello D'Andrea},
title = {So you think you can dance? {Rhythmic} flight performances with quadrocopters},
booktitle = {{Controls and Art}},
editor = {Amy LaViers and Magnus Egerstedt},
publisher = {Springer International Publishing},
pages = {73-105},
year = {2014},
doi = {10.1007/978-3-319-03904-6_4},
urldata={../../wp-content/papercite-data/data/schoellig-springer14-files.zip},
urlslides={../../wp-content/papercite-data/slides/schoellig-springer14-slides.pdf},
urllink = {http://www.tiny.cc/MusicInMotionSite},
urlvideo={https://www.youtube.com/playlist?list=PLD6AAACCBFFE64AC5},
abstract = {This chapter presents a set of algorithms that enable quadrotor vehicles to "fly with the music"; that is, to perform rhythmic motions that are aligned with the beat of a given music piece.}
}

Learning-based robust control: guaranteeing stability while improving performance
F. Berkenkamp and A. P. Schoellig
in Proc. of the Machine Learning in Planning and Control of Robot Motion Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2014.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Slides] [More Information]

To control dynamic systems, modern control theory relies on accurate mathematical models that describe the system behavior. Machine learning methods have proven to be an effective method to compensate for initial errors in these models and to achieve high-performance maneuvers by adapting the system model and control online. However, these methods usually do not guarantee stability during the learning process. On the other hand, the control community has traditionally accounted for model uncertainties by designing robust controllers. Robust controllers use a mathematical description of the uncertainty in the dynamic system derived prior to operation and guarantee robust stability for all uncertainties. Unlike machine learning methods, robust control does not improve the control performance by adapting the model online. This paper combines machine learning and robust control theory for the first time with the goal of improving control performance while guaranteeing stability. Data gathered during operation is used to reduce the uncertainty in the model and to learn systematic errors. Specifically, a nonlinear, nonparametric model of the unknown dynamics is learned with a Gaussian Process. This model is used for the computation of a linear robust controller, which guarantees stability around an operating point for all uncertainties. As a result, the robust controller improves its performance online while guaranteeing robust stability. A simulation example illustrates the performance improvements due to the learning-based robust controller.

@INPROCEEDINGS{berkenkamp-iros14,
author = {Felix Berkenkamp and Angela P. Schoellig},
title = {Learning-based robust control: guaranteeing stability while improving performance},
booktitle = {{Proc. of the Machine Learning in Planning and Control of Robot Motion Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year = {2014},
urllink = {http://www.cs.unm.edu/amprg/mlpc14Workshop/},
urlvideo={https://youtu.be/YqhLnCm0KXY?list=PLC12E387419CEAFF2},
urlslides={../../wp-content/papercite-data/slides/berkenkamp-iros14-slides.pdf},
abstract = {To control dynamic systems, modern control theory relies on accurate mathematical models that describe the system behavior. Machine learning methods have proven to be an effective method to compensate for initial errors in these models and to achieve high-performance maneuvers by adapting the system model and control online. However, these methods usually do not guarantee stability during the learning process. On the other hand, the control community has traditionally accounted for model uncertainties by designing robust controllers. Robust controllers use a mathematical description of the uncertainty in the dynamic system derived prior to operation and guarantee robust stability for all uncertainties. Unlike machine learning methods, robust control does not improve the control performance by adapting the model online. This paper combines machine learning and robust control theory for the first time with the goal of improving control performance while guaranteeing stability. Data gathered during operation is used to reduce the uncertainty in the model and to learn systematic errors. Specifically, a nonlinear, nonparametric model of the unknown dynamics is learned with a Gaussian Process. This model is used for the computation of a linear robust controller, which guarantees stability around an operating point for all uncertainties. As a result, the robust controller improves its performance online while guaranteeing robust stability. A simulation example illustrates the performance improvements due to the learning-based robust controller.}
}

[DOI] Design of norm-optimal iterative learning controllers: the effect of an iteration-domain Kalman filter for disturbance estimation
N. Degen and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2014, pp. 3590-3596.
[View BibTeX] [View Abstract] [Download PDF] [Download Slides]

Iterative learning control (ILC) has proven to be an effective method for improving the performance of repetitive control tasks. This paper revisits two optimization-based ILC algorithms: (i) the widely used quadratic-criterion ILC law (QILC) and (ii) an estimation-based ILC law using an iteration-domain Kalman filter (K-ILC). The goal of this paper is to analytically compare both algorithms and to highlight the advantages of the Kalman-filter-enhanced algorithm. We first show that for an iteration-constant estimation gain and an appropriate choice of learning parameters both algorithms are identical. We then show that the estimation-enhanced algorithm with its iteration-varying optimal Kalman gains can achieve both fast initial convergence and good noise rejection by (optimally) adapting the learning update rule over the course of an experiment. We conclude that the clear separation of disturbance estimation and input update of the K-ILC algorithm provides an intuitive architecture to design learning schemes that achieve both low noise sensitivity and fast convergence. To benchmark the algorithms we use a simulation of a single-input, single-output mass-spring-damper system.

@INPROCEEDINGS{degen-cdc14,
author = {Nicolas Degen and Angela P. Schoellig},
title = {Design of norm-optimal iterative learning controllers: the effect of an iteration-domain {K}alman filter for disturbance estimation},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
pages = {3590-3596},
year = {2014},
doi = {10.1109/CDC.2014.7039947},
urlslides = {../../wp-content/papercite-data/slides/degen-cdc14-slides.pdf},
abstract = {Iterative learning control (ILC) has proven to be an effective method for improving the performance of repetitive control tasks. This paper revisits two optimization-based ILC algorithms: (i) the widely used quadratic-criterion ILC law (QILC) and (ii) an estimation-based ILC law using an iteration-domain Kalman filter (K-ILC). The goal of this paper is to analytically compare both algorithms and to highlight the advantages of the Kalman-filter-enhanced algorithm. We first show that for an iteration-constant estimation gain and an appropriate choice of learning parameters both algorithms are identical. We then show that the estimation-enhanced algorithm with its iteration-varying optimal Kalman gains can achieve both fast initial convergence and good noise rejection by (optimally) adapting the learning update rule over the course of an experiment. We conclude that the clear separation of disturbance estimation and input update of the K-ILC algorithm provides an intuitive architecture to design learning schemes that achieve both low noise sensitivity and fast convergence. To benchmark the algorithms we use a simulation of a single-input, single-output mass-spring-damper system.}
}

[DOI] Learning-based nonlinear model predictive control to improve vision-based mobile robot path-tracking in challenging outdoor environments
C. J. Ostafew, A. P. Schoellig, and T. D. Barfoot
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2014, pp. 4029-4036.
[View BibTeX] [View Abstract] [Download PDF] [View Video]

This paper presents a Learning-based Nonlinear Model Predictive Control (LB-NMPC) algorithm for an autonomous mobile robot to reduce path-tracking errors over repeated traverses along a reference path. The LB-NMPC algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) based on experience collected during previous traversals as a function of system state, input and other relevant variables. Modelling the disturbance as a GP enables interpolation and extrapolation of learned disturbances, a key feature of this algorithm. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results including over 1.8 km of travel by a four-wheeled, 50 kg robot travelling through challenging terrain (including steep, uneven hills) and by a six-wheeled, 160 kg robot learning disturbances caused by unmodelled dynamics at speeds ranging from 0.35 m/s to 1.0 m/s. The speed is scheduled to balance trial time, path-tracking errors, and localization reliability based on previous experience. The results show that the system can start from a generic a priori vehicle model and subsequently learn to reduce vehicle- and trajectory-specific path-tracking errors based on experience.

@INPROCEEDINGS{ostafew-icra14,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot},
title = {Learning-based nonlinear model predictive control to improve vision-based mobile robot path-tracking in challenging outdoor environments},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
pages = {4029-4036},
year = {2014},
doi = {10.1109/ICRA.2014.6907444},
urlvideo = {https://youtu.be/MwVElAn95-M?list=PLC12E387419CEAFF2},
abstract = {This paper presents a Learning-based Nonlinear Model Predictive Control (LB-NMPC) algorithm for an autonomous mobile robot to reduce path-tracking errors over repeated traverses along a reference path. The LB-NMPC algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) based on experience collected during previous traversals as a function of system state, input and other relevant variables. Modelling the disturbance as a GP enables interpolation and extrapolation of learned disturbances, a key feature of this algorithm. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results including over 1.8 km of travel by a four-wheeled, 50 kg robot travelling through challenging terrain (including steep, uneven hills) and by a six-wheeled, 160 kg robot learning disturbances caused by unmodelled dynamics at speeds ranging from 0.35 m/s to 1.0 m/s. The speed is scheduled to balance trial time, path-tracking errors, and localization reliability based on previous experience. The results show that the system can start from a generic a priori vehicle model and subsequently learn to reduce vehicle- and trajectory-specific path-tracking errors based on experience.}
}

[DOI] Speed daemon: experience-based mobile robot speed scheduling
C. J. Ostafew, A. P. Schoellig, T. D. Barfoot, and J. Collier
in Proc. of the International Conference on Computer and Robot Vision (CRV), 2014, pp. 56-62. Best Robotics Paper Award.
[View BibTeX] [View Abstract] [Download PDF] [View Video]

A time-optimal speed schedule results in a mobile robot driving along a planned path at or near the limits of the robot’s capability. However, deriving models to predict the effect of increased speed can be very difficult. In this paper, we present a speed scheduler that uses previous experience, instead of complex models, to generate time-optimal speed schedules. The algorithm is designed for a vision-based, path-repeating mobile robot and uses experience to ensure reliable localization, low path-tracking errors, and realizable control inputs while maximizing the speed along the path. To our knowledge, this is the first speed scheduler to incorporate experience from previous path traversals in order to address system constraints. The proposed speed scheduler was tested in over 4 km of path traversals in outdoor terrain using a large Ackermann-steered robot travelling between 0.5 m/s and 2.0 m/s. The approach to speed scheduling is shown to generate fast speed schedules while remaining within the limits of the robot’s capability.

@INPROCEEDINGS{ostafew-crv14,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot and J. Collier},
title = {Speed daemon: experience-based mobile robot speed scheduling},
booktitle = {{Proc. of the International Conference on Computer and Robot Vision (CRV)}},
pages = {56-62},
year = {2014},
doi = {10.1109/CRV.2014.16},
urlvideo = {https://youtu.be/Pu3_F6k6Fa4?list=PLC12E387419CEAFF2},
abstract = {A time-optimal speed schedule results in a mobile robot driving along a planned path at or near the limits of the robot's capability. However, deriving models to predict the effect of increased speed can be very difficult. In this paper, we present a speed scheduler that uses previous experience, instead of complex models, to generate time-optimal speed schedules. The algorithm is designed for a vision-based, path-repeating mobile robot and uses experience to ensure reliable localization, low path-tracking errors, and realizable control inputs while maximizing the speed along the path. To our knowledge, this is the first speed scheduler to incorporate experience from previous path traversals in order to address system constraints. The proposed speed scheduler was tested in over 4 km of path traversals in outdoor terrain using a large Ackermann-steered robot travelling between 0.5 m/s and 2.0 m/s. The approach to speed scheduling is shown to generate fast speed schedules while remaining within the limits of the robot's capability.},
note = {Best Robotics Paper Award}
}

[DOI] A proof-of-concept demonstration of visual teach and repeat on a quadrocopter using an altitude sensor and a monocular camera
A. Pfrunder, A. P. Schoellig, and T. D. Barfoot
in Proc. of the International Conference on Computer and Robot Vision (CRV), 2014, pp. 238-245.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Slides]

This paper applies an existing vision-based navigation algorithm to a micro aerial vehicle (MAV). The algorithm has previously been used for long-range navigation of ground robots based on on-board 3D vision sensors such as a stereo or Kinect cameras. A teach-and-repeat operational strategy enables a robot to autonomously repeat a manually taught route without relying on an external positioning system such as GPS. For MAVs we show that a monocular downward looking camera combined with an altitude sensor can be used as 3D vision sensor replacing other resource-expensive 3D vision solutions. The paper also includes a simple path tracking controller that uses feedback from the visual and inertial sensors to guide the vehicle along a straight and level path. Preliminary experimental results demonstrate reliable, accurate and fully autonomous flight of an 8-m-long (straight and level) route, which was taught with the quadrocopter fixed to a cart. Finally, we present the successful flight of a more complex, 16-m-long route.

@INPROCEEDINGS{pfrunder-crv14,
author = {Andreas Pfrunder and Angela P. Schoellig and Timothy D. Barfoot},
title = {A proof-of-concept demonstration of visual teach and repeat on a quadrocopter using an altitude sensor and a monocular camera},
booktitle = {{Proc. of the International Conference on Computer and Robot Vision (CRV)}},
pages = {238-245},
year = {2014},
doi = {10.1109/CRV.2014.40},
urlvideo = {https://youtu.be/BRDvK4xD8ZY?list=PLuLKX4lDsLIaJEVTsuTAVdDJDx0xmzxXr},
urlslides = {../../wp-content/papercite-data/slides/pfrunder-crv14-slides.pdf},
abstract = {This paper applies an existing vision-based navigation algorithm to a micro aerial vehicle (MAV). The algorithm has previously been used for long-range navigation of ground robots based on on-board 3D vision sensors such as a stereo or Kinect cameras. A teach-and-repeat operational strategy enables a robot to autonomously repeat a manually taught route without relying on an external positioning system such as GPS. For MAVs we show that a monocular downward looking camera combined with an altitude sensor can be used as 3D vision sensor replacing other resource-expensive 3D vision solutions. The paper also includes a simple path tracking controller that uses feedback from the visual and inertial sensors to guide the vehicle along a straight and level path. Preliminary experimental results demonstrate reliable, accurate and fully autonomous flight of an 8-m-long (straight and level) route, which was taught with the quadrocopter fixed to a cart. Finally, we present the successful flight of a more complex, 16-m-long route.}
}

2013

[DOI] Dance of the flying machines: methods for designing and executing an aerial dance choreography
F. Augugliaro, A. P. Schoellig, and R. D’Andrea
IEEE Robotics Automation Magazine, vol. 20, iss. 4, pp. 96-104, 2013.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Slides]
Imagine a troupe of dancers flying together across a big open stage, their movement choreographed to the rhythm of the music. Their performance is both coordinated and skilled; the dancers are well rehearsed, and the choreography well suited to their abilities. They are no ordinary dancers, however, and this is not an ordinary stage. The performers are quadrocopters, and the stage is the ETH Zurich Flying Machine Arena, a state-of-the-art mobile testbed for aerial motion control research.

@ARTICLE{augugliaro-ram13,
author = {Federico Augugliaro and Angela P. Schoellig and Raffaello D'Andrea},
title = {Dance of the Flying Machines: Methods for Designing and Executing an Aerial Dance Choreography},
journal = {{IEEE Robotics Automation Magazine}},
volume = {20},
number = {4},
pages = {96-104},
year = {2013},
doi = {10.1109/MRA.2013.2275693},
urlvideo={http://youtu.be/NRL_1ozDQCA?t=21s},
urlslides={../../wp-content/papercite-data/slides/augugliaro-ram13-slides.pdf},
abstract = {Imagine a troupe of dancers flying together across a big open stage, their movement choreographed to the rhythm of the music. Their performance is both coordinated and skilled; the dancers are well rehearsed, and the choreography well suited to their abilities. They are no ordinary dancers, however, and this is not an ordinary stage. The performers are quadrocopters, and the stage is the ETH Zurich Flying Machine Arena, a state-of-the-art mobile testbed for aerial motion control research.}
}

[DOI] Visual teach and repeat, repeat, repeat: iterative learning control to improve mobile robot path tracking in challenging outdoor environments
C. J. Ostafew, A. P. Schoellig, and T. D. Barfoot
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013, pp. 176-181.
[View BibTeX] [View Abstract] [Download PDF] [View Video]

This paper presents a path-repeating, mobile robot controller that combines a feedforward, proportional Iterative Learning Control (ILC) algorithm with a feedback-linearized path-tracking controller to reduce path-tracking errors over repeated traverses along a reference path. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied, extreme environments. The paper presents experimental results including over 600 m of travel by a four-wheeled, 50 kg robot travelling through challenging terrain including steep hills and sandy turns and by a six-wheeled, 160 kg robot at gradually-increased speeds up to three times faster than the nominal, safe speed. In the absence of a global localization system, ILC is demonstrated to reduce path-tracking errors caused by unmodelled robot dynamics and terrain challenges.

@INPROCEEDINGS{ostafew-iros13,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot},
title = {Visual teach and repeat, repeat, repeat: Iterative learning control to improve mobile robot path tracking in challenging outdoor environments},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {176-181},
year = {2013},
doi = {10.1109/IROS.2013.6696350},
urlvideo = {https://youtu.be/08_d1HSPADA?list=PLC12E387419CEAFF2},
abstract = {This paper presents a path-repeating, mobile robot controller that combines a feedforward, proportional Iterative Learning Control (ILC) algorithm with a feedback-linearized path-tracking controller to reduce path-tracking errors over repeated traverses along a reference path. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied, extreme environments. The paper presents experimental results including over 600 m of travel by a four-wheeled, 50 kg robot travelling through challenging terrain including steep hills and sandy turns and by a six-wheeled, 160 kg robot at gradually-increased speeds up to three times faster than the nominal, safe speed. In the absence of a global localization system, ILC is demonstrated to reduce path-tracking errors caused by unmodelled robot dynamics and terrain challenges.}
}

[DOI] Improving tracking performance by learning from past data
A. P. Schoellig
PhD Thesis, Diss. ETH No. 20593, Institute for Dynamic Systems and Control, ETH Zurich, Switzerland, 2013. Awards: ETH Medal, Dimitris N. Chorafas Foundation Prize.
[View BibTeX] [Download Abstract] [Download PDF] [View Video] [View 2nd Video] [Download Slides]

@PHDTHESIS{schoellig-eth13,
author = {Angela P. Schoellig},
title = {Improving tracking performance by learning from past data},
school = {Diss. ETH No. 20593, Institute for Dynamic Systems and Control, ETH Zurich},
doi = {10.3929/ethz-a-009758916},
year = {2013},
address = {Switzerland},
urlabstract = {../../wp-content/papercite-data/pdf/schoellig-eth13-abstract.pdf},
urlslides = {../../wp-content/papercite-data/slides/schoellig-eth13-slides.pdf},
urlvideo = {https://youtu.be/zHTCsSkmADo?list=PLC12E387419CEAFF2},
urlvideo2 = {https://youtu.be/7r281vgfotg?list=PLD6AAACCBFFE64AC5},
note = {Awards: ETH Medal, Dimitris N. Chorafas Foundation Prize}
}

2012

[DOI] Optimization-based iterative learning for precise quadrocopter trajectory tracking
A. P. Schoellig, F. L. Mueller, and R. D’Andrea
Autonomous Robots, vol. 33, iss. 1-2, pp. 103-127, 2012.
[View BibTeX] [View Abstract] [Download PDF] [View Video]
Current control systems regulate the behavior of dynamic systems by reacting to noise and unexpected disturbances as they occur. To improve the performance of such control systems, experience from iterative executions can be used to anticipate recurring disturbances and proactively compensate for them. This paper presents an algorithm that exploits data from previous repetitions in order to learn to precisely follow a predefined trajectory. We adapt the feed-forward input signal to the system with the goal of achieving high tracking performance – even under the presence of model errors and other recurring disturbances. The approach is based on a dynamics model that captures the essential features of the system and that explicitly takes system input and state constraints into account. We combine traditional optimal filtering methods with state-of-the-art optimization techniques in order to obtain an effective and computationally efficient learning strategy that updates the feed-forward input signal according to a customizable learning objective. It is possible to define a termination condition that stops an execution early if the deviation from the nominal trajectory exceeds a given bound. This allows for a safe learning that gradually extends the time horizon of the trajectory. We developed a framework for generating arbitrary flight trajectories and for applying the algorithm to highly maneuverable autonomous quadrotor vehicles in the ETH Flying Machine Arena testbed. Experimental results are discussed for selected trajectories and different learning algorithm parameters.

@ARTICLE{schoellig-auro12,
author = {Angela P. Schoellig and Fabian L. Mueller and Raffaello D'Andrea},
title = {Optimization-based iterative learning for precise quadrocopter trajectory tracking},
journal = {{Autonomous Robots}},
volume = {33},
number = {1-2},
pages = {103-127},
year = {2012},
doi = {10.1007/s10514-012-9283-2},
urlvideo={http://youtu.be/goVuP5TJIUU?list=PLC12E387419CEAFF2},
abstract = {Current control systems regulate the behavior of dynamic systems by reacting to noise and unexpected disturbances as they occur. To improve the performance of such control systems, experience from iterative executions can be used to anticipate recurring disturbances and proactively compensate for them. This paper presents an algorithm that exploits data from previous repetitions in order to learn to precisely follow a predefined trajectory. We adapt the feed-forward input signal to the system with the goal of achieving high tracking performance - even under the presence of model errors and other recurring disturbances. The approach is based on a dynamics model that captures the essential features of the system and that explicitly takes system input and state constraints into account. We combine traditional optimal filtering methods with state-of-the-art optimization techniques in order to obtain an effective and computationally efficient learning strategy that updates the feed-forward input signal according to a customizable learning objective. It is possible to define a termination condition that stops an execution early if the deviation from the nominal trajectory exceeds a given bound. This allows for a safe learning that gradually extends the time horizon of the trajectory. We developed a framework for generating arbitrary flight trajectories and for applying the algorithm to highly maneuverable autonomous quadrotor vehicles in the ETH Flying Machine Arena testbed. Experimental results are discussed for selected trajectories and different learning algorithm parameters.}
}

[DOI] Limited benefit of joint estimation in multi-agent iterative learning
A. P. Schoellig, J. Alonso-Mora, and R. D’Andrea
Asian Journal of Control, vol. 14, iss. 3, pp. 613-623, 2012.
[View BibTeX] [View Abstract] [Download PDF] [Download Additional Material] [Download Slides]

This paper studies iterative learning control (ILC) in a multi-agent framework, wherein a group of agents simultaneously and repeatedly perform the same task. Assuming similarity between the agents, we investigate whether exchanging information between the agents improves an individual’s learning performance. That is, does an individual agent benefit from the experience of the other agents? We consider the multi-agent iterative learning problem as a two-step process of: first, estimating the repetitive disturbance of each agent; and second, correcting for it. We present a comparison of an agent’s disturbance estimate in the case of (I) independent estimation, where each agent has access only to its own measurement, and (II) joint estimation, where information of all agents is globally accessible. When the agents are identical and noise comes from measurement only, joint estimation yields a noticeable improvement in performance. However, when process noise is encountered or when the agents have an individual disturbance component, the benefit of joint estimation is negligible.

@ARTICLE{schoellig-ajc12,
author = {Angela P. Schoellig and Javier Alonso-Mora and Raffaello D'Andrea},
title = {Limited benefit of joint estimation in multi-agent iterative learning},
journal = {{Asian Journal of Control}},
volume = {14},
number = {3},
pages = {613-623},
year = {2012},
doi = {10.1002/asjc.398},
urldata={../../wp-content/papercite-data/data/schoellig-ajc12-files.zip},
urlslides={../../wp-content/papercite-data/slides/schoellig-ajc12-slides.pdf},
abstract = {This paper studies iterative learning control (ILC) in a multi-agent framework, wherein a group of agents simultaneously and repeatedly perform the same task. Assuming similarity between the agents, we investigate whether exchanging information between the agents improves an individual's learning performance. That is, does an individual agent benefit from the experience of the other agents? We consider the multi-agent iterative learning problem as a two-step process of: first, estimating the repetitive disturbance of each agent; and second, correcting for it. We present a comparison of an agent's disturbance estimate in the case of (I) independent estimation, where each agent has access only to its own measurement, and (II) joint estimation, where information of all agents is globally accessible. When the agents are identical and noise comes from measurement only, joint estimation yields a noticeable improvement in performance. However, when process noise is encountered or when the agents have an individual disturbance component, the benefit of joint estimation is negligible.}
}

[DOI] Generation of collision-free trajectories for a quadrocopter fleet: a sequential convex programming approach
F. Augugliaro, A. P. Schoellig, and R. D’Andrea
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2012, pp. 1917-1922.
[View BibTeX] [View Abstract] [Download PDF] [View Video]

This paper presents an algorithm that generates collision-free trajectories in three dimensions for multiple vehicles within seconds. The problem is cast as a non-convex optimization problem, which is iteratively solved using sequential convex programming that approximates non-convex constraints by using convex ones. The method generates trajectories that account for simple dynamics constraints and is thus independent of the vehicle’s type. An extensive a posteriori vehicle-specific feasibility check is included in the algorithm. The algorithm is applied to a quadrocopter fleet. Experimental results are shown.

@INPROCEEDINGS{augugliaro-iros12,
author = {Federico Augugliaro and Angela P. Schoellig and Raffaello D'Andrea},
title = {Generation of collision-free trajectories for a quadrocopter fleet: A sequential convex programming approach},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {1917-1922},
year = {2012},
doi = {10.1109/IROS.2012.6385823},
urlvideo = {https://youtu.be/wwK7WvvUvlI?list=PLD6AAACCBFFE64AC5},
abstract = {This paper presents an algorithm that generates collision-free trajectories in three dimensions for multiple vehicles within seconds. The problem is cast as a non-convex optimization problem, which is iteratively solved using sequential convex programming that approximates non-convex constraints by using convex ones. The method generates trajectories that account for simple dynamics constraints and is thus independent of the vehicle's type. An extensive a posteriori vehicle-specific feasibility check is included in the algorithm. The algorithm is applied to a quadrocopter fleet. Experimental results are shown.}
}

[DOI] Iterative learning of feed-forward corrections for high-performance tracking
F. L. Mueller, A. P. Schoellig, and R. D’Andrea
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2012, pp. 3276-3281.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Slides]

We revisit a recently developed iterative learning algorithm that enables systems to learn from a repeated operation with the goal of achieving high tracking performance of a given trajectory. The learning scheme is based on a coarse dynamics model of the system and uses past measurements to iteratively adapt the feed-forward input signal to the system. The novelty of this work is an identification routine that uses a numerical simulation of the system dynamics to extract the required model information. This allows the learning algorithm to be applied to any dynamic system for which a dynamics simulation is available (including systems with underlying feedback loops). The proposed learning algorithm is applied to a quadrocopter system that is guided by a trajectory-following controller. With the identification routine, we are able to extend our previous learning results to three-dimensional quadrocopter motions and achieve significantly higher tracking accuracy due to the underlying feedback control, which accounts for non-repetitive noise.

@INPROCEEDINGS{mueller-iros12,
author = {Fabian L. Mueller and Angela P. Schoellig and Raffaello D'Andrea},
title = {Iterative learning of feed-forward corrections for high-performance tracking},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {3276-3281},
year = {2012},
doi = {10.1109/IROS.2012.6385647},
urlvideo = {https://youtu.be/zHTCsSkmADo?list=PLC12E387419CEAFF2},
urlslides = {../../wp-content/papercite-data/slides/mueller-iros12-slides.pdf},
abstract = {We revisit a recently developed iterative learning algorithm that enables systems to learn from a repeated operation with the goal of achieving high tracking performance of a given trajectory. The learning scheme is based on a coarse dynamics model of the system and uses past measurements to iteratively adapt the feed-forward input signal to the system. The novelty of this work is an identification routine that uses a numerical simulation of the system dynamics to extract the required model information. This allows the learning algorithm to be applied to any dynamic system for which a dynamics simulation is available (including systems with underlying feedback loops). The proposed learning algorithm is applied to a quadrocopter system that is guided by a trajectory-following controller. With the identification routine, we are able to extend our previous learning results to three-dimensional quadrocopter motions and achieve significantly higher tracking accuracy due to the underlying feedback control, which accounts for non-repetitive noise.}
}

[DOI] Feed-forward parameter identification for precise periodic quadrocopter motions
A. P. Schoellig, C. Wiltsche, and R. D’Andrea
in Proc. of the American Control Conference (ACC), 2012, pp. 4313-4318.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Slides]

This paper presents an approach for precisely tracking periodic trajectories with a quadrocopter. In order to improve temporal and spatial tracking performance, we propose a feed-forward strategy that adapts the motion parameters sent to the vehicle controller. The motion parameters are either adjusted on the fly or, in order to avoid initial transients, identified prior to the flight performance. We outline an identification scheme that tunes parameters for a large class of periodic motions, and requires only a small number of identification experiments prior to flight. This reduced identification is based on analysis and experiments showing that the quadrocopter’s closed-loop dynamics can be approximated by three directionally decoupled linear systems. We show the effectiveness of this approach by performing a sequence of periodic motions on real quadrocopters using the tuned parameters obtained by the reduced identification.

@INPROCEEDINGS{schoellig-acc12,
author = {Angela P. Schoellig and Clemens Wiltsche and Raffaello D'Andrea},
title = {Feed-forward parameter identification for precise periodic quadrocopter motions},
booktitle = {{Proc. of the American Control Conference (ACC)}},
pages = {4313-4318},
year = {2012},
doi = {10.1109/ACC.2012.6315248},
urlvideo = {http://tiny.cc/MusicInMotion},
urlslides = {../../wp-content/papercite-data/slides/schoellig-acc12-slides.pdf},
abstract = {This paper presents an approach for precisely tracking periodic trajectories with a quadrocopter. In order to improve temporal and spatial tracking performance, we propose a feed-forward strategy that adapts the motion parameters sent to the vehicle controller. The motion parameters are either adjusted on the fly or, in order to avoid initial transients, identified prior to the flight performance. We outline an identification scheme that tunes parameters for a large class of periodic motions, and requires only a small number of identification experiments prior to flight. This reduced identification is based on analysis and experiments showing that the quadrocopter's closed-loop dynamics can be approximated by three directionally decoupled linear systems. We show the effectiveness of this approach by performing a sequence of periodic motions on real quadrocopters using the tuned parameters obtained by the reduced identification.}
}

An aerial robotics demonstration for controls research at the ETH Flying Machine Arena
R. Ritz, M. W. Müller, F. Augugliaro, M. Hehn, S. Lupashin, A. P. Schoellig, and R. D’Andrea
Swiss Society for Automatic Control Bulletin, 2012.
[View BibTeX] [Download PDF]

@MISC{ritz-sga12,
author = {Robin Ritz and Mark W. M{\"u}ller and Federico Augugliaro and Markus Hehn and Sergei Lupashin and Angela P. Schoellig and Raffaello D'Andrea},
title = {An aerial robotics demonstration for controls research at the {ETH Flying Machine Arena}},
year = {2012},
number = {463},
pages = {2-15},
howpublished = {Swiss Society for Automatic Control Bulletin}
}

Quadrocopter slalom learning
A. P. Schoellig, F. L. Mueller, and R. D’Andrea
Video Submission, AI and Robotics Multimedia Fair, Conference on Artificial Intelligence (AI), Assn. of the Advancement of Artificial Intelligence (AAAI), 2012.
[View BibTeX] [View Video]

@MISC{schoellig-aaai12,
author = {Angela P. Schoellig and Fabian L. Mueller and Raffaello D'Andrea},
title = {Quadrocopter Slalom Learning},
howpublished = {Video Submission, AI and Robotics Multimedia Fair, Conference on Artificial Intelligence (AI), Assn. of the Advancement of Artificial Intelligence (AAAI)},
urlvideo = {https://youtu.be/zHTCsSkmADo?list=PLC12E387419CEAFF2},
year = {2012},
}

2011

[DOI] Sensitivity of joint estimation in multi-agent iterative learning control
A. P. Schoellig and R. D’Andrea
in Proc. of the IFAC (International Federation of Automatic Control) World Congress, 2011, pp. 1204-1212.
[View BibTeX] [View Abstract] [Download PDF] [Download Additional Material] [Download Slides]
We consider a group of agents that simultaneously learn the same task, and revisit a previously developed algorithm, where agents share their information and learn jointly. We have already shown that, as compared to an independent learning model that disregards the information of the other agents, and when assuming similarity between the agents, a joint algorithm improves the learning performance of an individual agent. We now revisit the joint learning algorithm to determine its sensitivity to the underlying assumption of similarity between agents. We note that an incorrect assumption about the agents’ degree of similarity degrades the performance of the joint learning scheme. The degradation is particularly acute if we assume that the agents are more similar than they are in reality; in this case, a joint learning scheme can result in a poorer performance than the independent learning algorithm. In the worst case (when we assume that the agents are identical, but they are, in reality, not) the joint learning does not even converge to the correct value. We conclude that, when applying the joint algorithm, it is crucial not to overestimate the similarity of the agents; otherwise, a learning scheme that is independent of the similarity assumption is preferable.

@INPROCEEDINGS{schoellig-ifac11,
author = {Angela P. Schoellig and Raffaello D'Andrea},
title = {Sensitivity of joint estimation in multi-agent iterative learning control},
booktitle = {{Proc. of the IFAC (International Federation of Automatic Control) World Congress}},
pages = {1204-1212},
year = {2011},
doi = {10.3182/20110828-6-IT-1002.03687},
urlslides = {../../wp-content/papercite-data/slides/schoellig-ifac11-slides.pdf},
urldata = {../../wp-content/papercite-data/data/schoellig-ifac11-files.zip},
abstract = {We consider a group of agents that simultaneously learn the same task, and revisit a previously developed algorithm, where agents share their information and learn jointly. We have already shown that, as compared to an independent learning model that disregards the information of the other agents, and when assuming similarity between the agents, a joint algorithm improves the learning performance of an individual agent. We now revisit the joint learning algorithm to determine its sensitivity to the underlying assumption of similarity between agents. We note that an incorrect assumption about the agents' degree of similarity degrades the performance of the joint learning scheme. The degradation is particularly acute if we assume that the agents are more similar than they are in reality; in this case, a joint learning scheme can result in a poorer performance than the independent learning algorithm. In the worst case (when we assume that the agents are identical, but they are, in reality, not) the joint learning does not even converge to the correct value. We conclude that, when applying the joint algorithm, it is crucial not to overestimate the similarity of the agents; otherwise, a learning scheme that is independent of the similarity assumption is preferable.}
}

[DOI] Feasibility of motion primitives for choreographed quadrocopter flight
A. P. Schoellig, M. Hehn, S. Lupashin, and R. D’Andrea
in Proc. of the American Control Conference (ACC), 2011, pp. 3843-3849.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Additional Material] [Download Slides]

This paper describes a method for checking the feasibility of quadrocopter motions. The approach, meant as a validation tool for preprogrammed quadrocopter performances, is based on first principles models and ensures that a desired trajectory respects both vehicle dynamics and motor thrust limits. We apply this method towards the eventual goal of using parameterized motion primitives for expressive quadrocopter choreographies. First, we show how a large class of motion primitives can be formulated as truncated Fourier series. We then show how the feasibility check can be applied to such motions by deriving explicit parameter constraints for two particular parameterized primitives. The predicted feasibility constraints are compared against experimental results from quadrocopters in the ETH Flying Machine Arena.

@INPROCEEDINGS{schoellig-acc11,
author = {Angela P. Schoellig and Markus Hehn and Sergei Lupashin and Raffaello D'Andrea},
title = {Feasibility of motion primitives for choreographed quadrocopter flight},
booktitle = {{Proc. of the American Control Conference (ACC)}},
pages = {3843-3849},
year = {2011},
doi = {10.1109/ACC.2011.5991482},
urlvideo = {https://www.youtube.com/playlist?list=PLD6AAACCBFFE64AC5},
urlslides = {../../wp-content/papercite-data/slides/schoellig-acc11-slides.pdf},
urldata = {../../wp-content/papercite-data/data/schoellig-acc11-files.zip},
abstract = {This paper describes a method for checking the feasibility of quadrocopter motions. The approach, meant as a validation tool for preprogrammed quadrocopter performances, is based on first principles models and ensures that a desired trajectory respects both vehicle dynamics and motor thrust limits. We apply this method towards the eventual goal of using parameterized motion primitives for expressive quadrocopter choreographies. First, we show how a large class of motion primitives can be formulated as truncated Fourier series. We then show how the feasibility check can be applied to such motions by deriving explicit parameter constraints for two particular parameterized primitives. The predicted feasibility constraints are compared against experimental results from quadrocopters in the ETH Flying Machine Arena.}
}

[DOI] The Flying Machine Arena as of 2010
S. Lupashin, A. P. Schoellig, M. Hehn, and R. D’Andrea
Video Submission, in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2011.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [More Information]

The Flying Machine Arena (FMA) is an indoor research space built specifically for the study of autonomous systems and aerial robotics. In this video, we give an overview of this testbed and some of its capabilities. We show the FMA infrastructure and hardware, which includes a fleet of quadrocopters and a motion capture system for vehicle localization. The physical components of the FMA are complemented by specialized software tools and components that facilitate the use of the space and provide a unified framework for communication and control. The flexibility and modularity of the experimental platform is highlighted by various research projects and demonstrations.

@MISC{lupashin-icra11,
author = {Sergei Lupashin and Angela P. Schoellig and Markus Hehn and Raffaello D'Andrea},
title = {The {Flying Machine Arena} as of 2010},
howpublished = {Video Submission, in Proc. of the IEEE International Conference on Robotics and Automation (ICRA)},
year = {2011},
pages = {2970-2971},
doi = {10.1109/ICRA.2011.5980308},
urlvideo = {https://youtu.be/pcgvWhu8Arc?list=PLuLKX4lDsLIaVjdGsZxNBKLcogBnVVFQr},
urllink = {http://www.flyingmachinearena.org},
abstract = {The Flying Machine Arena (FMA) is an indoor research space built specifically for the study of autonomous systems and aerial robotics. In this video, we give an overview of this testbed and some of its capabilities. We show the FMA infrastructure and hardware, which includes a fleet of quadrocopters and a motion capture system for vehicle localization. The physical components of the FMA are complemented by specialized software tools and components that facilitate the use of the space and provide a unified framework for communication and control. The flexibility and modularity of the experimental platform is highlighted by various research projects and demonstrations.},
}

2010

[DOI] A simple learning strategy for high-speed quadrocopter multi-flips
S. Lupashin, A. P. Schoellig, M. Sherback, and R. D’Andrea
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2010, pp. 1642-1648.
[View BibTeX] [View Abstract] [Download PDF] [View Video]
We describe a simple and intuitive policy gradient method for improving parametrized quadrocopter multi-flips by combining iterative experiments with information from a first-principles model. We start by formulating an N-flip maneuver as a five-step primitive with five adjustable parameters. Optimization using a low-order first-principles 2D vertical plane model of the quadrocopter yields an initial set of parameters and a corrective matrix. The maneuver is then repeatedly performed with the vehicle. At each iteration the state error at the end of the primitive is used to update the maneuver parameters via a gradient adjustment. The method is demonstrated at the ETH Zurich Flying Machine Arena testbed on quadrotor helicopters performing and improving on flips, double flips and triple flips.

@INPROCEEDINGS{lupashin-icra10,
author = {Sergei Lupashin and Angela P. Schoellig and Michael Sherback and Raffaello D'Andrea},
title = {A simple learning strategy for high-speed quadrocopter multi-flips},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
pages = {1642-1648},
year = {2010},
doi = {10.1109/ROBOT.2010.5509452},
urlvideo = {https://youtu.be/bWExDW9J9sA?list=PLC12E387419CEAFF2},
abstract = {We describe a simple and intuitive policy gradient method for improving parametrized quadrocopter multi-flips by combining iterative experiments with information from a first-principles model. We start by formulating an N-flip maneuver as a five-step primitive with five adjustable parameters. Optimization using a low-order first-principles 2D vertical plane model of the quadrocopter yields an initial set of parameters and a corrective matrix. The maneuver is then repeatedly performed with the vehicle. At each iteration the state error at the end of the primitive is used to update the maneuver parameters via a gradient adjustment. The method is demonstrated at the ETH Zurich Flying Machine Arena testbed on quadrotor helicopters performing and improving on flips, double flips and triple flips.}
}

[DOI] Independent vs. joint estimation in multi-agent iterative learning control
A. P. Schoellig, J. Alonso-Mora, and R. D’Andrea
in Proc. of the IEEE Conference on Decision and Control (CDC), 2010, pp. 6949-6954.
[View BibTeX] [View Abstract] [Download PDF] [Download Additional Material] [Download Slides]

This paper studies iterative learning control (ILC) in a multi-agent framework, wherein a group of agents simultaneously and repeatedly perform the same task. The agents improve their performance by using the knowledge gained from previous executions. Assuming similarity between the agents, we investigate whether exchanging information between the agents improves an individual’s learning performance. That is, does an individual agent benefit from the experience of the other agents? We consider the multi-agent iterative learning problem as a two-step process of: first, estimating the repetitive disturbance of each agent; and second, correcting for it. We present a comparison of an agent’s disturbance estimate in the case of (I) independent estimation, where each agent has access only to its own measurement, and (II) joint estimation, where information of all agents is globally accessible. We analytically derive an upper bound of the performance improvement due to joint estimation. Results are obtained for two limiting cases: (i) pure process noise, and (ii) pure measurement noise. The benefits of information sharing are negligible in (i). For (ii), a performance improvement is observed when a high similarity between the agents is guaranteed.

@INPROCEEDINGS{schoellig-cdc10,
author = {Angela P. Schoellig and Javier Alonso-Mora and Raffaello D'Andrea},
title = {Independent vs. joint estimation in multi-agent iterative learning control},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
pages = {6949-6954},
year = {2010},
doi = {10.1109/CDC.2010.5717888},
urlslides = {../../wp-content/papercite-data/slides/schoellig-cdc10-slides.pdf},
urldata = {../../wp-content/papercite-data/data/schoellig-cdc10-files.zip},
abstract = {This paper studies iterative learning control (ILC) in a multi-agent framework, wherein a group of agents simultaneously and repeatedly perform the same task. The agents improve their performance by using the knowledge gained from previous executions. Assuming similarity between the agents, we investigate whether exchanging information between the agents improves an individual's learning performance. That is, does an individual agent benefit from the experience of the other agents? We consider the multi-agent iterative learning problem as a two-step process of: first, estimating the repetitive disturbance of each agent; and second, correcting for it. We present a comparison of an agent's disturbance estimate in the case of (I) independent estimation, where each agent has access only to its own measurement, and (II) joint estimation, where information of all agents is globally accessible. We analytically derive an upper bound of the performance improvement due to joint estimation. Results are obtained for two limiting cases: (i) pure process noise, and (ii) pure measurement noise. The benefits of information sharing are negligible in (i). For (ii), a performance improvement is observed when a high similarity between the agents is guaranteed.}
}

A platform for dance performances with multiple quadrocopters
A. P. Schoellig, F. Augugliaro, and R. D’Andrea
in Proc. of the Workshop on Robots and Musical Expressions at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2010, pp. 1-8.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [View 2nd Video] [Download Slides]

This paper presents a platform for rhythmic flight with multiple quadrocopters. We envision an expressive multimedia dance performance that is automatically composed and controlled, given a random piece of music. Results in this paper prove the feasibility of audio-motion synchronization when precisely timing the side-to-side motion of a quadrocopter to the beat of the music. An illustration of the indoor flight space and the vehicles shows the characteristics and capabilities of the experimental setup. Prospective features of the platform are outlined and key challenges are emphasized. The paper concludes with a proof-of-concept demonstration showing three vehicles synchronizing their side-to-side motion to the music beat. Moreover, a dance performance to a remix of the sound track ‘Pirates of the Caribbean’ gives a first impression of the novel musical experience. Future steps include an appropriate multiscale music analysis and the development of algorithms for the automated generation of choreography based on a database of motion primitives.

@INPROCEEDINGS{schoellig-iros10,
author = {Angela P. Schoellig and Federico Augugliaro and Raffaello D'Andrea},
title = {A platform for dance performances with multiple quadrocopters},
booktitle = {{Proc. of the Workshop on Robots and Musical Expressions at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {1-8},
year = {2010},
urlvideo = {https://youtu.be/aaaGJKnJdrg?list=PLD6AAACCBFFE64AC5},
urlvideo2 = {https://www.youtube.com/playlist?list=PLD6AAACCBFFE64AC5},
urlslides = {../../wp-content/papercite-data/slides/schoellig-iros10-slides.pdf},
abstract = {This paper presents a platform for rhythmic flight with multiple quadrocopters. We envision an expressive multimedia dance performance that is automatically composed and controlled, given a random piece of music. Results in this paper prove the feasibility of audio-motion synchronization when precisely timing the side-to-side motion of a quadrocopter to the beat of the music. An illustration of the indoor flight space and the vehicles shows the characteristics and capabilities of the experimental setup. Prospective features of the platform are outlined and key challenges are emphasized. The paper concludes with a proof-of-concept demonstration showing three vehicles synchronizing their side-to-side motion to the music beat. Moreover, a dance performance to a remix of the sound track 'Pirates of the Caribbean' gives a first impression of the novel musical experience. Future steps include an appropriate multiscale music analysis and the development of algorithms for the automated generation of choreography based on a database of motion primitives.}
}

[DOI] Synchronizing the motion of a quadrocopter to music
A. P. Schoellig, F. Augugliaro, and R. D’Andrea
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2010, pp. 3355-3360.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Slides]

This paper presents a quadrocopter flying in rhythm to music. The quadrocopter performs a periodic side-to-side motion in time to a musical beat. Underlying controllers are designed that stabilize the vehicle and produce a swinging motion. Synchronization is then achieved by using concepts from phase-locked loops. A phase comparator combined with a correction algorithm eliminate the phase error between the music reference and the actual quadrocopter motion. Experimental results show fast and effective synchronization that is robust to sudden changes in the reference amplitude and frequency. Changes in frequency and amplitude are tracked precisely when adding an additional feedforward component, based on an experimentally determined look-up table.

@INPROCEEDINGS{schoellig-icra10,
author = {Angela P. Schoellig and Federico Augugliaro and Raffaello D'Andrea},
title = {Synchronizing the motion of a quadrocopter to music},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
pages = {3355-3360},
year = {2010},
doi = {10.1109/ROBOT.2010.5509755},
urlslides = {../../wp-content/papercite-data/slides/schoellig-icra10-slides.pdf},
urlvideo = {https://youtu.be/Kx4DtXv_bPo?list=PLD6AAACCBFFE64AC5},
abstract = {This paper presents a quadrocopter flying in rhythm to music. The quadrocopter performs a periodic side-to-side motion in time to a musical beat. Underlying controllers are designed that stabilize the vehicle and produce a swinging motion. Synchronization is then achieved by using concepts from phase-locked loops. A phase comparator combined with a correction algorithm eliminate the phase error between the music reference and the actual quadrocopter motion. Experimental results show fast and effective synchronization that is robust to sudden changes in the reference amplitude and frequency. Changes in frequency and amplitude are tracked precisely when adding an additional feedforward component, based on an experimentally determined look-up table.}
}

2009

Optimization-based iterative learning control for trajectory tracking
A. P. Schoellig and R. D’Andrea
in Proc. of the European Control Conference (ECC), 2009, pp. 1505-1510.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Slides] [More Information]
In this paper, an optimization-based iterative learning control approach is presented. Given a desired trajectory to be followed, the proposed learning algorithm improves the system performance from trial to trial by exploiting the experience gained from previous repetitions. Taking advantage of the a-priori knowledge about the systems dominating dynamics, a data-based update rule is derived which adapts the feedforward input signal after each trial. By combining traditional model-based optimal filtering methods with state-of-the-art optimization techniques such as convex programming, an effective and computationally highly efficient learning strategy is obtained. Moreover, the derived formalism allows for the direct treatment of input and state constraints. Different (nonlinear) performance objectives can be specified defining the overall learning behavior. Finally, the proposed algorithm is successfully applied to the benchmark problem of swinging up a pendulum using open-loop control only.

@INPROCEEDINGS{schoellig-ecc09,
author = {Angela P. Schoellig and Raffaello D'Andrea},
title = {Optimization-based iterative learning control for trajectory tracking},
booktitle = {{Proc. of the European Control Conference (ECC)}},
pages = {1505-1510},
year = {2009},
urlslides = {../../wp-content/papercite-data/slides/schoellig-ecc09-slides.pdf},
urllink = {http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=7074619},
urlvideo = {https://youtu.be/W2gCn6aAwz4?list=PLC12E387419CEAFF2},
abstract = {In this paper, an optimization-based iterative learning control approach is presented. Given a desired trajectory to be followed, the proposed learning algorithm improves the system performance from trial to trial by exploiting the experience gained from previous repetitions. Taking advantage of the a-priori knowledge about the systems dominating dynamics, a data-based update rule is derived which adapts the feedforward input signal after each trial. By combining traditional model-based optimal filtering methods with state-of-the-art optimization techniques such as convex programming, an effective and computationally highly efficient learning strategy is obtained. Moreover, the derived formalism allows for the direct treatment of input and state constraints. Different (nonlinear) performance objectives can be specified defining the overall learning behavior. Finally, the proposed algorithm is successfully applied to the benchmark problem of swinging up a pendulum using open-loop control only.}
}

2008

Verification of the performance of selected subsystems for the LISA mission (in German)
P. F. Gath, D. Weise, T. Heinrich, A. P. Schoellig, and S. Otte
in Proc. of the German Aerospace Congress, German Society for Aeronautics and Astronautics (DGLR), 2008.
[View BibTeX] [View Abstract] [Download PDF]
Im Rahmen der Untersuchung zur Systemleistung alternativer Nutzlastkonzepte fuer die LISA Mission (Laser Interferometer Space Antenna) werden bei Astrium derzeit einzelne Untersysteme der Nutzlast auf ihre Leistungsfaehigkeit hin ueberprueft. Dies geschieht sowohl durch theoretische Untersuchungen im Rahmen von Simulationen als auch durch experimentelle Laboruntersuchungen.

@INPROCEEDINGS{gath-gac08,
author = {Peter F. Gath and Dennis Weise and Thomas Heinrich and Angela P. Schoellig and S. Otte},
title = {Verification of the performance of selected subsystems for the {LISA} mission {(in German)}},
booktitle = {{Proc. of the German Aerospace Congress, German Society for Aeronautics and Astronautics (DGLR)}},
year = {2008},
abstract = {Im Rahmen der Untersuchung zur Systemleistung alternativer Nutzlastkonzepte fuer die LISA Mission (Laser Interferometer Space Antenna) werden bei Astrium derzeit einzelne Untersysteme der Nutzlast auf ihre Leistungsfaehigkeit hin ueberprueft. Dies geschieht sowohl durch theoretische Untersuchungen im Rahmen von Simulationen als auch durch experimentelle Laboruntersuchungen.}
}

Learning through experience — Optimizing performance by repetition
A. P. Schoellig and R. D’Andrea
Abstract and Poster, in Proc. of the Robotics Challenges for Machine Learning Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2008.
[View BibTeX] [View Abstract] [Download PDF] [View Video] [Download Slides] [More Information]

The goal of our research is to develop a strategy which enables a system, executing the same task multiple times, to use the knowledge of the previous trials to learn more about its own dynamics and enhance its future performance. Our approach, which falls in the field of iterative learning control, combines methods from both areas, traditional model-based estimation and control and purely data-based learning.

@MISC{schoellig-iros08,
author = {Angela P. Schoellig and Raffaello D'Andrea},
title = {Learning through experience -- {O}ptimizing performance by repetition},
howpublished = {Abstract and Poster, in Proc. of the Robotics Challenges for Machine Learning Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year = {2008},
urlvideo = {https://youtu.be/W2gCn6aAwz4?list=PLC12E387419CEAFF2},
urlslides = {../../wp-content/papercite-data/slides/schoellig-iros08-slides.pdf},
urllink = {http://www.learning-robots.de/pmwiki.php/TC/IROS2008},
abstract = {The goal of our research is to develop a strategy which enables a system, executing the same task multiple times, to use the knowledge of the previous trials to learn more about its own dynamics and enhance its future performance. Our approach, which falls in the field of iterative learning control, combines methods from both areas, traditional model-based estimation and control and purely data-based learning.},
}

2007

[DOI] A hybrid Bellman equation for bimodal systems
P. Caines, M. Egerstedt, R. Malhame, and A. P. Schoellig
in Hybrid Systems: Computation and Control, A. Bemporad, A. Bicchi, and G. Buttazzo, Eds., Springer Berlin Heidelberg, 2007, vol. 4416, pp. 656-659.
[View BibTeX] [View Abstract] [Download PDF]
In this paper we present a dynamic programming formulation of a hybrid optimal control problem for bimodal systems with regional dynamics. In particular, based on optimality-zone computations, a framework is presented in which the resulting hybrid Bellman equation guides the design of optimal control programs with, at most, N discrete transitions.

@INCOLLECTION{caines-springer07,
author={Peter Caines and Magnus Egerstedt and Roland Malhame and Angela P. Schoellig},
title={A Hybrid {Bellman} Equation for Bimodal Systems},
booktitle={{Hybrid Systems: Computation and Control}},
editor={Bemporad, Alberto and Bicchi, Antonio and Buttazzo, Giorgio},
publisher={Springer Berlin Heidelberg},
pages={656-659},
year={2007},
volume={4416},
series={Lecture Notes in Computer Science},
doi={10.1007/978-3-540-71493-4_54},
abstract = {In this paper we present a dynamic programming formulation of a hybrid optimal control problem for bimodal systems with regional dynamics. In particular, based on optimality-zone computations, a framework is presented in which the resulting hybrid Bellman equation guides the design of optimal control programs with, at most, N discrete transitions.}
}

[DOI] A hybrid Bellman equation for systems with regional dynamics
A. P. Schoellig, P. E. Caines, M. Egerstedt, and R. P. Malhamé
in Proc. of the IEEE Conference on Decision and Control (CDC), 2007, pp. 3393-3398.
[View BibTeX] [View Abstract] [Download PDF]

In this paper, we study hybrid systems with regional dynamics, i.e., systems where transitions between different dynamical regimes occur as the continuous state of the system reaches given switching surfaces. In particular, we focus our attention on the optimal control problem associated with such systems, and we present a Hybrid Bellman Equation for such systems that provide a characterization of global optimality, given an upper bound on the number of switches. Not surprisingly, the solution will be hybrid in nature in that it will depend on not only the continuous control signals, but also on discrete decisions as to what domains the system should go through in the first place. A number of examples are presented to highlight the operation of the proposed approach.

@INPROCEEDINGS{schoellig-cdc07,
author = {Angela P. Schoellig and Peter E. Caines and Magnus Egerstedt and Roland P. Malham\'e},
title = {A hybrid {B}ellman equation for systems with regional dynamics},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
pages = {3393-3398},
year = {2007},
doi = {10.1109/CDC.2007.4434952},
abstract = {In this paper, we study hybrid systems with regional dynamics, i.e., systems where transitions between different dynamical regimes occur as the continuous state of the system reaches given switching surfaces. In particular, we focus our attention on the optimal control problem associated with such systems, and we present a Hybrid Bellman Equation for such systems that provide a characterization of global optimality, given an upper bound on the number of switches. Not surprisingly, the solution will be hybrid in nature in that it will depend on not only the continuous control signals, but also on discrete decisions as to what domains the system should go through in the first place. A number of examples are presented to highlight the operation of the proposed approach.}
}

Topology-dependent stability of a network of dynamical systems with communication delays
A. P. Schoellig, U. Münz, and F. Allgöwer
in Proc. of the European Control Conference (ECC), 2007, pp. 1197-1202.
[View BibTeX] [View Abstract] [Download PDF] [More Information]

In this paper, we analyze the stability of a network of first-order linear time-invariant systems with constant, identical communication delays. We investigate the influence of both system parameters and network characteristics on stability. In particular, a non-conservative stability bound for the delay is given such that the network is asymptotically stable for any delay smaller than this bound. We show how the network topology changes the stability bound. Exemplarily, we use these results to answer the question if a symmetric or skew-symmetric interconnection is preferable for a given set of subsystems.

@INPROCEEDINGS{schoellig-ecc07,
author = {Angela P. Schoellig and Ulrich M\"unz and Frank Allg\"ower},
title = {Topology-Dependent Stability of a Network of Dynamical Systems with Communication Delays},
booktitle = {{Proc. of the European Control Conference (ECC)}},
pages = {1197-1202},
year = {2007},
urllink = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7068977},
abstract = {In this paper, we analyze the stability of a network of first-order linear time-invariant systems with constant, identical communication delays. We investigate the influence of both system parameters and network characteristics on stability. In particular, a non-conservative stability bound for the delay is given such that the network is asymptotically stable for any delay smaller than this bound. We show how the network topology changes the stability bound. Exemplarily, we use these results to answer the question if a symmetric or skew-symmetric interconnection is preferable for a given set of subsystems.}
}

Optimal control of hybrid systems with regional dynamics
A. P. Schoellig
Master Thesis, Georgia Institute of Technology, USA, 2007.
[View BibTeX] [View Abstract] [Download PDF] [Download Slides] [More Information]

In this work, hybrid systems with regional dynamics are considered. These are systems where transitions between different dynamical regimes occur as the continuous state of the system reaches given switching surfaces. In particular, the attention is focused on the optimal control problem associated with such systems. More precisely, given a specific cost function, the goal is to determine the optimal path of going from a given starting point to a fixed final state during an a priori specified time horizon. The key characteristic of the approach presented in this thesis is a hierarchical decomposition of the hybrid optimal control problem, yielding to a framework which allows a solution on different levels of control. On the highest level of abstraction, the regional structure of the state space is taken into account and a discrete representation of the connections between the different regions provides global accessibility relations between regions. These are used on a lower level of control to formulate the main theorem of this work, namely, the Hybrid Bellman Equation for multimodal systems, which, in fact, provides a characterization of global optimality, given an upper bound on the number of transitions along a hybrid trajectory. Not surprisingly, the optimal solution is hybrid in nature, in that it depends on not only the continuous control signals, but also on discrete decisions as to what domains the system’s continuous state should go through in the first place. The main benefit with the proposed approach lies in the fact that a hierarchical Dynamic Programming algorithm can be used to representing both a theoretical characterization of the hybrid solution’s structural composition and, from a more application-driven point of view, a numerically implementable calculation rule yielding to globally optimal solutions in a regional dynamics framework. The operation of the recursive algorithm is highlighted by the consideration of numerous examples, among them, a heterogeneous multi-agent problem.

@MASTERSTHESIS{schoellig-gatech07,
author = {Angela P. Schoellig},
title = {Optimal control of hybrid systems with regional dynamics},
school = {Georgia Institute of Technology},
urllink = {http://hdl.handle.net/1853/19874},
urlslides = {../../wp-content/papercite-data/slides/schoellig-gatech07-slides.pdf},
year = {2007},
address = {USA},
abstract = {In this work, hybrid systems with regional dynamics are considered. These are systems where transitions between different dynamical regimes occur as the continuous state of the system reaches given switching surfaces. In particular, the attention is focused on the optimal control problem associated with such systems. More precisely, given a specific cost function, the goal is to determine the optimal path of going from a given starting point to a fixed final state during an a priori specified time horizon. The key characteristic of the approach presented in this thesis is a hierarchical decomposition of the hybrid optimal control problem, yielding to a framework which allows a solution on different levels of control. On the highest level of abstraction, the regional structure of the state space is taken into account and a discrete representation of the connections between the different regions provides global accessibility relations between regions. These are used on a lower level of control to formulate the main theorem of this work, namely, the Hybrid Bellman Equation for multimodal systems, which, in fact, provides a characterization of global optimality, given an upper bound on the number of transitions along a hybrid trajectory. Not surprisingly, the optimal solution is hybrid in nature, in that it depends on not only the continuous control signals, but also on discrete decisions as to what domains the system's continuous state should go through in the first place. The main benefit with the proposed approach lies in the fact that a hierarchical Dynamic Programming algorithm can be used to representing both a theoretical characterization of the hybrid solution's structural composition and, from a more application-driven point of view, a numerically implementable calculation rule yielding to globally optimal solutions in a regional dynamics framework. The operation of the recursive algorithm is highlighted by the consideration of numerous examples, among them, a heterogeneous multi-agent problem.},
}

2006

Stability of a network of dynamical systems with communication delays (in German)
A. P. Schoellig
Semester Project, University of Stuttgart, Germany, 2006.
[View BibTeX] [Download PDF] [Download Slides]
@MASTERSTHESIS{schoellig-stuttgart06,
author = {Angela P. Schoellig},
title = {Stability of a network of dynamical systems with communication delays {(in German)}},
school = {University of Stuttgart},
type = {Semester Project},
urlslides = {../../wp-content/papercite-data/slides/schoellig-stuttgart06-slides.pdf},
year = {2006},
address = {Germany}
}

University of Toronto Institute for Aerospace Studies