Publications | Dynamic Systems Lab | Prof. Angela Schoellig

Publications

Home » Research » Publications

A BibTex file that includes all references is found here. You can also follow our publication updates via Google Scholar.

2026

SplatXtRact: tractable gaussian splatting via open world region-of-interest extraction and refinement
H. Schieber, C. Kleinbeck, A. P. Schoellig, S. Leutenegger, and D. Roth
IEEE Robotics and Automation Letters (RA-L), 2026. Accepted.

We present a task-conditioned refinement for 3D Gaussian Splatting (GS) that enables robots or human operators to selectively extract task-relevant regions of a learned scene. Given a pre-trained GS map, our approach supports local region-of-interest (ROI) refinement, preserving a global map consistency while meeting close to real-time constraints required for interactive robotic perception. The framework decouples semantic ROI selection from initial GS optimization, allowing flexible integration with external and novel perception models. We evaluate our approach on indoor and outdoor data (TUM RGB-D, MipNeRF360), demonstrating a higher novel view synthesis quality compared to the state-of-the-art, reduced artifacts, and bounded latency suitable for human-in-the-loop operation.

@ARTICLE{schieber-ral26,
author={Hannah Schieber and Constantin Kleinbeck and Angela P Schoellig and Stefan Leutenegger and Daniel Roth},
journal = {{IEEE Robotics and Automation Letters (RA-L)}},
title={{SplatXtRact}: Tractable Gaussian Splatting via Open World Region-of-Interest Extraction and Refinement},
year={2026},
note={Accepted},
doi={10.1109/LRA.2026.3699257},
abstract = {We present a task-conditioned refinement for 3D Gaussian Splatting (GS) that enables robots or human operators to selectively extract task-relevant regions of a learned scene. Given a pre-trained GS map, our approach supports local region-of-interest (ROI) refinement, preserving a global map consistency while meeting close to real-time constraints required for interactive robotic perception. The framework decouples semantic ROI selection from initial GS optimization, allowing flexible integration with external and novel perception models. We evaluate our approach on indoor and outdoor data (TUM RGB-D, MipNeRF360), demonstrating a higher novel view synthesis quality compared to the state-of-the-art, reduced artifacts, and bounded latency suitable for human-in-the-loop operation.},
}

CLARE: continual learning for vision-language-action models via autonomous adapter routing and expansion
R. Römer*, Y. Zhang*, and A. P. Schoellig
IEEE Robotics and Automation Letters (RA-L), 2026. Accepted.

To teach robots complex manipulation tasks, a common approach is to fine-tune a pre-trained vision-language-action model (VLA) on task-specific data. However, since this recipe updates existing representations, it is unsuitable for long-term operation in the real world, where robots must continually adapt to new tasks and environments while retaining the knowledge they have already acquired. Existing continual learning methods for robotics commonly require storing previous data (exemplars), struggle with long task sequences, or rely on task identifiers for deployment. To address these limitations, we propose CLARE, a general, parameter-efficient framework for exemplar-free continual learning with VLAs. CLARE introduces lightweight modular adapters into selected VLA modules and autonomously expands the model only where necessary when learning a new task, guided by layer-wise feature similarity. During deployment, an autoencoder-based routing mechanism dynamically activates the most relevant adapters without requiring task labels. Through extensive experiments on the LIBERO benchmark and five real-world tasks, we show that CLARE achieves high performance on new tasks without catastrophic forgetting of earlier tasks, significantly outperforming even exemplar-based methods. Code, data, and videos are available at our website: this https URL.

@ARTICLE{roemer-ral26,
author={Ralf Römer* and Yi Zhang* and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters (RA-L)}},
title={{CLARE}: Continual Learning for Vision-Language-Action Models via Autonomous Adapter Routing and Expansion},
year={2026},
note={Accepted},
urllink = {https://tum-lsy.github.io/clare/},
urlcode = {https://github.com/utiasDSL/clare},
urllink2 = {https://arxiv.org/abs/2601.09512},
doi={10.1109/LRA.2026.3693992},
abstract = {To teach robots complex manipulation tasks, a common approach is to fine-tune a pre-trained vision-language-action model (VLA) on task-specific data. However, since this recipe updates existing representations, it is unsuitable for long-term operation in the real world, where robots must continually adapt to new tasks and environments while retaining the knowledge they have already acquired. Existing continual learning methods for robotics commonly require storing previous data (exemplars), struggle with long task sequences, or rely on task identifiers for deployment. To address these limitations, we propose CLARE, a general, parameter-efficient framework for exemplar-free continual learning with VLAs. CLARE introduces lightweight modular adapters into selected VLA modules and autonomously expands the model only where necessary when learning a new task, guided by layer-wise feature similarity. During deployment, an autoencoder-based routing mechanism dynamically activates the most relevant adapters without requiring task labels. Through extensive experiments on the LIBERO benchmark and five real-world tasks, we show that CLARE achieves high performance on new tasks without catastrophic forgetting of earlier tasks, significantly outperforming even exemplar-based methods. Code, data, and videos are available at our website: this https URL.},
}

CRISP – compliant ros2 controllers for learning-based manipulation policies and teleoperation
D. S. J. Pro, O. Hausdörfer, R. Römer, M. Dösch, M. Schuck, and A. P. Schöllig
IEEE Robotics and Automation Practice (RA-P), 2026.

Learning-based controllers, such as diffusion policies and vision-language action models, often generate low-frequency or discontinuous robot state changes. Achieving smooth reference tracking requires a low-level controller that converts high-level targets commands into joint torques, enabling compliant behavior during contact interactions. We present CRISP, a lightweight C++ implementation of compliant Cartesian and joint-space controllers for the ROS2 control standard, designed for seamless integration with high-level learning-based policies as well as teleoperation. The controllers are compatible with any manipulator that exposes a joint-torque interface. Through our Python and Gymnasium interfaces, CRISP provides a unified pipeline for recording data from hardware and simulation and deploying high-level learning-based policies seamlessly, facilitating rapid experimentation. The system has been validated on hardware with the Franka Robotics FR3 and in simulation with the Kuka IIWA14 and Kinova Gen3. Designed for rapid integration, flexible deployment, and real-time performance, our implementation provides a unified pipeline for data collection and policy execution, lowering the barrier to applying learning-based methods on ROS2-compatible manipulators.

@ARTICLE{pro-rap26,
title = {{CRISP} -- Compliant ROS2 Controllers for Learning-Based Manipulation Policies and Teleoperation},
author={Daniel San José Pro and Oliver Hausdörfer and Ralf Römer and Maximilian Dösch and Martin Schuck and Angela P. Schöllig},
journal={{IEEE Robotics and Automation Practice (RA-P)}},
year={2026},
volume={},
number={},
pages={},
doi={10.1109/RAP.2026.3690413},
urlcode = {https://github.com/utiasDSL/crisp_controllers},
urllink = {https://utiasdsl.github.io/crisp_controllers/},
abstract={Learning-based controllers, such as diffusion policies and vision-language action models, often generate low-frequency or discontinuous robot state changes. Achieving smooth reference tracking requires a low-level controller that converts high-level targets commands into joint torques, enabling compliant behavior during contact interactions. We present CRISP, a lightweight C++ implementation of compliant Cartesian and joint-space controllers for the ROS2 control standard, designed for seamless integration with high-level learning-based policies as well as teleoperation. The controllers are compatible with any manipulator that exposes a joint-torque interface. Through our Python and Gymnasium interfaces, CRISP provides a unified pipeline for recording data from hardware and simulation and deploying high-level learning-based policies seamlessly, facilitating rapid experimentation. The system has been validated on hardware with the Franka Robotics FR3 and in simulation with the Kuka IIWA14 and Kinova Gen3. Designed for rapid integration, flexible deployment, and real-time performance, our implementation provides a unified pipeline for data collection and policy execution, lowering the barrier to applying learning-based methods on ROS2-compatible manipulators. }
}

Preventing inactive cbf safety filters caused by incorrect relative degree assumptions
L. Brunke, S. Zhou, and A. P. Schoellig
IEEE Transactions on Automatic Control, vol. 71, iss. 1, pp. 700-707, 2026.

Control barrier function (CBF) safety filters emerged as a popular framework to certify and modify potentially unsafe control inputs, for example, provided by a reinforcement learning agent or a nonexpert user. Typical CBF safety filter designs assume that the system has a uniform relative degree. This assumption is restrictive and is frequently overlooked in practice. When violated, the assumption can cause the safety filter to become inactive, allowing large and possibly unsafe control inputs to be applied to the system. In discrete-time implementations, the inactivity issue is often manifested as chattering close to the safety boundary and/or constraint violations. In this work, we provide an in-depth discussion on the safety filter inactivity issue, propose a mitigation strategy based on multiple CBFs, and derive an upper bound on the sampling time for safety under sampled-data control. The effectiveness of our proposed method is validated through both simulation and quadrotor experiments.

@ARTICLE{brunke-ral26,
title = {Preventing Inactive CBF Safety Filters Caused by Incorrect Relative Degree Assumptions},
author={Lukas Brunke and Siqi Zhou and Angela P. Schoellig},
journal={{IEEE Transactions on Automatic Control}},
year={2026},
volume={71},
number={1},
pages={700-707},
keywords={Safety;Filters;Control systems;Robots;Quadrotors;Prevention and mitigation;Vectors;Trajectory;Sampled data systems;Upper bound;Control barrier function (CBF);nonlinear control;safety;sampled-data control},
doi={10.1109/TAC.2025.3608258},
urlcode = {https://github.com/lukasbrunke/multi-cbf},
urllink = {https://arxiv.org/abs/2409.11171},
urlvideo = {https://www.youtube.com/watch?v=V6XuLyLdVqo},
abstract={Control barrier function (CBF) safety filters emerged as a popular framework to certify and modify potentially unsafe control inputs, for example, provided by a reinforcement learning agent or a nonexpert user. Typical CBF safety filter designs assume that the system has a uniform relative degree. This assumption is restrictive and is frequently overlooked in practice. When violated, the assumption can cause the safety filter to become inactive, allowing large and possibly unsafe control inputs to be applied to the system. In discrete-time implementations, the inactivity issue is often manifested as chattering close to the safety boundary and/or constraint violations. In this work, we provide an in-depth discussion on the safety filter inactivity issue, propose a mitigation strategy based on multiple CBFs, and derive an upper bound on the sampling time for safety under sampled-data control. The effectiveness of our proposed method is validated through both simulation and quadrotor experiments.}
}

Where did i leave my glasses? open-vocabulary semantic exploration in real-world semi-static environments
B. Bogenberger, O. Harrison, O. Dahanaggamaarachchi, L. Brunke, J. Qian, S. Zhou, and A. P. Schoellig
IEEE Robotics and Automation Letters (RA-L), vol. 11, iss. 3, pp. 3342-3349, 2026.

Robots deployed in real-world environments, such as homes, must not only navigate safely but also understand their surroundings and adapt to changes in the environment. To perform tasks efficiently, they must build and maintain a semantic map that accurately reflects the current state of the environment. Existing research on semantic exploration largely focuses on static scenes without persistent object-level instance tracking. In this work, we propose an open-vocabulary, semantic exploration system for semi-static environments. Our system maintains a consistent map by building a probabilistic model of object instance stationarity, systematically tracking semi-static changes, and actively exploring areas that have not been visited for an extended period. In addition to active map maintenance, our approach leverages the map’s semantic richness with large language model (LLM)-based reasoning for open-vocabulary object-goal navigation. This enables the robot to search more efficiently by prioritizing contextually relevant areas. We compare our approach against state-of-the-art baselines using publicly available object navigation and mapping datasets, and we further demonstrate real-world transferability in three real-world environments. Our approach outperforms the compared baselines in both success rate and search efficiency for object-navigation tasks and can more reliably handle changes in mapping semi-static environments. In real-world experiments, our system detects 95% of map changes on average, improving efficiency by more than 29% as compared to random and patrol strategies.

@ARTICLE{bogenberger-ral26,
title = {Where Did I Leave My Glasses? Open-Vocabulary Semantic Exploration in Real-World Semi-Static Environments},
author={Benjamin Bogenberger and Oliver Harrison and Orrin Dahanaggamaarachchi and Lukas Brunke and Jingxing Qian and Siqi Zhou and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters (RA-L)}},
year={2026},
volume={11},
number={3},
pages={3342-3349},
doi={10.1109/LRA.2026.3656790},
urllink = {https://utiasdsl.github.io/semi-static-semantic-exploration/},
urlcode = {https://github.com/utiasDSL/perceive_semantix_release},
urlvideo = {https://www.youtube.com/watch?v=nUouwHUZPWQ},
abstract={Robots deployed in real-world environments, such as homes, must not only navigate safely but also understand their surroundings and adapt to changes in the environment. To perform tasks efficiently, they must build and maintain a semantic map that accurately reflects the current state of the environment. Existing research on semantic exploration largely focuses on static scenes without persistent object-level instance tracking. In this work, we propose an open-vocabulary, semantic exploration system for semi-static environments. Our system maintains a consistent map by building a probabilistic model of object instance stationarity, systematically tracking semi-static changes, and actively exploring areas that have not been visited for an extended period. In addition to active map maintenance, our approach leverages the map's semantic richness with large language model (LLM)-based reasoning for open-vocabulary object-goal navigation. This enables the robot to search more efficiently by prioritizing contextually relevant areas. We compare our approach against state-of-the-art baselines using publicly available object navigation and mapping datasets, and we further demonstrate real-world transferability in three real-world environments. Our approach outperforms the compared baselines in both success rate and search efficiency for object-navigation tasks and can more reliably handle changes in mapping semi-static environments. In real-world experiments, our system detects 95% of map changes on average, improving efficiency by more than 29% as compared to random and patrol strategies.}
}

Self-supervised learning for object pose estimation through active real sample capture
A. Li and A. P. Schoellig
IEEE Robotics and Automation Letters (RA-L), vol. 11, iss. 2, pp. 1954-1961, 2026.

6D Object pose estimation is a fundamental component in robotics enabling efficient interaction with the environment. In industrial bin-picking tasks, this problem becomes especially challenging due to difficult object poses, complex occlusions, and inter-object ambiguities. In this work, we propose a novel self-supervised method that automatically collects, labels, and fine-tunes on real images using an eye-in-hand camera setup. We leverage the mobile camera to first obtain reliable ground-truth estimates through multi-view pose estimation, allowing us to subsequently reposition the camera to capture and label real ‘hard case’ samples from the estimated scene. This process enables closure of the sim-to-real gap through large quantities of targeted real training data, generated by comparing differences in model performance between real and synthetically reconstructed scenes and informing the mobile camera on specific poses or areas for data capture. We surpass state-of-the-art performance on a challenging bin-picking benchmark: five out of seven objects surpass a 95% correct detection rate, compared to only one out of seven for previous methods.

@ARTICLE{li-ral16,
author={Alan Li and Angela P. Schoellig},
journal={{IEEE Robotics and Automation Letters (RA-L)}},
title={Self-Supervised Learning for Object Pose Estimation Through Active Real Sample Capture},
year={2026},
volume={11},
number={2},
pages={1954-1961},
doi={10.1109/LRA.2025.3643269},
abstract={6D Object pose estimation is a fundamental component in robotics enabling efficient interaction with the environment. In industrial bin-picking tasks, this problem becomes especially challenging due to difficult object poses, complex occlusions, and inter-object ambiguities. In this work, we propose a novel self-supervised method that automatically collects, labels, and fine-tunes on real images using an eye-in-hand camera setup. We leverage the mobile camera to first obtain reliable ground-truth estimates through multi-view pose estimation, allowing us to subsequently reposition the camera to capture and label real ‘hard case’ samples from the estimated scene. This process enables closure of the sim-to-real gap through large quantities of targeted real training data, generated by comparing differences in model performance between real and synthetically reconstructed scenes and informing the mobile camera on specific poses or areas for data capture. We surpass state-of-the-art performance on a challenging bin-picking benchmark: five out of seven objects surpass a 95% correct detection rate, compared to only one out of seven for previous methods.}
}

Decentralized and fully onboard: range-aided cooperative localization and navigation on micro aerial vehicles
A. Goudar and A. P. Schoellig
IEEE Robotics and Automation Letters (RA-L), vol. 11, iss. 1, pp. 954-961, 2026.

Controlling a team of robots in a coordinated manner is challenging because centralized approaches (where all computation is performed on a central machine) scale poorly, and globally referenced external localization systems may not always be available. In this work, we consider the problem of range-aided decentralized localization and formation control. In such a setting, each robot estimates its relative pose by combining data only from onboard odometry sensors and distance measurements to other robots in the team. Additionally, each robot calculates the control inputs necessary to collaboratively navigate an environment to accomplish a specific task, for example, moving in a desired formation while monitoring an area. We present a block coordinate descent approach to localization that does not require strict coordination between the robots. We present a novel formulation for formation control as inference on factor graphs that takes into account the state estimation uncertainty and can be solved efficiently. Our approach to range-aided localization and formation-based navigation is completely decentralized, does not require specialized trajectories to maintain formation, and achieves decimeter-level positioning and formation control accuracy. We demonstrate our approach through multiple real experiments involving formation flights in diverse indoor and outdoor environments.

@ARTICLE{goudar-ral2026,
author={Abhishek Goudar and Angela P. Schoellig},
journal={{IEEE Robotics and Automation Letters (RA-L)}},
title={Decentralized and Fully Onboard: Range-Aided Cooperative Localization and Navigation on Micro Aerial Vehicles},
year={2026},
volume={11},
number={1},
pages={954-961},
doi={10.1109/LRA.2025.3630870},
abstract={Controlling a team of robots in a coordinated manner is challenging because centralized approaches (where all computation is performed on a central machine) scale poorly, and globally referenced external localization systems may not always be available. In this work, we consider the problem of range-aided decentralized localization and formation control. In such a setting, each robot estimates its relative pose by combining data only from onboard odometry sensors and distance measurements to other robots in the team. Additionally, each robot calculates the control inputs necessary to collaboratively navigate an environment to accomplish a specific task, for example, moving in a desired formation while monitoring an area. We present a block coordinate descent approach to localization that does not require strict coordination between the robots. We present a novel formulation for formation control as inference on factor graphs that takes into account the state estimation uncertainty and can be solved efficiently. Our approach to range-aided localization and formation-based navigation is completely decentralized, does not require specialized trajectories to maintain formation, and achieves decimeter-level positioning and formation control accuracy. We demonstrate our approach through multiple real experiments involving formation flights in diverse indoor and outdoor environments.},
urllink = {https://arxiv.org/abs/2602.16594}
}

Robotic nonprehensile object transportation with a hanging tray
A. Heins and A. P. Schoellig
in Proc. of the IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), 2026. Accepted.

We consider the nonprehensile object transportation task known as the waiter’s problem, in which a robot must move an object balanced on a tray from one location to another. In contrast to prior works on the robotic waiter’s problem, which make the robot tilt a tray rigidly held by its end effector (EE), we use a tray suspended from the EE by ropes, such that it behaves like a three-dimensional pendulum. Some prior works have actuated the robot so that the EE simulates the behavior of a pendulum, because pendular motion reduces the shear forces acting on the transported objects, minimizing the sliding of rigid objects and sloshing in containers of liquid. In contrast, our use of a real hanging tray allows us to obtain the benefits of pendular motion while only actuating a 3 degree-of-freedom (DOF) mobile base, rather than requiring a full 6-DOF manipulator arm. Our experiments in simulation and on real hardware show that the hanging tray substantially reduces both sliding and sloshing compared to a static, rigidly-grasped tray. Furthermore, we integrate the hanging tray into an interactive robot waiter demonstration, which uses computer vision to identify people with a raised hand and visual servoing to steer toward them and allow them to access the tray.

@inproceedings{heins-aim26,
author={Adam Heins and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM)}},
title={Robotic Nonprehensile Object Transportation with a Hanging Tray},
year={2026},
note={Accepted},
urllink = {https://arxiv.org/abs/2606.10039},
abstract = {We consider the nonprehensile object transportation task known as the waiter's problem, in which a robot must move an object balanced on a tray from one location to another. In contrast to prior works on the robotic waiter's problem, which make the robot tilt a tray rigidly held by its end effector (EE), we use a tray suspended from the EE by ropes, such that it behaves like a three-dimensional pendulum. Some prior works have actuated the robot so that the EE simulates the behavior of a pendulum, because pendular motion reduces the shear forces acting on the transported objects, minimizing the sliding of rigid objects and sloshing in containers of liquid. In contrast, our use of a real hanging tray allows us to obtain the benefits of pendular motion while only actuating a 3 degree-of-freedom (DOF) mobile base, rather than requiring a full 6-DOF manipulator arm. Our experiments in simulation and on real hardware show that the hanging tray substantially reduces both sliding and sloshing compared to a static, rigidly-grasped tray. Furthermore, we integrate the hanging tray into an interactive robot waiter demonstration, which uses computer vision to identify people with a raised hand and visual servoing to steer toward them and allow them to access the tray.},
}

Exploiting differential flatness for efficient learning-based model predictive control of constrained multi-input control affine systems
T. A. Farger*, A. W. Hall*, and A. P. Schoellig
in Proc. of the European Control Conference (ECC), 2026. Accepted.

Learning-based control techniques use data from past trajectories to control systems with uncertain dynamics. However, learning-based controllers are often computationally inefficient, limiting their practicality. To address this limitation, we propose a learning-based controller that exploits differential flatness, a property of many robotic systems. Recent research on using flatness for learning-based control either is limited in that it (i) ignores input constraints, (ii) applies only to single-input systems, or (iii) is tailored to specific platforms. In contrast, our approach uses a system extension and block-diagonal cost formulation to control general multi-input, nonlinear, affine systems. Furthermore, it satisfies input and half-space flat state constraints and guarantees probabilistic Lyapunov decrease using only two sequential convex optimizations. We show that our approach performs similarly to, but is multiple times more efficient than, a Gaussian process model predictive controller in simulation, and achieves competitive tracking in real hardware experiments.

@inproceedings{farger-ecc26,
author={Tobias A. Farger* and Adam W. Hall* and Angela P. Schoellig},
booktitle = {{Proc. of the European Control Conference (ECC)}},
title={Exploiting Differential Flatness for Efficient Learning-based Model Predictive Control of Constrained Multi-Input Control Affine Systems},
year={2026},
note={Accepted},
urllink = {https://arxiv.org/abs/2604.24706},
abstract = {Learning-based control techniques use data from past trajectories to control systems with uncertain dynamics. However, learning-based controllers are often computationally inefficient, limiting their practicality. To address this limitation, we propose a learning-based controller that exploits differential flatness, a property of many robotic systems. Recent research on using flatness for learning-based control either is limited in that it (i) ignores input constraints, (ii) applies only to single-input systems, or (iii) is tailored to specific platforms. In contrast, our approach uses a system extension and block-diagonal cost formulation to control general multi-input, nonlinear, affine systems. Furthermore, it satisfies input and half-space flat state constraints and guarantees probabilistic Lyapunov decrease using only two sequential convex optimizations. We show that our approach performs similarly to, but is multiple times more efficient than, a Gaussian process model predictive controller in simulation, and achieves competitive tracking in real hardware experiments.},
}

SM2ITH: safe mobile manipulation with interactive human prediction via task-hierarchical bilevel model predictive control
F. D’Orazio*, S. Samavi*, X. Du*, S. Zhou, G. Oriolo, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2026. Accepted.

Mobile manipulators are designed to perform complex sequences of navigation and manipulation tasks in human-centered environments. While recent optimization-based methods such as Hierarchical Task Model Predictive Control (HTMPC) enable efficient multitask execution with strict task priorities, they have so far been applied mainly to static or structured scenarios. Extending these approaches to dynamic human-centered environments requires predictive models that capture how humans react to the actions of the robot. This work introduces Safe Mobile Manipulation with Interactive Human Prediction via Task-Hierarchical Bilevel Model Predictive Control (SM^2ITH), a unified framework that combines HTMPC with interactive human motion prediction through bilevel optimization that jointly accounts for robot and human dynamics. The framework is validated on two different mobile manipulators, the Stretch 3 and the Ridgeback-UR10, across three experimental settings: (i) delivery tasks with different navigation and manipulation priorities, (ii) sequential pick-and-place tasks with different human motion prediction models, and (iii) interactions involving adversarial human behavior. Our results highlight how interactive prediction enables safe and efficient coordination, outperforming baselines that rely on weighted objectives or open-loop human models.

@inproceedings{dorazio-icra26,
author={Francesco D'Orazio* and Sepehr Samavi* and Xintong Du* and Siqi Zhou and Giuseppe Oriolo and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
title={{SM2ITH}: Safe Mobile Manipulation with Interactive Human Prediction via Task-Hierarchical Bilevel Model Predictive Control},
year={2026},
note={Accepted},
urllink = {https://arxiv.org/abs/2511.17798},
abstract = {Mobile manipulators are designed to perform complex sequences of navigation and manipulation tasks in human-centered environments. While recent optimization-based methods such as Hierarchical Task Model Predictive Control (HTMPC) enable efficient multitask execution with strict task priorities, they have so far been applied mainly to static or structured scenarios. Extending these approaches to dynamic human-centered environments requires predictive models that capture how humans react to the actions of the robot. This work introduces Safe Mobile Manipulation with Interactive Human Prediction via Task-Hierarchical Bilevel Model Predictive Control (SM^2ITH), a unified framework that combines HTMPC with interactive human motion prediction through bilevel optimization that jointly accounts for robot and human dynamics. The framework is validated on two different mobile manipulators, the Stretch 3 and the Ridgeback-UR10, across three experimental settings: (i) delivery tasks with different navigation and manipulation priorities, (ii) sequential pick-and-place tasks with different human motion prediction models, and (iii) interactions involving adversarial human behavior. Our results highlight how interactive prediction enables safe and efficient coordination, outperforming baselines that rely on weighted objectives or open-loop human models.},
}

On flight path reconstruction in the absence of angle of attack and sideslip measurements
G. J. Moszczynski, A. Goudar, A. P. Schoellig, P. R. Grant, and V. Myrand-Lapierre
in Proc. of the AIAA Science and Technology Forum and Exposition (AIAA SciTech), 2026. Accepted.

To address increasing demand for low-cost flight testing methodologies, this work was dedicated to establishing the feasibility and achievable performance of flight path reconstruction (FPR) methods that do not require flow angle measurements. Because the performance of such methods is highly coupled to the motion of the aircraft, a focus of this work was the establishment of the trajectory features that afford estimation performance comparable to conventional FPR methods. To elucidate such trajectory requirements, a trajectory information analysis based on the singular value of the linear time-varying observability Gramian was presented. Application of this analysis to a simulated maneuver revealed that flow angle measurement-free FPR methods may perform comparably with conventional FPR formulations when flight tests are designed with sufficient heading angle variation. These findings were further established through a comparison of the FPR trajectory estimation performance obtained using maximum a posteriori estimator implementations of both schemes. Losses in sideslip angle performance were identified as the most significant consequence of omitting the flow angle measurements, though acceptable performance was demonstrated under conditions of sufficient heading angle variation.

@inproceedings{moszczynski-aiaa26,
author={Gregory J. Moszczynski and Abhisehk Goudar and Angela P. Schoellig and Peter R. Grant and Vincent Myrand-Lapierre},
booktitle = {{Proc. of the AIAA Science and Technology Forum and Exposition (AIAA SciTech)}},
title={On Flight Path Reconstruction In the Absence of Angle of Attack and Sideslip Measurements},
year={2026},
note={Accepted},
doi={https://doi.org/10.2514/6.2026-2659},
abstract = {To address increasing demand for low-cost flight testing methodologies, this work was dedicated to establishing the feasibility and achievable performance of flight path reconstruction (FPR) methods that do not require flow angle measurements. Because the performance of such methods is highly coupled to the motion of the aircraft, a focus of this work was the establishment of the trajectory features that afford estimation performance comparable to conventional FPR methods. To elucidate such trajectory requirements, a trajectory information analysis based on the singular value of the linear time-varying observability Gramian was presented. Application of this analysis to a simulated maneuver revealed that flow angle measurement-free FPR methods may perform comparably with conventional FPR formulations when flight tests are designed with sufficient heading angle variation. These findings were further established through a comparison of the FPR trajectory estimation performance obtained using maximum a posteriori estimator implementations of both schemes. Losses in sideslip angle performance were identified as the most significant consequence of omitting the flow angle measurements, though acceptable performance was demonstrated under conditions of sufficient heading angle variation.},
}

A primer on so(3) action representations in deep reinforcement learning
M. Schuck, S. Samy, and A. P. Schoellig
in Proc. of the International Conference on Learning Representations (ICLR), 2026. Accepted.

Many robotic control tasks require policies to act on orientations, yet the geometry of SO(3) makes this nontrivial. Because SO(3) admits no global, smooth, minimal parameterization, common representations such as Euler angles, quaternions, rotation matrices, and Lie algebra coordinates introduce distinct constraints and failure modes. While these trade-offs are well studied for supervised learning, their implications for actions in reinforcement learning remain unclear. We systematically evaluate SO(3) action representations across three standard continuous control algorithms, PPO, SAC, and TD3, under dense and sparse rewards. We compare how representations shape exploration, interact with entropy regularization, and affect training stability through empirical studies and analyze the implications of different projections for obtaining valid rotations from Euclidean network outputs. Across a suite of robotics benchmarks, we quantify the practical impact of these choices and distill simple, implementation-ready guidelines for selecting and using rotation actions. Our results highlight that representation-induced geometry strongly influences exploration and optimization and show that representing actions as tangent vectors in the local frame yields the most reliable results across algorithms.

@inproceedings{schuck-iclr26,
author={Martin Schuck and Sherif Samy and Angela P. Schoellig},
booktitle = {{Proc. of the International Conference on Learning Representations (ICLR)}},
title={A Primer on SO(3) Action Representations in Deep Reinforcement Learning},
year={2026},
note={Accepted},
urllink = {https://arxiv.org/abs/2510.11103},
abstract = {Many robotic control tasks require policies to act on orientations, yet the geometry of SO(3) makes this nontrivial. Because SO(3) admits no global, smooth, minimal parameterization, common representations such as Euler angles, quaternions, rotation matrices, and Lie algebra coordinates introduce distinct constraints and failure modes. While these trade-offs are well studied for supervised learning, their implications for actions in reinforcement learning remain unclear. We systematically evaluate SO(3) action representations across three standard continuous control algorithms, PPO, SAC, and TD3, under dense and sparse rewards. We compare how representations shape exploration, interact with entropy regularization, and affect training stability through empirical studies and analyze the implications of different projections for obtaining valid rotations from Euclidean network outputs. Across a suite of robotics benchmarks, we quantify the practical impact of these choices and distill simple, implementation-ready guidelines for selecting and using rotation actions. Our results highlight that representation-induced geometry strongly influences exploration and optimization and show that representing actions as tangent vectors in the local frame yields the most reliable results across algorithms.},
}

From demonstrations to safe deployment: path-consistent safety filtering for diffusion policies
R. Römer*, J. Balletshofer*, J. Thumm, M. Pavone, A. P. Schoellig, and M. Althoff
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2026. Accepted.

Diffusion policies (DPs) achieve state-of-the-art performance on complex manipulation tasks by learning from large-scale demonstration datasets, often spanning multiple embodiments and environments. However, they cannot guarantee safe behavior, so external safety mechanisms are needed. These, however, alter actions in ways unseen during training, causing unpredictable behavior and performance degradation. To address these problems, we propose path-consistent safety filtering (PACS) for DPs. Our approach performs path-consistent braking on a trajectory computed from the sequence of generated actions. In this way, we keep execution consistent with the policy’s training distribution, maintaining the learned, task-completing behavior. To enable a real-time deployment and handle uncertainties, we verify safety using set-based reachability analysis. Our experimental evaluation in simulation and on three challenging real-world human-robot interaction tasks shows that PACS (a) provides formal safety guarantees in dynamic environments, (b) preserves task success rates, and (c) outperforms reactive safety approaches, such as control barrier functions, by up to 68% in terms of task success. Videos are available at our project website: \url{https://tum-lsy.github.io/pacs/}.

@inproceedings{roemer-icra26,
author={Ralf Römer* and Julian Balletshofer* and Jakob Thumm and Marco Pavone and Angela P. Schoellig and Matthias Althoff},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
title={From Demonstrations to Safe Deployment: Path-Consistent Safety Filtering for Diffusion Policies},
year={2026},
note={Accepted},
urllink = {https://tum-lsy.github.io/pacs/},
abstract = {Diffusion policies (DPs) achieve state-of-the-art performance on complex manipulation tasks by learning from large-scale demonstration datasets, often spanning multiple embodiments and environments. However, they cannot guarantee safe behavior, so external safety mechanisms are needed. These, however, alter actions in ways unseen during training, causing unpredictable behavior and performance degradation. To address these problems, we propose path-consistent safety filtering (PACS) for DPs. Our approach performs path-consistent braking on a trajectory computed from the sequence of generated actions. In this way, we keep execution consistent with the policy's training distribution, maintaining the learned, task-completing behavior. To enable a real-time deployment and handle uncertainties, we verify safety using set-based reachability analysis. Our experimental evaluation in simulation and on three challenging real-world human-robot interaction tasks shows that PACS (a) provides formal safety guarantees in dynamic environments, (b) preserves task success rates, and (c) outperforms reactive safety approaches, such as control barrier functions, by up to 68% in terms of task success. Videos are available at our project website: \url{https://tum-lsy.github.io/pacs/}.},
}

Uncertainty quantification for flow-based vision-language-action models
R. Römer, M. Seeliger, S. Liu, B. Sturgis, M. Bagatella, D. Marta, A. Krause, and A. P. Schoellig
Technical Report, arXiv Preprint, 2026.

Vision-language-action models (VLAs) combine vision-language backbones with expressive generative action heads trained via flow matching on large-scale robotic datasets. Despite their strong empirical performance in robotic manipulation, VLAs lack mechanisms to quantify confidence in their predictions and to detect when their actions may be unreliable. This presents a critical limitation for real-world deployment in non-stationary environments, where models inevitably encounter scenarios outside their pretraining distribution and may fail without warning. To address this, we derive an efficient method for quantifying epistemic uncertainty in flow-matching models by leveraging velocity-field disagreement (VFD) across a small ensemble. We successfully use this uncertainty estimate for failure detection during deployment and active fine-tuning of flow-based VLAs. To this end, we propose SAVE, a framework for uncertainty-guided active multitask fine-tuning that reduces the number of costly expert demonstrations required to adapt VLAs to new tasks. Through extensive experiments on the LIBERO benchmark, we demonstrate that VFD yields better-calibrated uncertainty estimates predictive of downstream performance, that VFD achieves strong performance in detecting failures, and that uncertainty-guided data acquisition with SAVE requires at least 22% fewer samples than baselines. In summary, our work shows that quantifying epistemic uncertainty in flow-based VLAs improves both failure awareness and adaptation. Project website: this http URL.

@TECHREPORT{roemer-arxiv26,
author = {Ralf Römer and Maximilian Seeliger and Saida Liu and Ben Sturgis and Marco Bagatella and Daniel Marta and Andreas Krause and Angela P. Schoellig},
institution = {arXiv Preprint},
title = {Uncertainty Quantification for Flow-Based Vision-Language-Action Models},
year = {2026},
urllink = {https://tum-lsy.github.io/uq_vla/},
abstract = {Vision-language-action models (VLAs) combine vision-language backbones with expressive generative action heads trained via flow matching on large-scale robotic datasets. Despite their strong empirical performance in robotic manipulation, VLAs lack mechanisms to quantify confidence in their predictions and to detect when their actions may be unreliable. This presents a critical limitation for real-world deployment in non-stationary environments, where models inevitably encounter scenarios outside their pretraining distribution and may fail without warning. To address this, we derive an efficient method for quantifying epistemic uncertainty in flow-matching models by leveraging velocity-field disagreement (VFD) across a small ensemble. We successfully use this uncertainty estimate for failure detection during deployment and active fine-tuning of flow-based VLAs. To this end, we propose SAVE, a framework for uncertainty-guided active multitask fine-tuning that reduces the number of costly expert demonstrations required to adapt VLAs to new tasks. Through extensive experiments on the LIBERO benchmark, we demonstrate that VFD yields better-calibrated uncertainty estimates predictive of downstream performance, that VFD achieves strong performance in detecting failures, and that uncertainty-guided data acquisition with SAVE requires at least 22% fewer samples than baselines. In summary, our work shows that quantifying epistemic uncertainty in flow-based VLAs improves both failure awareness and adaptation. Project website: this http URL.}
}

Crazyflow: an accurate, gpu-accelerated, differentiable drone simulator in jax
M. Schuck*, M. P. Rath*, Y. Hua, A. Goudar, S. Zhou, and A. P. Schoellig
Technical Report, arXiv Preprint, 2026.

High-quality, large-scale synthetic data from simulations is becoming a cornerstone for pushing the capabilities of robot algorithms. While aerial robotics simulators have evolved to support specialized needs such as fidelity, differentiability, and swarms independently, a unified platform that can synthesize data across all these domains is missing. In this work, we propose Crazyflow, a simulator designed to push the limits of aerial-robotics algorithm development, from model-based to data-driven methods, gradient-based to sampling-based approaches, and single-agent to multi-agent systems. Compared to existing state-of-the-art drone simulators, it achieves speeds more than an order of magnitude faster for a single drone and can simulate thousands of swarms of 4000 drones each. Real-world experiments show Crazyflow supports both analytical-gradient-based policy learning, achieving sub-centimeter trajectory tracking accuracy without domain randomization, and sampling-based obstacle avoidance at speeds exceeding half a billion steps per second. Breaking the traditional train-then-deploy paradigm, we show that its unprecedented speed even enables in-flight reinforcement learning; we demonstrate this by throwing a physical drone into the air and training a recovery policy from scratch in 0.38 seconds, successfully stabilizing the drone. Crazyflow supports multiple levels of simulation abstraction, is directly compatible with all open-source Crazyflie models, and enables rapid reconfiguration across custom drone platforms and applications by providing a light-weight system identification pipeline. By pushing accuracy, speed, and differentiability simultaneously, Crazyflow serves as an open-source resource for synthetic data generation, with emerging capabilities for large-scale parallelization for online, in-execution learning and optimization, opening the door to novel algorithm development.

@TECHREPORT{schuck-arxiv26,
author = {Martin Schuck* and Marcel P. Rath* and Yufei Hua and Abhishek Goudar and SiQi Zhou and Angela P. Schoellig},
institution = {arXiv Preprint},
title = {{Crazyflow}: An Accurate, GPU-Accelerated, Differentiable Drone Simulator in JAX},
year = {2026},
urllink = {https://learnsyslab.github.io/crazyflow/},
urlcode = {https://github.com/learnsyslab/crazyflow},
abstract = {High-quality, large-scale synthetic data from simulations is becoming a cornerstone for pushing the capabilities of robot algorithms. While aerial robotics simulators have evolved to support specialized needs such as fidelity, differentiability, and swarms independently, a unified platform that can synthesize data across all these domains is missing. In this work, we propose Crazyflow, a simulator designed to push the limits of aerial-robotics algorithm development, from model-based to data-driven methods, gradient-based to sampling-based approaches, and single-agent to multi-agent systems. Compared to existing state-of-the-art drone simulators, it achieves speeds more than an order of magnitude faster for a single drone and can simulate thousands of swarms of 4000 drones each. Real-world experiments show Crazyflow supports both analytical-gradient-based policy learning, achieving sub-centimeter trajectory tracking accuracy without domain randomization, and sampling-based obstacle avoidance at speeds exceeding half a billion steps per second. Breaking the traditional train-then-deploy paradigm, we show that its unprecedented speed even enables in-flight reinforcement learning; we demonstrate this by throwing a physical drone into the air and training a recovery policy from scratch in 0.38 seconds, successfully stabilizing the drone. Crazyflow supports multiple levels of simulation abstraction, is directly compatible with all open-source Crazyflie models, and enables rapid reconfiguration across custom drone platforms and applications by providing a light-weight system identification pipeline. By pushing accuracy, speed, and differentiability simultaneously, Crazyflow serves as an open-source resource for synthetic data generation, with emerging capabilities for large-scale parallelization for online, in-execution learning and optimization, opening the door to novel algorithm development.}
}

Perceptive hierarchical-task mpc for sequential mobile manipulation in unstructured semi-static environments
X. Du*, J. Qian*, S. Zhou, and A. P. Schoellig
Technical Report, arXiv Preprint, 2026.

As compared to typical mobile manipulation tasks, sequential mobile manipulation poses a unique challenge – as the robot operates over extended periods, successful task completion is not solely dependent on consistent motion generation but also on the robot’s awareness and adaptivity to changes in the operating environment. While existing motion planners can generate whole-body trajectories to complete sequential tasks, they typically assume that the environment remains static and rely on precomputed maps. This assumption often breaks down during long-term operations, where semi-static changes such as object removal, introduction, or shifts are common. In this work, we propose a novel perceptive hierarchical-task model predictive control (HTMPC) framework for efficient sequential mobile manipulation in unstructured, changing environments. To tackle the challenge, we leverage a Bayesian inference framework to explicitly model object-level changes and thereby maintain a temporally accurate representation of the 3D environment; this up-to-date representation is embedded in a lexicographic optimization framework to enable efficient execution of sequential tasks. We validate our perceptive HTMPC approach through both simulated and real-robot experiments. In contrast to baseline methods, our approach systematically accounts for moved and phantom obstacles, successfully completing sequential tasks with higher efficiency and reactivity, without relying on prior maps or external infrastructure.

@TECHREPORT{du-arxiv26,
author={Xintong Du* and Jingxing Qian* and Siqi Zhou and Angela P. Schoellig},
institution = {arXiv Preprint},
title={Perceptive Hierarchical-Task MPC for Sequential Mobile Manipulation in Unstructured Semi-Static Environments},
year={2026},
urllink = {https://arxiv.org/abs/2603.10227},
abstract = {As compared to typical mobile manipulation tasks, sequential mobile manipulation poses a unique challenge -- as the robot operates over extended periods, successful task completion is not solely dependent on consistent motion generation but also on the robot's awareness and adaptivity to changes in the operating environment. While existing motion planners can generate whole-body trajectories to complete sequential tasks, they typically assume that the environment remains static and rely on precomputed maps. This assumption often breaks down during long-term operations, where semi-static changes such as object removal, introduction, or shifts are common. In this work, we propose a novel perceptive hierarchical-task model predictive control (HTMPC) framework for efficient sequential mobile manipulation in unstructured, changing environments. To tackle the challenge, we leverage a Bayesian inference framework to explicitly model object-level changes and thereby maintain a temporally accurate representation of the 3D environment; this up-to-date representation is embedded in a lexicographic optimization framework to enable efficient execution of sequential tasks. We validate our perceptive HTMPC approach through both simulated and real-robot experiments. In contrast to baseline methods, our approach systematically accounts for moved and phantom obstacles, successfully completing sequential tasks with higher efficiency and reactivity, without relying on prior maps or external infrastructure.},
}

SQ-CBF: signed distance functions for numerically stable superquadric-based safety filtering
H. Zhao*, L. Brunke*, O. Lagerquist, S. Zhou, and A. P. Schoellig
Technical Report, arXiv Preprint, 2026. Under review.

Ensuring safe robot operation in cluttered and dynamic environments remains a fundamental challenge. While control barrier functions provide an effective framework for real-time safety filtering, their performance critically depends on the underlying geometric representation, which is often simplified, leading to either overly conservative behavior or insufficient collision coverage. Superquadrics offer an expressive way to model complex shapes using a few primitives and are increasingly used for robot safety. To integrate this representation into collision avoidance, most existing approaches directly use their implicit functions as barrier candidates. However, we identify a critical but overlooked issue in this practice: the gradients of the implicit SQ function can become severely ill-conditioned, potentially rendering the optimization infeasible and undermining reliable real-time safety filtering. To address this issue, we formulate an SQ-based safety filtering framework that uses signed distance functions as barrier candidates. Since analytical SDFs are unavailable for general SQs, we compute distances using the efficient Gilbert-Johnson-Keerthi algorithm and obtain gradients via randomized smoothing. Extensive simulation and real-world experiments demonstrate consistent collision-free manipulation in cluttered and unstructured scenes, showing robustness to challenging geometries, sensing noise, and dynamic disturbances, while improving task efficiency in teleoperation tasks. These results highlight a pathway toward safety filters that remain precise and reliable under the geometric complexity of real-world environments.

@TECHREPORT{zhao-arxiv26,
author={Haocheng Zhao* and Lukas Brunke* and Oliver Lagerquist and Siqi Zhou and Angela P. Schoellig},
institution = {arXiv Preprint},
title={{SQ-CBF}: Signed Distance Functions for Numerically Stable Superquadric-Based Safety Filtering},
year={2026},
note={Under review},
urllink = {https://arxiv.org/abs/2602.11049},
urlvideo = {http://tiny.cc/sq-cbf},
abstract = {Ensuring safe robot operation in cluttered and dynamic environments remains a fundamental challenge. While control barrier functions provide an effective framework for real-time safety filtering, their performance critically depends on the underlying geometric representation, which is often simplified, leading to either overly conservative behavior or insufficient collision coverage. Superquadrics offer an expressive way to model complex shapes using a few primitives and are increasingly used for robot safety. To integrate this representation into collision avoidance, most existing approaches directly use their implicit functions as barrier candidates. However, we identify a critical but overlooked issue in this practice: the gradients of the implicit SQ function can become severely ill-conditioned, potentially rendering the optimization infeasible and undermining reliable real-time safety filtering. To address this issue, we formulate an SQ-based safety filtering framework that uses signed distance functions as barrier candidates. Since analytical SDFs are unavailable for general SQs, we compute distances using the efficient Gilbert-Johnson-Keerthi algorithm and obtain gradients via randomized smoothing. Extensive simulation and real-world experiments demonstrate consistent collision-free manipulation in cluttered and unstructured scenes, showing robustness to challenging geometries, sensing noise, and dynamic disturbances, while improving task efficiency in teleoperation tasks. These results highlight a pathway toward safety filters that remain precise and reliable under the geometric complexity of real-world environments.},
}

Liftnav: path planning via semantic lifting in tsdf-guided gaussian splatting
H. Schieber*, D. Frischmann*, V. Schaack, A. P. Schoellig, and D. Roth
Abstract and Poster, in the Workshop on Semantics for Reliable Robot Autonomy (SRRA) at the IEEE International Conference on Robotics and Automation (ICRA), 2026.

Autonomous robots in unknown indoor environments require both reliable collision avoidance and object-level understanding. Classical representations such as TSDF support safe planning but lack semantics, while photorealistic methods like Gaussian Splatting (GS) provide rich appearance yet suffer from soft geometry, limiting precise obstacle avoidance. We present LiftNav, a hybrid navigation framework built on GSFusion’s TSDF+GS dual map, augmented with a real-time pipeline of YOLO-based detection, TSDF-based 3D lifting, and B-spline trajectory optimization. This design enables flexible semantic navigation without dense 3D embeddings. We further introduce a hinge-loss-based collision penalty that improves trajectory smoothness and safety. We evaluate our approach in a simulation using the Replica dataset. Compared against a state-of-the-art radiance field baseline we show a 100% feasibility rate and shorter trajectories.

@MISC{schieber-icraw26,
author = {Hannah Schieber* and Dominik Frischmann* and Victor Schaack and Angela P. Schoellig and Daniel Roth},
title = {LiftNav: Path Planning via Semantic Lifting in TSDF-Guided Gaussian Splatting},
year = {2026},
howpublished = {Abstract and Poster, in the Workshop on Semantics for Reliable Robot Autonomy (SRRA) at the IEEE International Conference on Robotics and Automation (ICRA)},
urllink = {https://arxiv.org/abs/2605.31376},
abstract = {Autonomous robots in unknown indoor environments require both reliable collision avoidance and object-level understanding. Classical representations such as TSDF support safe planning but lack semantics, while photorealistic methods like Gaussian Splatting (GS) provide rich appearance yet suffer from soft geometry, limiting precise obstacle avoidance. We present LiftNav, a hybrid navigation framework built on GSFusion's TSDF+GS dual map, augmented with a real-time pipeline of YOLO-based detection, TSDF-based 3D lifting, and B-spline trajectory optimization. This design enables flexible semantic navigation without dense 3D embeddings. We further introduce a hinge-loss-based collision penalty that improves trajectory smoothness and safety. We evaluate our approach in a simulation using the Replica dataset. Compared against a state-of-the-art radiance field baseline we show a 100% feasibility rate and shorter trajectories.}
}

2025

SwarmGPT: combining large language models with safe motion planning for drone swarm choreography
M. Schuck, D. O. Dahanaggamaarachchi, B. Sprenger, V. Vyas, S. Zhou, and A. P. Schoellig
IEEE Robotics and Automation Letters (RA-L), vol. 10, iss. 11, pp. 12237-12244, 2025.

Drone swarm performances—synchronized, expressive aerial displays set to music—have emerged as a captivating application of modern robotics. Yet designing smooth, safe choreographies remains a complex task requiring expert knowledge. We present SwarmGPT, a language-based choreographer that leverages the reasoning power of large language models (LLMs) to streamline drone performance design. The LLM is augmented by a safety filter that ensures deployability by making minimal corrections when safety or feasibility constraints are violated. By decoupling high-level choreographic design from low-level motion planning, our system enables non-experts to iteratively refine choreographies using natural language without worrying about collisions or actuator limits. We validate our approach through simulations with swarms up to 200 drones and real-world experiments with up to 20 drones performing choreographies to diverse types of songs, demonstrating scalable, synchronized, and safe performances. Beyond entertainment, this work offers a blueprint for integrating foundation models into safety-critical swarm robotics applications.

@ARTICLE{schuck-ral25,
author = {Martin Schuck and Dinushka Orrin Dahanaggamaarachchi and Ben Sprenger and Vedant Vyas and Siqi Zhou and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters (RA-L)}},
title={{SwarmGPT}: Combining Large Language Models With Safe Motion Planning for Drone Swarm Choreography},
year={2025},
volume={10},
number={11},
pages={12237-12244},
doi={10.1109/LRA.2025.3619745},
abstract={Drone swarm performances—synchronized, expressive aerial displays set to music—have emerged as a captivating application of modern robotics. Yet designing smooth, safe choreographies remains a complex task requiring expert knowledge. We present SwarmGPT, a language-based choreographer that leverages the reasoning power of large language models (LLMs) to streamline drone performance design. The LLM is augmented by a safety filter that ensures deployability by making minimal corrections when safety or feasibility constraints are violated. By decoupling high-level choreographic design from low-level motion planning, our system enables non-experts to iteratively refine choreographies using natural language without worrying about collisions or actuator limits. We validate our approach through simulations with swarms up to 200 drones and real-world experiments with up to 20 drones performing choreographies to diverse types of songs, demonstrating scalable, synchronized, and safe performances. Beyond entertainment, this work offers a blueprint for integrating foundation models into safety-critical swarm robotics applications.}
}

Sensor query schedule and sensor noise covariances for accuracy-constrained trajectory estimation
A. Goudar and A. P. Schoellig
IEEE Robotics and Automation Letters (RA-L), vol. 10, iss. 7, pp. 6983-6990, 2025.

To navigate crowds without collisions, robots must interact with humans by forecasting their future motion and reacting accordingly. While learning-based prediction models have shown success in generating likely human trajectory predictions, integrating these stochastic models into a robot controller presents several challenges. The controller needs to account for interactive coupling between planned robot motion and human predictions while ensuring both predictions and robot actions are safe (i.e. collision-free). To address these challenges, we present a receding horizon crowd navigation method for single-robot multi-human environments. We first propose a diffusion model to generate joint trajectory predictions for all humans in the scene. We then incorporate these multi-modal predictions into a SICNav Bilevel MPC problem that simultaneously solves for a robot plan (upper-level) and acts as a safety filter to refine the predictions for non-collision (lower-level). Combining planning and prediction refinement into one bilevel problem ensures that the robot plan and human predictions are coupled. We validate the open-loop trajectory prediction performance of our diffusion model on the commonly used ETH/UCY benchmark and evaluate the closed-loop performance of our robot navigation method in simulation and extensive real-robot experiments demonstrating safe, efficient, and reactive robot motion.

@ARTICLE{goudar-ral25,
title = {Sensor Query Schedule and Sensor Noise Covariances for Accuracy-Constrained Trajectory Estimation},
author = {Abhishek Goudar and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters (RA-L)}},
year={2025},
volume={10},
number={7},
pages={6983-6990},
doi={10.1109/LRA.2025.3570948},
urllink = {https://arxiv.org/abs/2602.16598},
abstract={To navigate crowds without collisions, robots must interact with humans by forecasting their future motion and reacting accordingly. While learning-based prediction models have shown success in generating likely human trajectory predictions, integrating these stochastic models into a robot controller presents several challenges. The controller needs to account for interactive coupling between planned robot motion and human predictions while ensuring both predictions and robot actions are safe (i.e. collision-free). To address these challenges, we present a receding horizon crowd navigation method for single-robot multi-human environments. We first propose a diffusion model to generate joint trajectory predictions for all humans in the scene. We then incorporate these multi-modal predictions into a SICNav Bilevel MPC problem that simultaneously solves for a robot plan (upper-level) and acts as a safety filter to refine the predictions for non-collision (lower-level). Combining planning and prediction refinement into one bilevel problem ensures that the robot plan and human predictions are coupled. We validate the open-loop trajectory prediction performance of our diffusion model on the commonly used ETH/UCY benchmark and evaluate the closed-loop performance of our robot navigation method in simulation and extensive real-robot experiments demonstrating safe, efficient, and reactive robot motion.}
}

SICNav-Diffusion: safe and interactive crowd navigation with diffusion trajectory predictions
S. Samavi, A. Lem, F. Sato, S. Chen, Q. Gu, K. Yano, A. P. Schoellig, and F. Shkurti
IEEE Robotics and Automation Letters (RA-L), vol. 10, iss. 9, pp. 8738-8745, 2025.

To navigate crowds without collisions, robots must interact with humans by forecasting their future motion and reacting accordingly. While learning-based prediction models have shown success in generating likely human trajectory predictions, integrating these stochastic models into a robot controller presents several challenges. The controller needs to account for interactive coupling between planned robot motion and human predictions while ensuring both predictions and robot actions are safe (i.e. collision-free). To address these challenges, we present a receding horizon crowd navigation method for single-robot multi-human environments. We first propose a diffusion model to generate joint trajectory predictions for all humans in the scene. We then incorporate these multi-modal predictions into a SICNav Bilevel MPC problem that simultaneously solves for a robot plan (upper-level) and acts as a safety filter to refine the predictions for non-collision (lower-level). Combining planning and prediction refinement into one bilevel problem ensures that the robot plan and human predictions are coupled. We validate the open-loop trajectory prediction performance of our diffusion model on the commonly used ETH/UCY benchmark and evaluate the closed-loop performance of our robot navigation method in simulation and extensive real-robot experiments demonstrating safe, efficient, and reactive robot motion.

@ARTICLE{samavi-ral25,
title = {{SICNav-Diffusion}: Safe and Interactive Crowd Navigation with Diffusion Trajectory Predictions},
author = {Sepehr Samavi and Anthony Lem and Fumiaki Sato and Sirui Chen and Qiao Gu and Keijiro Yano and Angela P. Schoellig and Florian Shkurti},
journal = {{IEEE Robotics and Automation Letters (RA-L)}},
year={2025},
volume={10},
number={9},
pages={8738-8745},
doi={10.1109/LRA.2025.3585713},
urllink = {https://sepehr.fyi/projects/sicnav_diffusion},
urlcode = {https://github.com/sepsamavi/safe-interactive-crowdnav},
urlvideo = {tiny.cc/sicnav_diffusion},
abstract={To navigate crowds without collisions, robots must interact with humans by forecasting their future motion and reacting accordingly. While learning-based prediction models have shown success in generating likely human trajectory predictions, integrating these stochastic models into a robot controller presents several challenges. The controller needs to account for interactive coupling between planned robot motion and human predictions while ensuring both predictions and robot actions are safe (i.e. collision-free). To address these challenges, we present a receding horizon crowd navigation method for single-robot multi-human environments. We first propose a diffusion model to generate joint trajectory predictions for all humans in the scene. We then incorporate these multi-modal predictions into a SICNav Bilevel MPC problem that simultaneously solves for a robot plan (upper-level) and acts as a safety filter to refine the predictions for non-collision (lower-level). Combining planning and prediction refinement into one bilevel problem ensures that the robot plan and human predictions are coupled. We validate the open-loop trajectory prediction performance of our diffusion model on the commonly used ETH/UCY benchmark and evaluate the closed-loop performance of our robot navigation method in simulation and extensive real-robot experiments demonstrating safe, efficient, and reactive robot motion.}
}

Advancing reproducibility, benchmarks, and education with remote sim2real
S. Teetaert, W. Zhao, A. Loquercio, S. Zhou, L. Brunke, M. Schuck, W. Hönig, J. Panerati, and A. P. Schoellig
IEEE Robotics and Automation Magazine, vol. 32, iss. 1, p. 117–123, 2025.

Shared and repeatable benchmark problems have historically been a fundamental driver of progress for scientific communities. In academic conferences, benchmarks, replication tracks, and competitions offer the opportunity to researchers, often with different origins, backgrounds, and levels of seniority, to quantitatively compare their ideas. In robotics, a hot and challenging topic is sim2real—porting approaches that work well in simulation to real robot hardware. Hence, this article motivates and describes an aerial sim2real hardware-software framework that we created and used in (i) a competition that ran during the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems and (ii) two graduate courses, “Development of Unmanned Aerial Systems (AER1217)” at the University of Toronto during Winter 2023, and “Autonomous Drone Racing” at Technical University of Munich. (In our case, creating a hybrid competition with both simulation and real robot components was also dictated by the uncertainties around travel and logistics in the post COVID-19 world.) This article describes the task specification, the details of the software infrastructure supporting both simulation and real-life experiments, and the sim2real performance of the solution of the top-placed teams in the competition, as well as the lessons learned by participants and organizers.

@ARTICLE{teetaert-ram25,
title = {Advancing Reproducibility, Benchmarks, and Education with Remote Sim2real},
author = {Spencer Teetaert and Wenda Zhao and Antonio Loquercio and Siqi Zhou and Lukas Brunke and Martin Schuck and Wolfgang H{\"o}nig and Jacopo Panerati and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Magazine}},
number = {1},
volume = {32},
pages = {117--123},
year = {2025},
abstract = {Shared and repeatable benchmark problems have historically been a fundamental driver of progress for scientific communities. In academic conferences, benchmarks, replication tracks, and competitions offer the opportunity to researchers, often with different origins, backgrounds, and levels of seniority, to quantitatively compare their ideas. In robotics, a hot and challenging topic is sim2real—porting approaches that work well in simulation to real robot hardware. Hence, this article motivates and describes an aerial sim2real hardware-software framework that we created and used in (i) a competition that ran during the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems and (ii) two graduate courses, "Development of Unmanned Aerial Systems (AER1217)" at the University of Toronto during Winter 2023, and "Autonomous Drone Racing" at Technical University
of Munich. (In our case, creating a hybrid competition with both simulation and real robot components was also dictated by the uncertainties around travel and logistics in the post COVID-19 world.) This article describes the task specification, the details of the software infrastructure supporting both simulation and real-life experiments, and the sim2real performance of the solution of the top-placed teams in the competition, as well as the lessons learned by participants and organizers.}
}

Targeted hard sample synthesis based on estimated pose and occlusion error for improved object pose estimation
A. Li and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 10, iss. 2, p. 1281–1288, 2025.

6D Object pose estimation is a fundamental component in robotics enabling efficient interaction with the environment. It is particularly challenging in bin-picking applications, where objects may be textureless and in difficult poses, and occlusion between objects of the same type may cause confusion even in well-trained models. We propose a novel method of hard example synthesis that is model-agnostic, using existing simulators and the modeling of pose error in both the camera-to-object viewsphere and occlusion space. Through evaluation of the model performance with respect to the distribution of object poses and occlusions, we discover regions of high error and generate realistic training samples to specifically target these regions. With our training approach, we demonstrate an improvement in correct detection rate of up to 20% across several ROBI-dataset objects using state-of-the-art pose estimation models.

@ARTICLE{li-ral25,
title = {Targeted Hard Sample Synthesis Based on Estimated Pose and Occlusion Error for Improved Object Pose Estimation},
author = {Alan Li and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
number = {2},
volume = {10},
pages = {1281--1288},
year = {2025},
abstract = {6D Object pose estimation is a fundamental component in robotics enabling efficient interaction with the environment. It is particularly challenging in bin-picking applications, where objects may be textureless and in difficult poses, and occlusion between objects of the same type may cause confusion even in well-trained models. We propose a novel method of hard example synthesis that is model-agnostic, using existing simulators and the modeling of pose error in both the camera-to-object viewsphere and occlusion space. Through evaluation of the model performance with respect to the distribution of object poses and occlusions, we discover regions of high error and generate realistic training samples to specifically target these regions. With our training approach, we demonstrate an improvement in correct detection rate of up to 20% across several ROBI-dataset objects using state-of-the-art pose estimation models.}
}

IMU as an input vs. a measurement of the state in inertial-aided state estimation
K. Burnett, A. P. Schoellig, and T. D. Barfoot
Robotica, 2025. In press.

Treating IMU measurements as inputs to a motion model and then preintegrating these measurements has almost become a de-facto standard in many robotics applications. However, this approach has a few shortcomings. First, it conflates the IMU measurement noise with the underlying process noise. Second, it is unclear how the state will be propagated in the case of IMU measurement dropout. Third, it does not lend itself well to dealing with multiple high-rate sensors such as a lidar and an IMU or multiple asynchronous IMUs. In this paper, we compare treating an IMU as an input to a motion model against treating it as a measurement of the state in a continuous-time state estimation framework. We methodically compare the performance of these two approaches on a 1D simulation and show that they perform identically, assuming that each method’s hyperparameters have been tuned on a training set. We also provide results for our continuous-time lidar-inertial odometry in simulation and on the Newer College Dataset. In simulation, our approach exceeds the performance of an imu-as-input baseline during highly aggressive motion. On the Newer College Dataset, we demonstrate state of the art results. These results show that continuous-time techniques and the treatment of the IMU as a measurement of the state are promising areas of further research.

@ARTICLE{burnett-robotica25,
title = {{IMU} as an Input vs. a Measurement of the State in Inertial-Aided State Estimation},
author = {Keenan Burnett and Angela P. Schoellig and Timothy D. Barfoot},
journal = {{Robotica}},
number = {},
volume = {},
pages = {},
doi = {},
year = {2025},
note = {In press},
abstract = {Treating IMU measurements as inputs to a motion model and then preintegrating these measurements has almost become a de-facto standard in many robotics applications. However, this approach has a few shortcomings. First, it conflates the IMU measurement noise with the underlying process noise. Second, it is unclear how the state will be propagated in the case of IMU measurement dropout. Third, it does not lend itself well to dealing with multiple high-rate sensors such as a lidar and an IMU or multiple asynchronous IMUs. In this paper, we compare treating an IMU as an input to a motion model against treating it as a measurement of the state in a continuous-time state estimation framework. We methodically compare the performance of these two approaches on a 1D simulation and show that they perform identically, assuming that each method's hyperparameters have been tuned on a training set. We also provide results for our continuous-time lidar-inertial odometry in simulation and on the Newer College Dataset. In simulation, our approach exceeds the performance of an imu-as-input baseline during highly aggressive motion. On the Newer College Dataset, we demonstrate state of the art results. These results show that continuous-time techniques and the treatment of the IMU as a measurement of the state are promising areas of further research.}
}

Semantically safe robot manipulation: from semantic scene understanding to motion safeguards
L. Brunke, Y. Zhang, R. Römer, J. Naimer, N. Staykov, S. Zhou, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 10, iss. 5, p. 4810–4817, 2025.

Ensuring safe interactions in human-centric environments requires robots to understand and adhere to constraints recognized by humans as “common sense” (e.g., “moving a cup of water above a laptop is unsafe as the water may spill” or “rotating a cup of water is unsafe as it can lead to pouring its content”). Recent advances in computer vision and machine learning have enabled robots to acquire a semantic understanding of and reason about their operating environments. While extensive literature on safe robot decision-making exists, semantic understanding is rarely integrated into these formulations. In this work, we propose a semantic safety filter framework to certify robot inputs with respect to semantically defined constraints (e.g., unsafe spatial relationships, behaviors, and poses) and geometrically defined constraints (e.g., environment-collision and self-collision constraints). In our proposed approach, given perception inputs, we build a semantic map of the 3D environment and leverage the contextual reasoning capabilities of large language models to infer semantically unsafe conditions. These semantically unsafe conditions are then mapped to safe actions through a control barrier certification formulation. We demonstrate the proposed semantic safety filter in teleoperated manipulation tasks and with learned diffusion policies applied in a real-world kitchen environment that further showcases its effectiveness in addressing practical semantic safety constraints. Together, these experiments highlight our approach’s capability to integrate semantics into safety certification, enabling safe robot operation beyond traditional collision avoidance.

@ARTICLE{brunke-ral25,
title = {Semantically Safe Robot Manipulation: From Semantic Scene Understanding to Motion Safeguards},
author = {Lukas Brunke and Yanni Zhang and Ralf R{\"o}mer and Jack Naimer and Nikola Staykov and Siqi Zhou and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
number = {5},
volume = {10},
pages = {4810--4817},
doi = {},
year = {2025},
abstract = {Ensuring safe interactions in human-centric environments requires robots to understand and adhere to constraints recognized by humans as "common sense" (e.g., "moving a cup of water above a laptop is unsafe as the water may spill" or "rotating a cup of water is unsafe as it can lead to pouring its content"). Recent advances in computer vision and machine learning have enabled robots to acquire a semantic understanding of and reason about their operating environments. While extensive literature on safe robot decision-making exists, semantic understanding is rarely integrated into these formulations. In this work, we propose a semantic safety filter framework to certify robot inputs with respect to semantically defined constraints (e.g., unsafe spatial relationships, behaviors, and poses) and geometrically defined constraints (e.g., environment-collision and self-collision constraints). In our proposed approach, given perception inputs, we build a semantic map of the 3D environment and leverage the contextual reasoning capabilities of large language models to infer semantically unsafe conditions. These semantically unsafe conditions are then mapped to safe actions through a control barrier certification formulation. We demonstrate the proposed semantic safety filter in teleoperated manipulation tasks and with learned diffusion policies applied in a real-world kitchen environment that further showcases its effectiveness in addressing practical semantic safety constraints. Together, these experiments highlight our approach's capability to integrate semantics into safety certification, enabling safe robot operation beyond traditional collision avoidance.}
}

Safe multi-agent reinforcement learning for behavior-based cooperative navigation
M. Dawood, S. Pan, N. Dengler, S. Zhou, A. P. Schoellig, and M. Bennewitz
IEEE Robotics and Automation Letters, 2025. Accepted.

In this paper, we address the problem of behavior-based cooperative navigation of mobile robots using safe multi-agent reinforcement learning (MARL). Our work is the first to focus on cooperative navigation without individual reference targets for the robots, using a single target for the formation’s centroid. This eliminates the complexities involved in having several path planners to control a team of robots. To ensure safety, our MARL framework uses model predictive control (MPC) to prevent actions that could lead to collisions during training and execution. We demonstrate the effectiveness of our method in simulation and on real robots, achieving safe behavior-based cooperative navigation without using individual reference targets, with zero collisions, and faster target reaching compared to baselines. Finally, we study the impact of MPC safety filters on the learning process, revealing that we achieve faster convergence during training and we show that our approach can be safely deployed on real robots, even during early stages of the training.

@ARTICLE{dawood-ral25,
title = {Safe multi-agent reinforcement learning for behavior-based cooperative navigation},
author = {Murad Dawood and Sicong Pan and Nils Dengler and Siqi Zhou and Angela P. Schoellig and Maren Bennewitz},
journal = {{IEEE Robotics and Automation Letters}},
number = {},
volume = {},
pages = {},
doi = {},
year = {2025},
note = {Accepted},
abstract = {In this paper, we address the problem of behavior-based cooperative navigation of mobile robots using safe multi-agent reinforcement learning (MARL). Our work is the first to focus on cooperative navigation without individual reference targets for the robots, using a single target for the formation's centroid. This eliminates the complexities involved in having several path planners to control a team of robots. To ensure safety, our MARL framework uses model predictive control (MPC) to prevent actions that could lead to collisions during training and execution. We demonstrate the effectiveness of our method in simulation and on real robots, achieving safe behavior-based cooperative navigation without using individual reference targets, with zero collisions, and faster target reaching compared to baselines. Finally, we study the impact of MPC safety filters on the learning process, revealing that we achieve faster convergence during training and we show that our approach can be safely deployed on real robots, even during early stages of the training.}
}

Robust nonprehensile object transportation with uncertain inertial parameters
A. Heins and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 10, iss. 5, p. 4492–4499, 2025.

We consider the nonprehensile object transportation task known as the waiter’s problem–-in which a robot must move an object on a tray from one location to another–-when the transported object has uncertain inertial parameters. In contrast to existing approaches that completely ignore uncertainty in the inertia matrix or which only consider small parameter errors, we are interested in pushing the limits of the amount of inertial parameter uncertainty that can be handled. We first show how constraints that are robust to inertial parameter uncertainty can be incorporated into an optimization-based motion planning framework to transport objects while moving quickly. Next, we develop necessary conditions for the inertial parameters to be realizable on a bounding shape based on moment relaxations, allowing us to verify whether a trajectory will violate the constraints for any realizable inertial parameters. Finally, we demonstrate our approach on a mobile manipulator in simulations and real hardware experiments: our proposed robust constraints consistently successfully transport a 56 cm tall object with substantial inertial parameter uncertainty in the real world, while the baseline approaches drop the object while transporting it.

@ARTICLE{heins-ral25,
title = {Robust Nonprehensile Object Transportation with Uncertain Inertial Parameters},
author = {Adam Heins and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
number = {5},
volume = {10},
pages = {4492--4499},
doi = {10.1109/LRA.2025.3551067},
year = {2025},
abstract = {We consider the nonprehensile object transportation task known as the waiter's problem---in which a robot must move an object on a tray from one location to another---when the transported object has uncertain inertial parameters. In contrast to existing approaches that completely ignore uncertainty in the inertia matrix or which only consider small parameter errors, we are interested in pushing the limits of the amount of inertial parameter uncertainty that can be handled. We first show how constraints that are robust to inertial parameter uncertainty can be incorporated into an optimization-based motion planning framework to transport objects while moving quickly. Next, we develop necessary conditions for the inertial parameters to be realizable on a bounding shape based on moment relaxations, allowing us to verify whether a trajectory will violate the constraints for any realizable inertial parameters. Finally, we demonstrate our approach on a mobile manipulator in simulations and real hardware experiments: our proposed robust constraints consistently successfully transport a 56 cm tall object with substantial inertial parameter uncertainty in the real world, while the baseline approaches drop the object while transporting it.}
}

Safety filtering while training: improving the performance and sample efficiency of reinforcement learning agents
F. Pizarro Bejarano, L. Brunke, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 10, iss. 1, p. 788–795, 2025.

Reinforcement learning (RL) controllers are flexible and performant but rarely guarantee safety. Safety filters impart hard safety guarantees to RL controllers while maintaining flexibility. However, safety filters can cause undesired behaviours due to the separation between the controller and the safety filter, often degrading performance and robustness. In this letter, we analyze several modifications to incorporating the safety filter in training RL controllers rather than solely applying it during evaluation. The modifications allow the RL controller to learn to account for the safety filter. This letter presents a comprehensive analysis of training RL with safety filters, featuring simulated and real-world experiments with a Crazyflie 2.0 drone. We examine how various training modifications and hyperparameters impact performance, sample efficiency, safety, and chattering. Our findings serve as a guide for practitioners and researchers focused on safety filters and safe RL.

@ARTICLE{pizarro-ral25,
title = {Safety Filtering While Training: Improving the Performance and Sample Efficiency of Reinforcement Learning Agents},
author = {Federico {Pizarro Bejarano} and Lukas Brunke and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
volume = {10},
number = {1},
pages = {788--795},
doi = {10.1109/LRA.2024.3512374},
year = {2025},
abstract = {Reinforcement learning (RL) controllers are flexible and performant but rarely guarantee safety. Safety filters impart hard safety guarantees to RL controllers while maintaining flexibility. However, safety filters can cause undesired behaviours due to the separation between the controller and the safety filter, often degrading performance and robustness. In this letter, we analyze several modifications to incorporating the safety filter in training RL controllers rather than solely applying it during evaluation. The modifications allow the RL controller to learn to account for the safety filter. This letter presents a comprehensive analysis of training RL with safety filters, featuring simulated and real-world experiments with a Crazyflie 2.0 drone. We examine how various training modifications and hyperparameters impact performance, sample efficiency, safety, and chattering. Our findings serve as a guide for practitioners and researchers focused on safety filters and safe RL.}
}

Failure prediction at runtime for generative robot policies
R. Römer*, A. Kobras*, L. Worbis, and A. P. Schoellig
in Proc. of the Neural Information Processing Systems (NeurIPS), 2025. Accepted.

We propose FIPER, a general framework for Failure Prediction at Runtime for generative imitation learning policies without requiring failure data. FIPER combines (i) random network distillation (RND) for out-of-distribution (OOD) detection in the policy’s observation embedding space and (ii) a novel action chunk entropy (ACE) score for quantifying uncertainty in the conditional action distribution. Both failure prediction scores are calibrated on a small set of successful rollouts using conformal prediction and aggregated over short time windows. A failure alarm is triggered when both indicators exceed their respective thresholds. We evaluate FIPER across five simulation and real-world environments involving diverse failure modes. Results show that FIPER better distinguishes actual failures from benign OOD situations and predicts failures more accurately and earlier than existing approaches. We thus consider FIPER an important step towards more interpretable and safer generative robot policies.

@inproceedings{roemer-neurips25,
author={Ralf Römer* and Adrian Kobras* and Luca Worbis and Angela P. Schoellig},
booktitle = {{Proc. of the Neural Information Processing Systems (NeurIPS)}},
title={Failure Prediction at Runtime for Generative Robot Policies},
year={2025},
note={Accepted},
urllink = {https://tum-lsy.github.io/fiper_website/},
urlcode={https://github.com/utiasDSL/fiper},
urlvideo={https://www.youtube.com/watch?v=BTNQlxf57jU},
abstract = {We propose FIPER, a general framework for Failure Prediction at Runtime for generative imitation learning policies without requiring failure data. FIPER combines (i) random network distillation (RND) for out-of-distribution (OOD) detection in the policy’s observation embedding space and (ii) a novel action chunk entropy (ACE) score for quantifying uncertainty in the conditional action distribution. Both failure prediction scores are calibrated on a small set of successful rollouts using conformal prediction and aggregated over short time windows. A failure alarm is triggered when both indicators exceed their respective thresholds. We evaluate FIPER across five simulation and real-world environments involving diverse failure modes. Results show that FIPER better distinguishes actual failures from benign OOD situations and predicts failures more accurately and earlier than existing approaches. We thus consider FIPER an important step towards more interpretable and safer generative robot policies.},
}

Fine-tuning of neural network approximate mpc without retraining via bayesian optimization
H. Hose, P. Brunzema, A. von Rohr, A. Gräfe, A. P. Schoellig, and S. Trimpe
in Proc. of the International Conference on Robot Intelligence Technology and Applications (RITA), 2025. Accepted.

Approximate model-predictive control (AMPC) aims to imitate an MPC’s behavior with a neural network, removing the need to solve an expensive optimization problem at runtime. However, during deployment, the parameters of the underlying MPC must usually be fine-tuned. This often renders AMPC impractical as it requires repeatedly generating a new dataset and retraining the neural network. Recent work addresses this problem by adapting AMPC without retraining using approximated sensitivities of the MPC’s optimization problem. Currently, this adaption must be done by hand, which is labor-intensive and can be unintuitive for high-dimensional systems. To solve this issue, we propose using Bayesian optimization to tune the parameters of AMPC policies based on experimental data. By combining model-based control with direct and local learning, our approach achieves superior performance to nominal AMPC on hardware, with minimal experimentation. This allows automatic and data-efficient adaptation of AMPC to new system instances and fine-tuning to cost functions that are difficult to directly implement in MPC. We demonstrate the proposed method in hardware experiments for the swing-up maneuver on an inverted cartpole and yaw control of an under-actuated balancing unicycle robot, a challenging control problem.

@inproceedings{hose-rita25,
author={Henrik Hose and Paul Brunzema and Alexander von Rohr and Alexander Gräfe and Angela P. Schoellig and Sebastian Trimpe},
booktitle = {{Proc. of the International Conference on Robot Intelligence Technology and Applications (RITA)}},
title={Fine-Tuning of Neural Network Approximate MPC without Retraining via Bayesian Optimization},
year={2025},
note={Accepted},
urllink = {https://arxiv.org/abs/2512.14350},
urlcode={https://github.com/hshose/BO-parameter-adaptive-AMPC},
urlvideo={https://youtu.be/EhMNIMqVKZk},
abstract = {Approximate model-predictive control (AMPC) aims to imitate an MPC's behavior with a neural network, removing the need to solve an expensive optimization problem at runtime. However, during deployment, the parameters of the underlying MPC must usually be fine-tuned. This often renders AMPC impractical as it requires repeatedly generating a new dataset and retraining the neural network. Recent work addresses this problem by adapting AMPC without retraining using approximated sensitivities of the MPC's optimization problem. Currently, this adaption must be done by hand, which is labor-intensive and can be unintuitive for high-dimensional systems. To solve this issue, we propose using Bayesian optimization to tune the parameters of AMPC policies based on experimental data. By combining model-based control with direct and local learning, our approach achieves superior performance to nominal AMPC on hardware, with minimal experimentation. This allows automatic and data-efficient adaptation of AMPC to new system instances and fine-tuning to cost functions that are difficult to directly implement in MPC. We demonstrate the proposed method in hardware experiments for the swing-up maneuver on an inverted cartpole and yaw control of an under-actuated balancing unicycle robot, a challenging control problem. },
}

Improving drone racing performance through iterative learning mpc
H. Zhao, N. Schlüter, L. Brunke, and A. P. Schoellig
in 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025, pp. 6667-6674.

Autonomous drone racing presents a challenging control problem, requiring real-time decision-making and robust handling of nonlinear system dynamics. While iterative learning model predictive control (LMPC) offers a promising framework for iterative performance improvement, its direct application to drone racing faces challenges like real-time compatibility or the trade-off between time-optimal and safe traversal. In this paper, we enhance LMPC with three key innovations: (1) an adaptive cost function that dynamically weights time-optimal tracking against centerline adherence, (2) a shifted local safe set to prevent excessive shortcutting and enable more robust iterative updates, and (3) a Cartesian-based formulation that accommodates safety constraints without the singularities or integration errors associated with Frenet-frame transformations. Results from extensive simulation and real-world experiments demonstrate that our improved algorithm can optimize initial trajectories generated by a wide range of controllers with varying levels of tuning for a maximum improvement in lap time by 60.85%. Even applied to the most aggressively tuned state-of-the-art model-based controller, MPCC++, on a real drone, a 6.05% improvement is still achieved. Overall, the proposed method pushes the drone toward faster traversal and avoids collisions in simulation and real-world experiments, making it a practical solution to improve the peak performance of drone racing.

@inproceedings{zhao-iros25,
author={Haocheng Zhao and Niklas Schlüter and Lukas Brunke and Angela P. Schoellig},
booktitle={{2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
title={Improving Drone Racing Performance Through Iterative Learning MPC},
year={2025},
volume={},
number={},
pages={6667-6674},
urllink={https://arxiv.org/abs/2508.01103},
urlvideo={https://www.youtube.com/watch?v=ZIVtAFOpe-Y},
abstract={Autonomous drone racing presents a challenging control problem, requiring real-time decision-making and robust handling of nonlinear system dynamics. While iterative learning model predictive control (LMPC) offers a promising framework for iterative performance improvement, its direct application to drone racing faces challenges like real-time compatibility or the trade-off between time-optimal and safe traversal. In this paper, we enhance LMPC with three key innovations: (1) an adaptive cost function that dynamically weights time-optimal tracking against centerline adherence, (2) a shifted local safe set to prevent excessive shortcutting and enable more robust iterative updates, and (3) a Cartesian-based formulation that accommodates safety constraints without the singularities or integration errors associated with Frenet-frame transformations. Results from extensive simulation and real-world experiments demonstrate that our improved algorithm can optimize initial trajectories generated by a wide range of controllers with varying levels of tuning for a maximum improvement in lap time by 60.85%. Even applied to the most aggressively tuned state-of-the-art model-based controller, MPCC++, on a real drone, a 6.05% improvement is still achieved. Overall, the proposed method pushes the drone toward faster traversal and avoids collisions in simulation and real-world experiments, making it a practical solution to improve the peak performance of drone racing.},
doi={10.1109/IROS60139.2025.11246700}
}

Reinforcement learning with lie group orientations for robotics
M. Schuck, J. Brüdigam, S. Hirche, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2025, pp. 14369-14376. Accepted.

Handling orientations of robots and objects is a crucial aspect of many applications. Yet, ever so often, there is a lack of mathematical correctness when dealing with orientations, especially in learning pipelines involving, for example, artificial neural networks. In this paper, we investigate reinforcement learning with orientations and propose a simple modification of the network’s input and output that adheres to the Lie group structure of orientations. As a result, we obtain an easy and efficient implementation that is directly usable with existing learning libraries and achieves significantly better performance than other common orientation representations. We briefly introduce Lie theory specifically for orientations in robotics to motivate and outline our approach. Subsequently, a thorough empirical evaluation of different combinations of orientation representations for states and actions demonstrates the superior performance of our proposed approach in different scenarios, including: direct orientation control, end effector orientation control, and pick-and-place tasks.

@inproceedings{schuck-icra25,
author={Martin Schuck and Jan Br{\"u}digam and Sandra Hirche and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
title={Reinforcement Learning with Lie Group Orientations for Robotics},
year={2025},
note={Accepted},
volume={},
number={},
doi={10.1109/ICRA55743.2025.11128743},
pages={14369-14376},
abstract = {Handling orientations of robots and objects is a crucial aspect of many applications. Yet, ever so often, there is a lack of mathematical correctness when dealing with orientations, especially in learning pipelines involving, for example, artificial neural networks. In this paper, we investigate reinforcement learning with orientations and propose a simple modification of the network's input and output that adheres to the Lie group structure of orientations. As a result, we obtain an easy and efficient implementation that is directly usable with existing learning libraries and achieves significantly better performance than other common orientation representations. We briefly introduce Lie theory specifically for orientations in robotics to motivate and outline our approach. Subsequently, a thorough empirical evaluation of different combinations of orientation representations for states and actions demonstrates the superior performance of our proposed approach in different scenarios, including: direct orientation control, end effector orientation control, and pick-and-place tasks.}
}

Flying through moving gates without full state estimation
R. Römer, T. Emmert, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2025. Accepted.

Autonomous drone racing requires powerful perception, planning, and control and has become a benchmark and test field for autonomous, agile flight. Existing work usually assumes static race tracks with known maps, which enables offline planning of time-optimal trajectories, performing localization to the gates to reduce the drift in visual-inertial odometry (VIO) for state estimation or training learning-based methods for the particular race track and operating environment. In contrast, many real-world tasks like disaster response or delivery need to be performed in unknown and dynamic environments. To make drone racing more robust against unseen environments and moving gates, we propose a control algorithm that operates without a race track map or VIO, relying solely on monocular measurements of the line of sight to the gates. For this purpose, we adopt the law of proportional navigation (PN) to accurately fly through the gates despite gate motions or wind. We formulate the PN-informed vision-based control problem for drone racing as a constrained optimization problem and derive a closed-form optimal solution. Through simulations and real-world experiments, we demonstrate that our algorithm can navigate through moving gates at high speeds while being robust to different gate movements, model errors, wind, and delays.

@inproceedings{romer-icra25,
author={Ralf R{\"o}mer and Tim Emmert and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
title={Flying through Moving Gates without Full State Estimation},
year={2025},
note={Accepted},
abstract = {Autonomous drone racing requires powerful perception, planning, and control and has become a benchmark and test field for autonomous, agile flight. Existing work usually assumes static race tracks with known maps, which enables offline planning of time-optimal trajectories, performing localization to the gates to reduce the drift in visual-inertial odometry (VIO) for state estimation or training learning-based methods for the particular race track and operating environment. In contrast, many real-world tasks like disaster response or delivery need to be performed in unknown and dynamic environments. To make drone racing more robust against unseen environments and moving gates, we propose a control algorithm that operates without a race track map or VIO, relying solely on monocular measurements of the line of sight to the gates. For this purpose, we adopt the law of proportional navigation (PN) to accurately fly through the gates despite gate motions or wind. We formulate the PN-informed vision-based control problem for drone racing as a constrained optimization problem and derive a closed-form optimal solution. Through simulations and real-world experiments, we demonstrate that our algorithm can navigate through moving gates at high speeds while being robust to different gate movements, model errors, wind, and delays.}
}

ProDapt: proprioceptive adaptation using long-term memory diffusion
F. P. Bejarano, B. Jones, D. P. Moreno, J. Bowkett, P. G. Backes, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2025. Accepted.

Diffusion models have revolutionized imitation learning, allowing robots to replicate complex behaviours. However, diffusion often relies on cameras and other exteroceptive sensors to observe the environment and lacks long-term memory. In space, military, and underwater applications, robots must be highly robust to failures in exteroceptive sensors, operating using only proprioceptive information. In this paper, we propose ProDapt, a method of incorporating long-term memory of previous contacts between the robot and the environment in the diffusion process, allowing it to complete tasks using only proprioceptive data. This is achieved by identifying “keypoints”, essential past observations maintained as inputs to the policy. We test our approach using a UR10e robotic arm in both simulation and real experiments and demonstrate the necessity of this long-term memory for task completion.

@inproceedings{pizarro-icra25,
author={Federico Pizarro Bejarano and Bryson Jones and Daniel Pastor Moreno and Joseph Bowkett and Paul G. Backes and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
title={{ProDapt}: Proprioceptive Adaptation using Long-term Memory Diffusion},
year={2025},
note={Accepted},
abstract = {Diffusion models have revolutionized imitation learning, allowing robots to replicate complex behaviours. However, diffusion often relies on cameras and other exteroceptive sensors to observe the environment and lacks long-term memory. In space, military, and underwater applications, robots must be highly robust to failures in exteroceptive sensors, operating using only proprioceptive information. In this paper, we propose ProDapt, a method of incorporating long-term memory of previous contacts between the robot and the environment in the diffusion process, allowing it to complete tasks using only proprioceptive data. This is achieved by identifying "keypoints", essential past observations maintained as inputs to the policy. We test our approach using a UR10e robotic arm in both simulation and real experiments and demonstrate the necessity of this long-term memory for task completion.}
}

Diffusion predictive control with constraints
R. Römer, A. von Rohr, and A. P. Schoellig
in Proc. of the Conference on Learning for Dynamics and Control (L4DC), 2025. Accepted.

Diffusion models have recently gained popularity for policy learning in robotics due to their ability to capture high-dimensional and multimodal distributions. However, diffusion policies are inherently stochastic and typically trained offline, limiting their ability to handle unseen and dynamic conditions where novel constraints not represented in the training data must be satisfied. To overcome this limitation, we propose diffusion predictive control with constraints (DPCC), an algorithm for diffusion-based control with explicit state and action constraints that can deviate from those in the training data. DPCC uses constraint tightening and incorporates model-based projections into the denoising process of a trained trajectory diffusion model. This allows us to generate constraint-satisfying, dynamically feasible, and goal-reaching trajectories for predictive control. We show through simulations of a robot manipulator that DPCC outperforms existing methods in satisfying novel test-time constraints while maintaining performance on the learned control task.

@inproceedings{romer-l4dc25,
author={Ralf R{\"o}mer and Alexander von Rohr and Angela P. Schoellig},
booktitle = {{Proc. of the Conference on Learning for Dynamics and Control (L4DC)}},
title={Diffusion Predictive Control with Constraints},
year={2025},
note={Accepted},
abstract = {Diffusion models have recently gained popularity for policy learning in robotics due to their ability to capture high-dimensional and multimodal distributions. However, diffusion policies are inherently stochastic and typically trained offline, limiting their ability to handle unseen and dynamic conditions where novel constraints not represented in the training data must be satisfied. To overcome this limitation, we propose diffusion predictive control with constraints (DPCC), an algorithm for diffusion-based control with explicit state and action constraints that can deviate from those in the training data. DPCC uses constraint tightening and incorporates model-based projections into the denoising process of a trained trajectory diffusion model. This allows us to generate constraint-satisfying, dynamically feasible, and goal-reaching trajectories for predictive control. We show through simulations of a robot manipulator that DPCC outperforms existing methods in satisfying novel test-time constraints while maintaining performance on the learned control task.}
}

CoRe-GS: coarse-to-refined gaussian splatting with semantic object focus
H. Schieber, D. Frischmann, V. Schaack, S. Boche, A. Schoellig, S. Leutenegger, and D. Roth
Technical Report, arXiv Preprint, 2025. Under review.

Mobile reconstruction has the potential to support time-critical tasks such as tele-guidance and disaster response, where operators must quickly gain an accurate understanding of the environment. Full high-fidelity scene reconstruction is computationally expensive and often unnecessary when only specific points of interest (POIs) matter for timely decision making. We address this challenge with CoRe-GS, a semantic POI-focused extension of Gaussian Splatting (GS). Instead of optimizing every scene element uniformly, CoRe-GS first produces a fast segmentation-ready GS representation and then selectively refines splats belonging to semantically relevant POIs detected during data acquisition. This targeted refinement reduces training time to 25\% compared to full semantic GS while improving novel view synthesis quality in the areas that matter most. We validate CoRe-GS on both real-world (SCRREAM) and synthetic (NeRDS 360) datasets, demonstrating that prioritizing POIs enables faster and higher-quality mobile reconstruction tailored to operational needs.

@TECHREPORT{schieber-arxiv25,
author={Hannah Schieber and Dominik Frischmann and Victor Schaack and Simon Boche and Angela Schoellig and Stefan Leutenegger and Daniel Roth},
institution = {arXiv Preprint},
title={{CoRe-GS}: Coarse-to-Refined Gaussian Splatting with Semantic Object Focus},
year={2025},
note={Under review},
urllink = {https://hannahhaensen.github.io/core-gs/},
abstract = {Mobile reconstruction has the potential to support time-critical tasks such as tele-guidance and disaster response, where operators must quickly gain an accurate understanding of the environment. Full high-fidelity scene reconstruction is computationally expensive and often unnecessary when only specific points of interest (POIs) matter for timely decision making. We address this challenge with CoRe-GS, a semantic POI-focused extension of Gaussian Splatting (GS). Instead of optimizing every scene element uniformly, CoRe-GS first produces a fast segmentation-ready GS representation and then selectively refines splats belonging to semantically relevant POIs detected during data acquisition. This targeted refinement reduces training time to 25\% compared to full semantic GS while improving novel view synthesis quality in the areas that matter most. We validate CoRe-GS on both real-world (SCRREAM) and synthetic (NeRDS 360) datasets, demonstrating that prioritizing POIs enables faster and higher-quality mobile reconstruction tailored to operational needs.},
}

ARMimic: learning robotic manipulation from passive human demonstrations in augmented reality
R. Walia, Y. Wang, R. Römer, M. Nishio, A. P. Schoellig, and J. Ota
Technical Report, arXiv Preprint, 2025. Under review.

Imitation learning is a powerful paradigm for robot skill acquisition, yet conventional demonstration methods–such as kinesthetic teaching and teleoperation–are cumbersome, hardware-heavy, and disruptive to workflows. Recently, passive observation using extended reality (XR) headsets has shown promise for egocentric demonstration collection, yet current approaches require additional hardware, complex calibration, or constrained recording conditions that limit scalability and usability. We present ARMimic, a novel framework that overcomes these limitations with a lightweight and hardware-minimal setup for scalable, robot-free data collection using only a consumer XR headset and a stationary workplace camera. ARMimic integrates egocentric hand tracking, augmented reality (AR) robot overlays, and real-time depth sensing to ensure collision-aware, kinematically feasible demonstrations. A unified imitation learning pipeline is at the core of our method, treating both human and virtual robot trajectories as interchangeable, which enables policies that generalize across different embodiments and environments. We validate ARMimic on two manipulation tasks, including challenging long-horizon bowl stacking. In our experiments, ARMimic reduces demonstration time by 50% compared to teleoperation and improves task success by 11% over ACT, a state-of-the-art baseline trained on teleoperated data. Our results demonstrate that ARMimic enables safe, seamless, and in-the-wild data collection, offering great potential for scalable robot learning in diverse real-world settings.

@TECHREPORT{walia-arxiv25,
author={Rohan Walia and Yusheng Wang and Ralf Römer and Masahiro Nishio and Angela P. Schoellig and Jun Ota},
institution = {arXiv Preprint},
title={{ARMimic}: Learning Robotic Manipulation from Passive Human Demonstrations in Augmented Reality},
year={2025},
note={Under review},
urllink = {https://arxiv.org/abs/2509.22914},
abstract = {Imitation learning is a powerful paradigm for robot skill acquisition, yet conventional demonstration methods--such as kinesthetic teaching and teleoperation--are cumbersome, hardware-heavy, and disruptive to workflows. Recently, passive observation using extended reality (XR) headsets has shown promise for egocentric demonstration collection, yet current approaches require additional hardware, complex calibration, or constrained recording conditions that limit scalability and usability. We present ARMimic, a novel framework that overcomes these limitations with a lightweight and hardware-minimal setup for scalable, robot-free data collection using only a consumer XR headset and a stationary workplace camera. ARMimic integrates egocentric hand tracking, augmented reality (AR) robot overlays, and real-time depth sensing to ensure collision-aware, kinematically feasible demonstrations. A unified imitation learning pipeline is at the core of our method, treating both human and virtual robot trajectories as interchangeable, which enables policies that generalize across different embodiments and environments. We validate ARMimic on two manipulation tasks, including challenging long-horizon bowl stacking. In our experiments, ARMimic reduces demonstration time by 50% compared to teleoperation and improves task success by 11% over ACT, a state-of-the-art baseline trained on teleoperated data. Our results demonstrate that ARMimic enables safe, seamless, and in-the-wild data collection, offering great potential for scalable robot learning in diverse real-world settings.},
}

Deploying sicnav in the field: safe and interactive crowd navigation using mpc and bilevel optimization
S. Samavi, G. Bhutani, F. Shkurti, and A. P. Schoellig
Abstract and Poster, in the Workshop on Field Robotics at the IEEE International Conference on Robotics and Automation (ICRA), 2025.

Safe and efficient navigation in crowded environments remains a critical challenge for robots that provide a variety of service tasks such as food delivery or autonomous wheelchair mobility. Classical robot crowd navigation methods decouple human motion prediction from robot motion planning, which neglects the closed-loop interactions between humans and robots. This lack of a model for human reactions to the robot plan (e.g. moving out of the way) can cause the robot to get stuck. Our proposed Safe and Interactive Crowd Navigation (SICNav) method is a bilevel Model Predictive Control (MPC) framework that combines prediction and planning into one optimization problem, explicitly modeling interactions among agents. In this paper, we present a systems overview of the crowd navigation platform we use to deploy SICNav in previously unseen indoor and outdoor environments. We provide a preliminary analysis of the system’s operation over the course of nearly 7 km of autonomous navigation over two hours in both indoor and outdoor environments.

@MISC{samavi-icra25,
author = {Sepehr Samavi and Garvish Bhutani and Florian Shkurti and Angela P. Schoellig},
title = {Deploying SICNav in the Field: Safe and Interactive Crowd Navigation using MPC and Bilevel Optimization},
year = {2025},
howpublished = {Abstract and Poster, in the Workshop on Field Robotics at the IEEE International Conference on Robotics and Automation (ICRA)},
urllink = {https://arxiv.org/abs/2506.08851},
abstract = {Safe and efficient navigation in crowded environments remains a critical challenge for robots that provide a variety of service tasks such as food delivery or autonomous wheelchair mobility. Classical robot crowd navigation methods decouple human motion prediction from robot motion planning, which neglects the closed-loop interactions between humans and robots. This lack of a model for human reactions to the robot plan (e.g. moving out of the way) can cause the robot to get stuck. Our proposed Safe and Interactive Crowd Navigation (SICNav) method is a bilevel Model Predictive Control (MPC) framework that combines prediction and planning into one optimization problem, explicitly modeling interactions among agents. In this paper, we present a systems overview of the crowd navigation platform we use to deploy SICNav in previously unseen indoor and outdoor environments. We provide a preliminary analysis of the system's operation over the course of nearly 7 km of autonomous navigation over two hours in both indoor and outdoor environments.}
}

Scipy. spatial. transform: differentiable framework-agnostic 3d transformations in python
M. Schuck, A. von Rohr, and A. P. Schoellig
Abstract and Poster, in the Workshop on Differentiable Systems and Scientific Machine Learning at EurIPS, 2025.

Three-dimensional rigid-body transforms, i.e. rotations and translations, are central to modern differentiable machine learning pipelines in robotics, vision, and simulation. However, numerically robust and mathematically correct implementations, particularly on SO(3), are error-prone due to issues such as axis conventions, normalizations, composition consistency and subtle errors that only appear in edge cases. SciPy’s spatialtransform module is a rigorously tested Python implementation. However, it historically only supported NumPy, limiting adoption in GPU-accelerated and autodiff-based workflows. We present a complete overhaul of SciPy’s spatialtransform functionality that makes it compatible with any array library implementing the Python array API, including JAX, PyTorch, and CuPy. The revised implementation preserves the established SciPy interface while enabling GPU/TPU execution, JIT compilation, vectorized batching, and differentiation via native autodiff of the chosen backend. We demonstrate how this foundation supports differentiable scientific computing through two case studies: (i) scalability of 3D transforms and rotations and (ii) a JAX drone simulation that leverages SciPy’s Rotation for accurate integration of rotational dynamics. Our contributions have been merged into SciPy main and will ship in the next release, providing a framework-agnostic, production-grade basis for 3D spatial math in differentiable systems and ML.

@MISC{schuck-eurips25,
author = {Martin Schuck and Alexander von Rohr and Angela P. Schoellig},
title = {scipy. spatial. transform: Differentiable Framework-Agnostic 3D Transformations in Python},
year = {2025},
howpublished = {Abstract and Poster, in the Workshop on Differentiable Systems and Scientific Machine Learning at EurIPS},
urllink = {https://arxiv.org/abs/2511.18157},
abstract = {Three-dimensional rigid-body transforms, i.e. rotations and translations, are central to modern differentiable machine learning pipelines in robotics, vision, and simulation. However, numerically robust and mathematically correct implementations, particularly on SO(3), are error-prone due to issues such as axis conventions, normalizations, composition consistency and subtle errors that only appear in edge cases. SciPy's spatialtransform module is a rigorously tested Python implementation. However, it historically only supported NumPy, limiting adoption in GPU-accelerated and autodiff-based workflows. We present a complete overhaul of SciPy's spatialtransform functionality that makes it compatible with any array library implementing the Python array API, including JAX, PyTorch, and CuPy. The revised implementation preserves the established SciPy interface while enabling GPU/TPU execution, JIT compilation, vectorized batching, and differentiation via native autodiff of the chosen backend. We demonstrate how this foundation supports differentiable scientific computing through two case studies: (i) scalability of 3D transforms and rotations and (ii) a JAX drone simulation that leverages SciPy's Rotation for accurate integration of rotational dynamics. Our contributions have been merged into SciPy main and will ship in the next release, providing a framework-agnostic, production-grade basis for 3D spatial math in differentiable systems and ML.}
}

2024

Continuous-time radar-inertial and lidar-inertial odometry using a Gaussian process motion prior
K. Burnett, A. P. Schoellig, and T. D. Barfoot
IEEE Transactions on Robotics, vol. 41, p. 1059–1076, 2024.

In this work, we demonstrate continuous-time radar-inertial and lidar-inertial odometry using a Gaussian process motion prior. Using a sparse prior, we demonstrate improved computational complexity during preintegration and interpolation. We use a white-noise-on-acceleration motion prior and treat the gyroscope as a direct measurement of the state while preintegrating accelerometer measurements to form relative velocity factors. Our odometry is implemented using sliding-window batch trajectory estimation. To our knowledge, our work is the first to demonstrate radar-inertial odometry with a spinning mechanical radar using both gyroscope and accelerometer measurements. We improve the performance of our radar odometry by 43% by incorporating an inertial measurement unit. Our approach is efficient and we demonstrate real-time performance.

@ARTICLE{burnett-tro24,
title = {Continuous-Time Radar-Inertial and Lidar-Inertial Odometry using a {Gaussian} Process Motion Prior},
author = {Keenan Burnett and Angela P. Schoellig and Timothy D. Barfoot},
journal = {{IEEE Transactions on Robotics}},
number = {},
volume = {41},
pages = {1059--1076},
doi = {10.1109/TRO.2024.3521856},
year = {2024},
abstract = {In this work, we demonstrate continuous-time radar-inertial and lidar-inertial odometry using a Gaussian process motion prior. Using a sparse prior, we demonstrate improved computational complexity during preintegration and interpolation. We use a white-noise-on-acceleration motion prior and treat the gyroscope as a direct measurement of the state while preintegrating accelerometer measurements to form relative velocity factors. Our odometry is implemented using sliding-window batch trajectory estimation. To our knowledge, our work is the first to demonstrate radar-inertial odometry with a spinning mechanical radar using both gyroscope and accelerometer measurements. We improve the performance of our radar odometry by 43% by incorporating an inertial measurement unit. Our approach is efficient and we demonstrate real-time performance.}
}

Ultra-wideband time difference of arrival indoor localization: from sensor placement to system evaluation
W. Zhao, A. Goudar, M. Tang, and A. P. Schoellig
IEEE Robotics and Automation Magazine, 2024. Under review.

Wireless indoor localization has attracted significant research interest due to its high accuracy, low cost, lightweight design, and low power consumption. Specifically, ultra-wideband (UWB) time difference of arrival (TDOA)-based localization has emerged as a scalable positioning solution for mobile robots, consumer electronics, and wearable devices, featuring good accuracy and reliability. While UWB TDOA-based localization systems rely on the deployment of UWB radio sensors as positioning landmarks, existing works often assume these placements are predetermined or study the sensor placement problem alone without evaluating it in practical scenarios. In this article, we bridge this gap by approaching the UWB TDOA localization from a system-level perspective, integrating sensor placement as a key component and conducting practical evaluation in real-world scenarios. Through extensive real-world experiments, we demonstrate the accuracy and robustness of our localization system, comparing its performance to the theoretical lower bounds. Using a challenging multi-room environment as a case study, we illustrate the full system construction process, from sensor placement optimization to real-world deployment. Our evaluation, comprising a cumulative total of 39 minutes of real-world experiments involving up to five agents and covering 2608 meters across four distinct scenarios, provides valuable insights and guidelines for constructing UWB TDOA localization systems.

@ARTICLE{zhao-ram24,
title = {Ultra-wideband Time Difference of Arrival Indoor Localization: From
Sensor Placement to System Evaluation},
author = {Wenda Zhao and Abhishek Goudar and Mingliang Tang and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Magazine}},
note = {Under review},
year = {2024},
urllink = {https://arxiv.org/abs/2412.12427},
abstract = {Wireless indoor localization has attracted significant research
interest due to its high accuracy, low cost, lightweight design,
and low power consumption. Specifically, ultra-wideband (UWB)
time difference of arrival (TDOA)-based localization has emerged
as a scalable positioning solution for mobile robots, consumer
electronics, and wearable devices, featuring good accuracy and
reliability. While UWB TDOA-based localization systems rely on
the deployment of UWB radio sensors as positioning landmarks,
existing works often assume these placements are predetermined or
study the sensor placement problem alone without evaluating it in
practical scenarios. In this article, we bridge this gap by
approaching the UWB TDOA localization from a system-level
perspective, integrating sensor placement as a key component and
conducting practical evaluation in real-world scenarios. Through
extensive real-world experiments, we demonstrate the accuracy and
robustness of our localization system, comparing its performance
to the theoretical lower bounds. Using a challenging multi-room
environment as a case study, we illustrate the full system
construction process, from sensor placement optimization to
real-world deployment. Our evaluation, comprising a cumulative
total of 39 minutes of real-world experiments involving up to
five agents and covering 2608 meters across four distinct
scenarios, provides valuable insights and guidelines for
constructing UWB TDOA localization systems.}
}

Force push: robust single-point pushing with force feedback
A. Heins and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 9, iss. 8, p. 6856–6863, 2024.

We present a controller for quasistatic robotic planar pushing with single-point contact using only force feedback to sense the pushed object. We consider an omnidirectional mobile robot pushing an object (the “slider”) along a given path, where the robot is equipped with a force-torque sensor to measure the force at the contact point with the slider. The geometric, inertial, and frictional parameters of the slider are not known to the controller, nor are measurements of the slider’s pose. We assume that the robot can be localized so that the global position of the contact point is always known and that the approximate initial position of the slider is provided. Simulations and real-world experiments show that our controller yields pushes that are robust to a wide range of slider parameters and state perturbations along both straight and curved paths. Furthermore, we use an admittance controller to adjust the pushing velocity based on the measured force when the slider contacts obstacles like walls.

@ARTICLE{heins-ral24,
title = {Force Push: Robust Single-Point Pushing With Force Feedback},
author = {Adam Heins and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
volume = {9},
number = {8},
pages = {6856--6863},
doi = {10.1109/LRA.2024.3414180},
year = {2024},
abstract = {We present a controller for quasistatic robotic planar pushing with single-point contact using only force feedback to sense the pushed object. We consider an omnidirectional mobile robot pushing an object (the "slider") along a given path, where the robot is equipped with a force-torque sensor to measure the force at the contact point with the slider. The geometric, inertial, and frictional parameters of the slider are not known to the controller, nor are measurements of the slider's pose. We assume that the robot can be localized so that the global position of the contact point is always known and that the approximate initial position of the slider is provided. Simulations and real-world experiments show that our controller yields pushes that are robust to a wide range of slider parameters and state perturbations along both straight and curved paths. Furthermore, we use an admittance controller to adjust the pushing velocity based on the measured force when the slider contacts obstacles like walls.}
}

SICNav: safe and interactive crowd navigation using model predictive control and bilevel optimization
S. Samavi, J. R. Han, F. Shkurti, and A. P. Schoellig
IEEE Transactions on Robotics, vol. 41, p. 801–818, 2024.

Robots need to predict and react to human motions to navigate through a crowd without collisions. Many existing methods decouple prediction from planning, which does not account for the interaction between robot and human motions and can lead to the robot getting stuck. We propose SICNav, a Model Predictive Control (MPC) method that jointly solves for robot motion and predicted crowd motion in closed-loop. We model each human in the crowd to be following an Optimal Reciprocal Collision Avoidance (ORCA) scheme and embed that model as a constraint in the robot’s local planner, resulting in a bilevel nonlinear MPC optimization problem. We use a KKT-reformulation to cast the bilevel problem as a single level and use a nonlinear solver to optimize. Our MPC method can influence pedestrian motion while explicitly satisfying safety constraints in a single-robot multi-human environment. We analyze the performance of SICNav in two simulation environments and indoor experiments with a real robot to demonstrate safe robot motion that can influence the surrounding humans. We also validate the trajectory forecasting performance of ORCA on a human trajectory dataset. Code: https://github.com/sepsamavi/safe-interactive-crowdnav.git.

@ARTICLE{samavi-tro23,
title = {{SICNav}: Safe and Interactive Crowd Navigation using Model Predictive Control and Bilevel Optimization},
author = {Sepehr Samavi and James R. Han and Florian Shkurti and Angela P. Schoellig},
journal = {{IEEE Transactions on Robotics}},
volume = {41},
number = {},
pages = {801--818},
year = {2024},
urllink = {https://arxiv.org/abs/2310.10982},
doi = {10.1109/TRO.2024.3484634},
abstract = {Robots need to predict and react to human motions to navigate through a crowd without collisions. Many existing methods decouple prediction from planning, which does not account for the interaction between robot and human motions and can lead to the robot getting stuck. We propose SICNav, a Model Predictive Control (MPC) method that jointly solves for robot motion and predicted crowd motion in closed-loop. We model each human in the crowd to be following an Optimal Reciprocal Collision Avoidance (ORCA) scheme and embed that model as a constraint in the robot's local planner, resulting in a bilevel nonlinear MPC optimization problem. We use a KKT-reformulation to cast the bilevel problem as a single level and use a nonlinear solver to optimize. Our MPC method can influence pedestrian motion while explicitly satisfying safety constraints in a single-robot multi-human environment. We analyze the performance of SICNav in two simulation environments and indoor experiments with a real robot to demonstrate safe robot motion that can influence the surrounding humans. We also validate the trajectory forecasting performance of ORCA on a human trajectory dataset. Code: https://github.com/sepsamavi/safe-interactive-crowdnav.git.}
}

What is the impact of releasing code with publications? statistics from the machine learning, robotics, and control communities
S. Zhou, L. Brunke, A. Tao, A. W. Hall, F. Pizarro Bejarano, J. Panerati, and A. P. Schoellig
IEEE Control Systems Magazine, 2024. Accepted.

Open-sourcing research publications is a key enabler for the reproducibility of studies and the collective scientific progress of a research community. As all fields of science develop more advanced algorithms, we become more dependent on complex computational toolboxes – sharing research ideas solely through equations and proofs is no longer sufficient to communicate scientific developments. Over the past years, several efforts have highlighted the importance and challenges of transparent and reproducible research; code sharing is one of the key necessities in such efforts. In this article, we study the impact of code release on scientific research and present statistics from three research communities: machine learning, robotics, and control. We found that, over a six-year period (2016-2021), the percentages of papers with code at major machine learning, robotics, and control conferences have at least doubled. Moreover, high-impact papers were generally supported by open-source codes. As an example, the top 1\% of most cited papers at the Conference on Neural Information Processing Systems (NeurIPS) consistently included open-source codes. In addition, our analysis shows that popular code repositories generally come with high paper citations, which further highlights the coupling between code sharing and the impact of scientific research. While the trends are encouraging, we would like to continue to promote and increase our efforts toward transparent, reproducible research that accelerates innovation – releasing code with our papers is a clear first step.

@ARTICLE{zhou-mcs24,
author={Zhou, Siqi and Brunke, Lukas and Tao, Allen and Hall, Adam W. and Pizarro Bejarano, Federico and Panerati, Jacopo and Schoellig, Angela P.},
journal={{IEEE Control Systems Magazine}},
title={What is the Impact of Releasing Code with Publications? Statistics from the Machine Learning, Robotics, and Control Communities},
year={2024},
note={Accepted},
abstract={Open-sourcing research publications is a key enabler for the reproducibility of studies and the collective scientific progress of a research community. As all fields of science develop more advanced algorithms, we become more dependent on complex computational toolboxes -- sharing research ideas solely through equations and proofs is no longer sufficient to communicate scientific developments. Over the past years, several efforts have highlighted the importance and challenges of transparent and reproducible research; code sharing is one of the key necessities in such efforts. In this article, we study the impact of code release on scientific research and present statistics from three research communities: machine learning, robotics, and control. We found that, over a six-year period (2016-2021), the percentages of papers with code at major machine learning, robotics, and control conferences have at least doubled. Moreover, high-impact papers were generally supported by open-source codes. As an example, the top 1\% of most cited papers at the Conference on Neural Information Processing Systems (NeurIPS) consistently included open-source codes. In addition, our analysis shows that popular code repositories generally come with high paper citations, which further highlights the coupling between code sharing and the impact of scientific research. While the trends are encouraging, we would like to continue to promote and increase our efforts toward transparent, reproducible research that accelerates innovation -- releasing code with our papers is a clear first step.}
}

Optimal initialization strategies for range-only trajectory estimation
A. Goudar, F. Dümbgen, T. D. Barfoot, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 9, iss. 3, p. 2160–2167, 2024.

Range-only (RO) pose estimation involves determining a robot’s pose over time by measuring the distance between multiple devices on the robot, known as tags, and devices installed in the environment, known as anchors. The non-convex nature of the range measurement model results in a cost function with possible local minima. In the absence of a good initial guess, commonly used iterative solvers can get stuck in these local minima resulting in poor trajectory estimation accuracy. In this letter, we propose convex relaxations to the original non-convex problem based on semidefinite programs (SDPs). Specifically, we formulate computationally tractable SDP relaxations to obtain accurate initial pose and trajectory estimates for RO trajectory estimation under static and dynamic (i.e., constant-velocity motion) conditions. Through simulation and hardware experiments, we demonstrate that our proposed approaches estimate the initial pose and initial trajectories accurately compared to iterative local solvers. Additionally, the proposed relaxations recover global minima under moderate range measurement noise levels.

@ARTICLE{goudar-ral24b,
author={Abhishek Goudar and Frederike D{\"u}mbgen and Timothy D. Barfoot and Angela P. Schoellig},
journal={{IEEE Robotics and Automation Letters}},
title={Optimal Initialization Strategies for Range-Only Trajectory Estimation},
year={2024},
volume={9},
number={3},
pages={2160--2167},
doi={10.1109/LRA.2024.3354623},
abstract={Range-only (RO) pose estimation involves determining a robot's pose over time by measuring the distance between multiple devices on the robot, known as tags, and devices installed in the environment, known as anchors. The non-convex nature of the range measurement model results in a cost function with possible local minima. In the absence of a good initial guess, commonly used iterative solvers can get stuck in these local minima resulting in poor trajectory estimation accuracy. In this letter, we propose convex relaxations to the original non-convex problem based on semidefinite programs (SDPs). Specifically, we formulate computationally tractable SDP relaxations to obtain accurate initial pose and trajectory estimates for RO trajectory estimation under static and dynamic (i.e., constant-velocity motion) conditions. Through simulation and hardware experiments, we demonstrate that our proposed approaches estimate the initial pose and initial trajectories accurately compared to iterative local solvers. Additionally, the proposed relaxations recover global minima under moderate range measurement noise levels.}
}

Optimized control invariance conditions for uncertain input-constrained nonlinear control systems
L. Brunke, S. Zhou, M. Che, and A. P. Schoellig
IEEE Control Systems Letters, vol. 8, p. 157–162, 2024.

Providing safety guarantees for learning-based controllers is important for real-world applications. One approach to realizing safety for arbitrary control policies is safety filtering. If necessary, the filter modifies control inputs to ensure that the trajectories of a closed-loop system stay within a given state constraint set for all future time, referred to as the set being positive invariant or the system being safe. Under the assumption of fully known dynamics, safety can be certified using control barrier functions (CBFs). However, the dynamics model is often either unknown or only partially known in practice. Learning-based methods have been proposed to approximate the CBF condition for unknown or uncertain systems from data; however, these techniques do not account for input constraints and, as a result, may not yield a valid CBF condition to render the safe set invariant. In this letter, we study conditions that guarantee control invariance of the system under input constraints and propose an optimization problem to reduce the conservativeness of CBF-based safety filters. Building on these theoretical insights, we further develop a probabilistic learning approach that allows us to build a safety filter that guarantees safety for uncertain, input-constrained systems with high probability. We demonstrate the efficacy of our proposed approach in simulation and real-world experiments on a quadrotor and show that we can achieve safe closed-loop behavior for a learned system while satisfying state and input constraints.

@article{brunke-lcss24,
author={Lukas Brunke and Siqi Zhou and Mingxuan Che and Angela P. Schoellig},
journal={{IEEE Control Systems Letters}},
title={Optimized Control Invariance Conditions for Uncertain Input-Constrained Nonlinear Control Systems},
year={2024},
volume={8},
number={},
pages={157--162},
doi={10.1109/LCSYS.2023.3344138},
abstract={Providing safety guarantees for learning-based controllers is important for real-world applications. One approach to realizing safety for arbitrary control policies is safety filtering. If necessary, the filter modifies control inputs to ensure that the trajectories of a closed-loop system stay within a given state constraint set for all future time, referred to as the set being positive invariant or the system being safe. Under the assumption of fully known dynamics, safety can be certified using control barrier functions (CBFs). However, the dynamics model is often either unknown or only partially known in practice. Learning-based methods have been proposed to approximate the CBF condition for unknown or uncertain systems from data; however, these techniques do not account for input constraints and, as a result, may not yield a valid CBF condition to render the safe set invariant. In this letter, we study conditions that guarantee control invariance of the system under input constraints and propose an optimization problem to reduce the conservativeness of CBF-based safety filters. Building on these theoretical insights, we further develop a probabilistic learning approach that allows us to build a safety filter that guarantees safety for uncertain, input-constrained systems with high probability. We demonstrate the efficacy of our proposed approach in simulation and real-world experiments on a quadrotor and show that we can achieve safe closed-loop behavior for a learned system while satisfying state and input constraints.}
}

Hierarchical task model predictive control for sequential mobile manipulation tasks
X. Du, S. Zhou, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 9, iss. 2, p. 1270–1277, 2024.

Mobile manipulators are envisioned to serve more complex roles in people’s everyday lives. With recent breakthroughs in large language models, task planners have become better at translating human verbal instructions into a sequence of tasks. However, there is still a need for a decision-making algorithm that can seamlessly interface with the high-level task planner to carry out the sequence of tasks efficiently. In this work, building on the idea of nonlinear lexicographic optimization, we propose a novel Hierarchical-Task Model Predictive Control framework that is able to complete sequential tasks with improved performance and reactivity by effectively leveraging the robot’s redundancy. Compared to the state-of-the-art task-prioritized inverse kinematic control method, our approach has improved hierarchical trajectory tracking performance by 42\% on average when facing task changes, robot singularity, and reference variations. Compared to a typical single-task architecture, our proposed hierarchical task control architecture enables the robot to traverse a shorter path in task space and achieves an execution time 2.3 times faster when executing a sequence of delivery tasks. We demonstrated the results with real-world experiments on a 9 degrees of freedom mobile manipulator.

@article{du-ral24,
author={Xintong Du and Siqi Zhou and Angela P. Schoellig},
journal={{IEEE Robotics and Automation Letters}},
title={Hierarchical Task Model Predictive Control for Sequential Mobile Manipulation Tasks},
year={2024},
volume={9},
number={2},
pages={1270--1277},
doi={10.1109/LRA.2023.3342671},
abstract={Mobile manipulators are envisioned to serve more complex roles in people's everyday lives. With recent breakthroughs in large language models, task planners have become better at translating human verbal instructions into a sequence of tasks. However, there is still a need for a decision-making algorithm that can seamlessly interface with the high-level task planner to carry out the sequence of tasks efficiently. In this work, building on the idea of nonlinear lexicographic optimization, we propose a novel Hierarchical-Task Model Predictive Control framework that is able to complete sequential tasks with improved performance and reactivity by effectively leveraging the robot's redundancy. Compared to the state-of-the-art task-prioritized inverse kinematic control method, our approach has improved hierarchical trajectory tracking performance by 42\% on average when facing task changes, robot singularity, and reference variations. Compared to a typical single-task architecture, our proposed hierarchical task control architecture enables the robot to traverse a shorter path in task space and achieves an execution time 2.3 times faster when executing a sequence of delivery tasks. We demonstrated the results with real-world experiments on a 9 degrees of freedom mobile manipulator.}
}

Range-visual-inertial sensor fusion for micro aerial vehicle localization and navigation
A. Goudar, W. Zhao, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 9, iss. 1, p. 683–690, 2024.

We propose a fixed-lag smoother-based sensor fusion architecture to leverage the complementary benefits of range-based sensors and visual-inertial odometry (VIO) for localization. We use two fixed-lag smoothers (FLS) to decouple accurate state estimation and high-rate pose generation for closed-loop control. The first FLS combines ultrawideband (UWB)-based range measurements and VIO to estimate the robot trajectory and any systematic biases that affect the range measurements in cluttered environments. The second FLS estimates smooth corrections to VIO to generate pose estimates at a high rate for online control. The proposed method is lightweight and can run on a computationally constrained micro-aerial vehicle (MAV). We validate our approach through closed-loop flight tests involving dynamic trajectories in multiple real-world cluttered indoor environments. Our method achieves decimeter-to-sub-decimeter-level positioning accuracy using off-the-shelf sensors and decimeter-level tracking accuracy with minimally-tuned open-source controllers.

@article{goudar-ral24,
author={Abhishek Goudar and Wenda Zhao and Angela P. Schoellig},
journal={{IEEE Robotics and Automation Letters}},
title={Range-Visual-Inertial Sensor Fusion for Micro Aerial Vehicle Localization and Navigation},
year={2024},
volume={9},
number={1},
pages={683--690},
doi={10.1109/LRA.2023.3335772},
abstract={We propose a fixed-lag smoother-based sensor fusion architecture to leverage the complementary benefits of range-based sensors and visual-inertial odometry (VIO) for localization. We use two fixed-lag smoothers (FLS) to decouple accurate state estimation and high-rate pose generation for closed-loop control. The first FLS combines ultrawideband (UWB)-based range measurements and VIO to estimate the robot trajectory and any systematic biases that affect the range measurements in cluttered environments. The second FLS estimates smooth corrections to VIO to generate pose estimates at a high rate for online control. The proposed method is lightweight and can run on a computationally constrained micro-aerial vehicle (MAV). We validate our approach through closed-loop flight tests involving dynamic trajectories in multiple real-world cluttered indoor environments. Our method achieves decimeter-to-sub-decimeter-level positioning accuracy using off-the-shelf sensors and decimeter-level tracking accuracy with minimally-tuned open-source controllers.}
}

UTIL: an ultra-wideband time-difference-of-arrival indoor localization dataset
W. Zhao, A. Goudar, X. Qiao, and A. P. Schoellig
International Journal of Robotics Research, 2024. In press.

Ultra-wideband (UWB) time-difference-of-arrival (TDOA)-based localization has emerged as a promising, low-cost, and scalable indoor localization solution, which is especially suited for multi-robot applications. However, there is a lack of public datasets to study and benchmark UWB TDOA positioning technology in cluttered indoor environments. We fill in this gap by presenting a comprehensive dataset using Decawave’s DWM1000 UWB modules. To characterize the UWB TDOA measurement performance under various line-of-sight (LOS) and non-line-of-sight (NLOS) conditions, we collected signal-to-noise ratio (SNR), power difference values, and raw UWB TDOA measurements during the identification experiments. We also conducted a cumulative total of around 150 min of real-world flight experiments on a customized quadrotor platform to benchmark the UWB TDOA localization performance for mobile robots. The quadrotor was commanded to fly with an average speed of 0.45 m/s in both obstacle-free and cluttered environments using four different UWB anchor constellations. Raw sensor data including UWB TDOA, inertial measurement unit (IMU), optical flow, time-of-flight (ToF) laser altitude, and millimeter-accurate ground truth robot poses were collected during the flights. The dataset and development kit are available at https://utiasdsl.github.io/util-uwb-dataset/.

@article{zhao-ijrr24,
author = {Wenda Zhao and Abhishek Goudar and Xinyuan Qiao and Angela P. Schoellig},
title = {{UTIL}: An ultra-wideband time-difference-of-arrival indoor localization dataset},
journal = {{International Journal of Robotics Research}},
year = {2024},
volume = {},
number = {},
pages = {},
note = {In press},
urllink = {https://journals.sagepub.com/doi/full/10.1177/02783649241230640},
urlcode = {https://github.com/utiasDSL/util-uwb-dataset},
doi = {10.1177/02783649241230640},
abstract = {Ultra-wideband (UWB) time-difference-of-arrival (TDOA)-based localization has emerged as a promising, low-cost, and scalable indoor localization solution, which is especially suited for multi-robot applications. However, there is a lack of public datasets to study and benchmark UWB TDOA positioning technology in cluttered indoor environments. We fill in this gap by presenting a comprehensive dataset using Decawave’s DWM1000 UWB modules. To characterize the UWB TDOA measurement performance under various line-of-sight (LOS) and non-line-of-sight (NLOS) conditions, we collected signal-to-noise ratio (SNR), power difference values, and raw UWB TDOA measurements during the identification experiments. We also conducted a cumulative total of around 150 min of real-world flight experiments on a customized quadrotor platform to benchmark the UWB TDOA localization performance for mobile robots. The quadrotor was commanded to fly with an average speed of 0.45 m/s in both obstacle-free and cluttered environments using four different UWB anchor constellations. Raw sensor data including UWB TDOA, inertial measurement unit (IMU), optical flow, time-of-flight (ToF) laser altitude, and millimeter-accurate ground truth robot poses were collected during the flights. The dataset and development kit are available at https://utiasdsl.github.io/util-uwb-dataset/.}
}

Closing the perception-action loop for semantically safe navigation in semi-static environments
J. Qian, S. Zhou, N. J. Ren, V. Chatrath, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2024. Accepted.

Autonomous robots navigating in changing environments demand adaptive navigation strategies for safe long-term operation. While many modern control paradigms offer theoretical guarantees, they often assume known extrinsic safety constraints, overlooking challenges when deployed in real-world environments where objects can appear, disappear, and shift over time. In this paper, we present a closed-loop perception-action pipeline that bridges this gap. Our system encodes an online-constructed dense map, along with object-level semantic and consistency estimates into a control barrier function (CBF) to regulate safe regions in the scene. A model predictive controller (MPC) leverages the CBF-based safety constraints to adapt its navigation behaviour, which is particularly crucial when potential scene changes occur. We test the system in simulations and real-world experiments to demonstrate the impact of semantic information and scene change handling on robot behavior, validating the practicality of our approach.

@inproceedings{qian-icra24,
author={Jingxing Qian and Siqi Zhou and Nicholas Jianrui Ren and Veronica Chatrath and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
title={Closing the Perception-Action Loop for Semantically Safe Navigation in Semi-Static Environments},
year={2024},
note={Accepted},
abstract = {Autonomous robots navigating in changing environments demand adaptive navigation strategies for safe long-term operation. While many modern control paradigms offer theoretical guarantees, they often assume known extrinsic safety constraints, overlooking challenges when deployed in real-world environments where objects can appear, disappear, and shift over time. In this paper, we present a closed-loop perception-action pipeline that bridges this gap. Our system encodes an online-constructed dense map, along with object-level semantic and consistency estimates into a control barrier function (CBF) to regulate safe regions in the scene. A model predictive controller (MPC) leverages the CBF-based safety constraints to adapt its navigation behaviour, which is particularly crucial when potential scene changes occur. We test the system in simulations and real-world experiments to demonstrate the impact of semantic information and scene change handling on robot behavior, validating the practicality of our approach.}
}

Uncertainty-aware 3D object-level mapping with deep shape priors
Z. Liao, J. Yang, J. Qian, A. P. Schoellig, and S. L. Waslander
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2024. Accepted.

3D object-level mapping is a fundamental problem in robotics, which is especially challenging when object CAD models are unavailable during inference. In this work, we propose a framework that can reconstruct high-quality object-level maps for unknown objects. Our approach takes multiple RGB-D images as input and outputs dense 3D shapes and 9-DoF poses (including 3 scale parameters) for detected objects. The core idea of our approach is to leverage a learnt generative model for shape categories as a prior and to formulate a probabilistic, uncertainty-aware optimization framework for 3D reconstruction. We derive a probabilistic formulation that propagates shape and pose uncertainty through two novel loss functions. Unlike current state-of-the-art approaches, we explicitly model the uncertainty of the object shapes and poses during our optimization, resulting in a high-quality object-level mapping system. Moreover, the resulting shape and pose uncertainties, which we demonstrate can accurately reflect the true errors of our object maps, can also be useful for downstream robotics tasks such as active vision. We perform extensive evaluations on indoor and outdoor real-world datasets, achieving achieves substantial improvements over state-of-the-art methods. Our code will be available at https://github.com/TRAILab/UncertainShapePose

@inproceedings{liao-icra24,
author={Ziwei Liao and Jun Yang and Jingxing Qian and Angela P. Schoellig and Steven L. Waslander},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
title={Uncertainty-aware {3D} Object-Level Mapping with Deep Shape Priors},
year={2024},
note={Accepted},
abstract = {3D object-level mapping is a fundamental problem in robotics, which is especially challenging when object CAD models are unavailable during inference. In this work, we propose a framework that can reconstruct high-quality object-level maps for unknown objects. Our approach takes multiple RGB-D images as input and outputs dense 3D shapes and 9-DoF poses (including 3 scale parameters) for detected objects. The core idea of our approach is to leverage a learnt generative model for shape categories as a prior and to formulate a probabilistic, uncertainty-aware optimization framework for 3D reconstruction. We derive a probabilistic formulation that propagates shape and pose uncertainty through two novel loss functions. Unlike current state-of-the-art approaches, we explicitly model the uncertainty of the object shapes and poses during our optimization, resulting in a high-quality object-level mapping system. Moreover, the resulting shape and pose uncertainties, which we demonstrate can accurately reflect the true errors of our object maps, can also be useful for downstream robotics tasks such as active vision. We perform extensive evaluations on indoor and outdoor real-world datasets, achieving achieves substantial improvements over state-of-the-art methods. Our code will be available at https://github.com/TRAILab/UncertainShapePose}
}

Practical considerations for discrete-time implementations of continuous-time control barrier function-based safety filters
L. Brunke, S. Zhou, M. Che, and A. P. Schoellig
in Proc. of the American Control Conference (ACC), 2024. Accepted.

Safety filters based on control barrier functions (CBFs) have become a popular method to guarantee safety for uncertified control policies, e.g., as resulting from reinforcement learning. Here, safety is defined as staying in a pre-defined set, the safe set, that adheres to the system’s state constraints, e.g., as given by lane boundaries for a self-driving vehicle. In this paper, we examine one commonly overlooked problem that arises in practical implementations of continuous-time CBF-based safety filters. In particular, we look at the issues caused by discrete-time implementations of the continuous-time CBF-based safety filter, especially for cases where the magnitude of the Lie derivative of the CBF with respect to the control input is zero or close to zero. When overlooked, this filter can result in undesirable chattering effects or constraint violations. In this work, we propose three mitigation strategies that allow us to use a continuous-time safety filter in a discrete-time implementation with a local relative degree. Using these strategies in augmented CBF-based safety filters, we achieve safety for all states in the safe set by either using an additional penalty term in the safety filtering objective or modifying the CBF such that those undesired states are not encountered during closed-loop operation. We demonstrate the presented issue and validate our three proposed mitigation strategies in simulation and on a real-world quadrotor.

@inproceedings{brunke-acc24,
author={Lukas Brunke and Siqi Zhou and Mingxuan Che and Angela P. Schoellig},
booktitle = {{Proc. of the American Control Conference (ACC)}},
title={Practical Considerations for Discrete-Time Implementations of
Continuous-Time Control Barrier Function-Based Safety Filters},
year={2024},
note={Accepted},
abstract = {Safety filters based on control barrier functions (CBFs) have become a popular method to guarantee safety for uncertified control policies, e.g., as resulting from reinforcement learning. Here, safety is defined as staying in a pre-defined set, the safe set, that adheres to the system's state constraints, e.g., as given by lane boundaries for a self-driving vehicle. In this paper, we examine one commonly overlooked problem that arises in practical implementations of continuous-time CBF-based safety filters. In particular, we look at the issues caused by discrete-time implementations of the continuous-time CBF-based safety filter, especially for cases where the magnitude of the Lie derivative of the CBF with respect to the control input is zero or close to zero. When overlooked, this filter can result in undesirable chattering effects or constraint violations. In this work, we propose three mitigation strategies that allow us to use a continuous-time safety filter in a discrete-time implementation with a local relative degree. Using these strategies in augmented CBF-based safety filters, we achieve safety for all states in the safe set by either using an additional penalty term in the safety filtering objective or modifying the CBF such that those undesired states are not encountered during closed-loop operation. We demonstrate the presented issue and validate our three proposed mitigation strategies in simulation and on a real-world quadrotor.}
}

Control-barrier-aided teleoperation with visual-inertial SLAM for safe MAV navigation in complex environments
S. Zhou, S. Papatheodorou, S. Leutenegger, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2024. Accepted.

In this paper, we consider a Micro Aerial Vehicle (MAV) system teleoperated by a non-expert and introduce a perceptive safety filter that leverages Control Barrier Functions (CBFs) in conjunction with Visual-Inertial Simultaneous Localization and Mapping (VI-SLAM) and dense 3D occupancy mapping to guarantee safe navigation in complex and unstructured environments. Our system relies solely on onboard IMU measurements, stereo infrared images, and depth images and autonomously corrects teleoperated inputs when they are deemed unsafe. We define a point in 3D space as unsafe if it satisfies either of two conditions: (i) it is occupied by an obstacle, or (ii) it remains unmapped. At each time step, an occupancy map of the environment is updated by the VI-SLAM by fusing the onboard measurements, and a CBF is constructed to parameterize the (un)safe region in the 3D space. Given the CBF and state feedback from the VI-SLAM module, a safety filter computes a certified reference that best matches the teleoperation input while satisfying the safety constraint encoded by the CBF. In contrast to existing perception-based safe control frameworks, we directly close the perception-action loop and demonstrate the full capability of safe control in combination with real-time VI-SLAM without any external infrastructure or prior knowledge of the environment. We verify the efficacy of the perceptive safety filter in real-time MAV experiments using exclusively onboard sensing and computation and show that the teleoperated MAV is able to safely navigate through unknown environments despite arbitrary inputs sent by the teleoperator.

@inproceedings{zhou-icra24,
author={Siqi Zhou and Sotiris Papatheodorou and Stefan Leutenegger and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
title={Control-Barrier-Aided Teleoperation with Visual-Inertial {SLAM} for Safe {MAV} Navigation in Complex Environments},
year={2024},
note={Accepted},
abstract = {In this paper, we consider a Micro Aerial Vehicle (MAV) system teleoperated by a non-expert and introduce a perceptive safety filter that leverages Control Barrier Functions (CBFs) in conjunction with Visual-Inertial Simultaneous Localization and Mapping (VI-SLAM) and dense 3D occupancy mapping to guarantee safe navigation in complex and unstructured environments. Our system relies solely on onboard IMU measurements, stereo infrared images, and depth images and autonomously corrects teleoperated inputs when they are deemed unsafe. We define a point in 3D space as unsafe if it satisfies either of two conditions: (i) it is occupied by an obstacle, or (ii) it remains unmapped. At each time step, an occupancy map of the environment is updated by the VI-SLAM by fusing the onboard measurements, and a CBF is constructed to parameterize the (un)safe region in the 3D space. Given the CBF and state feedback from the VI-SLAM module, a safety filter computes a certified reference that best matches the teleoperation input while satisfying the safety constraint encoded by the CBF. In contrast to existing perception-based safe control frameworks, we directly close the perception-action loop and demonstrate the full capability of safe control in combination with real-time VI-SLAM without any external infrastructure or prior knowledge of the environment. We verify the efficacy of the perceptive safety filter in real-time MAV experiments using exclusively onboard sensing and computation and show that the teleoperated MAV is able to safely navigate through unknown environments despite arbitrary inputs sent by the teleoperator.}
}

AMSwarmX: safe swarm coordination in complex environments via implicit non-convex decomposition of the obstacle-free space
V. K. Adajania, S. Zhou, A. K. Singh, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2024. Accepted.

Quadrotor motion planning in complex environments leverage the concept of safe flight corridor (SFC) to facilitate static obstacle avoidance. Typically, SFCs are constructed through convex decomposition of the environment’s free space into cuboids, convex polyhedra, or spheres. However, when dealing with a quadrotor swarm, such SFCs can be overly conservative, substantially limiting the available free space for quadrotors to coordinate. This paper presents an Alternating Minimization-based approach that does not require building a conservative free-space approximation. Instead, both static and dynamic collision constraints are treated in a unified manner. Dynamic collisions are handled based on shared position trajectories of the quadrotors. Static obstacle avoidance is coupled with distance queries from the Octomap, providing an implicit non-convex decomposition of free space. As a result, our approach is scalable to arbitrary complex environments. Through extensive comparisons in simulation, we demonstrate a 60\% improvement in success rate, an average 1.8× reduction in mission completion time, and an average 23× reduction in per-agent computation time compared to SFC-based approaches. We also experimentally validated our approach using a Crazyflie quadrotor swarm of up to 12 quadrotors in obstacle-rich environments. The code, supplementary materials, and videos are released for reference.

@inproceedings{adajania-icra24,
author={Vivek K. Adajania and Siqi Zhou and Arun Kumar Singh and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
title={{AMSwarmX}: Safe Swarm Coordination in CompleX Environments via Implicit Non-Convex Decomposition of the Obstacle-Free Space},
year={2024},
note={Accepted},
abstract = {Quadrotor motion planning in complex environments leverage the concept of safe flight corridor (SFC) to facilitate static obstacle avoidance. Typically, SFCs are constructed through convex decomposition of the environment's free space into cuboids, convex polyhedra, or spheres. However, when dealing with a quadrotor swarm, such SFCs can be overly conservative, substantially limiting the available free space for quadrotors to coordinate. This paper presents an Alternating Minimization-based approach that does not require building a conservative free-space approximation. Instead, both static and dynamic collision constraints are treated in a unified manner. Dynamic collisions are handled based on shared position trajectories of the quadrotors. Static obstacle avoidance is coupled with distance queries from the Octomap, providing an implicit non-convex decomposition of free space. As a result, our approach is scalable to arbitrary complex environments. Through extensive comparisons in simulation, we demonstrate a 60\% improvement in success rate, an average 1.8× reduction in mission completion time, and an average 23× reduction in per-agent computation time compared to SFC-based approaches. We also experimentally validated our approach using a Crazyflie quadrotor swarm of up to 12 quadrotors in obstacle-rich environments. The code, supplementary materials, and videos are released for reference.}
}

Is data all that matters? The role of control frequency for learning-based sampled-data control of uncertain systems
R. Römer, L. Brunke, S. Zhou, and A. P. Schoellig
in Proc. of the American Control Conference (ACC), 2024. Accepted.

Learning models or control policies from data has become a powerful tool to improve the performance of uncertain systems. While a strong focus has been placed on increasing the amount and quality of data to improve performance, data can never fully eliminate uncertainty, making feedback necessary to ensure stability and performance. We show that the control frequency at which the input is recalculated is a crucial design parameter, yet it has hardly been considered before. We address this gap by combining probabilistic model learning and sampled-data control. We use Gaussian processes (GPs) to learn a continuous-time model and compute a corresponding discrete-time controller. The result is an uncertain sampled-data control system, for which we derive robust stability conditions. We formulate semidefinite programs to compute the minimum control frequency required for stability and to optimize performance. As a result, our approach enables us to study the effect of both control frequency and data on stability and closed-loop performance. We show in numerical simulations of a quadrotor that performance can be improved by increasing either the amount of data or the control frequency, and that we can trade off one for the other. For example, by increasing the control frequency by 33\%, we can reduce the number of data points by half while still achieving similar performance.

@inproceedings{roemer-acc24,
author={Ralf R{\"o}mer and Lukas Brunke and SiQi Zhou and Angela P. Schoellig},
booktitle = {{Proc. of the American Control Conference (ACC)}},
title={Is Data All That Matters? {The} Role of Control Frequency for Learning-Based Sampled-Data Control of Uncertain Systems},
year={2024},
note={Accepted},
urllink={https://arxiv.org/abs/2403.09504},
urlcode={https://github.com/ralfroemer99/lb_sd},
abstract = {Learning models or control policies from data has become a powerful tool to improve the performance of uncertain systems. While a strong focus has been placed on increasing the amount and quality of data to improve performance, data can never fully eliminate uncertainty, making feedback necessary to ensure stability and performance. We show that the control frequency at which the input is recalculated is a crucial design parameter, yet it has hardly been considered before. We address this gap by combining probabilistic model learning and sampled-data control. We use Gaussian processes (GPs) to learn a continuous-time model and compute a corresponding discrete-time controller. The result is an uncertain sampled-data control system, for which we derive robust stability conditions. We formulate semidefinite programs to compute the minimum control frequency required for stability and to optimize performance. As a result, our approach enables us to study the effect of both control frequency and data on stability and closed-loop performance. We show in numerical simulations of a quadrotor that performance can be improved by increasing either the amount of data or the control frequency, and that we can trade off one for the other. For example, by increasing the control frequency by 33\%, we can reduce the number of data points by half while still achieving similar performance.}
}

Fine-tuning of neural network approximate MPC without retraining via bayesian optimization
H. Hose, P. Brunzema, A. von Rohr, A. Gräfe, A. P. Schoellig, and S. Trimpe
Abstract and Poster, in the Workshop on Safe and Robust Robot Learning for Operation in the Real World at the Conference on Robot Learning (CoRL), 2024.

Approximate model-predictive control (AMPC) aims to imitate an MPC’s behavior with a neural network, removing the need to solve an expensive optimization problem at runtime. However, during deployment, the parameters of the underlying MPC must usually be fine-tuned. This often renders AMPC impractical due to the need to repeatedly generate a new dataset and retrain the neural network. Recent work addresses this problem by adapting AMPC without retraining using approximated sensitivities of the MPC’s optimization problem. However, currently, this adaption must be done by hand, which is labor-intensive and can be unintuitive for high-dimensional systems. To solve this issue, we propose using Bayesian optimization to tune the parameters of AMPC policies based on experimental data. By combining model-based control with direct and local learning, our approach achieves superior performance to nominal AMPC on hardware, with minimal experimentation. This allows automatic and data-efficient adaptation of AMPC to new system instances and fine-tuning to cost functions that are difficult to implement in MPC. We demonstrate the proposed method in hardware experiments for the swing-up maneuver fo a cartpole and yaw control of an under-actuated balancing unicycle robot, a challenging control problem.

@MISC{hose-corl24,
author = {Henrik Hose and Paul Brunzema and Alexander von Rohr and Alexander Gr{\"a}fe and Angela P. Schoellig and Sebastian Trimpe},
title = {Fine-Tuning of Neural Network Approximate {MPC} without Retraining via Bayesian Optimization},
year = {2024},
howpublished = {Abstract and Poster, in the Workshop on Safe and Robust Robot Learning for Operation in the Real World at the Conference on Robot Learning (CoRL)},
abstract = {Approximate model-predictive control (AMPC) aims to imitate
an MPC’s behavior with a neural network, removing the need to solve an expensive optimization problem at runtime. However, during deployment, the parameters of the underlying MPC must usually be fine-tuned. This often renders AMPC impractical due to the need to repeatedly generate a new dataset and retrain the neural network. Recent work addresses this problem by adapting AMPC without retraining using approximated sensitivities of the MPC's optimization problem. However, currently, this adaption must be done by hand, which is labor-intensive and can be unintuitive for high-dimensional systems. To solve this issue, we propose using Bayesian optimization to tune the parameters of AMPC policies based on experimental data. By combining model-based control with direct and local learning, our approach achieves superior performance to nominal AMPC on hardware, with minimal experimentation. This allows automatic and data-efficient adaptation of AMPC to new system instances and fine-tuning to cost functions that are difficult to implement in MPC. We demonstrate the proposed method in hardware experiments for the swing-up maneuver fo a cartpole and yaw control of an under-actuated balancing unicycle robot, a challenging control problem.}
}

Safe offline reinforcement learning using trajectory-level diffusion models
R. Römer, L. Brunke, M. Schuck, and A. P. Schoellig
Abstract and Poster, in the Back to the Future: Robot Learning Going Real Workshop at the IEEE International Conference on Robotics and Automation (ICRA), 2024.

Despite its success in controlling robotic systems, reinforcement learning (RL) suffers from several issues that hinder its widespread adoption in real-world scenarios. Recently, diffusion models have emerged as a powerful tool to address some of the longstanding challenges in offline and model-based RL, improving long-horizon planning and facilitating multitask generalization. However, these algorithms are unsuitable for operating in unseen and dynamic environments where novel and time-varying constraints not represented in the training data may arise. To address this issue, we propose incorporating a projection scheme into diffusion-based trajectory generation. Our approach uses the iterative nature of diffusion models and alternates the conditional backward diffusion process with a projection of the noisy trajectory onto the constraint set. As a result, we can generate trajectories that are both safe and dynamically feasible while still achieving high reward. We evaluate our approach for goal-conditioned offline RL for two simulated robotic systems navigating in environments with static and dynamic obstacles, representing novel test-time constraints. We show that our method can satisfy these constraints in closed loop, greatly increasing the success rate of reaching the goal.

@MISC{romer-icra24,
author = {Ralf R{\"o}mer and Lukas Brunke and Martin Schuck and Angela P. Schoellig},
title = {Safe Offline Reinforcement Learning using Trajectory-Level Diffusion Models},
year = {2024},
howpublished = {Abstract and Poster, in the Back to the Future: Robot Learning Going Real Workshop at the IEEE International Conference on Robotics and Automation (ICRA)},
abstract = {Despite its success in controlling robotic systems, reinforcement learning (RL) suffers from several issues that hinder its widespread adoption in real-world scenarios. Recently, diffusion models have emerged as a powerful tool to address some of the longstanding challenges in offline and model-based RL, improving long-horizon planning and facilitating multitask generalization. However, these algorithms are unsuitable for operating in unseen and dynamic environments where novel and time-varying constraints not represented in the training data may arise. To address this issue, we propose incorporating a projection scheme into diffusion-based trajectory generation. Our approach uses the iterative nature of diffusion models and alternates the conditional backward diffusion process with a projection of the noisy trajectory onto the constraint set. As a result, we can generate trajectories that are both safe and dynamically feasible while still achieving high reward. We evaluate our approach for goal-conditioned offline RL for two simulated robotic systems navigating in environments with static and dynamic obstacles, representing novel test-time constraints. We show that our method can satisfy these constraints in closed loop, greatly increasing the success rate of reaching the goal.},
}

2023

Perception-aware tag placement planning for robust localization of UAVs in indoor construction environments
N. Kayhani, A. Schoellig, and B. McCabe
Journal of Computing in Civil Engineering, vol. 37, iss. 2, p. 4022060, 2023.

Tag-based visual-inertial localization is a lightweight method for enabling autonomous data collection missions of low-cost unmanned aerial vehicles (UAVs) in indoor construction environments. However, finding the optimal tag configuration (i.e., number, size, and location) on dynamic construction sites remains challenging. This work proposes a perception-aware genetic algorithm-based tag placement planner (PGA-TaPP) to determine the optimal tag configuration using four-dimensional (4D) building information models (BIM), considering the project progress, safety requirements, and UAV’s localizability. The proposed method provides a 4D plan for tag placement by maximizing the localizability in user-specified regions of interest (ROIs) while limiting the installation costs. Localizability is quantified using the Fisher information matrix (FIM) and encapsulated in navigable grids. The experimental results show the effectiveness of our method in finding an optimal 4D tag placement plan for the robust localization of UAVs on under-construction indoor sites.

@article{kayani-jcce23,
author = {Navid Kayhani and Angela Schoellig and Brenda McCabe},
title = {Perception-Aware Tag Placement Planning for Robust Localization of {UAVs} in Indoor Construction Environments},
journal = {{Journal of Computing in Civil Engineering}},
volume = {37},
number = {2},
pages = {04022060},
year = {2023},
doi = {10.1061/JCCEE5.CPENG-5068},
urllink = {https://ascelibrary.org/doi/abs/10.1061/JCCEE5.CPENG-5068},
abstract = {Tag-based visual-inertial localization is a lightweight method for enabling autonomous data collection missions of low-cost unmanned aerial vehicles (UAVs) in indoor construction environments. However, finding the optimal tag configuration (i.e., number, size, and location) on dynamic construction sites remains challenging. This work proposes a perception-aware genetic algorithm-based tag placement planner (PGA-TaPP) to determine the optimal tag configuration using four-dimensional (4D) building information models (BIM), considering the project progress, safety requirements, and UAV’s localizability. The proposed method provides a 4D plan for tag placement by maximizing the localizability in user-specified regions of interest (ROIs) while limiting the installation costs. Localizability is quantified using the Fisher information matrix (FIM) and encapsulated in navigable grids. The experimental results show the effectiveness of our method in finding an optimal 4D tag placement plan for the robust localization of UAVs on under-construction indoor sites.}
}

Keep it upright: model predictive control for nonprehensile object transportation with obstacle avoidance on a mobile manipulator
A. Heins and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 8, iss. 12, p. 7986–7993, 2023.

We consider a nonprehensile manipulation task in which a mobile manipulator must balance objects on its end effector without grasping them—known as the waiter’s problem—and move to a desired location while avoiding static and dynamic obstacles. In contrast to existing approaches, our focus is on fast online planning in response to new and changing environments. Our main contribution is a whole-body constrained model predictive controller (MPC) for a mobile manipulator that balances objects and avoids collisions. Furthermore, we propose planning using the minimum statically-feasible friction coefficients, which provides robustness to frictional uncertainty and other force disturbances while also substantially reducing the compute time required to update the MPC policy. Simulations and hardware experiments on a velocity-controlled mobile manipulator with up to seven balanced objects, stacked objects, and various obstacles show that our approach can handle a variety of conditions that have not been previously demonstrated, with end effector speeds and accelerations up to 2.0 m/s and 7.9 m/s^2 , respectively. Notably, we demonstrate a projectile avoidance task in which the robot avoids a thrown ball while balancing a tall bottle.

@article{heins-ral23,
author = {Adam Heins and Angela P. Schoellig},
title = {Keep It Upright: Model Predictive Control for Nonprehensile Object Transportation With Obstacle Avoidance on a Mobile Manipulator},
journal = {{IEEE Robotics and Automation Letters}},
year = {2023},
volume = {8},
number = {12},
pages = {7986--7993},
urllink = {https://doi.org/10.1109/LRA.2023.3324520},
urlvideo = {http://tiny.cc/keep-it-upright},
urlcode = {https://github.com/utiasDSL/upright},
doi = {10.1109/LRA.2023.3324520},
abstract = {We consider a nonprehensile manipulation task in which a mobile manipulator must balance objects on its end effector without grasping them—known as the waiter's problem—and move to a desired location while avoiding static and dynamic obstacles. In contrast to existing approaches, our focus is on fast online planning in response to new and changing environments. Our main contribution is a whole-body constrained model predictive controller (MPC) for a mobile manipulator that balances objects and avoids collisions. Furthermore, we propose planning using the minimum statically-feasible friction coefficients, which provides robustness to frictional uncertainty and other force disturbances while also substantially reducing the compute time required to update the MPC policy. Simulations and hardware experiments on a velocity-controlled mobile manipulator with up to seven balanced objects, stacked objects, and various obstacles show that our approach can handle a variety of conditions that have not been previously demonstrated, with end effector speeds and accelerations up to 2.0 m/s and 7.9 m/s^2 , respectively. Notably, we demonstrate a projectile avoidance task in which the robot avoids a thrown ball while balancing a tall bottle.}
}

Differentially flat learning-based model predictive control using a stability, state, and input constraining safety filter
A. W. Hall, M. Greeff, and A. P. Schoellig
IEEE Control Systems Letters, vol. 7, p. 2191–2196, 2023.

Learning-based optimal control algorithms control unknown systems using past trajectory data and a learned model of the system dynamics. These controllers use either a linear approximation of the learned dynamics, trading performance for faster computation, or nonlinear optimization methods, which typically perform better but can limit real-time applicability. In this letter, we present a novel nonlinear controller that exploits differential flatness to achieve similar performance to state-of-the-art learning-based controllers but with significantly less computational effort. Differential flatness is a property of dynamical systems whereby nonlinear systems can be exactly linearized through a nonlinear input mapping. Here, the nonlinear transformation is learned as a Gaussian process and is used in a safety filter that guarantees, with high probability, stability as well as input and flat state constraint satisfaction. This safety filter is then used to refine inputs from a flat model predictive controller to perform constrained nonlinear learning-based optimal control through two successive convex optimizations. We compare our method to state-of-the-art learning-based control strategies and achieve similar performance, but with significantly better computational efficiency, while also respecting flat state and input constraints, and guaranteeing stability.

@article{hall-lcss23,
title = {Differentially Flat Learning-Based Model Predictive Control Using a Stability, State, and Input Constraining Safety Filter},
author = {Adam W. Hall and Melissa Greeff and Angela P. Schoellig},
journal = {{IEEE Control Systems Letters}},
year = {2023},
volume = {7},
number = {},
pages={2191--2196},
doi={10.1109/LCSYS.2023.3285616},
urllink = {https://ieeexplore.ieee.org/abstract/document/10149384},
abstract = {Learning-based optimal control algorithms control unknown systems using past trajectory data and a learned model of the system dynamics. These controllers use either a linear approximation of the learned dynamics, trading performance for faster computation, or nonlinear optimization methods, which typically perform better but can limit real-time applicability. In this letter, we present a novel nonlinear controller that exploits differential flatness to achieve similar performance to state-of-the-art learning-based controllers but with significantly less computational effort. Differential flatness is a property of dynamical systems whereby nonlinear systems can be exactly linearized through a nonlinear input mapping. Here, the nonlinear transformation is learned as a Gaussian process and is used in a safety filter that guarantees, with high probability, stability as well as input and flat state constraint satisfaction. This safety filter is then used to refine inputs from a flat model predictive controller to perform constrained nonlinear learning-based optimal control through two successive convex optimizations. We compare our method to state-of-the-art learning-based control strategies and achieve similar performance, but with significantly better computational efficiency, while also respecting flat state and input constraints, and guaranteeing stability.}
}

Boreas: a multi-season autonomous driving dataset
K. Burnett, D. J. Yoon, Y. Wu, A. Z. Li, H. Zhang, S. Lu, J. Qian, W. Tseng, A. Lambert, K. Y. K. Leung, A. P. Schoellig, and T. D. Barfoot
The International Journal of Robotics Research, vol. 42, iss. 1-2, p. 33–42, 2023.

The Boreas dataset was collected by driving a repeated route over the course of one year, resulting in stark seasonal variations and adverse weather conditions such as rain and falling snow. In total, the Boreas dataset contains over 350km of driving data featuring a 128-channel Velodyne Alpha-Prime lidar, a 360 degree Navtech CIR304-H scanning radar, a 5MP FLIR Blackfly S camera, and centimetre-accurate post-processed ground truth poses. At launch, our dataset will support live leaderboards for odometry, metric localization, and 3D object detection. The dataset and development kit are available at: https://www.boreas.utias.utoronto.ca/

@article{burnett-ijrr22,
author = {Keenan Burnett and David J. Yoon and Yuchen Wu and Andrew Zou Li and Haowei Zhang and Shichen Lu and Jingxing Qian and Wei-Kang Tseng and Andrew Lambert and Keith Y.K. Leung and Angela P. Schoellig and Timothy D. Barfoot},
title = {Boreas: A Multi-Season Autonomous Driving Dataset},
journal = {{The International Journal of Robotics Research}},
volume = {42},
number = {1-2},
pages = {33--42},
year = {2023},
doi = {10.1177/02783649231160195},
abstract = {The Boreas dataset was collected by driving a repeated route over the course of one year, resulting in stark seasonal variations and adverse weather conditions such as rain and falling snow. In total, the Boreas dataset contains over 350km of driving data featuring a 128-channel Velodyne Alpha-Prime lidar, a 360 degree Navtech CIR304-H scanning radar, a 5MP FLIR Blackfly S camera, and centimetre-accurate post-processed ground truth poses. At launch, our dataset will support live leaderboards for odometry, metric localization, and 3D object detection. The dataset and development kit are available at: https://www.boreas.utias.utoronto.ca/}
}

POV-SLAM: probabilistic object-aware variational SLAM in semi-static environments
J. Qian, V. Chatrath, J. Servos, A. Mavrinac, W. Burgard, S. L. Waslander, and A. P. Schoellig
in Proc. of Robotics: Science and Systems, 2023.

Simultaneous localization and mapping (SLAM) in slowly varying scenes is important for long-term robot task completion in GPS-denied environments. Failing to detect scene changes may lead to inaccurate maps and, ultimately, lost robots. Classical SLAM algorithms assume static scenes, and recent works take dynamics into account, but require scene changes to be observed in consecutive frames. Semi-static scenes, wherein objects appear, disappear, or move slowly over time, are often overlooked, yet are critical for long-term operation. We propose an object-aware, factor-graph SLAM framework that tracks and reconstructs semi-static object-level changes. Our novel variational expectation-maximization strategy is used to optimize factor graphs involving a Gaussian-Uniform bimodal measurement likelihood for potentially-changing objects. We evaluate our approach alongside the state-of-the-art SLAM solutions in simulation and on our novel real-world SLAM dataset captured in a warehouse over four months. Our method improves the robustness of localization in the presence of semi-static changes, providing object-level reasoning about the scene.

@inproceedings{qian-rss23,
author = {Jingxing Qian and Veronica Chatrath and James Servos and Aaron Mavrinac and Wolfram Burgard and Steven L. Waslander and Angela P. Schoellig},
title = {{POV-SLAM}: Probabilistic Object-Aware Variational {SLAM} in Semi-Static Environments},
booktitle = {{Proc. of Robotics: Science and Systems}},
year = {2023},
doi = {10.15607/RSS.2023.XIX.069},
abstract = {Simultaneous localization and mapping (SLAM) in slowly varying scenes is important for long-term robot task completion in GPS-denied environments. Failing to detect scene changes may lead to inaccurate maps and, ultimately, lost robots. Classical SLAM algorithms assume static scenes, and recent works take dynamics into account, but require scene changes to be observed in consecutive frames. Semi-static scenes, wherein objects appear, disappear, or move slowly over time, are often overlooked, yet are critical for long-term operation. We propose an object-aware, factor-graph SLAM framework that tracks and reconstructs semi-static object-level changes. Our novel variational expectation-maximization strategy is used to optimize factor graphs involving a Gaussian-Uniform bimodal measurement likelihood for potentially-changing objects. We evaluate our approach alongside the state-of-the-art SLAM solutions in simulation and on our novel real-world SLAM dataset captured in a warehouse over four months. Our method improves the robustness of localization in the presence of semi-static changes, providing object-level reasoning about the scene.}
}

Multi-step model predictive safety filters: reducing chattering by increasing the prediction horizon
F. Pizarro Bejarano, L. Brunke, and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2023, p. 4723–4730.

Learning-based controllers have demonstrated su-perior performance compared to classical controllers in various tasks. However, providing safety guarantees is not trivial. Safety, the satisfaction of state and input constraints, can be guaranteed by augmenting the learned control policy with a safety filter. Model predictive safety filters (MPSFs) are a common safety filtering approach based on model predictive control (MPC). MPSFs seek to guarantee safety while minimizing the difference between the proposed and applied inputs in the immediate next time step. This limited foresight can lead to jerky motions and undesired oscillations close to constraint boundaries, known as chattering. In this paper, we reduce chattering by considering input corrections over a longer horizon. Under the assumption of bounded model uncertainties, we prove recursive feasibility using techniques from robust MPC. We verified the proposed approach in both extensive simulation and quadrotor exper-iments. In experiments with a Crazyflie 2.0 drone, we show that, in addition to preserving the desired safety guarantees, the proposed MPSF reduces chattering by more than a factor of 4 compared to previous MPSF formulations.

@inproceedings{pizarro-cdc23,
author={Pizarro Bejarano, Federico and Brunke, Lukas and Schoellig, Angela P.},
booktitle={{Proc. of the IEEE Conference on Decision and Control (CDC)}},
title={Multi-Step Model Predictive Safety Filters: Reducing Chattering by Increasing the Prediction Horizon},
year={2023},
pages={4723--4730},
doi={10.1109/CDC49753.2023.10383734},
abstract={Learning-based controllers have demonstrated su-perior performance compared to classical controllers in various tasks. However, providing safety guarantees is not trivial. Safety, the satisfaction of state and input constraints, can be guaranteed by augmenting the learned control policy with a safety filter. Model predictive safety filters (MPSFs) are a common safety filtering approach based on model predictive control (MPC). MPSFs seek to guarantee safety while minimizing the difference between the proposed and applied inputs in the immediate next time step. This limited foresight can lead to jerky motions and undesired oscillations close to constraint boundaries, known as chattering. In this paper, we reduce chattering by considering input corrections over a longer horizon. Under the assumption of bounded model uncertainties, we prove recursive feasibility using techniques from robust MPC. We verified the proposed approach in both extensive simulation and quadrotor exper-iments. In experiments with a Crazyflie 2.0 drone, we show that, in addition to preserving the desired safety guarantees, the proposed MPSF reduces chattering by more than a factor of 4 compared to previous MPSF formulations.}
}

Does unpredictability influence driving behavior?
S. Samavi, F. Shkurti, and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023, p. 1720–1727.

In this paper we investigate the effect of the unpredictability of surrounding cars on an ego-car performing a driving maneuver. We use Maximum Entropy Inverse reinforcement Learning to model reward functions for an ego-car conducting a lane change in a highway setting. We define a new feature based on the unpredictability of surrounding cars and use it in the reward function. We learn two reward functions from human data: a baseline and one that incorporates our defined unpredictability feature, then compare their performance with a quantitative and qualitative evaluation. Our evaluation demonstrates that incorporating the unpredictability feature leads to a better fit of human-generated test data. These results encourage further investigation of the effect of unpredictability on driving behavior.

@inproceedings{samavi-iros23,
author={Sepehr Samavi and Florian Shkurti and Angela P. Schoellig},
booktitle={{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
title={Does Unpredictability Influence Driving Behavior?},
year={2023},
pages={1720--1727},
doi={10.1109/IROS55552.2023.10342534},
abstract={In this paper we investigate the effect of the unpredictability of surrounding cars on an ego-car performing a driving maneuver. We use Maximum Entropy Inverse reinforcement Learning to model reward functions for an ego-car conducting a lane change in a highway setting. We define a new feature based on the unpredictability of surrounding cars and use it in the reward function. We learn two reward functions from human data: a baseline and one that incorporates our defined unpredictability feature, then compare their performance with a quantitative and qualitative evaluation. Our evaluation demonstrates that incorporating the unpredictability feature leads to a better fit of human-generated test data. These results encourage further investigation of the effect of unpredictability on driving behavior.}
}

Multi-view keypoints for reliable 6D object pose estimation
A. Li and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2023, p. 6988–6994.

6D Object pose estimation is a fundamental component in robotics enabling efficient interaction with the environment. 6D pose estimation is particularly challenging in bin- picking applications, where many objects are low-feature and reflective, and self-occlusion between objects of the same type is common. We propose a novel multi-view approach leveraging known camera transformations from an eye-in-hand setup to combine heatmap and keypoint estimates into a probability density map over 3D space. The result is a robust approach that is scalable in the number of views. It relies on a confidence score composed of keypoint probabilities and point-cloud alignment error, which allows reliable rejection of false positives. We demonstrate an average pose estimation error of approximately 0.5 mm and 2 degrees across a variety of difficult low-feature and reflective objects in the ROBI dataset, while also surpassing the state-of-art correct detection rate, measured using the 10\% object diameter threshold on ADD error.

@inproceedings{li-icra23,
author={Alan Li and Angela P. Schoellig},
booktitle={{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
title={Multi-View Keypoints for Reliable {6D} Object Pose Estimation},
year={2023},
pages={6988--6994},
doi={10.1109/ICRA48891.2023.10160354},
abstract={6D Object pose estimation is a fundamental component in robotics enabling efficient interaction with the environment. 6D pose estimation is particularly challenging in bin- picking applications, where many objects are low-feature and reflective, and self-occlusion between objects of the same type is common. We propose a novel multi-view approach leveraging known camera transformations from an eye-in-hand setup to combine heatmap and keypoint estimates into a probability density map over 3D space. The result is a robust approach that is scalable in the number of views. It relies on a confidence score composed of keypoint probabilities and point-cloud alignment error, which allows reliable rejection of false positives. We demonstrate an average pose estimation error of approximately 0.5 mm and 2 degrees across a variety of difficult low-feature and reflective objects in the ROBI dataset, while also surpassing the state-of-art correct detection rate, measured using the 10\% object diameter threshold on ADD error.}
}

Uncertainty-aware gaussian mixture model for uwb time difference of arrival localization in cluttered environments
W. Zhao, A. Goudar, M. Tang, X. Qiao, and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023, p. 5266–5273.

Ultra-wideband (UWB) time difference of arrival (TDOA)-based localization has emerged as a low-cost and scalable indoor positioning solution. However, in cluttered environments, the performance of UWB TDOA-based localization deteriorates due to the biased and non-Gaussian noise distributions induced by obstacles. In this work, we present a bi-level optimization-based joint localization and noise model learning algorithm to address this problem. In particular, we use a Gaussian mixture model (GMM) to approximate the measurement noise distribution. We explicitly incorporate the estimated state’s uncertainty into the GMM noise model learning, referred to as uncertainty-aware GMM, to improve both noise modeling and localization performance. We first evaluate the GMM noise model learning and localization performance in numerous simulation scenarios. We then demonstrate the effectiveness of our algorithm in extensive real-world experiments using two different cluttered environments. We show that our algorithm provides accurate position estimates with low-cost UWB sensors, no prior knowledge about the obstacles in the space, and a significant amount of UWB radios occluded.

@INPROCEEDINGS{zhao-iros23,
author={Wenda Zhao and Abhishek Goudar and Mingliang Tang and Xinyuan Qiao and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
title={Uncertainty-aware Gaussian Mixture Model for UWB Time Difference of Arrival Localization in Cluttered Environments},
year={2023},
pages={5266--5273},
doi={10.1109/IROS55552.2023.10342365},
urllink={https://ieeexplore.ieee.org/document/10342365},
abstract = {Ultra-wideband (UWB) time difference of arrival (TDOA)-based localization has emerged as a low-cost and scalable indoor positioning solution. However, in cluttered environments, the performance of UWB TDOA-based localization deteriorates due to the biased and non-Gaussian noise distributions induced by obstacles. In this work, we present a bi-level optimization-based joint localization and noise model learning algorithm to address this problem. In particular, we use a Gaussian mixture model (GMM) to approximate the measurement noise distribution. We explicitly incorporate the estimated state’s uncertainty into the GMM noise model learning, referred to as uncertainty-aware GMM, to improve both noise modeling and localization performance. We first evaluate the GMM noise model learning and localization performance in numerous simulation scenarios. We then demonstrate the effectiveness of our algorithm in extensive real-world experiments using two different cluttered environments. We show that our algorithm provides accurate position estimates with low-cost UWB sensors, no prior knowledge about the obstacles in the space, and a significant amount of UWB radios occluded.}
}

AMSwarm: an alternating minimization approach for safe motion planning of quadrotor swarms in cluttered environments
V. K. Adajania, S. Zhou, A. K. Singh, and A. P. and Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2023, p. 1421–1427.

This paper presents a scalable online algorithm to generate safe and kinematically feasible trajectories for quadrotor swarms. Existing approaches rely on linearizing Euclidean distance-based collision constraints and on axis-wise decoupling of kinematic constraints to reduce the trajectory optimization problem for each quadrotor to a quadratic program (QP). This conservative approximation often fails to find a solution in cluttered environments. We present a novel alternative that handles collision constraints without linearization and kinematic constraints in their quadratic form while still retaining the QP form. We achieve this by reformulating the constraints in a polar form and applying an Alternating Minimization algorithm to the resulting problem. Through extensive simulation results, we demonstrate that, as compared to Sequential Convex Programming (SCP) baselines, our approach achieves on average a 72\% improvement in success rate, a 36\% reduction in mission time, and a 42 times faster per-agent computation time. We also show that collision constraints derived from discrete-time barrier functions (BF) can be incorporated, leading to different safety behaviours without significant computational overhead. Moreover, our optimizer outperforms the state-of-the-art optimal control solver ACADO in handling BF constraints with a 31 times faster per-agent computation time and a 44\% reduction in mission time on average. We experimentally validated our approach on a Crazyflie quadrotor swarm of up to 12 quadrotors. The code with supplementary material and video are released for reference.

@inproceedings{adajania-icra23,
author={Vivek K. Adajania and Siqi Zhou and Arun Kumar Singh and and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
title={{AMSwarm}: An Alternating Minimization Approach for Safe Motion Planning of Quadrotor Swarms in Cluttered Environments},
year={2023},
pages={1421--1427},
doi={10.1109/ICRA48891.2023.10161063},
urlvideo={http://tiny.cc/AMSwarmVideo},
urlcode={https://github.com/utiasDSL/AMSwarm},
abstract = {This paper presents a scalable online algorithm to generate safe and kinematically feasible trajectories for quadrotor swarms. Existing approaches rely on linearizing Euclidean distance-based collision constraints and on axis-wise decoupling of kinematic constraints to reduce the trajectory optimization problem for each quadrotor to a quadratic program (QP). This conservative approximation often fails to find a solution in cluttered environments. We present a novel alternative that handles collision constraints without linearization and kinematic constraints in their quadratic form while still retaining the QP form. We achieve this by reformulating the constraints in a polar form and applying an Alternating Minimization algorithm to the resulting problem. Through extensive simulation results, we demonstrate that, as compared to Sequential Convex Programming (SCP) baselines, our approach achieves on average a 72\% improvement in success rate, a 36\% reduction in mission time, and a 42 times faster per-agent computation time. We also show that collision constraints derived from discrete-time barrier functions (BF) can be incorporated, leading to different safety behaviours without significant computational overhead. Moreover, our optimizer outperforms the state-of-the-art optimal control solver ACADO in handling BF constraints with a 31 times faster per-agent computation time and a 44\% reduction in mission time on average. We experimentally validated our approach on a Crazyflie quadrotor swarm of up to 12 quadrotors. The code with supplementary material and video are released for reference.}
}

Continuous-time range-only pose estimation
A. Goudar, T. D. Barfoot, and A. P. Schoellig
in Proc. of the Conference on Robots and Vision (CRV), 2023, p. 29–36.

Range-only (RO) localization involves determining the position of a mobile robot by measuring the distance to specific anchors. RO localization is challenging since the measurements are low-dimensional and a single range sensor does not have enough information to estimate the full pose of the robot. As such, range sensors are typically coupled with other sensing modalities such as wheel encoders or inertial measurement units (IMUs) to estimate the full pose. In this work, we propose a continuous-time Gaussian process (GP)-based trajectory estimation method to estimate the full pose of a robot using only range measurements from multiple range sensors. Results from simulation and real experiments show that our proposed method, using off-the-shelf range sensors, is able to achieve comparable performance and in some cases outperform alternative state-of-the-art sensor-fusion methods that use additional sensing modalities.

@INPROCEEDINGS{goudar-crv23,
author={Abhishek Goudar and Timothy D. Barfoot and Angela P. Schoellig},
booktitle = {{Proc. of the Conference on Robots and Vision (CRV)}},
title={Continuous-Time Range-Only Pose Estimation},
year={2023},
pages={29--36},
doi={10.1109/CRV60082.2023.00012},
urllink = {https://arxiv.org/abs/2304.09043},
abstract = {Range-only (RO) localization involves determining the position of a mobile robot by measuring the distance to specific anchors. RO localization is challenging since the measurements are low-dimensional and a single range sensor does not have enough information to estimate the full pose of the robot. As such, range sensors are typically coupled with other sensing modalities such as wheel encoders or inertial measurement units (IMUs) to estimate the full pose. In this work, we propose a continuous-time Gaussian process (GP)-based trajectory estimation method to estimate the full pose of a robot using only range measurements from multiple range sensors. Results from simulation and real experiments show that our proposed method, using off-the-shelf range sensors, is able to achieve comparable performance and in some cases outperform alternative state-of-the-art sensor-fusion methods that use additional sensing modalities.}
}

Swarm-GPT: combining large language models with safe motion planning for robot choreography design
A. Jiao, T. P. Patel, S. Khurana, A. Korol, L. Brunke, V. K. Adajania, U. Culha, S. Zhou, and A. P. Schoellig
Extended Abstract in the 6th Robot Learning Workshop at the Conference on Neural Information Processing Systems (NeurIPS), 2023.

This paper presents Swarm-GPT, a system that integrates large language models (LLMs) with safe swarm motion planning – offering an automated and novel approach to deployable drone swarm choreography. Swarm-GPT enables users to automatically generate synchronized drone performances through natural language instructions. With an emphasis on safety and creativity, Swarm-GPT addresses a critical gap in the field of drone choreography by integrating the creative power of generative models with the effectiveness and safety of model-based planning algorithms. This goal is achieved by prompting the LLM to generate a unique set of waypoints based on extracted audio data. A trajectory planner processes these waypoints to guarantee collision-free and feasible motion. Results can be viewed in simulation prior to execution and modified through dynamic re-prompting. Sim-to-real transfer experiments demonstrate Swarm-GPT’s ability to accurately replicate simulated drone trajectories, with a mean sim-to-real root mean square error (RMSE) of 28.7 mm. To date, Swarm-GPT has been successfully showcased at three live events, exemplifying safe real-world deployment of pre-trained models.

@MISC{jiao-neurips23,
author = {Aoran Jiao and Tanmay P. Patel and Sanjmi Khurana and Anna-Mariya Korol and Lukas Brunke and Vivek K. Adajania and Utku Culha and Siqi Zhou and Angela P. Schoellig},
title = {{Swarm-GPT}: Combining Large Language Models with Safe Motion Planning for Robot Choreography Design},
year = {2023},
howpublished = {Extended Abstract in the 6th Robot Learning Workshop at the Conference on Neural Information Processing Systems (NeurIPS)},
urllink = {https://arxiv.org/abs/2312.01059},
abstract = {This paper presents Swarm-GPT, a system that integrates large language models (LLMs) with safe swarm motion planning - offering an automated and novel approach to deployable drone swarm choreography. Swarm-GPT enables users to automatically generate synchronized drone performances through natural language instructions. With an emphasis on safety and creativity, Swarm-GPT addresses a critical gap in the field of drone choreography by integrating the creative power of generative models with the effectiveness and safety of model-based planning algorithms. This goal is achieved by prompting the LLM to generate a unique set of waypoints based on extracted audio data. A trajectory planner processes these waypoints to guarantee collision-free and feasible motion. Results can be viewed in simulation prior to execution and modified through dynamic re-prompting. Sim-to-real transfer experiments demonstrate Swarm-GPT's ability to accurately replicate simulated drone trajectories, with a mean sim-to-real root mean square error (RMSE) of 28.7 mm. To date, Swarm-GPT has been successfully showcased at three live events, exemplifying safe real-world deployment of pre-trained models.},
}

Robust single-point pushing with force feedback
A. Heins and A. P. Schoellig
Short Paper in the Embracing Contacts Workshop at the IEEE International Conference on Robotics and Automation (ICRA), 2023.

We present the first controller for quasistatic robotic planar pushing with single-point contact using only force feedback. We consider a mobile robot equipped with a force-torque sensor to measure the force at the contact point with the pushed object (the “slider”). The parameters of the slider are not known to the controller, nor is feedback on the slider’s pose. We assume that the global position of the contact point is always known and that the approximate initial position of the slider is provided. We focus specifically on the case when it is desired to push the slider along a straight line. Simulations and real-world experiments show that our controller yields stable pushes that are robust to a wide range of slider parameters and state perturbations.

@MISC{heins-icra23,
author = {Adam Heins and Angela P. Schoellig},
title = {Robust Single-Point Pushing with Force Feedback},
year = {2023},
howpublished = {Short Paper in the Embracing Contacts Workshop at the IEEE International Conference on Robotics and Automation (ICRA)},
urllink = {https://arxiv.org/abs/2305.11048},
abstract = {We present the first controller for quasistatic robotic planar pushing with single-point contact using only force feedback. We consider a mobile robot equipped with a force-torque sensor to measure the force at the contact point with the pushed object (the "slider"). The parameters of the slider are not known to the controller, nor is feedback on the slider's pose. We assume that the global position of the contact point is always known and that the approximate initial position of the slider is provided. We focus specifically on the case when it is desired to push the slider along a straight line. Simulations and real-world experiments show that our controller yields stable pushes that are robust to a wide range of slider parameters and state perturbations.},
}

2022

Safe-control-gym: a unified benchmark suite for safe learning-based control and reinforcement learning in robotics
Z. Yuan, A. W. Hall, S. Zhou, M. G. Lukas Brunke and, J. Panerati, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 7, iss. 4, pp. 11142-11149, 2022.

In recent years, both reinforcement learning and learning-based control—as well as the study of their safety, which is crucial for deployment in real-world robots—have gained significant traction. However, to adequately gauge the progress and applicability of new results, we need the tools to equitably compare the approaches proposed by the controls and reinforcement learning communities. Here, we propose a new open-source benchmark suite, called safe-control-gym, supporting both model-based and data-based control techniques. We provide implementations for three dynamic systems—the cart-pole, the 1D, and 2D quadrotor—and two control tasks—stabilization and trajectory tracking. We propose to extend OpenAI’s Gym API—the de facto standard in reinforcement learning research—with (i) the ability to specify (and query) symbolic dynamics and (ii) constraints, and (iii) (repeatably) inject simulated disturbances in the control inputs, state measurements, and inertial properties. To demonstrate our proposal and in an attempt to bring research communities closer together, we show how to use safe-control-gym to quantitatively compare the control performance, data efficiency, and safety of multiple approaches from the fields of traditional control, learning-based control, and reinforcement learning.

@article{yuan-ral22,
author={Zhaocong Yuan and Adam W. Hall and Siqi Zhou and Lukas Brunke and, Melissa Greeff and Jacopo Panerati and Angela P. Schoellig},
title={safe-control-gym: a Unified Benchmark Suite for Safe Learning-based Control and Reinforcement Learning in Robotics},
journal = {{IEEE Robotics and Automation Letters}},
year = {2022},
volume={7},
number={4},
pages={11142-11149},
urllink = {https://ieeexplore.ieee.org/abstract/document/9849119/},
doi = {10.1109/LRA.2022.3196132},
abstract = {In recent years, both reinforcement learning and learning-based control—as well as the study of their safety, which is crucial for deployment in real-world robots—have gained significant traction. However, to adequately gauge the progress and applicability of new results, we need the tools to equitably compare the approaches proposed by the controls and reinforcement learning communities. Here, we propose a new open-source benchmark suite, called safe-control-gym, supporting both model-based and data-based control techniques. We provide implementations for three dynamic systems—the cart-pole, the 1D, and 2D quadrotor—and two control tasks—stabilization and trajectory tracking. We propose to extend OpenAI’s Gym API—the de facto standard in reinforcement learning research—with (i) the ability to specify (and query) symbolic dynamics and (ii) constraints, and (iii) (repeatably) inject simulated disturbances in the control inputs, state measurements, and inertial properties. To demonstrate our proposal and in an attempt to bring research communities closer together, we show how to use safe-control-gym to quantitatively compare the control performance, data efficiency, and safety of multiple approaches from the fields of traditional control, learning-based control, and reinforcement learning.}
}

Are we ready for radar to replace lidar in all-weather mapping and localization?
K. Burnett, Y. Wu, D. J. Yoon, A. P. Schoellig, and T. D. Barfoot
IEEE Robotics and Automation Letters, vol. 4, iss. 7, p. 10328–10335, 2022.

We present an extensive comparison between three topometric localization systems: radar-only, lidar-only, and a cross-modal radar-to-lidar system across varying seasonal and weather conditions using the Boreas dataset. Contrary to our expectations, our experiments showed that our lidar-only pipeline achieved the best localization accuracy even during a snowstorm. Our results seem to suggest that the sensitivity of lidar localization to moderate precipitation has been exaggerated in prior works. However, our radar-only pipeline was able to achieve competitive accuracy with a much smaller map. Furthermore, radar localization and radar sensors still have room to improve and may yet prove valuable in extreme weather or as a redundant backup system. Code for this project can be found at: https://github.com/utiasASRL/vtr3

@article{burnett-ral22,
author={Keenan Burnett and Yuchen Wu and David J. Yoon and Angela P. Schoellig and Timothy D. Barfoot},
title={Are We Ready for Radar to Replace Lidar in All-Weather Mapping and Localization?},
journal = {{IEEE Robotics and Automation Letters}},
year = {2022},
volume = {4},
number = {7},
pages = {10328--10335},
doi = {10.1109/LRA.2022.3192885},
urllink = {https://ieeexplore.ieee.org/abstract/document/9835037/},
abstract = {We present an extensive comparison between three topometric localization systems: radar-only, lidar-only, and a cross-modal radar-to-lidar system across varying seasonal and weather conditions using the Boreas dataset. Contrary to our expectations, our experiments showed that our lidar-only pipeline achieved the best localization accuracy even during a snowstorm. Our results seem to suggest that the sensitivity of lidar localization to moderate precipitation has been exaggerated in prior works. However, our radar-only pipeline was able to achieve competitive accuracy with a much smaller map. Furthermore, radar localization and radar sensors still have room to improve and may yet prove valuable in extreme weather or as a redundant backup system. Code for this project can be found at: https://github.com/utiasASRL/vtr3}
}

Min-max vertex cycle covers with connectivity constraints for multi-robot patrolling
J. Scherer, A. P. Schoellig, and B. Rinner
IEEE Robotics and Automation Letters, vol. 4, iss. 7, p. 10152–10159, 2022.

We consider a multi-robot patrolling scenario with intermittent connectivity constraints, ensuring that robots’ data finally arrive at a base station. In particular, each robot traverses a closed tour periodically and meets with the robots on neighboring tours to exchange data. We model the problem as a variant of the min-max vertex cycle cover problem (MMCCP), which is the problem of covering all vertices with a given number of disjoint tours such that the longest tour length is minimal. In this work, we introduce the minimum idleness connectivity-constrained multi-robot patrolling problem, show that it is NP-hard, and model it as a mixed-integer linear program (MILP). The computational complexity of exactly solving this problem restrains practical applications, and therefore we develop approximate algorithms taking a solution for MMCCP as input. Our simulation experiments on 10 vertices and up to 3 robots compare the results of different solution approaches (including solving the MILP formulation) and show that our greedy algorithm can obtain an objective value close to the one of the MILP formulations but requires much less computation time. Experiments on instances with up to 100 vertices and 10 robots indicate that the greedy approximation algorithm tries to keep the length of the longest tour small by extending smaller tours for data exchange.

@article{scherer-ral22,
author={J{\"u}rgen Scherer and Angela P. Schoellig and Bernhard Rinner},
title={Min-Max Vertex Cycle Covers With Connectivity Constraints for Multi-Robot Patrolling},
journal = {{IEEE Robotics and Automation Letters}},
year = {2022},
volume = {4},
number = {7},
pages = {10152--10159},
doi = {10.1109/LRA.2022.3193242},
urllink = {https://ieeexplore.ieee.org/abstract/document/9837406/},
abstract = {We consider a multi-robot patrolling scenario with intermittent connectivity constraints, ensuring that robots' data finally arrive at a base station. In particular, each robot traverses a closed tour periodically and meets with the robots on neighboring tours to exchange data. We model the problem as a variant of the min-max vertex cycle cover problem (MMCCP), which is the problem of covering all vertices with a given number of disjoint tours such that the longest tour length is minimal. In this work, we introduce the minimum idleness connectivity-constrained multi-robot patrolling problem, show that it is NP-hard, and model it as a mixed-integer linear program (MILP). The computational complexity of exactly solving this problem restrains practical applications, and therefore we develop approximate algorithms taking a solution for MMCCP as input. Our simulation experiments on 10 vertices and up to 3 robots compare the results of different solution approaches (including solving the MILP formulation) and show that our greedy algorithm can obtain an objective value close to the one of the MILP formulations but requires much less computation time. Experiments on instances with up to 100 vertices and 10 robots indicate that the greedy approximation algorithm tries to keep the length of the longest tour small by extending smaller tours for data exchange.}
}

Finding the right place: sensor placement for UWB time difference of arrival localization in cluttered indoor environments
W. Zhao, A. Goudar, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 7, iss. 3, p. 6075–6082, 2022.

Ultra-wideband (UWB) time difference of arrival (TDOA)-based localization has recently emerged as a promising indoor positioning solution. However, in cluttered environments, both the UWB radio positions and the obstacle-induced nonline-of-sight (NLOS) measurement biases significantly impact the quality of the position estimate. Consequently, the placement of the UWB radios must be carefully designed to provide satisfactory localization accuracy for a region of interest. In this work, we propose a novel algorithm that optimizes the UWB radio positions for a pre-defined region of interest in the presence of obstacles. The mean-squared error (MSE) metric is used to formulate an optimization problem that balances the influence of the geometry of the radio positions and the NLOS effects. We further apply the proposed algorithm to compute a minimal number of UWB radios required for a desired localization accuracy and their corresponding positions. In a realworld cluttered environment, we show that the designed UWB radio placements provide 47\% and 76\% localization root-meansquared error (RMSE) reduction in 2D and 3D experiments, respectively, when compared against trivial placements.

@article{zhao-ral22-arxiv,
author = {Wenda Zhao and Abhishek Goudar and Angela P. Schoellig},
title = {Finding the Right Place: Sensor Placement for {UWB} Time Difference of Arrival Localization in Cluttered Indoor Environments},
journal = {{IEEE Robotics and Automation Letters}},
year={2022},
volume={7},
number={3},
pages={6075--6082},
doi={10.1109/LRA.2022.3165181},
abstract = {Ultra-wideband (UWB) time difference of arrival (TDOA)-based localization has recently emerged as a promising indoor positioning solution. However, in cluttered environments, both the UWB radio positions and the obstacle-induced nonline-of-sight (NLOS) measurement biases significantly impact the quality of the position estimate. Consequently, the placement of the UWB radios must be carefully designed to provide satisfactory localization accuracy for a region of interest. In this work, we propose a novel algorithm that optimizes the UWB radio positions for a pre-defined region of interest in the presence of obstacles. The mean-squared error (MSE) metric is used to formulate an optimization problem that balances the influence of the geometry of the radio positions and the NLOS effects. We further apply the proposed algorithm to compute a minimal number of UWB radios required for a desired localization accuracy and their corresponding positions. In a realworld cluttered environment, we show that the designed UWB radio placements provide 47\% and 76\% localization root-meansquared error (RMSE) reduction in 2D and 3D experiments, respectively, when compared against trivial placements.}
}

Safe learning in robotics: from learning-based control to safe reinforcement learning
L. Brunke, M. Greeff, A. W. Hall, Z. Yuan, S. Zhou, J. Panerati, and A. P. Schoellig
Annual Review of Control, Robotics, and Autonomous Systems, vol. 5, iss. 1, 2022.

The last half decade has seen a steep rise in the number of contributions on safe learning methods for real-world robotic deployments from both the control and reinforcement learning communities. This article provides a concise but holistic review of the recent advances made in using machine learning to achieve safe decision-making under uncertainties, with a focus on unifying the language and frameworks used in control theory and reinforcement learning research. It includes learning-based control approaches that safely improve performance by learning the uncertain dynamics, reinforcement learning approaches that encourage safety or robustness, and methods that can formally certify the safety of a learned control policy. As data- and learning-based robot control methods continue to gain traction, researchers must understand when and how to best leverage them in real-world scenarios where safety is imperative, such as when operating in close proximity to humans. We highlight some of the open challenges that will drive the field of robot learning in the coming years, and emphasize the need for realistic physics-based benchmarks to facilitate fair comparisons between control and reinforcement learning approaches. Expected final online publication date for the Annual Review of Control, Robotics, and Autonomous Systems, Volume 5 is May 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

@article{dsl-annurev22,
author = {Lukas Brunke and Melissa Greeff and Adam W. Hall and Zhaocong Yuan and Siqi Zhou and Jacopo Panerati and Angela P. Schoellig},
title = {Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning},
journal = {{Annual Review of Control, Robotics, and Autonomous Systems}},
volume = {5},
number = {1},
year = {2022},
doi = {10.1146/annurev-control-042920-020211},
URL = {https://doi.org/10.1146/annurev-control-042920-020211},
abstract = { The last half decade has seen a steep rise in the number of contributions on safe learning methods for real-world robotic deployments from both the control and reinforcement learning communities. This article provides a concise but holistic review of the recent advances made in using machine learning to achieve safe decision-making under uncertainties, with a focus on unifying the language and frameworks used in control theory and reinforcement learning research. It includes learning-based control approaches that safely improve performance by learning the uncertain dynamics, reinforcement learning approaches that encourage safety or robustness, and methods that can formally certify the safety of a learned control policy. As data- and learning-based robot control methods continue to gain traction, researchers must understand when and how to best leverage them in real-world scenarios where safety is imperative, such as when operating in close proximity to humans. We highlight some of the open challenges that will drive the field of robot learning in the coming years, and emphasize the need for realistic physics-based benchmarks to facilitate fair comparisons between control and reinforcement learning approaches. Expected final online publication date for the Annual Review of Control, Robotics, and Autonomous Systems, Volume 5 is May 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates. }
}

Fly out the window: exploiting discrete-time flatness for fast vision-based multirotor flight
M. Greeff, S. Zhou, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 7, iss. 2, p. 5023–5030, 2022.

Recent work has demonstrated fast, agile flight using only vision as a position sensor and no GPS. Current feed- back controllers for fast vision-based flight typically rely on a full-state estimate, including position, velocity and acceleration. An accurate full-state estimate is often challenging to obtain due to noisy IMU measurements, infrequent position updates from the vision system, and an imperfect motion model used to obtain high-rate state estimates required by the controller. In this work, we present an alternative control design that bypasses the need for a full-state estimate by exploiting discrete-time flatness, a structural property of the underlying vehicle dynamics. First, we show that the Euler discretization of the multirotor dynamics is discrete-time flat. This allows us to design a predictive controller using only a window of inputs and outputs, the latter consisting of position and yaw estimates. We highlight in simulation that our approach outperforms controllers that rely on an incorrect full-state estimate. We perform extensive outdoor multirotor flight experiments and demonstrate reliable vision-based navigation. In these experiments, our discrete- time flatness-based controller achieves speeds up to 10 m/s and significantly outperforms similar controllers that hinge on full- state estimation, achieving up to 80\% path error reduction.

@article{greeff-ral22,
author = {Melissa Greeff and SiQi Zhou and Angela P. Schoellig},
title = {Fly Out The Window: Exploiting Discrete-Time Flatness for Fast Vision-Based Multirotor Flight},
journal = {{IEEE Robotics and Automation Letters}},
year = {2022},
volume = {7},
number = {2},
pages = {5023--5030},
doi = {10.1109/LRA.2022.3154008},
urllink = {https://ieeexplore.ieee.org/document/9720919},
abstract = {Recent work has demonstrated fast, agile flight using only vision as a position sensor and no GPS. Current feed- back controllers for fast vision-based flight typically rely on a full-state estimate, including position, velocity and acceleration. An accurate full-state estimate is often challenging to obtain due to noisy IMU measurements, infrequent position updates from the vision system, and an imperfect motion model used to obtain high-rate state estimates required by the controller. In this work, we present an alternative control design that bypasses the need for a full-state estimate by exploiting discrete-time flatness, a structural property of the underlying vehicle dynamics. First, we show that the Euler discretization of the multirotor dynamics is discrete-time flat. This allows us to design a predictive controller using only a window of inputs and outputs, the latter consisting of position and yaw estimates. We highlight in simulation that our approach outperforms controllers that rely on an incorrect full-state estimate. We perform extensive outdoor multirotor flight experiments and demonstrate reliable vision-based navigation. In these experiments, our discrete- time flatness-based controller achieves speeds up to 10 m/s and significantly outperforms similar controllers that hinge on full- state estimation, achieving up to 80\% path error reduction.}
}

Bridging the model-reality gap with Lipschitz network adaptation
S. Zhou, K. Pereida, W. Zhao, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 7, iss. 1, p. 642–649, 2022.

As robots venture into the real world, they are subject to unmodeled dynamics and disturbances. Traditional model-based control approaches have been proven successful in relatively static and known operating environments. However, when an accurate model of the robot is not available, model-based design can lead to suboptimal and even unsafe behaviour. In this work, we propose a method that bridges the model-reality gap and enables the application of model-based approaches even if dynamic uncertainties are present. In particular, we present a learning-based model reference adaptation approach that makes a robot system, with possibly uncertain dynamics, behave as a predefined reference model. In turn, the reference model can be used for model-based controller design. In contrast to typical model reference adaptation control approaches, we leverage the representative power of neural networks to capture highly nonlinear dynamics uncertainties and guarantee stability by encoding a certifying Lipschitz condition in the architectural design of a special type of neural network called the Lipschitz network. Our approach applies to a general class of nonlinear control-affine systems even when our prior knowledge about the true robot system is limited. We demonstrate our approach in flying inverted pendulum experiments, where an off-the-shelf quadrotor is challenged to balance an inverted pendulum while hovering or tracking circular trajectories.

@article{zhou-ral22,
author = {Siqi Zhou and Karime Pereida and Wenda Zhao and Angela P. Schoellig},
title = {Bridging the model-reality gap with {Lipschitz} network adaptation},
journal = {{IEEE Robotics and Automation Letters}},
year = {2022},
volume = {7},
number = {1},
pages = {642--649},
doi = {https://doi.org/10.1109/LRA.2021.3131698},
urllink = {https://arxiv.org/abs/2112.03756},
urlvideo = {http://tiny.cc/lipnet-pendulum},
abstract = {As robots venture into the real world, they are subject to unmodeled dynamics and disturbances. Traditional model-based control approaches have been proven successful in relatively static and known operating environments. However, when an accurate model of the robot is not available, model-based design can lead to suboptimal and even unsafe behaviour. In this work, we propose a method that bridges the model-reality gap and enables the application of model-based approaches even if dynamic uncertainties are present. In particular, we present a learning-based model reference adaptation approach that makes a robot system, with possibly uncertain dynamics, behave as a predefined reference model. In turn, the reference model can be used for model-based controller design. In contrast to typical model reference adaptation control approaches, we leverage the representative power of neural networks to capture highly nonlinear dynamics uncertainties and guarantee stability by encoding a certifying Lipschitz condition in the architectural design of a special type of neural network called the Lipschitz network. Our approach applies to a general class of nonlinear control-affine systems even when our prior knowledge about the true robot system is limited. We demonstrate our approach in flying inverted pendulum experiments, where an off-the-shelf quadrotor is challenged to balance an inverted pendulum while hovering or tracking circular trajectories.}
}

Tag-based visual-inertial localization of unmanned aerial vehicles in indoor construction environments using an on-manifold extended Kalman filter
N. Kayhani, W. Zhao, B. McCabe, and A. P. Schoellig
Automation in Construction, vol. 135, p. 104112, 2022.

Automated visual data collection using autonomous unmanned aerial vehicles (UAVs) can improve the accessibility and accuracy of the frequent data required for indoor construction inspections and tracking. However, robust localization, as a critical enabler for autonomy, is challenging in ever-changing, cluttered, GPS-denied indoor construction environments. Rapid alterations and repetitive low-texture areas on indoor construction sites jeopardize the reliability of typical vision-based solutions. This research proposes a tag-based visual-inertial localization method for off-the-shelf UAVs with only a camera and an inertial measurement unit (IMU). Given that tag locations are known in the BIM, the proposed method estimates the UAV’s global pose by fusing inertial data and tag measurements using an on-manifold extended Kalman filter (EKF). The root-mean-square error (RMSE) achieved in our experiments in laboratory and simulation, being as low as 2 − 5 cm, indicates the potential of deploying the proposed method for autonomous navigation of low-cost UAVs in indoor construction environments.

@article{kayhani-autocon22,
author = {Navid Kayhani and Wenda Zhao and Brenda McCabe and Angela P. Schoellig},
title = {Tag-based visual-inertial localization of unmanned aerial vehicles in indoor construction environments using an on-manifold extended {Kalman} filter},
journal = {{Automation in Construction}},
year = {2022},
volume = {135},
pages = {104112},
issn = {0926-5805},
doi = {https://doi.org/10.1016/j.autcon.2021.104112},
url = {https://www.sciencedirect.com/science/article/pii/S092658052100563X},
keywords = {Indoor localization, Unmanned aerial vehicle, Extended Kalman filter, SE(3), On-manifold state estimation, Autonomous navigation, Building information model, Construction robotics, AprilTag},
abstract = {Automated visual data collection using autonomous unmanned aerial vehicles (UAVs) can improve the accessibility and accuracy of the frequent data required for indoor construction inspections and tracking. However, robust localization, as a critical enabler for autonomy, is challenging in ever-changing, cluttered, GPS-denied indoor construction environments. Rapid alterations and repetitive low-texture areas on indoor construction sites jeopardize the reliability of typical vision-based solutions. This research proposes a tag-based visual-inertial localization method for off-the-shelf UAVs with only a camera and an inertial measurement unit (IMU). Given that tag locations are known in the BIM, the proposed method estimates the UAV's global pose by fusing inertial data and tag measurements using an on-manifold extended Kalman filter (EKF). The root-mean-square error (RMSE) achieved in our experiments in laboratory and simulation, being as low as 2 − 5 cm, indicates the potential of deploying the proposed method for autonomous navigation of low-cost UAVs in indoor construction environments.}
}

Robust predictive output-feedback safety filter for uncertain nonlinear control systems
L. Brunke, S. Zhou, and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2022, p. 3051–3058.

@INPROCEEDINGS{brunke-cdc22,
author={Lukas Brunke and Siqi Zhou and Angela P. Schoellig},
booktitle={{Proc. of the IEEE Conference on Decision and Control (CDC)}},
title={Robust Predictive Output-Feedback Safety Filter for Uncertain Nonlinear Control Systems},
year={2022},
pages={3051--3058},
doi={10.1109/CDC51059.2022.9992834}
abstract={In real-world applications, we often require reliable decision making under dynamics uncertainties using noisy high-dimensional sensory data. Recently, we have seen an increasing number of learning-based control algorithms developed to address the challenge of decision making under dynamics uncertainties. These algorithms often make assumptions about the underlying unknown dynamics and, as a result, can provide safety guarantees. This is more challenging for other widely used learning-based decision making algorithms such as reinforcement learning. Furthermore, the majority of existing approaches assume access to state measurements, which can be restrictive in practice. In this paper, inspired by the literature on safety filters and robust output-feedback control, we present a robust predictive output-feedback safety filter (RPOF-SF) framework that provides safety certification to an arbitrary controller applied to an uncertain nonlinear control system. The proposed RPOF-SF combines a robustly stable observer that estimates the system state from noisy measurement data and a predictive safety filter that renders an arbitrary controller safe by (possibly) minimally modifying the controller input to guarantee safety. We show in theory that the proposed RPOF-SF guarantees constraint satisfaction despite disturbances applied to the system. We demonstrate the efficacy of the proposed RPOF-SF algorithm using an uncertain mass-spring-damper system.}
}

Gaussian variational inference with covariance constraints applied to range-only localization
A. Goudar, W. Zhao, T. D. Barfoot, and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022, p. 2872–2879.

Accurate and reliable state estimation is becoming increasingly important as robots venture into the real world. Gaussian variational inference (GVI) is a promising alternative for nonlinear state estimation, which estimates a full probability density for the posterior instead of a point estimate as in maximum a posteriori (MAP)-based approaches. GVI works by optimizing for the parameters of a multivariate Gaussian (MVG) that best agree with the observed data. However, such an optimization procedure must ensure the parameter constraints of a MVG are satisfied; in particular, the inverse covariance matrix must be positive definite. In this work, we propose a tractable algorithm for performing state estimation using GVI that guarantees that the inverse covariance matrix remains positive definite and is well-conditioned throughout the optimization procedure. We evaluate our method extensively in both simulation and real-world experiments for range-only localization. Our results show GVI is consistent on this problem, while MAP is over-confident.

@INPROCEEDINGS{goudar-iros22,
author={Abhishek Goudar and Wenda Zhao and Timothy D. Barfoot and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
title={Gaussian Variational Inference with Covariance Constraints Applied to Range-only Localization},
year={2022},
pages={2872--2879},
doi={10.1109/IROS47612.2022.9981520},
abstract = {Accurate and reliable state estimation is becoming increasingly important as robots venture into the real world. Gaussian variational inference (GVI) is a promising alternative for nonlinear state estimation, which estimates a full probability density for the posterior instead of a point estimate as in maximum a posteriori (MAP)-based approaches. GVI works by optimizing for the parameters of a multivariate Gaussian (MVG) that best agree with the observed data. However, such an optimization procedure must ensure the parameter constraints of a MVG are satisfied; in particular, the inverse covariance matrix must be positive definite. In this work, we propose a tractable algorithm for performing state estimation using GVI that guarantees that the inverse covariance matrix remains positive definite and is well-conditioned throughout the optimization procedure. We evaluate our method extensively in both simulation and real-world experiments for range-only localization. Our results show GVI is consistent on this problem, while MAP is over-confident.}
}

POCD: probabilistic object-level change detection and volumetric mapping in semi-static scenes
J. Qian, V. Chatrath, J. Yang, J. Servos, A. Schoellig, and S. L. Waslander
in Proc. of Robotics: Science and Systems (RSS), 2022.

Maintaining an up-to-date map to reflect recent changes in the scene is very important, particularly in situations involving repeated traversals by a robot operating in an environment over an extended period. Undetected changes may cause a deterioration in map quality, leading to poor localization, inefficient operations, and lost robots. Volumetric methods, such as truncated signed distance functions (TSDFs), have quickly gained traction due to their real-time production of a dense and detailed map, though map updating in scenes that change over time remains a challenge. We propose a framework that introduces a novel probabilistic object state representation to track object pose changes in semi-static scenes. The representation jointly models a stationarity score and a TSDF change measure for each object. A Bayesian update rule that incorporates both geometric and semantic information is derived to achieve consistent online map maintenance. To extensively evaluate our approach alongside the state-of-the-art, we release a novel real-world dataset in a warehouse environment. We also evaluate on the public ToyCar dataset. Our method outperforms state-of-the-art methods on the reconstruction quality of semi-static environments.

@INPROCEEDINGS{qian-rss22,
author={Jingxing Qian and Veronica Chatrath and Jun Yang and James Servos and Angela Schoellig and Steven L. Waslander},
booktitle={{Proc. of Robotics: Science and Systems (RSS)}},
title={{POCD}: Probabilistic Object-Level Change Detection and Volumetric Mapping in Semi-Static Scenes},
year={2022},
doi={10.15607/RSS.2022.XVIII.013},
abstract={Maintaining an up-to-date map to reflect recent changes in the scene is very important, particularly in situations involving repeated traversals by a robot operating in an environment over an extended period. Undetected changes may cause a deterioration in map quality, leading to poor localization, inefficient operations, and lost robots. Volumetric methods, such as truncated signed distance functions (TSDFs), have quickly gained traction due to their real-time production of a dense and detailed map, though map updating in scenes that change over time remains a challenge. We propose a framework that introduces a novel probabilistic object state representation to track object pose changes in semi-static scenes. The representation jointly models a stationarity score and a TSDF change measure for each object. A Bayesian update rule that incorporates both geometric and semantic information is derived to achieve consistent online map maintenance. To extensively evaluate our approach alongside the state-of-the-art, we release a novel real-world dataset in a warehouse environment. We also evaluate on the public ToyCar dataset. Our method outperforms state-of-the-art methods on the reconstruction quality of semi-static environments.}
}

Fusion of machine learning and MPC under uncertainty: what advances are on the horizon?
A. Mesbah, K. P. Wabersich, A. P. Schoellig, M. N. Zeilinger, S. Lucia, T. A. Badgwell, and J. A. Paulson
in Proc. of the American Control Conference (ACC), 2022, p. 342–357.

This paper provides an overview of the recent research efforts on the integration of machine learning and model predictive control under uncertainty. The paper is organized as a collection of four major categories: learning models from system data and prior knowledge; learning control policy parameters from closed-loop performance data; learning efficient approximations of iterative online optimization from policy data; and learning optimal cost-to-go representations from closed-loop performance data. In addition to reviewing the relevant literature, the paper also offers perspectives for future research in each of these areas.

@INPROCEEDINGS{mesbah-acc22,
author={Ali Mesbah and Kim P. Wabersich and Angela P. Schoellig and Melanie N. Zeilinger and Sergio Lucia and Thomas A. Badgwell and Joel A. Paulson},
booktitle={{Proc. of the American Control Conference (ACC)}},
title={Fusion of Machine Learning and {MPC} under Uncertainty: What Advances Are on the Horizon?},
year={2022},
pages={342--357},
doi={10.23919/ACC53348.2022.9867643},
abstract = {This paper provides an overview of the recent research efforts on the integration of machine learning and model predictive control under uncertainty. The paper is organized as a collection of four major categories: learning models from system data and prior knowledge; learning control policy parameters from closed-loop performance data; learning efficient approximations of iterative online optimization from policy data; and learning optimal cost-to-go representations from closed-loop performance data. In addition to reviewing the relevant literature, the paper also offers perspectives for future research in each of these areas.}
}

Barrier Bayesian linear regression: online learning of control barrier conditions for safety-critical control of uncertain systems
L. Brunke, S. Zhou, and A. P. Schoellig
in Proc. of the Learning for Dynamics and Control Conference (L4DC), 2022, p. 881–892.

In this work, we consider the problem of designing a safety filter for a nonlinear uncertain control system. Our goal is to augment an arbitrary controller with a safety filter such that the overall closed-loop system is guaranteed to stay within a given state constraint set, referred to as being safe. For systems with known dynamics, control barrier functions (CBFs) provide a scalar condition for determining if a system is safe. For uncertain systems, robust or adaptive CBF certification approaches have been proposed. However, these approaches can be conservative or require the system to have a particular parametric structure. For more generic uncertain systems, machine learning approaches have been used to approximate the CBF condition. These works typically assume that the learning module is sufficiently trained prior to deployment. Safety during learning is not guaranteed. We propose a barrier Bayesian linear regression (BBLR) approach that guarantees safe online learning of the CBF condition for the true, uncertain system. We assume that the error between the nominal system and the true system is bounded and exploit the structure of the CBF condition. We show that our approach can safely expand the set of certifiable control inputs despite system and learning uncertainties. The effectiveness of our approach is demonstrated in simulation using a two-dimensional pendulum stabilization task.

@INPROCEEDINGS{brunke-l4dc22,
author={Lukas Brunke and Siqi Zhou and Angela P. Schoellig},
booktitle ={{Proc. of the Learning for Dynamics and Control Conference (L4DC)}},
title ={Barrier {Bayesian} Linear Regression: Online Learning of Control Barrier Conditions for Safety-Critical Control of Uncertain Systems},
year={2022},
pages ={881--892},
urllink = {https://proceedings.mlr.press/v168/brunke22a.html},
abstract = {In this work, we consider the problem of designing a safety filter for a nonlinear uncertain control system. Our goal is to augment an arbitrary controller with a safety filter such that the overall closed-loop system is guaranteed to stay within a given state constraint set, referred to as being safe. For systems with known dynamics, control barrier functions (CBFs) provide a scalar condition for determining if a system is safe. For uncertain systems, robust or adaptive CBF certification approaches have been proposed. However, these approaches can be conservative or require the system to have a particular parametric structure. For more generic uncertain systems, machine learning approaches have been used to approximate the CBF condition. These works typically assume that the learning module is sufficiently trained prior to deployment. Safety during learning is not guaranteed. We propose a barrier Bayesian linear regression (BBLR) approach that guarantees safe online learning of the CBF condition for the true, uncertain system. We assume that the error between the nominal system and the true system is bounded and exploit the structure of the CBF condition. We show that our approach can safely expand the set of certifiable control inputs despite system and learning uncertainties. The effectiveness of our approach is demonstrated in simulation using a two-dimensional pendulum stabilization task.}
}

Stochastic modeling of tag installation error for robust on-manifold tag-based visual-inertial localization
N. Kayhani, B. McCabe, and A. P. Schoellig
in Proc. of the Canadian Society of Civil Engineering Annual Conference (CSCE), 2022. Accepted.

Autonomous mobile robots, including unmanned aerial vehicles (UAVs), have received significant attention for their applications in construction. These platforms have great potential to automate and enhance the quality and frequency of the required data for many tasks such as construction schedule updating, inspections, and monitoring. Robust localization is a critical enabler for reliable deployments of autonomous robotic platforms. Automated robotic solutions rely mainly on the global positioning system (GPS) for outdoor localization. However, GPS signals are denied indoors, and pre-built environment maps are often used for indoor localization. This entails generating high-quality maps by teleoperating the mobile robot in the environment. Not only is this approach time-consuming and tedious, but it also is unreliable in indoor construction settings. Layout changes with construction progress, requiring frequent mapping sessions to support autonomous missions. Moreover, the effectiveness of vision-based solutions relying on visual features are highly impacted in low texture and repetitive areas on site. To address these challenges, we previously proposed a low-cost, lightweight tag-based visual-inertial localization method using AprilTags. Tags, in this method, are paper printable landmarks with known sizes and locations, representing the environment’s quasi-map. Since tag placement/replacement is a manual process, it is subject to human errors. In this work, we study the impact of human error in the manual tag installation process and propose a stochastic approach to account for this uncertainty using the Lie group theory. Employing Monte Carlo simulation, we experimentally show that the proposed stochastic model incorporated in our on-manifold formulation improves the robustness and accuracy of tag-based localization against inevitable imperfections in manual tag installation on site.

@INPROCEEDINGS{kayhani-csce22,
author={Navid Kayhani and Brenda McCabe and Angela P. Schoellig},
booktitle={{Proc. of the Canadian Society of Civil Engineering Annual Conference (CSCE)}},
title={Stochastic modeling of tag installation error for robust on-manifold tag-based visual-inertial localization},
year={2022},
note={Accepted},
urlvideo = {https://youtu.be/2frTKgOwbf4},
abstract = {Autonomous mobile robots, including unmanned aerial vehicles (UAVs), have received significant attention for their applications in construction. These platforms have great potential to automate and enhance the quality and frequency of the required data for many tasks such as construction schedule updating, inspections, and monitoring. Robust localization is a critical enabler for reliable deployments of autonomous robotic platforms. Automated robotic solutions rely mainly on the global positioning system (GPS) for outdoor localization. However, GPS signals are denied indoors, and pre-built environment maps are often used for indoor localization. This entails generating high-quality maps by teleoperating the mobile robot in the environment. Not only is this approach time-consuming and tedious, but it also is unreliable in indoor construction settings. Layout changes with construction progress, requiring frequent mapping sessions to support autonomous missions. Moreover, the effectiveness of vision-based solutions relying on visual features are highly impacted in low texture and repetitive areas on site. To address these challenges, we previously proposed a low-cost, lightweight tag-based visual-inertial localization method using AprilTags. Tags, in this method, are paper printable landmarks with known sizes and locations, representing the environment’s quasi-map. Since tag placement/replacement is a manual process, it is subject to human errors. In this work, we study the impact of human error in the manual tag installation process and propose a stochastic approach to account for this uncertainty using the Lie group theory. Employing Monte Carlo simulation, we experimentally show that the proposed stochastic model incorporated in our on-manifold formulation improves the robustness and accuracy of tag-based localization against inevitable imperfections in manual tag installation on site.}
}

2021

Zeus: a system description of the two-time winner of the collegiate SAE autodrive competition
K. Burnett, J. Qian, X. Du, L. Liu, D. J. Yoon, T. Shen, S. Sun, S. Samavi, M. J. Sorocky, M. Bianchi, K. Zhang, A. Arkhangorodsky, Q. Sykora, S. Lu, Y. Huang, A. P. Schoellig, and T. D. Barfoot
Journal of Field Robotics, vol. 38, iss. 1, pp. 139-166, 2021.

The SAE AutoDrive Challenge is a 3-year collegiate competition to develop a self-driving car by 2020. The second year of the competition was held in June 2019 at MCity, a mock town built for self-driving car testing at the University of Michigan. Teams were required to autonomously navigate a series of intersections while handling pedestrians, traffic lights, and traffic signs. Zeus is aUToronto’s winning entry in the AutoDrive Challenge. This article describes the system design and development of Zeus as well as many of the lessons learned along the way. This includes details on the team’s organizational structure, sensor suite, software components, and performance at the Year 2 competition. With a team of mostly undergraduates and minimal resources, aUToronto has made progress toward a functioning self-driving vehicle, in just 2 years. This article may prove valuable to researchers looking to develop their own self-driving platform.

@ARTICLE{burnett-jfr21,
author = {Keenan Burnett and Jingxing Qian and Xintong Du and Linqiao Liu and David J. Yoon and Tianchang Shen and Susan Sun and Sepehr Samavi and Michael J. Sorocky and Mollie Bianchi and Kaicheng Zhang and Arkady Arkhangorodsky and Quinlan Sykora and Shichen Lu and Yizhou Huang and Angela P. Schoellig and Timothy D. Barfoot},
title = {Zeus: A system description of the two-time winner of the collegiate {SAE} autodrive competition},
year = {2021},
journal = {{Journal of Field Robotics}},
volume = {38},
number = {1},
pages = {139-166},
doi = {10.1002/rob.21958},
abstract = {The SAE AutoDrive Challenge is a 3-year collegiate competition to develop a self-driving car by 2020. The second year of the competition was held in June 2019 at MCity, a mock town built for self-driving car testing at the University of Michigan. Teams were required to autonomously navigate a series of intersections while handling pedestrians, traffic lights, and traffic signs. Zeus is aUToronto's winning entry in the AutoDrive Challenge. This article describes the system design and development of Zeus as well as many of the lessons learned along the way. This includes details on the team's organizational structure, sensor suite, software components, and performance at the Year 2 competition. With a team of mostly undergraduates and minimal resources, aUToronto has made progress toward a functioning self-driving vehicle, in just 2 years. This article may prove valuable to researchers looking to develop their own self-driving platform.}
}

Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics
F. Berkenkamp, A. Krause, and A. P. Schoellig
Machine Learning, 2021.

Selecting the right tuning parameters for algorithms is a pravelent problem in machine learning that can significantly affect the performance of algorithms. Data-efficient optimization algorithms, such as Bayesian optimization, have been used to automate this process. During experiments on real-world systems such as robotic platforms these methods can evaluate unsafe parameters that lead to safety-critical system failures and can destroy the system. Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been developed, which guarantees that the performance of the system never falls below a critical value; that is, safety is defined based on the performance function. However, coupling performance and safety is often not desirable in practice, since they are often opposing objectives. In this paper, we present a generalized algorithm that allows for multiple safety constraints separate from the objective. Given an initial set of safe parameters, the algorithm maximizes performance but only evaluates parameters that satisfy safety for all constraints with high probability. To this end, it carefully explores the parameter space by exploiting regularity assumptions in terms of a Gaussian process prior. Moreover, we show how context variables can be used to safely transfer knowledge to new situations and tasks. We provide a theoretical analysis and demonstrate that the proposed algorithm enables fast, automatic, and safe optimization of tuning parameters in experiments on a quadrotor vehicle.

@article{berkenkamp-ml21,
author = {Felix Berkenkamp and Andreas Krause and Angela P. Schoellig},
title = {Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics},
journal = {{Machine Learning}},
year = {2021},
doi = {10.1007/s10994-021-06019-1},
abstract = {Selecting the right tuning parameters for algorithms is a pravelent problem in machine learning that can significantly affect the performance of algorithms. Data-efficient optimization algorithms, such as Bayesian optimization, have been used to automate this process. During experiments on real-world systems such as robotic platforms these methods can evaluate unsafe parameters that lead to safety-critical system failures and can destroy the system. Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been developed, which guarantees that the performance of the system never falls below a critical value; that is, safety is defined based on the performance function. However, coupling performance and safety is often not desirable in practice, since they are often opposing objectives. In this paper, we present a generalized algorithm that allows for multiple safety constraints separate from the objective. Given an initial set of safe parameters, the algorithm maximizes performance but only evaluates parameters that satisfy safety for all constraints with high probability. To this end, it carefully explores the parameter space by exploiting regularity assumptions in terms of a Gaussian process prior. Moreover, we show how context variables can be used to safely transfer knowledge to new situations and tasks. We provide a theoretical analysis and demonstrate that the proposed algorithm enables fast, automatic, and safe optimization of tuning parameters in experiments on a quadrotor vehicle.}
}

Robust adaptive model predictive control for guaranteed fast and accurate stabilization in the presence of model errors
K. Pereida, L. Brunke, and A. P. Schoellig
International Journal of Robust and Nonlinear Control, vol. 31, iss. 18, p. 8750–8784, 2021.

Numerous control applications, including robotic systems such as unmanned aerial vehicles or assistive robots, are expected to guarantee high performance despite being deployed in unknown and dynamic environments where they are subject to disturbances, unmodeled dynamics, and parametric uncertainties. The fast feedback of adaptive controllers makes them an effective approach for compensating for disturbances and unmodeled dynamics, but adaptive controllers seldom achieve high performance, nor do they guarantee state and input constraint satisfaction. In this article we propose a robust adaptive model predictive controller for guaranteed fast and accurate stabilization in the presence of model uncertainties. The proposed approach combines robust model predictive control (RMPC) with an underlying discrete-time adaptive controller. We refer to this combined controller as an RMPC- controller. The adaptive controller forces the system to behave close to a linear reference model despite the presence of parametric uncertainties. However, the true dynamics of the adaptive controlled system may deviate from the linear reference model. In this work we prove that this deviation is bounded and use it as the modeling error of the linear reference model. We combine adaptive control with an RMPC that leverages the linear reference model and the modeling error. We prove stability and recursive feasibility of the proposed RMPC-. Furthermore, we validate the feasibility, performance, and accuracy of the proposed RMPC- on a stabilization task in a numerical experiment. We demonstrate that the proposed RMPC- outperforms adaptive control, robust MPC, and other baseline controllers in all metrics.

@article{pereida-ijrnc21,
author = {Karime Pereida and Lukas Brunke and Angela P. Schoellig},
title = {Robust adaptive model predictive control for guaranteed fast and accurate stabilization in the presence of model errors},
journal = {{International Journal of Robust and Nonlinear Control}},
year = {2021},
volume = {31},
number = {18},
pages = {8750--8784},
doi = {https://doi.org/10.1002/rnc.5712},
urllink = {https://onlinelibrary.wiley.com/doi/abs/10.1002/rnc.5712},
abstract = {Numerous control applications, including robotic systems such as unmanned aerial vehicles or assistive robots, are expected to guarantee high performance despite being deployed in unknown and dynamic environments where they are subject to disturbances, unmodeled dynamics, and parametric uncertainties. The fast feedback of adaptive controllers makes them an effective approach for compensating for disturbances and unmodeled dynamics, but adaptive controllers seldom achieve high performance, nor do they guarantee state and input constraint satisfaction. In this article we propose a robust adaptive model predictive controller for guaranteed fast and accurate stabilization in the presence of model uncertainties. The proposed approach combines robust model predictive control (RMPC) with an underlying discrete-time adaptive controller. We refer to this combined controller as an RMPC- controller. The adaptive controller forces the system to behave close to a linear reference model despite the presence of parametric uncertainties. However, the true dynamics of the adaptive controlled system may deviate from the linear reference model. In this work we prove that this deviation is bounded and use it as the modeling error of the linear reference model. We combine adaptive control with an RMPC that leverages the linear reference model and the modeling error. We prove stability and recursive feasibility of the proposed RMPC-. Furthermore, we validate the feasibility, performance, and accuracy of the proposed RMPC- on a stabilization task in a numerical experiment. We demonstrate that the proposed RMPC- outperforms adaptive control, robust MPC, and other baseline controllers in all metrics.}
}

A deep learning approach for rock fragmentation analysis
T. Bamford, K. Esmaeili, and A. P. Schoellig
International Journal of Rock Mechanics and Mining Sciences, vol. 145, p. 104839, 2021.

In mining operations, blast-induced rock fragmentation affects the productivity and efficiency of downstream operations including digging, hauling, crushing, and grinding. Continuous measurement of rock fragmentation is essential for optimizing blast design. Current methods of rock fragmentation analysis rely on either physical screening of blasted rock material or image analysis of the blasted muckpiles; both are time consuming. This study aims to present and evaluate the measurement of rock fragmentation using deep learning strategies. A deep neural network (DNN) architecture was used to predict characteristic sizes of rock fragments from a 2D image of a muckpile. The data set used for training the DNN model is composed of 61,853 labelled images of blasted rock fragments. An exclusive data set of 1,263 labelled images were used to test the DNN model. The percent error for coarse characteristic size prediction ranges within ±25% when evaluated using the test set. Model validation on orthomosaics for two muckpiles shows that the deep learning method achieves a good accuracy (lower mean percent error) compared to manual image labelling. Validation on screened piles shows that the DNN model prediction is similar to manual labelling accuracy when compared with sieving analysis.

@article{bamford-ijrmms21,
title = {A deep learning approach for rock fragmentation analysis},
author = {Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
journal = {{International Journal of Rock Mechanics and Mining Sciences}},
year = {2021},
volume = {145},
doi = {10.1016/j.ijrmms.2021.104839},
pages = {104839},
urllink = {https://www.sciencedirect.com/science/article/pii/S1365160921002239},
abstract = {In mining operations, blast-induced rock fragmentation affects the productivity and efficiency of downstream operations including digging, hauling, crushing, and grinding. Continuous measurement of rock fragmentation is essential for optimizing blast design. Current methods of rock fragmentation analysis rely on either physical screening of blasted rock material or image analysis of the blasted muckpiles; both are time consuming. This study aims to present and evaluate the measurement of rock fragmentation using deep learning strategies. A deep neural network (DNN) architecture was used to predict characteristic sizes of rock fragments from a 2D image of a muckpile. The data set used for training the DNN model is composed of 61,853 labelled images of blasted rock fragments. An exclusive data set of 1,263 labelled images were used to test the DNN model. The percent error for coarse characteristic size prediction ranges within ±25% when evaluated using the test set. Model validation on orthomosaics for two muckpiles shows that the deep learning method achieves a good accuracy (lower mean percent error) compared to manual image labelling. Validation on screened piles shows that the DNN model prediction is similar to manual labelling accuracy when compared with sieving analysis.},
}

Meta learning with paired forward and inverse models for efficient receding horizon control
C. D. McKinnon and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 6, iss. 2, p. 3240–3247, 2021.

This paper presents a model-learning method for Stochastic Model Predictive Control (SMPC) that is both accurate and computationally efficient. We assume that the control input affects the robot dynamics through an unknown (but invertable) nonlinear function. By learning this unknown function and its inverse, we can use the value of the function as a new control input (which we call the input feature) that is optimised by SMPC in place of the original control input. This removes the need to evaluate a function approximator for the unknown function during optimisation in SMPC (where it would be evaluated many times), reducing the computational cost. The learned inverse is evaluated only once at each sampling time to convert the optimal input feature from SMPC to a control input to apply to the system. We assume that the remaining unknown dynamics can be accurately represented as a model that is linear in a set of coefficients, which enables fast adaptation to new conditions. We demonstrate our approach in experiments on a large ground robot using a stereo camera for localisation.

@article{mckinnon-ral21,
title = {Meta Learning With Paired Forward and Inverse Models for Efficient Receding Horizon Control},
author = {Christopher D. McKinnon and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2021},
volume = {6},
number = {2},
pages = {3240--3247},
doi = {10.1109/LRA.2021.3063957},
urllink = {https://ieeexplore.ieee.org/document/9369887},
abstract = {This paper presents a model-learning method for Stochastic Model Predictive Control (SMPC) that is both accurate and computationally efficient. We assume that the control input affects the robot dynamics through an unknown (but invertable) nonlinear function. By learning this unknown function and its inverse, we can use the value of the function as a new control input (which we call the input feature) that is optimised by SMPC in place of the original control input. This removes the need to evaluate a function approximator for the unknown function during optimisation in SMPC (where it would be evaluated many times), reducing the computational cost. The learned inverse is evaluated only once at each sampling time to convert the optimal input feature from SMPC to a control input to apply to the system. We assume that the remaining unknown dynamics can be accurately represented as a model that is linear in a set of coefficients, which enables fast adaptation to new conditions. We demonstrate our approach in experiments on a large ground robot using a stereo camera for localisation.}
}

Do we need to compensate for motion distortion and Doppler effects in spinning radar navigation?
K. Burnett, A. P. Schoellig, and T. D. Barfoot
IEEE Robotics and Automation Letters, vol. 6, iss. 2, p. 771–778, 2021.

In order to tackle the challenge of unfavorable weather conditions such as rain and snow, radar is being revisited as a parallel sensing modality to vision and lidar. Recent works have made tremendous progress in applying spinning radar to odometry and place recognition. However, these works have so far ignored the impact of motion distortion and Doppler effects on spinning-radar-based navigation, which may be significant in the self-driving car domain where speeds can be high. In this work, we demonstrate the effect of these distortions on radar odometry using the Oxford Radar RobotCar Dataset and metric localization using our own data-taking platform. We revisit a lightweight estimator that can recover the motion between a pair of radar scans while accounting for both effects. Our conclusion is that both motion distortion and the Doppler effect are significant in different aspects of spinning radar navigation, with the former more prominent than the latter. Code for this project can be found at: https://github.com/keenan-burnett/yeti_radar_odometry.

@article{burnett-ral21,
title = {Do We Need to Compensate for Motion Distortion and {Doppler} Effects in Spinning Radar Navigation?},
author = {Keenan Burnett and Angela P. Schoellig and Timothy D. Barfoot},
journal = {{IEEE Robotics and Automation Letters}},
year = {2021},
volume = {6},
number = {2},
pages = {771--778},
doi = {10.1109/LRA.2021.3052439},
urllink = {https://ieeexplore.ieee.org/document/9327473},
abstract = {In order to tackle the challenge of unfavorable weather conditions such as rain and snow, radar is being revisited as a parallel sensing modality to vision and lidar. Recent works have made tremendous progress in applying spinning radar to odometry and place recognition. However, these works have so far ignored the impact of motion distortion and Doppler effects on spinning-radar-based navigation, which may be significant in the self-driving car domain where speeds can be high. In this work, we demonstrate the effect of these distortions on radar odometry using the Oxford Radar RobotCar Dataset and metric localization using our own data-taking platform. We revisit a lightweight estimator that can recover the motion between a pair of radar scans while accounting for both effects. Our conclusion is that both motion distortion and the Doppler effect are significant in different aspects of spinning radar navigation, with the former more prominent than the latter. Code for this project can be found at: https://github.com/keenan-burnett/yeti_radar_odometry.}
}

Learning-based bias correction for time difference of arrival ultra-wideband localization of resource-constrained mobile robots
W. Zhao, J. Panerati, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 6, iss. 2, p. 3639–3646, 2021.

Accurate indoor localization is a crucial enabling technology for many robotics applications, from warehouse management to monitoring tasks. Ultra-wideband (UWB) time difference of arrival (TDOA)-based localization is a promising lightweight, low-cost solution that can scale to a large number of devices—making it especially suited for resource-constrained multi-robot applications. However, the localization accuracy of standard, commercially available UWB radios is often insufficient due to significant measurement bias and outliers. In this letter, we address these issues by proposing a robust UWB TDOA localization framework comprising of (i) learning-based bias correction and (ii) M-estimation-based robust filtering to handle outliers. The key properties of our approach are that (i) the learned biases generalize to different UWB anchor setups and (ii) the approach is computationally efficient enough to run on resource-constrained hardware. We demonstrate our approach on a Crazyflie nano-quadcopter. Experimental results show that the proposed localization framework, relying only on the onboard IMU and UWB, provides an average of 42.08\% localization error reduction (in three different anchor setups) compared to the baseline approach without bias compensation. We also show autonomous trajectory tracking on a quadcopter using our UWB TDOA localization approach.

@article{zhao-ral21,
title = {Learning-based Bias Correction for Time Difference of Arrival Ultra-wideband Localization of Resource-constrained Mobile Robots},
author = {Wenda Zhao and Jacopo Panerati and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2021},
volume = {6},
number = {2},
pages = {3639--3646},
doi = {10.1109/LRA.2021.3064199},
urlvideo = {https://youtu.be/J32mrDN5ws4},
urllink = {https://ieeexplore.ieee.org/document/9372785},
abstract = {Accurate indoor localization is a crucial enabling technology for many robotics applications, from warehouse management to monitoring tasks. Ultra-wideband (UWB) time difference of arrival (TDOA)-based localization is a promising lightweight, low-cost solution that can scale to a large number of devices—making it especially suited for resource-constrained multi-robot applications. However, the localization accuracy of standard, commercially available UWB radios is often insufficient
due to significant measurement bias and outliers. In this letter, we address these issues by proposing a robust UWB TDOA localization framework comprising of (i) learning-based bias correction and (ii) M-estimation-based robust filtering to handle outliers. The key properties of our approach are that (i) the learned biases generalize to different UWB anchor setups and (ii) the approach is computationally efficient enough to run on resource-constrained hardware. We demonstrate our approach on a Crazyflie nano-quadcopter. Experimental results show that the proposed localization framework, relying only on the onboard IMU and UWB, provides an average of 42.08\% localization error reduction (in three different anchor setups) compared to the baseline approach without bias compensation. We also show autonomous trajectory tracking on a quadcopter using our UWB TDOA localization approach.}
}

Exploiting differential flatness for robust learning-based tracking control using Gaussian processes
M. Greeff and A. Schoellig
IEEE Control Systems Letters, vol. 5, iss. 4, p. 1121–1126, 2021.

Learning-based control has shown to outperform conventional model-based techniques in the presence of model uncertainties and systematic disturbances. However, most state-of-the-art learning-based nonlinear trajectory tracking controllers still lack any formal guarantees. In this letter, we exploit the property of differential flatness to design an online, robust learning-based controller to achieve both high tracking performance and probabilistically guarantee a uniform ultimate bound on the tracking error. A common control approach for differentially flat systems is to try to linearize the system by using a feedback (FB) linearization controller designed based on a nominal system model. Performance and safety are limited by the mismatch between the nominal model and the actual system. Our proposed approach uses a nonparametric Gaussian Process (GP) to both improve FB linearization and quantify, probabilistically, the uncertainty in our FB linearization. We use this probabilistic bound in a robust linear quadratic regulator (LQR) framework. Through simulation, we highlight that our proposed approach significantly outperforms alternative learning-based strategies that use differential flatness.

@article{greeff-lcss21,
title = {Exploiting Differential Flatness for Robust Learning-Based Tracking Control using {Gaussian} Processes},
author = {Melissa Greeff and Angela Schoellig},
journal = {{IEEE Control Systems Letters}},
year = {2021},
volume = {5},
number = {4},
pages = {1121--1126},
doi = {10.1109/LCSYS.2020.3009177},
urllink = {https://ieeexplore.ieee.org/document/9140024},
urlvideo = {https://youtu.be/ZFzZkKjQ3qw},
abstract = {Learning-based control has shown to outperform conventional model-based techniques in the presence of model uncertainties and systematic disturbances. However, most state-of-the-art learning-based nonlinear trajectory tracking controllers still lack any formal guarantees. In this letter, we exploit the property of differential flatness to design an online, robust learning-based controller to achieve both high tracking performance and probabilistically guarantee a uniform ultimate bound on the tracking error. A common control approach for differentially flat systems is to try to linearize the system by using a feedback (FB) linearization controller designed based on a nominal system model. Performance and safety are limited by the mismatch between the nominal model and the actual system. Our proposed approach uses a nonparametric Gaussian Process (GP) to both improve FB linearization and quantify, probabilistically, the uncertainty in our FB linearization. We use this probabilistic bound in a robust linear quadratic regulator (LQR) framework. Through simulation, we highlight that our proposed approach significantly outperforms alternative learning-based strategies that use differential flatness.}
}

Self-calibration of the offset between GPS and semantic map frames for robust localization
W. Tseng, A. P. Schoellig, and T. D. Barfoot
in Proc. of the Conference on Robots and Vision (CRV), 2021, p. 173–180.

@INPROCEEDINGS{tseng-crv21,
author={Wei-Kang Tseng and Angela P. Schoellig and Timothy D. Barfoot},
booktitle={{Proc. of the Conference on Robots and Vision (CRV)}},
title={Self-Calibration of the Offset Between {GPS} and Semantic Map Frames for Robust Localization},
year={2021},
pages={173--180},
urllink={https://ieeexplore.ieee.org/abstract/document/9469506},
doi={10.1109/CRV52889.2021.00031}
abstract = {In self-driving, standalone GPS is generally considered to have insufficient positioning accuracy to stay in lane. Instead, many turn to LIDAR localization, but this comes at the expense of building LIDAR maps that can be costly to maintain. Another possibility is to use semantic cues such as lane lines and traffic lights to achieve localization, but these are usually not continuously visible. This issue can be remedied by combining semantic cues with GPS to fill in the gaps. However, due to elapsed time between mapping and localization, the live GPS frame can be offset from the semantic map frame, requiring calibration. In this paper, we propose a robust semantic localization algorithm that self-calibrates for the offset between the live GPS and semantic map frames by exploiting common semantic cues, including traffic lights and lane markings. We formulate the problem using a modified Iterated Extended Kalman Filter, which incorporates GPS and camera images for semantic cue detection via Convolutional Neural Networks. Experimental results show that our proposed algorithm achieves decimetre-level accuracy comparable to typical LIDAR localization performance and is robust against sparse semantic features and frequent GPS dropouts.}
}

Radar odometry combining probabilistic estimation and unsupervised feature learning
K. Burnett, D. J. Yoon, A. P. Schoellig, and T. D. Barfoot
in Proc. of Robotics: Science and Systems (RSS), 2021.

This paper presents a radar odometry method that combines probabilistic trajectory estimation and deep learned features without needing groundtruth pose information. The feature network is trained unsupervised, using only the on-board radar data. With its theoretical foundation based on a data likelihood objective, our method leverages a deep network for processing rich radar data, and a non-differentiable classic estimator for probabilistic inference. We provide extensive experimental results on both the publicly available Oxford Radar RobotCar Dataset and an additional 100 km of driving collected in an urban setting. Our sliding-window implementation of radar odometry outperforms most hand-crafted methods and approaches the current state of the art without requiring a groundtruth trajectory for training. We also demonstrate the effectiveness of radar odometry under adverse weather conditions. Code for this project can be found at: https://github.com/utiasASRL/hero_radar_odometry

@INPROCEEDINGS{burnett-rss21,
author={Keenan Burnett and David J. Yoon and Angela P. Schoellig and Timothy D. Barfoot},
booktitle={{Proc. of Robotics: Science and Systems (RSS)}},
title={Radar Odometry Combining Probabilistic Estimation and Unsupervised Feature Learning},
year={2021},
urllink = {http://www.roboticsproceedings.org/rss17/p029.pdf},
urlvideo = {https://youtu.be/DunPo1hdRbM},
urldata = {https://github.com/utiasASRL/hero_radar_odometry},
urlcode = {https://github.com/utiasASRL/hero_radar_odometry},
doi = {10.15607/RSS.2021.XVII.029},
abstract = {This paper presents a radar odometry method that combines probabilistic trajectory estimation and deep learned features without needing groundtruth pose information. The feature network is trained unsupervised, using only the on-board radar data. With its theoretical foundation based on a data likelihood objective, our method leverages a deep network for processing rich radar data, and a non-differentiable classic estimator for probabilistic inference. We provide extensive experimental results on both the publicly available Oxford Radar RobotCar Dataset and an additional 100 km of driving collected in an urban setting. Our sliding-window implementation of radar odometry outperforms most hand-crafted methods and approaches the current state of the art without requiring a groundtruth trajectory for training. We also demonstrate the effectiveness of radar odometry under adverse weather conditions. Code for this project can be found at: https://github.com/utiasASRL/hero_radar_odometry}
}

RLO-MPC: robust learning-based output feedback MPC for improving the performance of uncertain systems in iterative tasks
L. Brunke, S. Zhou, and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2021, pp. 2183-2190.

In this work we address the problem of performing a repetitive task when we have uncertain observations and dynamics. We formulate this problem as an iterative infinite horizon optimal control problem with output feedback. Previously, this problem was solved for linear time-invariant (LTI) system for the case when noisy full-state measurements are available using a robust iterative learning control framework, which we refer to as robust learning-based model predictive control (RL-MPC). However, this work does not apply to the case when only noisy observations of part of the state are available. This limits the applicability of current approaches in practice: First, in practical applications we typically do not have access to the full state. Second, uncertainties in the observations, when not accounted for, can lead to instability and constraint violations. To overcome these limitations, we propose a combination of RL-MPC with robust output feedback model predictive control, named robust learning-based output feedback model predictive control (RLO-MPC). We show recursive feasibility and stability, and prove theoretical guarantees on the performance over iterations. We validate the proposed approach with a numerical example in simulation and a quadrotor stabilization task in experiments.

@INPROCEEDINGS{brunke-cdc21,
author={Lukas Brunke and Siqi Zhou and Angela P. Schoellig},
booktitle={{Proc. of the IEEE Conference on Decision and Control (CDC)}},
title={{RLO-MPC}: Robust Learning-Based Output Feedback {MPC} for Improving the Performance of Uncertain Systems in Iterative Tasks},
year={2021},
pages={2183-2190},
urlvideo = {https://youtu.be/xJ8xFKp3cAo},
doi={10.1109/CDC45484.2021.9682940},
abstract = {In this work we address the problem of performing a repetitive task when we have uncertain observations and dynamics. We formulate this problem as an iterative infinite horizon optimal control problem with output feedback. Previously, this problem was solved for linear time-invariant (LTI) system for the case when noisy full-state measurements are available using a robust iterative learning control framework, which we refer to as robust learning-based model predictive control (RL-MPC). However, this work does not apply to the case when only noisy observations of part of the state are available. This limits the applicability of current approaches in practice: First, in practical applications we typically do not have access to the full state. Second, uncertainties in the observations, when not accounted for, can lead to instability and constraint violations. To overcome these limitations, we propose a combination of RL-MPC with robust output feedback model predictive control, named robust learning-based output feedback model predictive control (RLO-MPC). We show recursive feasibility and stability, and prove theoretical guarantees on the performance over iterations. We validate the proposed approach with a numerical example in simulation and a quadrotor stabilization task in experiments.}
}

Learning a stability filter for uncertain differentially flat systems using Gaussian processes
M. Greeff, A. W. Hall, and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2021, p. 789–794.

Many physical system models exhibit a structural property known as differential flatness. Intuitively, differential flatness allows us to separate the system’s nonlinear dynamics into a linear dynamics component and a nonlinear term. In this work, we exploit this structure and propose using a nonparametric Gaussian Process (GP) to learn the unknown nonlinear term. We use this GP in an optimization problem to optimize for an input that is most likely to feedback linearize the system (i.e., cancel this nonlinear term). This optimization is subject to input constraints and a stability filter, described by an uncertain Control Lyapunov Function (CLF), which prob- abilistically guarantees exponential trajectory tracking when possible. Furthermore, for systems that are control-affine, we choose to express this structure in the selection of the kernel for the GP. By exploiting this selection, we show that the optimization problem is not only convex but can be efficiently solved as a second-order cone program. We compare our approach to related works in simulation and show that we can achieve similar performance at much lower computational cost.

@INPROCEEDINGS{greeff-cdc21,
author = {Melissa Greeff and Adam W. Hall and Angela P. Schoellig},
title = {Learning a Stability Filter for Uncertain Differentially Flat Systems using {Gaussian} Processes},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2021},
pages = {789--794},
doi = {10.1109/CDC45484.2021.9683661},
abstract = {Many physical system models exhibit a structural property known as differential flatness. Intuitively, differential flatness allows us to separate the system’s nonlinear dynamics into a linear dynamics component and a nonlinear term. In this work, we exploit this structure and propose using a nonparametric Gaussian Process (GP) to learn the unknown nonlinear term. We use this GP in an optimization problem to optimize for an input that is most likely to feedback linearize the system (i.e., cancel this nonlinear term). This optimization is subject to input constraints and a stability filter, described by an uncertain Control Lyapunov Function (CLF), which prob- abilistically guarantees exponential trajectory tracking when possible. Furthermore, for systems that are control-affine, we choose to express this structure in the selection of the kernel for the GP. By exploiting this selection, we show that the optimization problem is not only convex but can be efficiently solved as a second-order cone program. We compare our approach to related works in simulation and show that we can achieve similar performance at much lower computational cost.},
}

Online spatio-temporal calibration of tightly-coupled ultrawideband-aided inertial localization
A. Goudar and A. P. Schoellig
in Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2021, p. 1161–1168.

The combination of ultrawideband (UWB) radios and inertial measurement units (IMU) can provide accurate positioning in environments where the Global Positioning System (GPS) service is either unavailable or has unsatisfactory performance. The two sensors, IMU and UWB radio, are often not co-located on a moving system. The UWB radio is typically located at the extremities of the system to ensure reliable communication, whereas the IMUs are located closer to its center of gravity. Furthermore, without hardware or software synchronization, data from heterogeneous sensors can arrive at different time instants resulting in temporal offsets. If uncalibrated, these spatial and temporal offsets can degrade the positioning performance. In this paper, using observability and identifiability criteria, we derive the conditions required for successfully calibrating the spatial and the temporal offset parameters of a tightly-coupled UWB-IMU system. We also present an online method for jointly calibrating these offsets. The results show that our calibration approach results in improved positioning accuracy while simultaneously estimating (i) the spatial offset parameters to millimeter precision and (ii) the temporal offset parameter to millisecond precision.

@INPROCEEDINGS{goudar-iros21,
author = {Abhishek Goudar and Angela P. Schoellig},
title = {Online Spatio-temporal Calibration of Tightly-coupled Ultrawideband-aided Inertial Localization},
booktitle = {{Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS)}},
year = {2021},
pages = {1161--1168},
doi = {10.1109/IROS51168.2021.9636625},
abstract = {The combination of ultrawideband (UWB) radios and inertial measurement units (IMU) can provide accurate positioning in environments where the Global Positioning System (GPS) service is either unavailable or has unsatisfactory performance. The two sensors, IMU and UWB radio, are often not co-located on a moving system. The UWB radio is typically located at the extremities of the system to ensure reliable communication, whereas the IMUs are located closer to its center of gravity. Furthermore, without hardware or software synchronization, data from heterogeneous sensors can arrive at different time instants resulting in temporal offsets. If uncalibrated, these spatial and temporal offsets can degrade the positioning performance. In this paper, using observability and identifiability criteria, we derive the conditions required for successfully calibrating the spatial and the temporal offset parameters of a tightly-coupled UWB-IMU system. We also present an online method for jointly calibrating these offsets. The results show that our calibration approach results in improved positioning accuracy while simultaneously estimating (i) the spatial offset parameters to millimeter precision and (ii) the temporal offset parameter to millisecond precision.},
}

Mobile manipulation in unknown environments with differential inverse kinematics control
A. Heins, M. Jakob, and A. P. Schoellig
in Proc. of the Conference on Robots and Vision (CRV), 2021, p. 64–71.

Mobile manipulators combine the large workspace of mobile robots with the interactive capabilities of manipulator arms, making them useful in a variety of domains including construction and assistive care. We propose a differential inverse kinematics whole-body control approach for position-controlled industrial mobile manipulators. Our controller is capable of task-space trajectory tracking, force regulation, obstacle and singularity avoidance, and pushing an object toward a goal location, with limited sensing and knowledge of the environment. We evaluate the proposed approach through extensive experiments on a 9 degree-of-freedom omnidirectional mobile manipulator. A video demonstrating many of the experiments can be found at http://tiny.cc/crv21-mm.

@INPROCEEDINGS{heins-crv21,
author = {Adam Heins and Michael Jakob and Angela P. Schoellig},
title = {Mobile Manipulation in Unknown Environments with Differential Inverse Kinematics Control},
booktitle = {{Proc. of the Conference on Robots and Vision (CRV)}},
year = {2021},
pages = {64--71},
urlvideo = {http://tiny.cc/crv21-mm},
abstract = {Mobile manipulators combine the large workspace of mobile robots with the interactive capabilities of manipulator arms, making them useful in a variety of domains including construction and assistive care. We propose a differential inverse kinematics whole-body control approach for position-controlled industrial mobile manipulators. Our controller is capable of task-space trajectory tracking, force regulation, obstacle and singularity avoidance, and pushing an object toward a goal location, with limited sensing and knowledge of the environment. We evaluate the proposed approach through extensive experiments on a 9 degree-of-freedom omnidirectional mobile manipulator. A video demonstrating many of the experiments can be found at http://tiny.cc/crv21-mm.},
}

Learning to fly—a Gym environment with PyBullet physics for reinforcement learning of multi-agent quadcopter control
J. Panerati, H. Zheng, S. Zhou, J. Xu, A. Prorok, and A. P. Schoellig
in Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2021, p. 7512–7519.

Robotic simulators are crucial for academic research and education as well as the development of safetycritical applications. Reinforcement learning environments—simple simulations coupled with a problem specification in the form of a reward function—are also important to standardize the development (and benchmarking) of learning algorithms. Yet, full-scale simulators typically lack portability and parallelizability. Vice versa, many reinforcement learning environments trade-off realism for high sample throughputs in toylike problems. While public data sets have greatly benefited deep learning and computer vision, we still lack the software tools to simultaneously develop—and fairly compare—control theory and reinforcement learning approaches. In this paper, we propose an open-source OpenAI Gym-like environment for multiple quadcopters based on the Bullet physics engine. Its multi-agent and vision-based reinforcement learning interfaces, as well as the support of realistic collisions and aerodynamic effects, make it, to the best of our knowledge, a first of its kind. We demonstrate its use through several examples, either for control (trajectory tracking with PID control, multi-robot flight with downwash, etc.) or reinforcement learning (single and multi-agent stabilization tasks), hoping to inspire future research that combines control theory and machine learning.

@INPROCEEDINGS{panerati-iros21,
author = {Jacopo Panerati and Hehui Zheng and SiQi Zhou and James Xu and Amanda Prorok and Angela P. Schoellig},
title = {Learning to Fly—a {Gym} Environment with {PyBullet} Physics for
Reinforcement Learning of Multi-agent Quadcopter Control},
booktitle = {{Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS)}},
year = {2021},
pages = {7512--7519},
doi = {10.1109/IROS51168.2021.9635857},
urlvideo = {https://www.youtube.com/watch?v=-zyrmneaz88},
urllink = {https://arxiv.org/abs/2103.02142},
abstract = {Robotic simulators are crucial for academic research and education as well as the development of safetycritical applications. Reinforcement learning environments—simple simulations coupled with a problem specification in the form of a reward function—are also important to standardize the development (and benchmarking) of learning algorithms. Yet, full-scale simulators typically lack portability and parallelizability. Vice versa, many reinforcement learning environments trade-off realism for high sample throughputs in toylike problems. While public data sets have greatly benefited deep learning and computer vision, we still lack the software tools to simultaneously develop—and fairly compare—control theory and reinforcement learning approaches. In this paper, we propose an open-source OpenAI Gym-like environment for multiple quadcopters based on the Bullet physics engine. Its multi-agent and vision-based reinforcement learning interfaces, as well as the support of realistic collisions and aerodynamic effects, make it, to the best of our knowledge, a first of its kind. We demonstrate its use through several examples, either for control (trajectory tracking with PID control, multi-robot flight with downwash, etc.) or reinforcement learning (single and multi-agent stabilization tasks), hoping to inspire future research that combines control theory and machine learning.},
}

2020

Haul road monitoring in open pit mines using unmanned aerial vehicles: a case study at Bald Mountain mine site
F. Medinac, T. Bamford, M. Hart, M. Kowalczyk, and K. Esmaeili
Mining, Metallurgy & Exploration, vol. 37, p. 1877–1883, 2020.

Improved haul road conditions can positively impact mine operations resulting in increased safety, productivity gains, increased tire life, and lower maintenance costs. For these reasons, a monitoring program is required to ensure the operational efficiency of the haul roads. Currently, at Bald Mountain mine, monthly site severity studies, ad hoc inspections by frontline supervisors, or operator feedback reporting is used to assess road conditions. These methods are subjective and provide low temporal resolution data. This case study presents novel unmanned aerial vehicle (UAV) technologies, applied on a critical section of haul road at Bald Mountain, to showcase the potential for monitoring haul roads. The results show that orthophotos and digital elevation models can be used to assess the road smoothness condition and to check the road design compliance. Moreover, the aerial mapping allows detection of surface water, rock spillage, and potholes on the road that can be quickly repaired/removed by the dedicated road maintenance team.

@article{medinac-mme20,
title = {Haul road monitoring in open pit mines using unmanned aerial vehicles: A case study at {Bald Mountain} mine site},
author = {Filip Medinac and Thomas Bamford and Matthew Hart and Michal Kowalczyk and Kamran Esmaeili},
journal = {{Mining, Metallurgy \& Exploration}},
year = {2020},
volume = {37},
pages = {1877--1883},
doi = {10.1007/s42461-020-00291-w},
urllink = {https://rdcu.be/cbOuJ},
abstract = {Improved haul road conditions can positively impact mine operations resulting in increased safety, productivity gains, increased tire life, and lower maintenance costs. For these reasons, a monitoring program is required to ensure the operational efficiency of the haul roads. Currently, at Bald Mountain mine, monthly site severity studies, ad hoc inspections by frontline supervisors, or operator feedback reporting is used to assess road conditions. These methods are subjective and provide low temporal resolution data. This case study presents novel unmanned aerial vehicle (UAV) technologies, applied on a critical section of haul road at Bald Mountain, to showcase the potential for monitoring haul roads. The results show that orthophotos and digital elevation models can be used to assess the road smoothness condition and to check the road design compliance. Moreover, the aerial mapping allows detection of surface water, rock spillage, and potholes on the road that can be quickly repaired/removed by the dedicated road maintenance team.},
}

Deep neural networks as add-on modules for enhancing robot performance in impromptu trajectory tracking
S. Zhou, M. K. Helwa, and A. P. Schoellig
The International Journal of Robotics Research, p. 1–22, 2020.

High-accuracy trajectory tracking is critical to many robotic applications, including search and rescue, advanced manufacturing, and industrial inspection, to name a few. Yet the unmodeled dynamics and parametric uncertainties of operating in such complex environments make it difficult to design controllers that are capable of accurately tracking arbitrary, feasible trajectories from the first attempt (i.e., impromptu trajectory tracking). This article proposes a platform-independent, learning-based ‘‘add-on’’ module to enhance the tracking performance of black-box control systems in impromptu tracking tasks. Our approach is to pre-cascade a deep neural network (DNN) to a stabilized baseline control system, in order to establish an identity mapping from the desired output to the actual output. Previous research involving quadrotors showed that, for 30 arbitrary hand-drawn trajectories, the DNN-enhancement control architecture reduces tracking errors by 43\% on average, as compared with the baseline controller. In this article, we provide a platform-independent formulation and practical design guidelines for the DNN-enhancement approach. In particular, we: (1) characterize the underlying function of the DNN module; (2) identify necessary conditions for the approach to be effective; (3) provide theoretical insights into the stability of the overall DNN-enhancement control architecture; (4) derive a condition that supports dataefficient training of the DNN module; and (5) compare the novel theory-driven DNN design with the prior trial-and-error design using detailed quadrotor experiments. We show that, as compared with the prior trial-and-error design, the novel theory-driven design allows us to reduce the input dimension of the DNN by two thirds while achieving similar tracking performance.

@article{zhou-ijrr20,
title = {Deep neural networks as add-on modules for enhancing robot performance in impromptu trajectory tracking},
author = {Siqi Zhou and Mohamed K. Helwa and Angela P. Schoellig},
journal = {{The International Journal of Robotics Research}},
year = {2020},
volume = {0},
number = {0},
pages = {1--22},
doi = {10.1177/0278364920953902},
urllink = {https://doi.org/10.1177/0278364920953902},
urlvideo = {https://youtu.be/K-DrZGFvpN4},
abstract = {High-accuracy trajectory tracking is critical to many robotic applications, including search and rescue, advanced manufacturing, and industrial inspection, to name a few. Yet the unmodeled dynamics and parametric uncertainties of operating in such complex environments make it difficult to design controllers that are capable of accurately tracking arbitrary, feasible trajectories from the first attempt (i.e., impromptu trajectory tracking). This article proposes a platform-independent,
learning-based ‘‘add-on’’ module to enhance the tracking performance of black-box control systems in impromptu tracking tasks. Our approach is to pre-cascade a deep neural network (DNN) to a stabilized baseline control system, in order to establish an identity mapping from the desired output to the actual output. Previous research involving quadrotors showed that, for 30 arbitrary hand-drawn trajectories, the DNN-enhancement control architecture reduces tracking errors by 43\% on average, as compared with the baseline controller. In this article, we provide a platform-independent formulation and practical design guidelines for the DNN-enhancement approach. In particular, we: (1) characterize the underlying function of the DNN module; (2) identify necessary conditions for the approach to be effective; (3) provide theoretical insights into the stability of the overall DNN-enhancement control architecture; (4) derive a condition that supports dataefficient training of the DNN module; and (5) compare the novel theory-driven DNN design with the prior trial-and-error design using detailed quadrotor experiments. We show that, as compared with the prior trial-and-error design, the novel theory-driven design allows us to reduce the input dimension of the DNN by two thirds while achieving similar tracking performance.}
}

Continuous monitoring and improvement of the blasting process in open pit mines using unmanned aerial vehicle techniques
T. Bamford, F. Medinac, and K. Esmaeili
Remote Sensing, vol. 12, iss. 17, p. 2801, 2020.

The current techniques used for monitoring the blasting process in open pit mines are manual, intermittent and inefficient and can expose technical manpower to hazardous conditions. This study presents the application of unmanned aerial vehicle (UAV) systems for monitoring and improving the blasting process in open pit mines. Field experiments were conducted in different open pit mines to assess rock fragmentation, blast-induced damage on final pit walls, blast dynamics and the accuracy of blastholes including production and pre-split holes. The UAV-based monitoring was done in three different stages, including pre-blasting, blasting and post-blasting. In the pre-blasting stage, pit walls were mapped to collect structural data to predict in situ block size distribution and to develop as-built pit wall digital elevation models (DEM) to assess blast-induced damage. This was followed by mapping the production blasthole patterns implemented in the mine to investigate drillhole alignment. To monitor the blasting process, a high-speed camera was mounted on the UAV to investigate blast initiation, sequencing, misfired holes and stemming ejection. In the post-blast stage, the blasted rock pile (muck pile) was monitored to estimate fragmentation and assess muck pile configuration, heave and throw. The collected aerial data provide detailed information and high spatial and temporal resolution on the quality of the blasting process and significant opportunities for process improvement. The current challenges with regards to the application of UAVs for blasting process monitoring are discussed, and recommendations for obtaining the most value out of an UAV application are provided.

@article{bamford-rs20,
title = {Continuous Monitoring and Improvement of the Blasting Process in Open Pit Mines Using Unmanned Aerial Vehicle Techniques},
author = {Thomas Bamford and Filip Medinac and Kamran Esmaeili},
journal = {{Remote Sensing}},
year = {2020},
volume = {12},
number = {17},
doi = {10.3390/rs12172801},
pages = {2801},
urllink = {https://www.mdpi.com/2072-4292/12/17/2801},
abstract = {The current techniques used for monitoring the blasting process in open pit mines are manual, intermittent and inefficient and can expose technical manpower to hazardous conditions. This study presents the application of unmanned aerial vehicle (UAV) systems for monitoring and improving the blasting process in open pit mines. Field experiments were conducted in different open pit mines to assess rock fragmentation, blast-induced damage on final pit walls, blast dynamics and the accuracy of blastholes including production and pre-split holes. The UAV-based monitoring was done in three different stages, including pre-blasting, blasting and post-blasting. In the pre-blasting stage, pit walls were mapped to collect structural data to predict in situ block size distribution and to develop as-built pit wall digital elevation models (DEM) to assess blast-induced damage. This was followed by mapping the production blasthole patterns implemented in the mine to investigate drillhole alignment. To monitor the blasting process, a high-speed camera was mounted on the UAV to investigate blast initiation, sequencing, misfired holes and stemming ejection. In the post-blast stage, the blasted rock pile (muck pile) was monitored to estimate fragmentation and assess muck pile configuration, heave and throw. The collected aerial data provide detailed information and high spatial and temporal resolution on the quality of the blasting process and significant opportunities for process improvement. The current challenges with regards to the application of UAVs for blasting process monitoring are discussed, and recommendations for obtaining the most value out of an UAV application are provided.},
}

To share or not to share? performance guarantees and the asymmetric nature of cross-robot experience transfer
M. J. Sorocky, S. Zhou, and A. P. Schoellig
IEEE Control Systems Letters, vol. 5, iss. 3, p. 923–928, 2020.

In the robotics literature, experience transfer has been proposed in different learning-based control frameworks to minimize the costs and risks associated with training robots. While various works have shown the feasibility of transferring prior experience from a source robot to improve or accelerate the learning of a target robot, there are usually no guarantees that experience transfer improves the performance of the target robot. In practice, the efficacy of transferring experience is often not known until it is tested on physical robots. This trial-and-error approach can be extremely unsafe and inefficient. Building on our previous work, in this paper we consider an inverse module transfer learning framework, where the inverse module of a source robot system is transferred to a target robot system to improve its tracking performance on arbitrary trajectories. We derive a theoretical bound on the tracking error when a source inverse module is transferred to the target robot and propose a Bayesian-optimization-based algorithm to estimate this bound from data. We further highlight the asymmetric nature of cross-robot experience transfer that has often been neglected in the literature. We demonstrate our approach in quadrotor experiments and show that we can guarantee positive transfer on the target robot for tracking random periodic trajectories.

@article{sorocky-lcss20,
title = {To Share or Not to Share? Performance Guarantees and the Asymmetric Nature of Cross-Robot Experience Transfer},
author = {Michael J. Sorocky and Siqi Zhou and Angela P. Schoellig},
journal = {{IEEE Control Systems Letters}},
year = {2020},
volume = {5},
number = {3},
pages = {923--928},
doi = {10.1109/LCSYS.2020.3005886},
urllink = {https://ieeexplore.ieee.org/document/9129781},
urlvideo = {https://www.youtube.com/watch?v=fPWNhIMcMqM},
urlvideo2 = {https://youtu.be/wVAxJO-pejQ},
abstract = {In the robotics literature, experience transfer has been proposed in different learning-based control frameworks to minimize the costs and risks associated with training robots. While various works have shown the feasibility of transferring prior experience from a source robot to improve or accelerate the learning of a target robot, there are usually no guarantees that experience transfer improves the performance of the target robot. In practice, the efficacy of transferring experience is often not known until it is tested on physical robots. This trial-and-error approach can be extremely unsafe and inefficient. Building on our previous work, in this paper we consider an inverse module transfer learning framework, where the inverse module of a source robot system is transferred to a target robot system to improve its tracking performance on arbitrary trajectories. We derive a theoretical bound on the tracking error when a source inverse module is transferred to the target robot and propose a Bayesian-optimization-based algorithm to estimate this bound from data. We further highlight the asymmetric nature of cross-robot experience transfer that has often been neglected in the literature. We demonstrate our approach in quadrotor experiments and show that we can guarantee positive transfer on the target robot for tracking random periodic trajectories.}
}

Variational inference with parameter learning applied to vehicle trajectory estimation
J. N. Wong, D. J. Yoon, A. P. Schoellig, and T. D. Barfoot
IEEE Robotics and Automation Letters, vol. 5, iss. 4, p. 5291–5298, 2020.

We present parameter learning in a Gaussian variational inference setting using only noisy measurements (i.e., no groundtruth). This is demonstrated in the context of vehicle trajectory estimation, although the method we propose is general. The letter extends the Exactly Sparse Gaussian Variational Inference (ESGVI) framework, which has previously been used for large-scale nonlinear batch state estimation. Our contribution is to additionally learn parameters of our system models (which may be difficult to choose in practice) within the ESGVI framework. In this letter, we learn the covariances for the motion and sensor models used within vehicle trajectory estimation. Specifically, we learn the parameters of a white-noise-on-acceleration motion model and the parameters of an Inverse-Wishart prior over measurement covariances for our sensor model. We demonstrate our technique using a 36 km dataset consisting of a car using lidar to localize against a high-definition map; we learn the parameters on a training section of the data and then show that we achieve high-quality state estimates on a test section, even in the presence of outliers. Lastly, we show that our framework can be used to solve pose graph optimization even with many false loop closures.

@article{wong-ral20b,
title = {Variational Inference with Parameter Learning Applied to Vehicle Trajectory Estimation},
author = {Jeremy N. Wong and David J. Yoon and Angela P. Schoellig and Timothy D. Barfoot},
journal = {{IEEE Robotics and Automation Letters}},
year = {2020},
volume = {5},
number = {4},
pages = {5291--5298},
doi = {10.1109/LRA.2020.3007381},
urllink = {https://ieeexplore.ieee.org/document/9134886},
urlvideo = {https://youtu.be/WTj7Cl0wXFo},
abstract = {We present parameter learning in a Gaussian variational inference setting using only noisy measurements (i.e., no groundtruth). This is demonstrated in the context of vehicle trajectory estimation, although the method we propose is general. The letter extends the Exactly Sparse Gaussian Variational Inference (ESGVI) framework, which has previously been used for large-scale nonlinear batch state estimation. Our contribution is to additionally learn parameters of our system models (which may be difficult to choose in practice) within the ESGVI framework. In this letter, we learn the covariances for the motion and sensor models used within vehicle trajectory estimation. Specifically, we learn the parameters of a white-noise-on-acceleration motion model and the parameters of an Inverse-Wishart prior over measurement covariances for our sensor model. We demonstrate our technique using a 36 km dataset consisting of a car using lidar to localize against a high-definition map; we learn the parameters on a training section of the data and then show that we achieve high-quality state estimates on a test section, even in the presence of outliers. Lastly, we show that our framework can be used to solve pose graph optimization even with many false loop closures.}
}

Online trajectory generation with distributed model predictive control for multi-robot motion planning
C. E. Luis, M. Vukosavljev, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 5, iss. 2, p. 604–611, 2020.

We present a distributed model predictive control (DMPC) algorithm to generate trajectories in real-time for multiple robots. We adopted the on-demand collision avoidance method presented in previous work to efficiently compute non-colliding trajectories in transition tasks. An event-triggered replanning strategy is proposed to account for disturbances. Our simulation results show that the proposed collision avoidance method can reduce, on average, around 50\% of the travel time required to complete a multi-agent point-to-point transition when compared to the well-studied Buffered Voronoi Cells (BVC) approach. Additionally, it shows a higher success rate in transition tasks with a high density of agents, with more than 90\% success rate with 30 palm-sized quadrotor agents in a 18 m^3 arena. The approach was experimentally validated with a swarm of up to 20 drones flying in close proximity.

@article{luis-ral20,
title = {Online Trajectory Generation with Distributed Model Predictive Control for Multi-Robot Motion Planning},
author = {Carlos E. Luis and Marijan Vukosavljev and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2020},
volume = {5},
number = {2},
pages = {604--611},
doi = {10.1109/LRA.2020.2964159},
urlvideo = {https://www.youtube.com/watch?v=N4rWiraIU2k},
urllink = {https://arxiv.org/pdf/1909.05150.pdf},
abstract = {We present a distributed model predictive control (DMPC) algorithm to generate trajectories in real-time for multiple robots. We adopted the on-demand collision avoidance method presented in previous work to efficiently compute non-colliding trajectories in transition tasks. An event-triggered replanning strategy is proposed to account for disturbances. Our simulation results show that the proposed collision avoidance method can reduce, on average, around 50\% of the travel time required to complete a multi-agent point-to-point transition when compared to the well-studied Buffered Voronoi Cells (BVC) approach. Additionally, it shows a higher success rate in transition tasks with a high density of agents, with more than 90\% success rate with 30 palm-sized quadrotor agents in a 18 m^3 arena. The approach was experimentally validated with a swarm of up to 20 drones flying in close proximity.}
}

A data-driven motion prior for continuous-time trajectory estimation on SE(3)
J. N. Wong, D. J. Yoon, A. P. Schoellig, and T. D. Barfoot
IEEE Robotics and Automation Letters, vol. 5, iss. 2, p. 1429–1436, 2020.

Simultaneous trajectory estimation and mapping (STEAM) is a method for continuous-time trajectory estimation in which the trajectory is represented as a Gaussian Process (GP). Previous formulations of STEAM used a GP prior that assumed either white-noise-on-acceleration (WNOA) or white-noise-on-jerk (WNOJ). However, previous work did not provide a principled way to choose the continuous-time motion prior or its parameters on a real robotic system. This paper derives a novel data-driven motion prior where ground truth trajectories of a moving robot are used to train a motion prior that better represents the robot’s motion. In this approach, we use a prior where latent accelerations are represented as a GP with a Matérn covariance function and draw a connection to the Singer acceleration model. We then formulate a variation of STEAM using this new prior. We train the WNOA, WNOJ, and our new latent-force prior and evaluate their performance in the context of both lidar localization and lidar odometry of a car driving along a 20km route, where we show improved state estimates compared to the two previous formulations.

@article{wong-ral20,
title = {A Data-Driven Motion Prior for Continuous-Time Trajectory Estimation on {SE(3)}},
author = {Jeremy N. Wong and David J. Yoon and Angela P. Schoellig and Timothy D. Barfoot},
journal = {{IEEE Robotics and Automation Letters}},
year = {2020},
volume = {5},
number = {2},
pages = {1429--1436},
doi = {10.1109/LRA.2020.2969153},
urlvideo = {https://youtu.be/xUGl3w6meZg},
abstract = {Simultaneous trajectory estimation and mapping (STEAM) is a method for continuous-time trajectory estimation in which the trajectory is represented as a Gaussian Process (GP). Previous formulations of STEAM used a GP prior that assumed either white-noise-on-acceleration (WNOA) or white-noise-on-jerk (WNOJ). However, previous work did not provide a principled way to choose the continuous-time motion prior or its parameters on a real robotic system. This paper derives a novel data-driven motion prior where ground truth trajectories of a moving robot are used to train a motion prior that better represents the robot's motion. In this approach, we use a prior where latent accelerations are represented as a GP with a Mat\'{e}rn covariance function and draw a connection to the Singer acceleration model. We then formulate a variation of STEAM using this new prior.
We train the WNOA, WNOJ, and our new latent-force prior and evaluate their performance in the context of both lidar localization and lidar odometry of a car driving along a 20km route, where we show improved state estimates compared to the two previous formulations.}
}

Tag-based indoor localization of uavs in construction environments: opportunities and challenges in practice
N. Kayhani, B. McCabe, A. Abdelaal, A. Heins, and A. P. Schoellig
in Proc. of the Construction Research Congress, 2020, p. 226–235.

Automated visual inspection and progress monitoring of construction projects using different robotic platforms have recently attracted scholars’ attention. Unmanned/unoccupied aerial vehicles (UAVs), however, are more and more being used for this purpose because of their maneuverability and perspective capabilities. Although a multi-sensor autonomous UAV can enhance the collection of informative data in constantly-evolving construction environments, autonomous flight and navigation of UAVs are challenging in indoor environments where the global positioning system (GPS) might be denied or unreliable. In such continually changing environments, the limited external infrastructure and the existence of unknown obstacles are two key challenges that need to be addressed. On the other hand, construction indoor environments are not fully unknown, as a progressively updating building information model (BIM) provides valuable prior knowledge about the GPS-denied environment. This fact can potentially create unique opportunities to facilitate the indoor navigation process in construction projects. The authors have previously shown the potentials of AprilTag fiducial markers for localization of a camera-equipped UAV in various controlled experimental setups in the laboratory. In this paper, we investigate the opportunities and challenges of using tag-based localization techniques in real-world construction environments.

@INPROCEEDINGS{kayhani-crc20,
author={Navid Kayhani and Brenda McCabe and Ahmed Abdelaal and Adam Heins and Angela P. Schoellig},
booktitle={{Proc. of the Construction Research Congress}},
title={Tag-based Indoor Localization of UAVs in Construction Environments: Opportunities and Challenges in Practice},
year={2020},
pages={226--235},
doi={10.1061/9780784482865.025},
urllink = {https://ascelibrary.org/doi/epdf/10.1061/9780784482865.025},
abstract = {Automated visual inspection and progress monitoring of construction projects using different robotic platforms have recently attracted scholars' attention. Unmanned/unoccupied aerial vehicles (UAVs), however, are more and more being used for this purpose because of their maneuverability and perspective capabilities. Although a multi-sensor autonomous UAV can enhance the collection of informative data in constantly-evolving construction environments, autonomous flight and navigation of UAVs are challenging in indoor environments where the global positioning system (GPS) might be denied or unreliable. In such continually changing environments, the limited external infrastructure and the existence of unknown obstacles are two key challenges that need to be addressed. On the other hand, construction indoor environments are not fully unknown, as a progressively updating building information model (BIM) provides valuable prior knowledge about the GPS-denied environment. This fact can potentially create unique opportunities to facilitate the indoor navigation process in construction projects. The authors have previously shown the potentials of AprilTag fiducial markers for localization of a camera-equipped UAV in various controlled experimental setups in the laboratory. In this paper, we investigate the opportunities and challenges of using tag-based localization techniques in real-world construction environments.}
}

Catch the ball: accurate high-speed motions for mobile manipulators via inverse dynamics learning
K. Dong, K. Pereida, F. Shkurti, and A. P. Schoellig
in Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2020, p. 6718–6725.

Mobile manipulators consist of a mobile platform equipped with one or more robot arms and are of interest for a wide array of challenging tasks because of their extended workspace and dexterity. Typically, mobile manipulators are deployed in slow-motion collaborative robot scenarios. In this paper, we consider scenarios where accurate high-speed motions are required. We introduce a framework for this regime of tasks including two main components: (i) a bi-level motion optimization algorithm for real-time trajectory generation, which relies on Sequential Quadratic Programming (SQP) and Quadratic Programming (QP), respectively; and (ii) a learning-based controller optimized for precise tracking of high-speed motions via a learned inverse dynamics model. We evaluate our framework with a mobile manipulator platform through numerous high-speed ball catching experiments, where we show a success rate of 85.33\%. To the best of our knowledge, this success rate exceeds the reported performance of existing related systems and sets a new state of the art.

@INPROCEEDINGS{dong-iros20,
author = {Ke Dong and Karime Pereida and Florian Shkurti and Angela P. Schoellig},
title = {Catch the Ball: Accurate High-Speed Motions for Mobile Manipulators via Inverse Dynamics Learning},
booktitle = {{Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS)}},
year = {2020},
pages = {6718--6725},
urlvideo = {https://www.youtube.com/watch?v=4uCvzurthS4},
urlvideo2 = {https://youtu.be/LlWN3cGUIbk},
urllink = {https://arxiv.org/abs/1811.01273},
abstract = {Mobile manipulators consist of a mobile platform equipped with one or more robot arms and are of interest for a wide array of challenging tasks because of their extended workspace and dexterity. Typically, mobile manipulators are deployed in slow-motion collaborative robot scenarios. In this paper, we consider scenarios where accurate high-speed motions are required. We introduce a framework for this regime of tasks including two main components: (i) a bi-level motion optimization algorithm for real-time trajectory generation, which relies on Sequential Quadratic Programming (SQP) and Quadratic Programming (QP), respectively; and (ii) a learning-based controller optimized for precise tracking of high-speed motions via a learned inverse dynamics model. We evaluate our framework with a mobile manipulator platform through numerous high-speed ball catching experiments, where we show a success rate of 85.33\%. To the best of our knowledge, this success rate exceeds the reported performance of existing related systems and sets a new state of the art.},
}

Optimal geometry for ultra-wideband localization using Bayesian optimization
W. Zhao, M. Vukosavljev, and A. P. Schoellig
in Proc. of the International Federation of Automatic Control (IFAC) World Congress, 2020, p. 15481–15488.

This paper introduces a novel algorithm to find a geometric configuration of ultrawideband sources in order to provide optimal position estimation performance with TimeDifference-of-Arrival measurements. Different from existing works, we aim to achieve the best localization performance for a user-defined region of interest instead of a single target point. We employ an analysis based on the Cramer-Rao lower bound and dilution of precision to formulate an optimization problem. A Bayesian optimization-based algorithm is proposed to find an optimal geometry that achieves the smallest estimation variance upper bound while ensuring source placement constraints. The approach is validated through simulation and experimental results in 2D scenarios, showing an improvement over a naive source placement.

@INPROCEEDINGS{zhao-ifac20,
author = {Wenda Zhao and Marijan Vukosavljev and Angela P. Schoellig},
title = {Optimal Geometry for Ultra-wideband Localization using {Bayesian} Optimization},
booktitle = {{Proc. of the International Federation of Automatic Control (IFAC) World Congress}},
year = {2020},
volume = {53},
number = {2},
pages = {15481--15488},
urlvideo = {https://youtu.be/5mqKOfWpEWc},
abstract = {This paper introduces a novel algorithm to find a geometric configuration of ultrawideband sources in order to provide optimal position estimation performance with TimeDifference-of-Arrival measurements. Different from existing works, we aim to achieve the best localization performance for a user-defined region of interest instead of a single target point. We employ an analysis based on the Cramer-Rao lower bound and dilution of precision to formulate an optimization problem. A Bayesian optimization-based algorithm is proposed to find an optimal geometry that achieves the smallest estimation variance upper bound while ensuring source placement constraints. The approach is validated through simulation and experimental results in 2D scenarios, showing an improvement over a naive source placement.},
}

A perception-aware flatness-based model predictive controller for fast vision-based multirotor flight
M. Greeff, T. D. Barfoot, and A. P. Schoellig
in Proc. of the International Federation of Automatic Control (IFAC) World Congress, 2020, p. 9412–9419.

Despite the push toward fast, reliable vision-based multirotor flight, most vision- based navigation systems still rely on controllers that are perception-agnostic. Given that these controllers ignore their effect on the system’s localisation capabilities, they can produce an action that allows vision-based localisation (and consequently navigation) to fail. In this paper, we present a perception-aware flatness-based model predictive controller (MPC) that accounts for its effect on visual localisation. To achieve perception awareness, we first develop a simple geometric model that uses over 12 km of flight data from two different environments (urban and rural) to associate visual landmarks with a probability of being successfully matched. In order to ensure localisation, we integrate this model as a chance constraint in our MPC such that we are probabilistically guaranteed that the number of successfully matched visual landmarks exceeds a minimum threshold. We show how to simplify the chance constraint to a nonlinear, deterministic constraint on the position of the multirotor. With desired speeds of 10 m/s, we demonstrate in simulation (based on real-world perception data) how our proposed perception-aware MPC is able to achieve faster flight while guaranteeing localisation compared to similar perception-agnostic controllers. We illustrate how our perception-aware MPC adapts the path constraint along the path based on the perception model by accounting for camera orientation, path error and location of the visual landmarks. The result is that repeating the same geometric path but with the camera facing in opposite directions can lead to different optimal paths flown.

@INPROCEEDINGS{greeff-ifac20,
author = {Melissa Greeff and Timothy D. Barfoot and Angela P. Schoellig},
title = {A Perception-Aware Flatness-Based Model Predictive Controller for Fast Vision-Based Multirotor Flight},
booktitle = {{Proc. of the International Federation of Automatic Control (IFAC) World Congress}},
year = {2020},
volume = {53},
number = {2},
pages = {9412--9419},
urlvideo = {https://youtu.be/aBEce5aWfvk},
abstract = {Despite the push toward fast, reliable vision-based multirotor flight, most vision-
based navigation systems still rely on controllers that are perception-agnostic. Given that these controllers ignore their effect on the system’s localisation capabilities, they can produce an action that allows vision-based localisation (and consequently navigation) to fail. In this paper, we present a perception-aware flatness-based model predictive controller (MPC) that accounts for its effect on visual localisation. To achieve perception awareness, we first develop a simple geometric model that uses over 12 km of flight data from two different environments (urban and rural) to associate visual landmarks with a probability of being successfully matched. In order to ensure localisation, we integrate this model as a chance constraint in our MPC such that we are probabilistically guaranteed that the number of successfully matched visual landmarks exceeds a minimum threshold. We show how to simplify the chance constraint to a nonlinear, deterministic constraint on the position of the multirotor. With desired speeds of 10 m/s, we demonstrate in simulation (based on real-world perception data) how our proposed perception-aware MPC is able to achieve faster flight while guaranteeing localisation compared to similar perception-agnostic controllers. We illustrate how our perception-aware MPC adapts the path constraint along the path based on the perception model by accounting for camera orientation, path error and location of the visual landmarks. The result is that repeating the same geometric path but with the camera facing in opposite directions can lead to different optimal paths flown.},
}

Visual localization with Google Earth images for robust global pose estimation of UAVs
B. Patel, T. D. Barfoot, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2020, p. 6491–6497.

We estimate the global pose of a multirotor UAV by visually localizing images captured during a flight with Google Earth images pre-rendered from known poses. We metrically localize real images with georeferenced rendered images using a dense mutual information technique to allow accurate global pose estimation in outdoor GPS-denied environments. We show the ability to consistently localize throughout a sunny summer day despite major lighting changes while demonstrating that a typical feature-based localizer struggles under the same conditions. Successful image registrations are used as measurements in a filtering framework to apply corrections to the pose estimated by a gimballed visual odometry pipeline. We achieve less than 1 metre and 1 degree RMSE on a 303 metre flight and less than 3 metres and 3 degrees RMSE on six 1132 metre flights as low as 36 metres above ground level conducted at different times of the day from sunrise to sunset.

@INPROCEEDINGS{patel-icra20,
title = {Visual Localization with {Google Earth} Images for Robust Global Pose Estimation of {UAV}s},
author = {Bhavit Patel and Timothy D. Barfoot and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2020},
pages = {6491--6497},
urlvideo = {https://tiny.cc/GElocalization},
abstract = {We estimate the global pose of a multirotor UAV by visually localizing images captured during a flight with Google Earth images pre-rendered from known poses. We metrically localize real images with georeferenced rendered images using a dense mutual information technique to allow accurate global pose estimation in outdoor GPS-denied environments. We show the ability to consistently localize throughout a sunny summer day despite major lighting changes while demonstrating that a typical feature-based localizer struggles under the same conditions. Successful image registrations are used as measurements in a filtering framework to apply corrections to the pose estimated by a gimballed visual odometry pipeline. We achieve less than 1 metre and 1 degree RMSE on a 303 metre flight and less than 3 metres and 3 degrees RMSE on six 1132 metre flights as low as 36 metres above ground level conducted at different times of the day from sunrise to sunset.}
}

Context-aware cost shaping to reduce the impact of model error in safe, receding horizon control
C. D. McKinnon and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 2386-2392.

This paper presents a method to enable a robot using stochastic Model Predictive Control (MPC) to achieve high performance on a repetitive path-following task. In particular, we consider the case where the accuracy of the model for robot dynamics varies significantly over the path–motivated by the fact that the models used in MPC must be computationally efficient, which limits their expressive power. Our approach is based on correcting the cost predicted using a simple learned dynamics model over the MPC horizon. This discourages the controller from taking actions that lead to higher cost than would have been predicted using the dynamics model. In addition, stochastic MPC provides a quantitative measure of safety by limiting the probability of violating state and input constraints over the prediction horizon. Our approach is unique in that it combines both online model learning and cost learning over the prediction horizon and is geared towards operating a robot in changing conditions. We demonstrate our algorithm in simulation and experiment on a ground robot that uses a stereo camera for localization.

@INPROCEEDINGS{mckinnon-icra20,
title = {Context-aware Cost Shaping to Reduce the Impact of Model Error in Safe, Receding Horizon Control},
author = {Christopher D. McKinnon and Angela P. Schoellig},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2020},
pages = {2386-2392},
doi = {10.1109/ICRA40945.2020.9197521},
urlvideo = {https://youtu.be/xrgcO2-A9bo},
abstract = {This paper presents a method to enable a robot using stochastic Model Predictive Control (MPC) to achieve high performance on a repetitive path-following task. In particular, we consider the case where the accuracy of the model for robot dynamics varies significantly over the path–motivated by the fact that the models used in MPC must be computationally efficient, which limits their expressive power. Our approach is based on correcting the cost predicted using a simple learned dynamics model over the MPC horizon. This discourages the controller from taking actions that lead to higher cost than would have been predicted using the dynamics model. In addition, stochastic MPC provides a quantitative measure of safety by limiting the probability of violating state and input constraints over the prediction horizon. Our approach is unique in that it combines both online model learning and cost learning over the prediction horizon and is geared towards operating a robot in changing conditions. We demonstrate our algorithm in simulation and experiment on a ground robot that uses a stereo camera for localization.}
}

Experience selection using dynamics similarity for efficient multi-source transfer learning between robots
M. J. Sorocky, S. Zhou, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2020, p. 2739–2745.

In the robotics literature, different knowledge transfer approaches have been proposed to leverage the experience from a source task or robot—real or virtual—to accelerate the learning process on a new task or robot. A commonly made but infrequently examined assumption is that incorporating experience from a source task or robot will be beneficial. For practical applications, inappropriate knowledge transfer can result in negative transfer or unsafe behaviour. In this work, inspired by a system gap metric from robust control theory, the nu-gap, we present a data-efficient algorithm for estimating the similarity between pairs of robot systems. In a multi-source inter-robot transfer learning setup, we show that this similarity metric allows us to predict relative transfer performance and thus informatively select experiences from a source robot before knowledge transfer. We demonstrate our approach with quadrotor experiments, where we transfer an inverse dynamics model from a real or virtual source quadrotor to enhance the tracking performance of a target quadrotor on arbitrary hand-drawn trajectories. We show that selecting experiences based on the proposed similarity metric effectively facilitates the learning of the target quadrotor, improving performance by 62\% compared to a poorly selected experience.

@INPROCEEDINGS{sorocky-icra20,
author = {Michael J. Sorocky and Siqi Zhou and Angela P. Schoellig},
title = {Experience Selection Using Dynamics Similarity for Efficient Multi-Source Transfer Learning Between Robots},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2020},
pages = {2739--2745},
urllink = {https://ieeexplore.ieee.org/document/9196744},
urlvideo = {https://youtu.be/8m3mOkljujM},
abstract = {In the robotics literature, different knowledge transfer approaches have been proposed to leverage the experience from a source task or robot—real or virtual—to accelerate the learning process on a new task or robot. A commonly made but infrequently examined assumption is that incorporating experience from a source task or robot will be beneficial. For practical applications, inappropriate knowledge transfer can result in negative transfer or unsafe behaviour. In this work, inspired by a system gap metric from robust control theory, the nu-gap, we present a data-efficient algorithm for estimating the similarity between pairs of robot systems. In a multi-source inter-robot transfer learning setup, we show that this similarity metric allows us to predict relative transfer performance and thus informatively select experiences from a source robot before knowledge transfer. We demonstrate our approach with quadrotor experiments, where we transfer an inverse dynamics model from a real or virtual source quadrotor to enhance the tracking performance of a target quadrotor on arbitrary hand-drawn trajectories. We show that selecting experiences based on the proposed similarity metric effectively facilitates the learning of the target quadrotor, improving performance by 62\% compared to a poorly selected experience.},
}

2019

Distributed iterative learning control for multi-agent systems
A. Hock and A. P. Schoellig
Autonomous Robots, vol. 43, iss. 8, p. 1989–2010, 2019.

The goal of this work is to enable a team of quadrotors to learn how to accurately track a desired trajectory while holding agiven formation. We solve this problem in a distributed manner, where each vehicle has only access to the information of its neighbors. The desired trajectory is only available to one (or few) vehicle(s). We present a distributed iterative learning control {(ILC)} approach where each vehicle learns from the experience of its own and its neighbors’ previous task repetitions and adapts its feedforward input to improve performance. Existing algorithms are extended in theory to make them more applicable to real-world experiments. In particular, we prove convergence of the learning scheme for any linear, causal learning function with gains chosen according to a simple scalar condition. Previous proofs were restricted to a specific learning function, which only depends on the tracking error derivative {(D-type ILC)}. This extension provides more degrees of freedom in the {ILC} design and, as a result, better performance can be achieved. We also show that stability is not affected by a linear dynamic coupling between neighbors. This allows the use of an additional consensus feedback controller to compensate for non-repetitive disturbances. Possible robustness extensions for the {ILC} algorithm are discussed, the so-called {Q-filter} and a {Kalman} filter for disturbance estimation. Finally, this is the first work to show distributed ILC in experiment. With a team of two quadrotors, the practical applicability of the proposed distributed multi-agent {ILC} approach is attested and the benefits of the theoretic extension are analyzed. In a second experimental setup with a team of four quadrotors, we evaluate the impact of different communication graph structures on the learning performance. The results indicate, that there is a trade-off between fast learning convergence and formation synchronicity, especially during the first iterations.

@article{hock-auro19,
title = {Distributed iterative learning control for multi-agent systems},
author = {Andreas Hock and Angela P. Schoellig},
journal = {{Autonomous Robots}},
year = {2019},
volume = {43},
number = {8},
pages = {1989--2010},
doi = {10.1007/s10514-019-09845-4},
abstract = {The goal of this work is to enable a team of quadrotors to learn how to accurately track a desired trajectory while holding agiven formation. We solve this problem in a distributed manner, where each vehicle has only access to the information of its neighbors. The desired trajectory is only available to one (or few) vehicle(s). We present a distributed iterative learning control {(ILC)} approach where each vehicle learns from the experience of its own and its neighbors’ previous task repetitions and adapts its feedforward input to improve performance. Existing algorithms are extended in theory to make them more applicable to real-world experiments. In particular, we prove convergence of the learning scheme for any linear, causal learning function with gains chosen according to a simple scalar condition. Previous proofs were restricted to a specific learning function, which only depends on the tracking error derivative {(D-type ILC)}. This extension provides more degrees of freedom in the {ILC} design and, as a result, better performance can be achieved. We also show that stability is not affected by a linear dynamic coupling between neighbors. This allows the use of an additional consensus feedback controller to compensate for non-repetitive disturbances. Possible robustness extensions for the {ILC} algorithm are discussed, the so-called {Q-filter} and a {Kalman} filter for disturbance estimation. Finally, this is the first work to show distributed ILC in experiment. With a team of two quadrotors, the practical applicability of the proposed distributed multi-agent {ILC} approach is attested and the benefits of the theoretic extension are analyzed. In a second experimental setup with a team of four quadrotors, we evaluate the impact of different communication graph structures on the learning performance. The results indicate, that there is a trade-off between fast learning convergence and formation synchronicity, especially during the first iterations.}
}

A modular framework for motion planning using safe-by-design motion primitives
M. Vukosavljev, Z. Kroeze, A. P. Schoellig, and M. E. Broucke
IEEE Transactions on Robotics, vol. 35, iss. 5, p. 1233–1252, 2019.

In this paper, we present a modular framework for solving a motion planning problem among a group of robots. The proposed framework utilizes a finite set of low-level motion primitives to generate motions in a gridded workspace. The constraints on allowable sequences of motion primitives are formalized through a maneuver automaton . At the high level, a control policy determines which motion primitive is executed in each box of the gridded workspace. We state general conditions on motion primitives to obtain provably correct behavior so that a library of safe-by-design motion primitives can be designed. The overall framework yields a highly robust design by utilizing feedback strategies at both the low and high levels. We provide specific designs for motion primitives and control policies suitable for multirobot motion planning; the modularity of our approach enables one to independently customize the designs of each of these components. Our approach is experimentally validated on a group of quadrocopters.

@article{vukosavljev-tro19,
title = {A modular framework for motion planning using safe-by-design motion primitives},
author = {Marijan Vukosavljev and Zachary Kroeze and Angela P. Schoellig and Mireille E. Broucke},
journal = {{IEEE Transactions on Robotics}},
year = {2019},
volume = {35},
number = {5},
pages = {1233--1252},
doi = {10.1109/TRO.2019.2923335},
urlvideo = {http://tiny.cc/modular-3alg},
urllink = {https://arxiv.org/abs/1905.00495},
abstract = {In this paper, we present a modular framework for solving a motion planning problem among a group of robots. The proposed framework utilizes a finite set of low-level motion primitives to generate motions in a gridded workspace. The constraints on allowable sequences of motion primitives are formalized through a maneuver automaton . At the high level, a control policy determines which motion primitive is executed in each box of the gridded workspace. We state general conditions on motion primitives to obtain provably correct behavior so that a library of safe-by-design motion primitives can be designed. The overall framework yields a highly robust design by utilizing feedback strategies at both the low and high levels. We provide specific designs for motion primitives and control policies suitable for multirobot motion planning; the modularity of our approach enables one to independently customize the designs of each of these components. Our approach is experimentally validated on a group of quadrocopters.}
}

There’s no place like home: visual teach and repeat for emergency return of multirotor UAVs during GPS failure
M. Warren, M. Greeff, B. Patel, J. Collier, A. P. Schoellig, and T. D. Barfoot
IEEE Robotics and Automation Letters, vol. 4, iss. 1, p. 161–168, 2019.

Redundant navigation systems are critical for safe operation of UAVs in high-risk environments. Since most commercial UAVs almost wholly rely on GPS, jamming, interference and multi-pathing are real concerns that usually limit their operations to low-risk environments and VLOS. This paper presents a vision-based route-following system for the autonomous, safe return of UAVs under primary navigation failure such as GPS jamming. Using a Visual Teach and Repeat framework to build a visual map of the environment during an outbound flight, we show the autonomous return of the UAV by visually localising the live view to this map when a simulated GPS failure occurs, controlling the vehicle to follow the safe outbound path back to the launch point. Using gimbal-stabilised stereo vision alone, without reliance on external infrastructure or inertial sensing, Visual Odometry and localisation are achieved at altitudes of 5-25 m and flight speeds up to 55 km/h. We examine the performance of the visual localisation algorithm under a variety of conditions and also demonstrate closed-loop autonomy along a complicated 450 m path.

@article{warren-ral19,
title = {There's No Place Like Home: Visual Teach and Repeat for Emergency Return of Multirotor {UAV}s During {GPS} Failure},
author = {Michael Warren and Melissa Greeff and Bhavit Patel and Jack Collier and Angela P. Schoellig and Timothy D. Barfoot},
journal = {{IEEE Robotics and Automation Letters}},
year = {2019},
volume = {4},
number = {1},
pages = {161--168},
doi = {10.1109/LRA.2018.2883408},
urllink = {https://arxiv.org/abs/1809.05757},
urlvideo = {https://youtu.be/oJaQ4ZbvsFw},
abstract = {Redundant navigation systems are critical for safe operation of UAVs in high-risk environments. Since most commercial UAVs almost wholly rely on GPS, jamming, interference and multi-pathing are real concerns that usually limit their operations to low-risk environments and VLOS. This paper presents a vision-based route-following system for the autonomous, safe return of UAVs under primary navigation failure such as GPS jamming. Using a Visual Teach and Repeat framework to build a visual map of the environment during an outbound flight, we show the autonomous return of the UAV by visually localising the live view to this map when a simulated GPS failure occurs, controlling the vehicle to follow the safe outbound path back to the launch point. Using gimbal-stabilised stereo vision alone, without reliance on external infrastructure or inertial sensing, Visual Odometry and localisation are achieved at altitudes of 5-25 m and flight speeds up to 55 km/h. We examine the performance of the visual localisation algorithm under a variety of conditions and also demonstrate closed-loop autonomy along a complicated 450 m path.}
}

Provably robust learning-based approach for high-accuracy tracking control of Lagrangian systems
M. K. Helwa, A. Heins, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 4, iss. 2, p. 1587–1594, 2019.

Lagrangian systems represent a wide range of robotic systems, including manipulators, wheeled and legged robots, and quadrotors. Inverse dynamics control and feed-forward linearization techniques are typically used to convert the complex nonlinear dynamics of Lagrangian systems to a set of decoupled double integrators, and then a standard, outer-loop controller can be used to calculate the commanded acceleration for the linearized system. However, these methods typically depend on having a very accurate system model, which is often not available in practice. While this challenge has been addressed in the literature using different learning approaches, most of these approaches do not provide safety guarantees in terms of stability of the learning-based control system. In this paper, we provide a novel, learning-based control approach based on Gaussian processes (GPs) that ensures both stability of the closed-loop system and high-accuracy tracking. We use GPs to approximate the error between the commanded acceleration and the actual acceleration of the system, and then use the predicted mean and variance of the GP to calculate an upper bound on the uncertainty of the linearized model. This uncertainty bound is then used in a robust, outer-loop controller to ensure stability of the overall system. Moreover, we show that the tracking error converges to a ball with a radius that can be made arbitrarily small. Furthermore, we verify the effectiveness of our approach via simulations on a 2 degree-of-freedom (DOF) planar manipulator and experimentally on a 6 DOF industrial manipulator.

@article{helwa-ral19,
title = {Provably Robust Learning-Based Approach for High-Accuracy Tracking Control of {L}agrangian Systems},
author = {Mohamed K. Helwa and Adam Heins and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2019},
volume = {4},
number = {2},
pages = {1587--1594},
doi = {10.1109/LRA.2019.2896728},
urllink = {https://arxiv.org/pdf/1804.01031.pdf},
urlvideo = {https://youtu.be/CBmZ4F79gmI},
abstract = {Lagrangian systems represent a wide range of robotic systems, including manipulators, wheeled and legged robots, and quadrotors. Inverse dynamics control and feed-forward linearization techniques are typically used to convert the complex nonlinear dynamics of Lagrangian systems to a set of decoupled double integrators, and then a standard, outer-loop controller can be used to calculate the commanded acceleration for the linearized system. However, these methods typically depend on having a very accurate system model, which is often not available in practice. While this challenge has been addressed in the literature using different learning approaches, most of these approaches do not provide safety guarantees in terms of stability of the learning-based control system. In this paper, we provide a novel, learning-based control approach based on Gaussian processes (GPs) that ensures both stability of the closed-loop system and high-accuracy tracking. We use GPs to approximate the error between the commanded acceleration and the actual acceleration of the system, and then use the predicted mean and variance of the GP to calculate an upper bound on the uncertainty of the linearized model. This uncertainty bound is then used in a robust, outer-loop controller to ensure stability of the overall system. Moreover, we show that the tracking error converges to a ball with a radius that can be made arbitrarily small. Furthermore, we verify the effectiveness of our approach via simulations on a 2 degree-of-freedom (DOF) planar manipulator and experimentally on a 6 DOF industrial manipulator.}
}

Trajectory generation for multiagent point-to-point transitions via distributed model predictive control
C. E. Luis and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 4, iss. 2, p. 357–382, 2019.

This paper introduces a novel algorithm for multiagent offline trajectory generation based on distributed model predictive control (DMPC). By predicting future states and sharing this information with their neighbours, the agents are able to detect and avoid collisions while moving towards their goals. The proposed algorithm computes transition trajectories for dozens of vehicles in a few seconds. It reduces the computation time by more than 85\% compared to previous optimization approaches based on sequential convex programming (SCP), with only causing a small impact on the optimality of the plans. We replaced the previous compatibility constraints in DMPC, which limit the motion of the agents in order to avoid collisions, by relaxing the collision constraints and enforcing them only when required. The approach was validated both through extensive simulations for a wide range of randomly generated transitions and with teams of up to 25 quadrotors flying in confined indoor spaces.

@article{luis-ral19,
title = {Trajectory Generation for Multiagent Point-To-Point Transitions via Distributed Model Predictive Control},
author = {Carlos E. Luis and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2019},
volume = {4},
number = {2},
pages = {357--382},
urllink = {https://arxiv.org/abs/1809.04230},
urlvideo = {https://youtu.be/ZN2e7h-kkpw},
abstract = {This paper introduces a novel algorithm for multiagent offline trajectory generation based on distributed model predictive control (DMPC). By predicting future states and sharing this information with their neighbours, the agents are able to detect and avoid collisions while moving towards their goals. The proposed algorithm computes transition trajectories for dozens of vehicles in a few seconds. It reduces the computation time by more than 85\% compared to previous optimization approaches based on sequential convex programming (SCP), with only causing a small impact on the optimality of the plans. We replaced the previous compatibility constraints in DMPC, which limit the motion of the agents in order to avoid collisions, by relaxing the collision constraints and enforcing them only when required. The approach was validated both through extensive simulations for a wide range of randomly generated transitions and with teams of up to 25 quadrotors flying in confined indoor spaces.}
}

Learn fast, forget slow: safe predictive control for systems with locally linear actuator dynamics performing repetitive tasks
C. D. McKinnon and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 4, iss. 2, p. 2180–2187, 2019.

We present a control method for improved repetitive path following for a ground vehicle that is geared towards long-term operation where the operating conditions can change over time and are initially unknown. We use weighted Bayesian Linear Regression to model the unknown actuator dynamics, and show how this simple model is more accurate in both its estimate of the mean behaviour and model uncertainty than Gaussian Process Regression and generalizes to novel operating conditions with little or no tuning. In addition, it allows us to use fast adaptation and long-term learning in one, unified framework, to adapt quickly to new operating conditions and learn repetitive model errors over time. This comes with the added benefit of lower computational cost, longer look-ahead, and easier optimization when the model is used in a robust, Model Predictive controller (MPC). In order to fully capitalize on the long prediction horizons that are possible with this new approach, we use Tube MPC to reduce predicted uncertainty growth. We demonstrate the effectiveness of our approach in experiment on a 900 kg ground robot showing results over 2.7 km of driving with both physical and artificial changes to the robot’s dynamics. All of our experiments are conducted using a stereo camera for localization.

@article{mckinnon-ral19,
title={Learn Fast, Forget Slow: Safe Predictive Control for Systems with Locally Linear Actuator Dynamics Performing Repetitive Tasks},
author={Christopher D. McKinnon and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2019},
volume = {4},
number = {2},
pages = {2180--2187},
urllink={https://arxiv.org/abs/1810.06681},
urlvideo={https://youtu.be/fLNMtYabuU4},
abstract = {We present a control method for improved repetitive path following for a ground vehicle that is geared towards long-term operation where the operating conditions can change over time and are initially unknown. We use weighted Bayesian Linear Regression to model the unknown actuator dynamics, and show how this simple model is more accurate in both its estimate of the mean behaviour and model uncertainty than Gaussian Process Regression and generalizes to novel operating conditions with little or no tuning. In addition, it allows us to use fast adaptation and long-term learning in one, unified framework, to adapt quickly to new operating conditions and learn repetitive model errors over time. This comes with the added benefit of lower computational cost, longer look-ahead, and easier optimization when the model is used in a robust, Model Predictive controller (MPC). In order to fully capitalize on the long prediction horizons that are possible with this new approach, we use Tube MPC to reduce predicted uncertainty growth. We demonstrate the effectiveness of our approach in experiment on a 900 kg ground robot showing results over 2.7 km of driving with both physical and artificial changes to the robot's dynamics. All of our experiments are conducted using a stereo camera for localization.}
}

Transfer learning for high-precision trajectory tracking through L1 adaptive feedback and iterative learning
K. Pereida, D. Kooijman, R. R. P. R. Duivenvoorden, and A. P. Schoellig
International Journal of Adaptive Control and Signal Processing, vol. 33, iss. 2, p. 388–409, 2019.

Robust and adaptive control strategies are needed when robots or automated systems are introduced to unknown and dynamic environments where they are required to cope with disturbances, unmodeled dynamics, and parametric uncertainties. In this paper, we demonstrate the capabilities of a combined L_1 adaptive control and iterative learning control (ILC) framework to achieve high-precision trajectory tracking in the presence of unknown and changing disturbances. The L1 adaptive controller makes the system behave close to a reference model; however, it does not guarantee that perfect trajectory tracking is achieved, while ILC improves trajectory tracking performance based on previous iterations. The combined framework in this paper uses L1 adaptive control as an underlying controller that achieves a robust and repeatable behavior, while the ILC acts as a high-level adaptation scheme that mainly compensates for systematic tracking errors. We illustrate that this framework enables transfer learning between dynamically different systems, where learned experience of one system can be shown to be beneficial for another different system. Experimental results with two different quadrotors show the superior performance of the combined L1-ILC framework compared with approaches using ILC with an underlying proportional-derivative controller or proportional-integral-derivative controller. Results highlight that our L1-ILC framework can achieve high-precision trajectory tracking when unknown and changing disturbances are present and can achieve transfer of learned experience between dynamically different systems. Moreover, our approach is able to achieve precise trajectory tracking in the first attempt when the initial input is generated based on the reference model of the adaptive controller.

@ARTICLE{pereida-acsp18,
title={Transfer Learning for High-Precision Trajectory Tracking Through {L1} Adaptive Feedback and Iterative Learning},
author={Karime Pereida and Dave Kooijman and Rikky R. P. R. Duivenvoorden and Angela P. Schoellig},
journal={{International Journal of Adaptive Control and Signal Processing}},
year={2019},
volume = {33},
number = {2},
pages = {388--409},
doi={10.1002/acs.2887},
urllink={https://onlinelibrary.wiley.com/doi/abs/10.1002/acs.2887},
abstract={Robust and adaptive control strategies are needed when robots or automated systems are introduced to unknown and dynamic environments where they are required to cope with disturbances, unmodeled dynamics, and parametric uncertainties. In this paper, we demonstrate the capabilities of a combined L_1 adaptive control and iterative learning control (ILC) framework to achieve high-precision trajectory tracking in the presence of unknown and changing disturbances. The L1 adaptive controller makes the system behave close to a reference model; however, it does not guarantee that perfect trajectory tracking is achieved, while ILC improves trajectory tracking performance based on previous iterations. The combined framework in this paper uses L1 adaptive control as an underlying controller that achieves a robust and repeatable behavior, while the ILC acts as a high-level adaptation scheme that mainly compensates for systematic tracking errors. We illustrate that this framework enables transfer learning between dynamically different systems, where learned experience of one system can be shown to be beneficial for another different system. Experimental results with two different quadrotors show the superior performance of the combined L1-ILC framework compared with approaches using ILC with an underlying proportional-derivative controller or proportional-integral-derivative controller. Results highlight that our L1-ILC framework can achieve high-precision trajectory tracking when unknown and changing disturbances are present and can achieve transfer of learned experience between dynamically different systems. Moreover, our approach is able to achieve precise trajectory tracking in the first attempt when the initial input is generated based on the reference model of the adaptive controller.},
}

Active training trajectory generation for inverse dynamics model learning with deep neural networks
S. Zhou and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2019, p. 1784–1790.

Inverse dynamics models have been used in robot control algorithms to realize a desired motion or to enhance a robot’s performance. As robot dynamics and their operating environments become more complex, there is a growing trend of learning uncertain or unknown dynamics from data. While techniques such as deep neural networks (DNNs) have been successfully used to learn inverse dynamics, it is usually implicitly assumed that the learning modules are trained on sufficiently rich datasets. In practical implementations, this assumption typically results in a trial-and-error training process, which can be inefficient or unsafe for robot applications. In this paper, we present an active trajectory generation framework that allows us to systematically design informative trajectories for training DNN inverse dynamics modules. In particular, we introduce an episode-based algorithm that integrates a spline trajectory optimization approach with DNN active learning for efficient data collection. We consider different DNN uncertainty estimation techniques and active learning heuristics in our work and illustrate the proposed active training trajectory generation approach in simulation. We show that the proposed active training trajectory generation outperforms adhoc, intuitive training approaches.

@INPROCEEDINGS{zhou-cdc19,
author = {Siqi Zhou and Angela P. Schoellig},
title = {Active Training Trajectory Generation for Inverse Dynamics Model Learning with Deep Neural Networks},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2019},
pages = {1784--1790},
abstract = {Inverse dynamics models have been used in robot control algorithms to realize a desired motion or to enhance a robot’s performance. As robot dynamics and their operating environments become more complex, there is a growing trend of learning uncertain or unknown dynamics from data. While techniques such as deep neural networks (DNNs) have been successfully used to learn inverse dynamics, it is usually implicitly assumed that the learning modules are trained on sufficiently rich datasets. In practical implementations, this assumption typically results in a trial-and-error training process, which can be inefficient or unsafe for robot applications. In this paper, we present an active trajectory generation framework that allows us to systematically design informative trajectories for training DNN inverse dynamics modules. In particular, we introduce an episode-based algorithm that integrates a spline trajectory optimization approach with DNN active learning for efficient data collection. We consider different DNN uncertainty estimation techniques and active learning heuristics in our work and illustrate the proposed active training trajectory generation approach in simulation. We show that the proposed active training trajectory generation outperforms adhoc, intuitive training approaches.},
}

Trajectory tracking for quadrotors with attitude control on $\mathcal{S}^2\times \mathcal{S}^1$
D. Kooijman, A. P. Schoellig, and D. J. Antunes
in Proc. of the European Control Conference (ECC), 2019, p. 4002–4009.

The control of a quadrotor is typically split into two subsequent problems: finding desired accelerations to control its position, and controlling its attitude and the total thrust to track these accelerations and to track a yaw angle reference. While the thrust vector, generating accelerations, and the angle of rotation about the thrust vector, determining the yaw angle, can be controlled independently, most attitude control strategies in the literature, relying on representations in terms of quaternions, rotation matrices or Euler angles, result in an unnecessary coupling between the control of the thrust vector and of the angle about this vector. This leads, for instance, to undesired position tracking errors due to yaw tracking errors. In this paper we propose to tackle the attitude control problem using an attitude representation in the Cartesian product of the 2-sphere and the 1-sphere, denoted by $\mathcal{S}^2\times \mathcal{S}^1$. We propose a non-linear tracking control law on $\mathcal{S}^2\times \mathcal{S}^1$ that decouples the control of the thrust vector and of the angle of rotation about the thrust vector, and guarantees almost global asymptotic stability. Simulation results highlight the advantages of the proposed approach over previous approaches.

@INPROCEEDINGS{kooijman-ecc19,
author = {Dave Kooijman and Angela P. Schoellig and Duarte J. Antunes},
title = {Trajectory Tracking for Quadrotors with Attitude Control on {$\mathcal{S}^2\times \mathcal{S}^1$}},
booktitle = {{Proc. of the European Control Conference (ECC)}},
year = {2019},
pages = {4002--4009},
abstract = {The control of a quadrotor is typically split into two subsequent problems: finding desired accelerations to control its position, and controlling its attitude and the total thrust to track these accelerations and to track a yaw angle reference. While the thrust vector, generating accelerations, and the angle of rotation about the thrust vector, determining the yaw angle, can be controlled independently, most attitude control strategies in the literature, relying on representations in terms of quaternions, rotation matrices or Euler angles, result in an unnecessary coupling between the control of the thrust vector and of the angle about this vector. This leads, for instance, to undesired position tracking errors due to yaw tracking errors. In this paper we propose to tackle the attitude control problem using an attitude representation in the Cartesian product of the 2-sphere and the 1-sphere, denoted by $\mathcal{S}^2\times \mathcal{S}^1$. We propose a non-linear tracking control law on $\mathcal{S}^2\times \mathcal{S}^1$ that decouples the control of the thrust vector and of the angle of rotation about the thrust vector, and guarantees almost global asymptotic stability. Simulation results highlight the advantages of the proposed approach over previous approaches.},
}

aUToTrack: a lightweight object detection and tracking system for the SAE AutoDrive challenge
K. Burnett, S. Samavi, S. Waslander, T. D. Barfoot, and A. P. Schoellig
in Proc. of the Conference on Computer and Robot Vision (CRV), 2019, p. 209–216. Best poster presentation award.

The University of Toronto is one of eight teams competing in the SAE AutoDrive Challenge – a competition to develop a self-driving car by 2020. After placing first at the Year 1 challenge [1], we are headed to MCity in June 2019 for the second challenge. There, we will interact with pedestrians, cyclists, and cars. For safe operation, it is critical to have an accurate estimate of the position of all objects surrounding the vehicle. The contributions of this work are twofold: First, we present a new object detection and tracking dataset (UofTPed50), which uses GPS to ground truth the position and velocity of a pedestrian. To our knowledge, a dataset of this type for pedestrians has not been shown in the literature before. Second, we present a lightweight object detection and tracking system (aUToTrack) that uses vision, LIDAR, and GPS/IMU positioning to achieve state-of-the-art performance on the KITTI Object Tracking benchmark. We show that aUToTrack accurately estimates the position and velocity of pedestrians, in real-time, using CPUs only. aUToTrack has been tested in closed-loop experiments on a real self-driving car (seen in Figure 1), and we demonstrate its performance on our dataset.

@INPROCEEDINGS{burnett-crv19,
author = {Keenan Burnett and Sepehr Samavi and Steven Waslander and Timothy D. Barfoot and Angela P. Schoellig},
title = {{aUToTrack:} A lightweight object detection and tracking system for the {SAE} {AutoDrive} Challenge},
booktitle = {{Proc. of the Conference on Computer and Robot Vision (CRV)}},
year = {2019},
pages = {209--216},
note = {Best poster presentation award},
urlvideo = {https://youtu.be/FLCgcgzNo80},
abstract = {The University of Toronto is one of eight teams competing in the SAE AutoDrive Challenge – a competition to develop a self-driving car by 2020. After placing first at the Year 1 challenge [1], we are headed to MCity in June 2019 for the second challenge. There, we will interact with pedestrians, cyclists, and cars. For safe operation, it is critical to have an accurate estimate of the position of all objects surrounding the vehicle. The contributions of this work are twofold: First, we present a new object detection and tracking dataset (UofTPed50), which uses GPS to ground truth the position and velocity of a pedestrian. To our knowledge, a dataset of this type for pedestrians has not been shown in the literature before. Second, we present a lightweight object detection and tracking system (aUToTrack) that uses vision, LIDAR, and GPS/IMU positioning to achieve state-of-the-art performance on the KITTI Object Tracking benchmark. We show that aUToTrack accurately estimates the position and velocity of pedestrians, in real-time, using CPUs only. aUToTrack has been tested in closed-loop experiments on a real self-driving car (seen in Figure 1), and we demonstrate its performance on our dataset.},
}

Learning probabilistic models for safe predictive control in unknown environments
C. D. McKinnon and A. P. Schoellig
in Proc. of the European Control Conference (ECC), 2019, p. 2472–2479.

Researchers rely increasingly on tools from machine learning to improve the performance of control algorithms on real world tasks and enable robots to operate for long periods of time without intervention. Many of these algorithms require a model for the dynamics of the robot. In particular, researchers designing methods for safe learning control often rely on an upper bound on model error to make guarantees about the worst-case closed-loop performance of their algorithm. There are different options for how to learn such a model of the robot dynamics. We study probabilistic models for use in the context of stochastic model predictive control. Two popular choices for learning the robot dynamics are Gaussian Process (GP) regression and various forms of local linear regression. In this paper, we present a study comparing GPs with a particular form of local linear regression for learning robot dynamics with the aim of guaranteeing safety when a robot operates in novel conditions. We show results based on experimental data from a 900 kg ground robot using vision for localisation.

@INPROCEEDINGS{mckinnon-ecc19,
author = {Christopher D. McKinnon and Angela P. Schoellig},
title = {Learning Probabilistic Models for Safe Predictive Control in Unknown Environments},
booktitle = {{Proc. of the European Control Conference (ECC)}},
year = {2019},
pages = {2472--2479},
abstract = {Researchers rely increasingly on tools from machine learning to improve the performance of control algorithms on real world tasks and enable robots to operate for long periods of time without intervention. Many of these algorithms require a model for the dynamics of the robot. In particular, researchers designing methods for safe learning control often rely on an upper bound on model error to make guarantees about the worst-case closed-loop performance of their algorithm. There are different options for how to learn such a model of the robot dynamics. We study probabilistic models for use in the context of stochastic model predictive control. Two popular choices for learning the robot dynamics are Gaussian Process (GP) regression and various forms of local linear regression. In this paper, we present a study comparing GPs with a particular form of local linear regression for learning robot dynamics with the aim of guaranteeing safety when a robot operates in novel conditions. We show results based on experimental data from a 900 kg ground robot using vision for localisation.},
}

Improved tag-based indoor localization of UAVs using extended Kalman filter
N. Kayhani, A. Heins, W. Zhao, M. Nahangi, B. McCabe, and A. P. Schoellig
in Proc. of the International Symposium on Automation and Robotics in Construction (ISARC), 2019, p. 624–631.

Indoor localization and navigation of unmanned aerial vehicles (UAVs) is a critical function for autonomous flight and automated visual inspection of construction elements in continuously changing construction environments. The key challenge for indoor localization and navigation is that the global positioning system (GPS) signal is not sufficiently reliable for state estimation. Having used the AprilTag markers for indoor localization, we showed a proof-of-concept that a camera-equipped UAV can be localized in a GPS-denied environment; however, the accuracy of the localization was inadequate in some situations. This study presents the implementation and performance assessment of an Extended Kalman Filter (EKF) for improving the estimation process of a previously developed indoor localization framework using AprilTag markers. An experimental set up is used to assess the performance of the updated estimation process in comparison to the previous state estimation method and the ground truth data. Results show that the state estimation and indoor localization are improved substantially using the EKF. To have a more robust estimation, we extract and fuse data from multiple tags. The framework can now be tested in real-world environments given that our continuous localization is sufficiently robust and reliable.

@INPROCEEDINGS{kayhani-isarc19,
author = {Navid Kayhani and Adam Heins and Wenda Zhao and Mohammad Nahangi and Brenda McCabe and Angela P. Schoellig},
title = {Improved Tag-based Indoor Localization of {UAV}s Using Extended {Kalman} Filter},
booktitle = {{Proc. of the International Symposium on Automation and Robotics in Construction (ISARC)}},
year = {2019},
pages = {624--631},
doi={10.22260/ISARC2019/0083},
abstract = {Indoor localization and navigation of unmanned aerial vehicles (UAVs) is a critical function for autonomous flight and automated visual inspection of construction elements in continuously changing construction environments. The key challenge for indoor localization and navigation is that the global positioning system (GPS) signal is not sufficiently reliable for state estimation. Having used the AprilTag markers for indoor localization, we showed a proof-of-concept that a camera-equipped UAV can be localized in a GPS-denied environment; however, the accuracy of the localization was inadequate in some situations. This study presents the implementation and performance assessment of an Extended Kalman Filter (EKF) for improving the estimation process of a previously developed indoor localization framework using AprilTag markers. An experimental set up is used to assess the performance of the updated estimation process in comparison to the previous state estimation method and the ground truth data. Results show that the state estimation and indoor localization are improved substantially using the EKF. To have a more robust estimation, we extract and fuse data from multiple tags. The framework can now be tested in real-world environments given that our continuous localization is sufficiently robust and reliable.},
}

Knowledge transfer between robots with similar dynamics for high-accuracy impromptu trajectory tracking
S. Zhou, A. Sarabakha, E. Kayacan, M. K. Helwa, and A. P. Schoellig
in Proc. of the European Control Conference (ECC), 2019, p. 1–8.

In this paper, we propose an online learning approach that enables the inverse dynamics model learned for a source robot to be transferred to a target robot (e.g., from one quadrotor to another quadrotor with different mass or aerodynamic properties). The goal is to leverage knowledge from the source robot such that the target robot achieves high-accuracy trajectory tracking on arbitrary trajectories from the first attempt with minimal data recollection and training. Most existing approaches for multi-robot knowledge transfer are based on post-analysis of datasets collected from both robots. In this work, we study the feasibility of impromptu transfer of models across robots by learning an error prediction module online. In particular, we analytically derive the form of the mapping to be learned by the online module for exact tracking, propose an approach for characterizing similarity between robots, and use these results to analyze the stability of the overall system. The proposed approach is illustrated in simulation and verified experimentally on two different quadrotors performing impromptu trajectory tracking tasks, where the quadrotors are required to accurately track arbitrary hand-drawn trajectories from the first attempt.

@INPROCEEDINGS{zhou-ecc19,
author = {Siqi Zhou and Andriy Sarabakha and Erdal Kayacan and Mohamed K. Helwa and Angela P. Schoellig},
title = {Knowledge Transfer Between Robots with Similar Dynamics for High-Accuracy Impromptu Trajectory Tracking},
booktitle = {{Proc. of the European Control Conference (ECC)}},
year = {2019},
pages = {1--8},
urlvideo = {https://youtu.be/Pj_irRLHsD8},
abstract = {In this paper, we propose an online learning approach that enables the inverse dynamics model learned for a source robot to be transferred to a target robot (e.g., from one quadrotor to another quadrotor with different mass or aerodynamic properties). The goal is to leverage knowledge from the source robot such that the target robot achieves high-accuracy trajectory tracking on arbitrary trajectories from the first attempt with minimal data recollection and training. Most existing approaches for multi-robot knowledge transfer are based on post-analysis of datasets collected from both robots. In this work, we study the feasibility of impromptu transfer of models across robots by learning an error prediction module online. In particular, we analytically derive the form of the mapping to be learned by the online module for exact tracking, propose an approach for characterizing similarity between robots, and use these results to analyze the stability of the overall system. The proposed approach is illustrated in simulation and verified experimentally on two different quadrotors performing impromptu trajectory tracking tasks, where the quadrotors are required to accurately track arbitrary hand-drawn trajectories from the first attempt.},
}

Building a winning self-driving car in six months
K. Burnett, A. Schimpe, S. Samavi, M. Gridseth, C. W. Liu, Q. Li, Z. Kroeze, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2019, p. 9583–9589.

The SAE AutoDrive Challenge is a three-year competition to develop a Level 4 autonomous vehicle by 2020. The first set of challenges were held in April of 2018 in Yuma, Arizona. Our team (aUToronto/Zeus) placed first. In this paper, we describe our complete system architecture and specialized algorithms that enabled us to win. We show that it is possible to develop a vehicle with basic autonomy features in just six months relying on simple, robust algorithms. We do not make use of a prior map. Instead, we have developed a multi-sensor visual localization solution. All of our algorithms run in real-time using CPUs only. We also highlight the closed-loop performance of our system in detail in several experiments.

@INPROCEEDINGS{burnett-icra19,
author = {Keenan Burnett and Andreas Schimpe and Sepehr Samavi and Mona Gridseth and Chengzhi Winston Liu and Qiyang Li and Zachary Kroeze and Angela P. Schoellig},
title = {Building a Winning Self-Driving Car in Six Months},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2019},
pages = {9583--9589},
urlvideo = {http://tiny.cc/zeus-y1},
urllink = {https://arxiv.org/abs/1811.01273},
abstract = {The SAE AutoDrive Challenge is a three-year competition to develop a Level 4 autonomous vehicle by 2020. The first set of challenges were held in April of 2018 in Yuma, Arizona. Our team (aUToronto/Zeus) placed first. In this paper, we describe our complete system architecture and specialized algorithms that enabled us to win. We show that it is possible to develop a vehicle with basic autonomy features in just six months relying on simple, robust algorithms. We do not make use of a prior map. Instead, we have developed a multi-sensor visual localization solution. All of our algorithms run in real-time using CPUs only. We also highlight the closed-loop performance of our system in detail in several experiments.},
}

Point me in the right direction: improving visual localization on UAVs with active gimballed camera pointing
B. Patel, M. Warren, and A. P. Schoellig
in Proc. of the Conference on Computer and Robot Vision (CRV), 2019, p. 105–112. Best paper award, robot vision.

Robust autonomous navigation of multirotor UAVs in GPS-denied environments is critical to enable their safe operation in many applications such as surveillance and reconnaissance, inspection, and delivery services. In this paper, we use a gimballed stereo camera for localization and demonstrate how the localization performance and robustness can be improved by actively controlling the camera’s viewpoint. For an autonomous route-following task based on a recorded map, multiple gimbal pointing strategies are compared: off-the-shelf passive stabilization, active stabilization, minimization of viewpoint orientation error, and pointing the camera optical axis at the centroid of previously observed landmarks. We demonstrate improved localization performance using an active gimbal-stabilized camera in multiple outdoor flight experiments on routes up to 315 m, and with 6-25 m altitude variations. Scenarios are shown where a static camera frequently fails to localize while a gimballed camera attenuates perspective errors to retain localization. We demonstrate that our orientation matching and centroid pointing strategies provide the best performance; enabling localization despite increasing velocity discrepancies between the map-generation flight and the live flight from 3-9 m/s, and 8 m path offsets.

@INPROCEEDINGS{patel-crv19,
author = {Bhavit Patel and Michael Warren and Angela P. Schoellig},
title = {Point Me In The Right Direction: Improving Visual Localization on {UAV}s with Active Gimballed Camera Pointing},
booktitle = {{Proc. of the Conference on Computer and Robot Vision (CRV)}},
year = {2019},
pages = {105--112},
note = {Best paper award, robot vision},
abstract = {Robust autonomous navigation of multirotor UAVs in GPS-denied environments is critical to enable their safe operation in many applications such as surveillance and reconnaissance, inspection, and delivery services. In this paper, we use a gimballed stereo camera for localization and demonstrate how the localization performance and robustness can be improved by actively controlling the camera’s viewpoint. For an autonomous route-following task based on a recorded map, multiple gimbal pointing strategies are compared: off-the-shelf passive stabilization, active stabilization, minimization of viewpoint orientation error, and pointing the camera optical axis at the centroid of previously observed landmarks. We demonstrate improved localization performance using an active gimbal-stabilized camera in multiple outdoor flight experiments on routes up to 315 m, and with 6-25 m altitude variations. Scenarios are shown where a static camera frequently fails to localize while a gimballed camera attenuates perspective errors to retain localization. We demonstrate that our orientation matching and centroid pointing strategies provide the best performance; enabling localization despite increasing velocity discrepancies between the map-generation flight and the live flight from 3-9 m/s, and 8 m path offsets.},
}

Fast and in sync: periodic swarm patterns for quadrotors
X. Du, C. E. Luis, M. Vukosavljev, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2019, p. 9143–9149.

This paper aims to design quadrotor swarm performances, where the swarm acts as an integrated, coordinated unit embodying moving and deforming objects. We divide the task of creating a choreography into three basic steps: designing swarm motion primitives, transitioning between those movements, and synchronizing the motion of the drones. The result is a flexible framework for designing choreographies comprised of a wide variety of motions. The motion primitives can be intuitively designed using few parameters, providing a rich library for choreography design. Moreover, we combine and adapt existing goal assignment and trajectory generation algorithms to maximize the smoothness of the transitions between motion primitives. Finally, we propose a correction algorithm to compensate for motion delays and synchronize the motion of the drones to a desired periodic motion pattern. The proposed methodology was validated experimentally by generating and executing choreographies on a swarm of 25 quadrotors.

@INPROCEEDINGS{du-icra19,
author = {Xintong Du and Carlos E. Luis and Marijan Vukosavljev and Angela P. Schoellig},
title = {Fast and in sync: periodic swarm patterns for quadrotors},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2019},
pages={9143--9149},
urllink = {https://arxiv.org/abs/1810.03572},
urlvideo = {https://drive.google.com/file/d/1D9CTpYSjdFHNjiYsFWeOI3--Ve1-3Bfp/view},
abstract = {This paper aims to design quadrotor swarm performances, where the swarm acts as an integrated, coordinated unit embodying moving and deforming objects. We divide the task of creating a choreography into three basic steps: designing swarm motion primitives, transitioning between those movements, and synchronizing the motion of the drones. The result is a flexible framework for designing choreographies comprised of a wide variety of motions. The motion primitives can be intuitively designed using few parameters, providing a rich library for choreography design. Moreover, we combine and adapt existing goal assignment and trajectory generation algorithms to maximize the smoothness of the transitions between motion primitives. Finally, we propose a correction algorithm to compensate for motion delays and synchronize the motion of the drones to a desired periodic motion pattern. The proposed methodology was validated experimentally by generating and executing choreographies on a swarm of 25 quadrotors.},
}

Hierarchically consistent motion primitives for quadrotor coordination
M. Vukosavljev, A. P. Schoellig, and M. E. Broucke
Technical Report, arXiv, 2019.

We present a hierarchical framework for motion planning of a large collection of agents. The proposed framework starts from low level motion primitives over a gridded workspace and provides a set of rules for constructing higher level motion primitives. Our hierarchical approach is highly scalable and robust making it an ideal tool for planning for multi-agent systems. Results are demonstrated experimentally on a collection of quadrotors that must navigate a cluttered environment while maintaining a formation.

@TECHREPORT{vukosavljev-report19,
author = {Marijan Vukosavljev and Angela P. Schoellig and Mireille E. Broucke},
title = {Hierarchically consistent motion primitives for quadrotor coordination},
year = {2019},
institution = {arXiv},
urlvideo = {http://tiny.cc/hier-moprim},
urllink = {https://arxiv.org/abs/1809.05757},
abstract = {We present a hierarchical framework for motion planning of a large collection of agents. The proposed framework starts from low level motion primitives over a gridded workspace and provides a set of rules for constructing higher level motion primitives. Our hierarchical approach is highly scalable and robust making it an ideal tool for planning for multi-agent systems. Results are demonstrated experimentally on a collection of quadrotors that must navigate a cluttered environment while maintaining a formation.},
}

Robust adaptive model predictive control for high-accuracy trajectory tracking in changing conditions
K. Pereida and A. P. Schoellig
Short Paper and Presentation, in Proc. of the Algorithms and Architectures for Learning in-the-Loop Systems in Autonomous Flight Workshop at IEEE International Conference on Robotics and Automation (ICRA), 2019.

Robots are being deployed in unknown and dynamic environments where they are required to handle disturbances, unmodeled dynamics, and parametric uncertainties. Sophisticated control strategies can guarantee high performance in these changing environments. In this work, we propose a novel robust adaptive model predictive controller that combines robust model predictive control (MPC) with an underlying $\mathcal{L}_1$ adaptive controller to improve trajectory tracking of a system subject to unknown and changing disturbances. The $\mathcal{L}_1$ adaptive controller forces the system to behave close to a specified linear reference model. The controlled system may still deviate from the reference model, but this deviation is shown to be upper bounded. An outer-loop robust MPC uses this upper bound, the linear reference model and system constraints to calculate the optimal reference input that minimizes the given cost function. The proposed robust adaptive MPC is able to achieve high-accuracy trajectory tracking even in the presence of unknown disturbances. We show preliminary experimental results of an adaptive MPC on a quadrotor. The adaptive MPC has a lower trajectory tracking error compared to a predictive, non-adaptive approach, even when wind disturbances are applied.

@MISC{pereida-icra19a,
author = {Karime Pereida and Angela P. Schoellig},
title = {Robust Adaptive Model Predictive Control for High-Accuracy Trajectory Tracking in Changing Conditions},
year = {2019},
howpublished = {Short Paper and Presentation, in Proc. of the Algorithms and Architectures for Learning in-the-Loop Systems in Autonomous Flight Workshop at IEEE International Conference on Robotics and Automation (ICRA)},
urlvideo = {https://youtu.be/xuyLst5mkEE},
abstract = {Robots are being deployed in unknown and dynamic environments where they are required to handle disturbances, unmodeled dynamics, and parametric uncertainties. Sophisticated control strategies can guarantee high performance in these changing environments. In this work, we propose a novel robust adaptive model predictive controller that combines robust model predictive control (MPC) with an underlying $\mathcal{L}_1$ adaptive controller to improve trajectory tracking of a system subject to unknown and changing disturbances. The $\mathcal{L}_1$ adaptive controller forces the system to behave close to a specified linear reference model. The controlled system may still deviate from the reference model, but this deviation is shown to be upper bounded. An outer-loop robust MPC uses this upper bound, the linear reference model and system constraints to calculate the optimal reference input that minimizes the given cost function. The proposed robust adaptive MPC is able to achieve high-accuracy trajectory tracking even in the presence of unknown disturbances. We show preliminary experimental results of an adaptive MPC on a quadrotor. The adaptive MPC has a lower trajectory tracking error compared to a predictive, non-adaptive approach, even when wind disturbances are applied.},
}

Diversity in robotics: from diverse teams to diverse impact
K. Pereida and M. Greeff
Short Paper and Presentation, in Proc. of the Debates on the Future of Robotics Research Workshop at the IEEE International Conference on Robotics and Automation (ICRA), 2019.

Roboticists develop technologies that are used by people worldwide, consequently impacting many aspects of human life – from healthcare and law enforcement to autonomous transportation. The development of these technologies involves design and innovation – both of which rely on personal choice and experience. Hence, personal biases, whether intentionally or unintentionally, tend to be embedded in the final product designs. Homogeneous teams of designers and engineers are more likely to develop products that overlook the needs of a given part of the population – even missing gaps for potential technological innovation. In this talk we emphasize some of the negative impacts a lack of diversity has on robotic innovation by highlighting examples of embedded biases within certain technologies and providing some evidence that this is linked to a lack of diverse teams. If our aim as a community is to increase research capacity, creativity, and broaden the impact of robotics, making it a more diverse field must be a goal.

@MISC{pereida-icra19b,
author = {Karime Pereida and Melissa Greeff},
title = {Diversity in Robotics: From Diverse Teams to Diverse Impact},
year = {2019},
howpublished = {Short Paper and Presentation, in Proc. of the Debates on the Future of Robotics Research Workshop at the IEEE International Conference on Robotics and Automation (ICRA)},
abstract = {Roboticists develop technologies that are used by people worldwide, consequently impacting many aspects of human life - from healthcare and law enforcement to autonomous transportation. The development of these technologies involves design and innovation - both of which rely on personal choice and experience. Hence, personal biases, whether intentionally or unintentionally, tend to be embedded in the final product designs. Homogeneous teams of designers and engineers are more likely to develop products that overlook the needs of a given part of the population - even missing gaps for potential technological innovation. In this talk we emphasize some of the negative impacts a lack of diversity has on robotic innovation by highlighting examples of embedded biases within certain technologies and providing some evidence that this is linked to a lack of diverse teams. If our aim as a community is to increase research capacity, creativity, and broaden the impact of robotics, making it a more diverse field must be a goal.},
}

Data-efficient multi-robot, multi-task transfer learning for trajectory tracking
K. Pereida, M. K. Helwa, and A. P. Schoellig
Abstract and Poster, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA), 2019.

Learning can significantly improve the performance of robots in uncertain and changing environments; however, typical learning approaches need to start a new learning process for each new task or robot as transferring knowledge is cumbersome or not possible. In this work, we introduce a multi-robot, multi-task transfer learning framework that allows a system to complete a task by learning from a few demonstrations of another task executed on a different system. We focus on the trajectory tracking problem where each trajectory represents a different task. The proposed learning control architecture has two stages: (i) \emph{multi-robot} transfer learning framework that combines $\mathcal{L}_1$ adaptive control and iterative learning control, where the key idea is that the adaptive controller forces dynamically different systems to behave as a specified reference model; and (ii) a \emph{multi-task} transfer learning framework that uses theoretical control results (e.g., the concept of vector relative degree) to learn a map from desired trajectories to the inputs that make the system track these trajectories with high accuracy. This map is used to calculate the inputs for a new, unseen trajectory. We conduct experiments on two different quadrotor platforms and six different trajectories where we show that using information from tracking a single trajectory learned by one quadrotor reduces, on average, the first-iteration tracking error on another quadrotor by 74\%.

@MISC{pereida-icra19c,
author = {Karime Pereida and Mohamed K. Helwa and Angela P. Schoellig},
title = {Data-Efficient Multi-Robot, Multi-Task Transfer Learning for Trajectory Tracking},
year = {2019},
howpublished = {Abstract and Poster, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA)},
abstract = {Learning can significantly improve the performance of robots in uncertain and changing environments; however, typical learning approaches need to start a new learning process for each new task or robot as transferring knowledge is cumbersome or not possible. In this work, we introduce a multi-robot, multi-task transfer learning framework that allows a system to complete a task by learning from a few demonstrations of another task executed on a different system. We focus on the trajectory tracking problem where each trajectory represents a different task. The proposed learning control architecture has two stages: (i) \emph{multi-robot} transfer learning framework that combines $\mathcal{L}_1$ adaptive control and iterative learning control, where the key idea is that the adaptive controller forces dynamically different systems to behave as a specified reference model; and (ii) a \emph{multi-task} transfer learning framework that uses theoretical control results (e.g., the concept of vector relative degree) to learn a map from desired trajectories to the inputs that make the system track these trajectories with high accuracy. This map is used to calculate the inputs for a new, unseen trajectory. We conduct experiments on two different quadrotor platforms and six different trajectories where we show that using information from tracking a single trajectory learned by one quadrotor reduces, on average, the first-iteration tracking error on another quadrotor by 74\%.},
}

Towards scalable online trajectory generation for multi-robot systems
C. E. Luis, M. Vukosavljev, and A. P. Schoellig
Abstract and Poster, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA), 2019.

We present a distributed model predictive control (DMPC) algorithm to generate trajectories in real-time for multiple robots, taking into account their trajectory tracking dynamics and actuation limits. An event-triggered replanning strategy is proposed to account for disturbances in the system. We adopted the on-demand collision avoidance method presented in previous work to efficiently compute non-colliding trajectories in transition tasks. Preliminary results in simulation show a higher success rate than previous online methods based on Buffered Voronoi Cells (BVC), while maintaining computational tractability for real-time operation.

@MISC{luis-icra19,
author = {Carlos E. Luis and Marijan Vukosavljev and Angela P. Schoellig},
title = {Towards Scalable Online Trajectory Generation for Multi-robot Systems},
year = {2019},
howpublished = {Abstract and Poster, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA)},
abstract = {We present a distributed model predictive control (DMPC) algorithm to generate trajectories in real-time for multiple robots, taking into account their trajectory tracking dynamics and actuation limits. An event-triggered replanning strategy is proposed to account for disturbances in the system. We adopted the on-demand collision avoidance method presented in previous work to efficiently compute non-colliding trajectories in transition tasks. Preliminary results in simulation show a higher success rate than previous online methods based on Buffered Voronoi Cells (BVC), while maintaining computational tractability for real-time operation.},
}

Knowledge transfer between robots with online learning for enhancing robot performance in impromptu trajectory tracking
S. Zhou, A. Sarabakha, E. Kayacan, M. K. Helwa, and A. P. Schoellig
Abstract and Presentation, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA), 2019.

As robot dynamics become more complex, learning from data is emerging as an alternative for obtaining accurate dynamic models to assist control system designs or to enhance robot performance. Though being effective, common model learning techniques rely on rich datasets collected from the robots, and the learned experience is often platform-specific. In this work, we propose an online learning approach for transferring deep neural network (DNN) inverse dynamics models across two robots and analyze the role of dynamic similarity in the transfer problem. We demonstrate our proposed knowledge transfer approach with two different quadrotors on impromptu trajectory tracking tasks, in which the quadrotors are required to track arbitrary hand-drawn trajectories accurately from the first attempt. With this work, we illustrate that (i) we can relate the transferability of DNN inverse models to the robot dynamic properties, and (ii) when the transfer is feasible, we can significantly reduce data recollections that would be otherwise costly or risky for robot applications. Given a heterogeneous robot team, we envision having to train only one of the agents to allow the whole team achieving higher performance.

@MISC{zhou-icra19,
author = {Siqi Zhou and Andriy Sarabakha and Erdal Kayacan and Mohamed K. Helwa and Angela P. Schoellig},
title = {Knowledge Transfer Between Robots with Online Learning for Enhancing Robot Performance in Impromptu Trajectory Tracking},
year = {2019},
howpublished = {Abstract and Presentation, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA)},
abstract = {As robot dynamics become more complex, learning from data is emerging as an alternative for obtaining accurate dynamic models to assist control system designs or to enhance robot performance. Though being effective, common model learning techniques rely on rich datasets collected from the robots, and the learned experience is often platform-specific. In this work, we propose an online learning approach for transferring deep neural network (DNN) inverse dynamics models across two robots and analyze the role of dynamic similarity in the transfer problem. We demonstrate our proposed knowledge transfer approach with two different quadrotors on impromptu trajectory tracking tasks, in which the quadrotors are required to track arbitrary hand-drawn trajectories accurately from the first attempt. With this work, we illustrate that (i) we can relate the transferability of DNN inverse models to the robot dynamic properties, and (ii) when the transfer is feasible, we can significantly reduce data recollections that would be otherwise costly or risky for robot applications. Given a heterogeneous robot team, we envision having to train only one of the agents to allow the whole team achieving higher performance.}
}

2018

Data-efficient multi-robot, multi-task transfer learning for trajectory tracking
K. Pereida, M. K. Helwa, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 3, iss. 2, p. 1260–1267, 2018.

Transfer learning has the potential to reduce the burden of data collection and to decrease the unavoidable risks of the training phase. In this paper, we introduce a multi-robot, multi-task transfer learning framework that allows a system to complete a task by learning from a few demonstrations of another task executed on another system. We focus on the trajectory tracking problem where each trajectory represents a different task, since many robotic tasks can be described as a trajectory tracking problem. The proposed, multi-robot transfer learning framework is based on a combined L1 adaptive control and iterative learning control approach. The key idea is that the adaptive controller forces dynamically different systems to behave as a specified reference model. The proposed multi-task transfer learning framework uses theoretical control results (e.g., the concept of vector relative degree) to learn a map from desired trajectories to the inputs that make the system track these trajectories with high accuracy. This map is used to calculate the inputs for a new, unseen trajectory. Experimental results using two different quadrotor platforms and six different trajectories show that, on average, the proposed framework reduces the first-iteration tracking error by 74% when information from tracking a different, single trajectory on a different quadrotor is utilized.

@article{pereida-ral18,
title = {Data-Efficient Multi-Robot, Multi-Task Transfer Learning for Trajectory Tracking},
author = {Karime Pereida and Mohamed K. Helwa and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2018},
volume = {3},
number = {2},
doi = {10.1109/LRA.2018.2795653},
pages = {1260--1267},
urllink = {http://ieeexplore.ieee.org/abstract/document/8264705/},
abstract = {Transfer learning has the potential to reduce the burden of data collection and to decrease the unavoidable risks of the training phase. In this paper, we introduce a multi-robot, multi-task transfer learning framework that allows a system to complete a task by learning from a few demonstrations of another task executed on another system. We focus on the trajectory tracking problem where each trajectory represents a different task, since many robotic tasks can be described as a trajectory tracking problem. The proposed, multi-robot transfer learning framework is based on a combined L1 adaptive control and iterative learning control approach. The key idea is that the adaptive controller forces dynamically different systems to behave as a specified reference model. The proposed multi-task transfer learning framework uses theoretical control results (e.g., the concept of vector relative degree) to learn a map from desired trajectories to the inputs that make the system track these trajectories with high accuracy. This map is used to calculate the inputs for a new, unseen trajectory. Experimental results using two different quadrotor platforms and six different trajectories show that, on average, the proposed framework reduces the first-iteration tracking error by 74% when information from tracking a different, single trajectory on a different quadrotor is utilized.},
}

The regular indefinite linear quadratic optimal control problem: stabilizable case
M. Vukosavljev, A. P. Schoellig, and M. E. Broucke
SIAM Journal on Control and Optimization, vol. 56, iss. 1, pp. 496-516, 2018.

This paper addresses an open problem in the area of linear quadratic optimal control. We consider the regular, infinite-horizon, stability-modulo-a-subspace, indefinite linear quadratic problem under the assumption that the dynamics are stabilizable. Our result generalizes previous works dealing with the same problem in the case of controllable dynamics. We explicitly characterize the unique solution of the algebraic Riccati equation that gives the optimal cost and optimal feedback control, as well as necessary and sufficient conditions for the existence of optimal controls.

@article{vukosavljev-sicon18,
title = {The regular indefinite linear quadratic optimal control problem: stabilizable case},
author = {Marijan Vukosavljev and Angela P. Schoellig and Mireille E. Broucke},
journal = {{SIAM Journal on Control and Optimization}},
year = {2018},
volume = {56},
number = {1},
pages = {496-516},
doi = {10.1137/17M1143137},
urllink = {https://arxiv.org/abs/1905.00509},
abstract = {This paper addresses an open problem in the area of linear quadratic optimal control. We consider the regular, infinite-horizon, stability-modulo-a-subspace, indefinite linear quadratic problem under the assumption that the dynamics are stabilizable. Our result generalizes previous works dealing with the same problem in the case of controllable dynamics. We explicitly characterize the unique solution of the algebraic Riccati equation that gives the optimal cost and optimal feedback control, as well as necessary and sufficient conditions for the existence of optimal controls.},
}

On the construction of safe controllable regions for affine systems with applications to robotics
M. K. Helwa and A. P. Schoellig
Automatica, vol. 98, p. 323–330, 2018.

This paper studies the problem of constructing in-block controllable (IBC) regions for affine systems. That is, we are concerned with constructing regions in the state space of affine systems such that all the states in the interior of the region are mutually accessible within the region’s interior by applying uniformly bounded inputs. We first show that existing results for checking in- block controllability on given polytopic regions cannot be easily extended to address the question of constructing IBC regions. We then explore the geometry of the problem to provide a computationally efficient algorithm for constructing IBC regions. We also prove the soundness of the algorithm. We then use the proposed algorithm to construct safe speed profiles for robotic systems. As a case study, we present several experimental results on unmanned aerial vehicles (UAVs) to verify the effectiveness of the proposed algorithm; these results include using the proposed algorithm for real-time collision avoidance for UAVs.

@article{helwa-auto18,
title = {On the construction of safe controllable regions for affine systems with applications to robotics},
author = {Mohamed K. Helwa and Angela P. Schoellig},
journal = {{Automatica}},
volume = {98},
pages = {323--330},
doi = {https://doi.org/10.1016/j.automatica.2018.09.019},
year = {2018},
urllink = {http://www.sciencedirect.com/science/article/pii/S0005109818304448},
abstract = {This paper studies the problem of constructing in-block controllable (IBC) regions for affine systems. That is, we are concerned with constructing regions in the state space of affine systems such that all the states in the interior of the region are mutually accessible within the region’s interior by applying uniformly bounded inputs. We first show that existing results for checking in- block controllability on given polytopic regions cannot be easily extended to address the question of constructing IBC regions. We then explore the geometry of the problem to provide a computationally efficient algorithm for constructing IBC regions. We also prove the soundness of the algorithm. We then use the proposed algorithm to construct safe speed profiles for robotic systems. As a case study, we present several experimental results on unmanned aerial vehicles (UAVs) to verify the effectiveness of the proposed algorithm; these results include using the proposed algorithm for real-time collision avoidance for UAVs.},
}

An inversion-based learning approach for improving impromptu trajectory tracking of robots with non-minimum phase dynamics
S. Zhou, M. K. Helwa, and A. P. Schoellig
IEEE Robotics and Automation Letters, vol. 3, iss. 3, p. 1663–1670, 2018.

This letter presents a learning-based approach for impromptu trajectory tracking for non-minimum phase systems, i.e., systems with unstable inverse dynamics. Inversion-based feedforward approaches are commonly used for improving tracking performance; however, these approaches are not directly applicable to non-minimum phase systems due to their inherent instability. In order to resolve the instability issue, existing methods have assumed that the system model is known and used preactuation or inverse approximation techniques. In this work, we propose an approach for learning a stable, approximate inverse of a non-minimum phase baseline system directly from its input–output data. Through theoretical discussions, simulations, and experiments on two different platforms, we show the stability of our proposed approach and its effectiveness for high-accuracy, impromptu tracking. Our approach also shows that including more information in the training, as is commonly assumed to be useful, does not lead to better performance but may trigger instability and impact the effectiveness of the overall approach.

@article{zhou-ral18,
title = {An Inversion-Based Learning Approach for Improving Impromptu Trajectory Tracking of Robots With Non-Minimum Phase Dynamics},
author = {SiQi Zhou and Mohamed K. Helwa and Angela P. Schoellig},
journal = {{IEEE Robotics and Automation Letters}},
year = {2018},
volume = {3},
number = {3},
doi = {10.1109/LRA.2018.2801471},
pages = {1663--1670},
urllink = {https://arxiv.org/pdf/1709.04407.pdf},
abstract = {This letter presents a learning-based approach for impromptu trajectory tracking for non-minimum phase systems, i.e., systems with unstable inverse dynamics. Inversion-based feedforward approaches are commonly used for improving tracking performance; however, these approaches are not directly applicable to non-minimum phase systems due to their inherent instability. In order to resolve the instability issue, existing methods have assumed that the system model is known and used preactuation or inverse approximation techniques. In this work, we propose an approach for learning a stable, approximate inverse of a non-minimum phase baseline system directly from its input–output data. Through theoretical discussions, simulations, and experiments on two different platforms, we show the stability of our proposed approach and its effectiveness for high-accuracy, impromptu tracking. Our approach also shows that including more information in the training, as is commonly assumed to be useful, does not lead to better performance but may trigger instability and impact the effectiveness of the overall approach.},
}

Level-headed: gimbal-stabilised visual teach & repeat for improved high-speed path-following
M. Warren, A. P. Schoellig, and T. D. Barfoot
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2018.

Operating in rough, unstructured terrain is an essential requirement for any truly field-deployable ground robot. Search-and-rescue, border patrol and agricultural work all require operation in environments with little established infrastructure for easy navigation. This presents challenges for sensor-based navigation such as vision, where erratic motion and feature-poor environments test feature tracking and hinder the performance of repeat matching of point features. For vision-based route-following methods such as Visual Teach and Repeat (VT&R), maintaining similar visual perspective of salient point features is critical for reliable odometry and accurate localisation over long periods. In this paper, we investigate a potential solution to these challenges by integrating a gimballed camera with VT&R on a Grizzly Robotic Utility Vehicle (RUV) for testing at high speeds and in visually challenging environments. We investigate the benefits and drawbacks of using an actively gimballed camera to attenuate image motion and control viewpoint. We compare the use of a gimballed camera to our traditional fixed stereo configuration and demonstrate cases of improved performance in Visual Odometry (VO), localisation, and path following in several sets of outdoor experiments.

@INPROCEEDINGS{warren-icra18,
author = {Michael Warren and Angela P. Schoellig and Tim D. Barfoot},
title = {Level-headed: gimbal-stabilised visual teach & repeat for improved high-speed path-following},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year={2018},
abstract = {Operating in rough, unstructured terrain is an essential requirement for any truly field-deployable ground robot. Search-and-rescue, border patrol and agricultural work all require operation in environments with little established infrastructure for easy navigation. This presents challenges for sensor-based navigation such as vision, where erratic motion and feature-poor environments test feature tracking and hinder the performance of repeat matching of point features. For vision-based route-following methods such as Visual Teach and Repeat (VT&R), maintaining similar visual perspective of salient point features is critical for reliable odometry and accurate localisation over long periods. In this paper, we investigate a potential solution to these challenges by integrating a gimballed camera with VT&R on a Grizzly Robotic Utility Vehicle (RUV) for testing at high speeds and in visually challenging environments. We investigate the benefits and drawbacks of using an actively gimballed camera to attenuate image motion and control viewpoint. We compare the use of a gimballed camera to our traditional fixed stereo configuration and demonstrate cases of improved performance in Visual Odometry (VO), localisation, and path following in several sets of outdoor experiments.},
}

Pre- and post-blast rock block size analysis using UAV-Lidar based data and discrete fracture network
F. Medinac, T. Bamford, K. Esmaeili, and A. P. Schoellig
in Proc. of the 2nd International Discrete Fracture Network Engineering (DFNE), 2018.

@INPROCEEDINGS{medinac-dfne18,
author = {Filip Medinac and Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
title = {Pre- and post-blast rock block size analysis using {UAV-Lidar} based data and discrete fracture network},
booktitle = {{Proc. of the 2nd International Discrete Fracture Network Engineering (DFNE)}},
year = {2018},
abstact = {Drilling and blasting is one of the key processes in open pit mining, required to reduce in-situ rock block size to rock fragments that can be handled by mine equipment. It is a significant cost driver of any mining operation which can influence the downstream mining processes. In-situ rock block size influences the muck pile size distribution after blast, and the amount of drilling and explosive required to achieve a desired distribution. Thus, continuous measurement of pre- and post-blast rock block size distribution is essential for the optimization of the rock fragmentation process. This paper presents the results of a case study in an open pit mine where an Unmanned Aerial Vehicle (UAV) was used for mapping of the pit walls before blast. Pit wall mapping and aerial data was used as input to generate a 3D Discrete Fracture Network (DFN) model of the rock mass and to estimate the in-situ block size distribution. Data collected by the UAV was also used to estimate the post-blast rock fragment size distribution. The knowledge of in-situ and blasted rock size distributions can be related to assess blast performance. This knowledge will provide feedback to production engineers to adjust the blast design.},
}

Adaptive model predictive control for high-accuracy trajectory tracking in changing conditions
K. Pereida and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018, p. 7831–7837.

Robots and automated systems are increasingly being introduced to unknown and dynamic environments where they are required to handle disturbances, unmodeled dynamics, and parametric uncertainties. Robust and adaptive control strategies are required to achieve high performance in these dynamic environments. In this paper, we propose a novel adaptive model predictive controller that combines model predictive control (MPC) with an underlying L_1 adaptive controller to improve trajectory tracking of a system subject to unknown and changing disturbances. The L_1 adaptive controller forces the system to behave in a predefined way, as specified by a reference model. A higher-level model predictive controller then uses this reference model to calculate the optimal reference input based on a cost function, while taking into account input and state constraints. We focus on the experimental validation of the proposed approach and demonstrate its effectiveness in experiments on a quadrotor. We show that the proposed approach has a lower trajectory tracking error compared to non-predictive, adaptive approaches and a predictive, non-adaptive approach, even when external wind disturbances are applied.

@INPROCEEDINGS{pereida-iros18,
author={Karime Pereida and Angela P. Schoellig},
title={Adaptive Model Predictive Control for High-Accuracy Trajectory Tracking in Changing Conditions},
booktitle={{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year={2018},
pages={7831--7837},
urllink={https://arxiv.org/abs/1807.05290},
abstract={Robots and automated systems are increasingly being introduced to unknown and dynamic environments where they are required to handle disturbances, unmodeled dynamics, and parametric uncertainties. Robust and adaptive control strategies are required to achieve high performance in these dynamic environments. In this paper, we propose a novel adaptive model predictive controller that combines model predictive control (MPC) with an underlying L_1 adaptive controller to improve trajectory tracking of a system subject to unknown and changing disturbances. The L_1 adaptive controller forces the system to behave in a predefined way, as specified by a reference model. A higher-level model predictive controller then uses this reference model to calculate the optimal reference input based on a cost function, while taking into account input and state constraints. We focus on the experimental validation of the proposed approach and demonstrate its effectiveness in experiments on a quadrotor. We show that the proposed approach has a lower trajectory tracking error compared to non-predictive, adaptive approaches and a predictive, non-adaptive approach, even when external wind disturbances are applied.},
}

Flatness-based model predictive control for quadrotor trajectory tracking
M. Greeff and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018, p. 6740–6745.

The use of model predictive control for quadrotor applications requires balancing trajectory tracking performance and constraint satisfaction with fast computational demands. This paper proposes a Flatness-based Model Predictive Control (FMPC) approach that can be applied to quadrotors, and more generally, differentially flat nonlinear systems. Our proposed FMPC couples feedback model predictive control with feedforward linearization. The proposed approach has the computational advantage that, similar to linear model predictive control, it only requires solving a convex quadratic program instead of a nonlinear program. However, unlike linear model predictive control, we still account for the nonlinearity in the model through the use of an inverse term. In simulation, we demonstrate improved robustness over approaches that couple model predictive control with feedback linearization. In experiments using quadrotor vehicles, we demonstrate improved trajectory tracking compared to classical linear and nonlinear model predictive controllers.

@INPROCEEDINGS{greeff-iros18,
author={Melissa Greeff and Angela P. Schoellig},
title={Flatness-based Model Predictive Control for Quadrotor Trajectory Tracking},
booktitle={{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year={2018},
urllink={http://www.dynsyslab.org/wp-content/papercite-data/pdf/greeff-iros18.pdf},
urldata = {../../wp-content/papercite-data/data/greeff-icra18-supplementary.pdf},
pages={6740--6745},
abstract={The use of model predictive control for quadrotor applications requires balancing trajectory tracking performance and constraint satisfaction with fast computational demands. This paper proposes a Flatness-based Model Predictive Control (FMPC) approach that can be applied to quadrotors, and more generally, differentially flat nonlinear systems. Our proposed FMPC couples feedback model predictive control with feedforward linearization. The proposed approach has the computational advantage that, similar to linear model predictive control, it only requires solving a convex quadratic program instead of a nonlinear program. However, unlike linear model predictive control, we still account for the nonlinearity in the model through the use of an inverse term. In simulation, we demonstrate improved robustness over approaches that couple model predictive control with feedback linearization. In experiments using quadrotor vehicles, we demonstrate improved trajectory tracking compared to classical linear and nonlinear model predictive controllers.},
}

Hybrid model predictive control for crosswind stabilization of hybrid airships
J. F. M. Foerster, M. K. Helwa, X. Du, and A. P. Schoellig
in Proc. of the International Symposium on Experimental Robotics (ISER), 2018, pp. 499-510.

Hybrid airships are heavier-than-air vehicles that generate a majority of the lift using buoyancy. The resulting high energy efficiency during operation and short take-off and landing distances make this vehicle class very suited for a number of logistics applications. However, the range of safe operating conditions can be limited due to a high susceptibility to crosswinds during taxiing, take-off and landing. The goal of this work is to design and implement an automated counter-gust system (CGS) that stabilizes a hybrid airship against wind disturbances during ground operations by controlling thrusters that are mounted to the wingtips. The CGS controller should compute optimal control inputs, run autonomously without pilot intervention, be computationally efficient to run on onboard hardware, and be flexible regarding adaption to future aircraft.

@INPROCEEDINGS{foerster-iser18,
author={Julian F. M. Foerster and Mohamed K. Helwa and Xintong Du and Angela P. Schoellig},
title={Hybrid Model Predictive Control for Crosswind Stabilization of Hybrid Airships},
booktitle={{Proc. of the International Symposium on Experimental Robotics (ISER)}},
year={2018},
pages={499-510},
doi={10.1007/978-3-030-33950-0_43},
urllink={https://link.springer.com/chapter/10.1007/978-3-030-33950-0_43},
abstract={Hybrid airships are heavier-than-air vehicles that generate a majority of the lift using buoyancy. The resulting high energy efficiency during operation and short take-off and landing distances make this vehicle class very suited for a number of logistics applications. However, the range of safe operating conditions can be limited due to a high susceptibility to crosswinds during taxiing, take-off and landing. The goal of this work is to design and implement an automated counter-gust system (CGS) that stabilizes a hybrid airship against wind disturbances during ground operations by controlling thrusters that are mounted to the wingtips. The CGS controller should compute optimal control inputs, run autonomously without pilot intervention, be computationally efficient to run on onboard hardware, and be flexible regarding adaption to future aircraft.},
}

Automated localization of UAVs in GPS-denied indoor construction environments using fiducial markers
M. Nahangi, A. Heins, B. McCabe, and A. P. Schoellig
in Proc. International Symposium on Automation and Robotics in Construction (ISARC), 2018, p. 88–94.

Unmanned Aerial Vehicles (UAVs) have opened a wide range of opportunities and applications in different sectors including construction. Such applications include: 3D mapping from 2D images and video footage, automated site inspection, and performance monitoring. All of the above-mentioned applications perform well outdoors where GPS is quite reliable for localization and navigation of UAV’s. Indoor localization and consequently indoor navigation have remained relatively untapped, because GPS is not sufficiently reliable and accurate in indoor environments. This paper presents a method for localization of aerial vehicles in GPS-denied indoor construction environments. The proposed method employs AprilTags that are linked to previously known coordinates in the 3D building information model (BIM). Using cameras on-board the UAV and extracting the transformation from the tag to the camera’s frame, the UAV can be localized on the site. It can then use the previously computed information for navigation between critical locations on construction sites. We use an experimental setup to verify and validate the proposed method by comparing with an indoor localization system as the ground truth. Results show that the proposed method is sufficiently accurate to perform indoor navigation. Moreover, the method does not intensify the complexity of the construction execution as the tags are simply printed and placed on available surfaces at the construction site.

@INPROCEEDINGS{nahangi-isarc18,
author={Mohammad Nahangi and Adam Heins and Brenda McCabe and Angela P. Schoellig},
title={Automated Localization of {UAV}s in {GPS}-Denied Indoor Construction Environments Using Fiducial Markers},
booktitle = {{Proc. International Symposium on Automation and Robotics in Construction (ISARC)}},
year = {2018},
pages={88--94},
abstract = {Unmanned Aerial Vehicles (UAVs) have opened a wide range of opportunities and applications in different sectors including construction. Such applications include: 3D mapping from 2D images and video footage, automated site inspection, and performance monitoring. All of the above-mentioned applications perform well outdoors where GPS is quite reliable for localization and navigation of UAV’s. Indoor localization and consequently indoor navigation have remained relatively untapped, because GPS is not sufficiently reliable and accurate in indoor environments. This paper presents a method for localization of aerial vehicles in GPS-denied indoor construction environments. The proposed method employs AprilTags that are linked to previously known coordinates in the 3D building information model (BIM). Using cameras on-board the UAV and extracting the transformation from the tag to the camera’s frame, the UAV can be localized on the site. It can then use the previously computed information for navigation between critical locations on construction sites. We use an experimental setup to verify and validate the proposed method by comparing with an indoor localization system as the ground truth. Results show that the proposed method is sufficiently accurate to perform indoor navigation. Moreover, the method does not intensify the complexity of the construction execution as the tags are simply printed and placed on available surfaces at the construction site.},
}

Experience-based model selection to enable long-term, safe control for repetitive tasks under changing conditions
C. D. McKinnon and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018, p. 2977–2984.

Learning has propelled the cutting edge of performance in robotic control to new heights, allowing robots to operate with high performance in conditions that were previously unimaginable. The majority of the work, however, assumes that the unknown parts are static or slowly changing. This limits them to static or slowly changing environments. However, in the real world, a robot may experience various unknown conditions. This paper presents a method to extend an existing single mode GP-based safe learning controller to learn an increasing number of non-linear models for the robot dynamics. We show that this approach enables a robot to re-use past experience from a large number of previously visited operating conditions, and to safely adapt when a new and distinct operating condition is encountered. This allows the robot to achieve safety and high performance in an large number of operating conditions that do not have to be specified ahead of time. Our approach runs independently from the controller, imposing no additional computation time on the control loop regardless of the number of previous operating conditions considered. We demonstrate the effectiveness of our approach in experiment on a 900\,kg ground robot with both physical and artificial changes to its dynamics. All of our experiments are conducted using vision for localization.

@inproceedings{mckinnon-iros18,
author = {Christopher D. McKinnon and Angela P. Schoellig},
title = {Experience-Based Model Selection to Enable Long-Term, Safe Control for Repetitive Tasks Under Changing Conditions},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year = {2018},
urllink = {https://arxiv.org/abs/1803.04065},
urlslides = {../../wp-content/papercite-data/slides/mckinnon-iros18-slides.pdf},
pages = {2977--2984},
abstract = {Learning has propelled the cutting edge of performance in robotic control to new heights, allowing robots to operate with high performance in conditions that were previously unimaginable. The majority of the work, however, assumes that the unknown parts are static or slowly changing. This limits them to static or slowly changing environments. However, in the real world, a robot may experience various unknown conditions. This paper presents a method to extend an existing single mode GP-based safe learning controller to learn an increasing number of non-linear models for the robot dynamics. We show that this approach enables a robot to re-use past experience from a large number of previously visited operating conditions, and to safely adapt when a new and distinct operating condition is encountered. This allows the robot to achieve safety and high performance in an large number of operating conditions that do not have to be specified ahead of time. Our approach runs independently from the controller, imposing no additional computation time on the control loop regardless of the number of previous operating conditions considered. We demonstrate the effectiveness of our approach in experiment on a 900\,kg ground robot with both physical and artificial changes to its dynamics. All of our experiments are conducted using vision for localization.},
}

Evaluation of UAV system accuracy for automated fragmentation measurement
T. Bamford, K. Esmaeili, and A. P. Schoellig
in Proc. of the 12th International Symposium on Rock Fragmentation by Blasting (FRAGBLAST), 2018, p. 715–730.

The current practice of collecting rock fragmentation data is highly manual and provides data with low temporal and spatial resolution. Unmanned Aerial Vehicle (UAV) technology can increase both temporal and spatial data resolution without exposing technicians to hazardous conditions. Our previous works using UAV technology to acquire real-time rock fragmentation data has shown comparable quality results to sieving in a lab environment. However, when applied to a mining environment, it is essential to quantify the accuracy of scale estimation and rock size distribution by considering various sources of uncertainties such as the UAV GPS, which provides noisy measurements. In the current paper, we investigate the accuracy of application of UAVs to collect photographic data for fragmentation analysis. This is done by evaluating the accuracy of the 3D model generated using the UAV system, estimated image scale, and the measured rock size distribution. This paper also investigates the impact of flight altitude on the measured rock size distribution.

@inproceedings{bamford-fragblast12,
author = {Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
title = {Evaluation of {UAV} system accuracy for automated fragmentation measurement},
booktitle = {{Proc. of the 12th International Symposium on Rock Fragmentation by Blasting (FRAGBLAST)}},
year = {2018},
pages = {715--730},
abstract = {The current practice of collecting rock fragmentation data is highly manual and provides data with low temporal and spatial resolution. Unmanned Aerial Vehicle (UAV) technology can increase both temporal and spatial data resolution without exposing technicians to hazardous conditions. Our previous works using UAV technology to acquire real-time rock fragmentation data has shown comparable quality results to sieving in a lab environment. However, when applied to a mining environment, it is essential to quantify the accuracy of scale estimation and rock size distribution by considering various sources of uncertainties such as the UAV GPS, which provides noisy measurements. In the current paper, we investigate the accuracy of application of UAVs to collect photographic data for fragmentation analysis. This is done by evaluating the accuracy of the 3D model generated using the UAV system, estimated image scale, and the measured rock size distribution. This paper also investigates the impact of flight altitude on the measured rock size distribution.},
}

Learning of coordination policies for robotic swarms
Q. Li, X. Du, Y. Huang, Q. Sykora, and A. P. Schoellig
Technical Report, arXiv, 2018.

Inspired by biological swarms, robotic swarms are envisioned to solve real-world problems that are difficult for individual agents. Biological swarms can achieve collective intelligence based on local interactions and simple rules; however, designing effective distributed policies for large-scale robotic swarms to achieve a global objective can be challenging. Although it is often possible to design an optimal centralized strategy for smaller numbers of agents, those methods can fail as the number of agents increases. Motivated by the growing success of machine learning, we develop a deep learning approach that learns distributed coordination policies from centralized policies. In contrast to traditional distributed control approaches, which are usually based on human-designed policies for relatively simple tasks, this learning-based approach can be adapted to more difficult tasks. We demonstrate the efficacy of our proposed approach on two different tasks, the well-known rendezvous problem and a more difficult particle assignment problem. For the latter, no known distributed policy exists. From extensive simulations, it is shown that the performance of the learned coordination policies is comparable to the centralized policies, surpassing state-of-the-art distributed policies. Thereby, our proposed approach provides a promising alternative for real-world coordination problems that would be otherwise computationally expensive to solve or intangible to explore.

@TECHREPORT{li-icra18,
title = {Learning of Coordination Policies for Robotic Swarms},
institution = {arXiv},
author = {Qiyang Li and Xintong Du and Yizhou Huang and Quinlan Sykora and Angela P. Schoellig},
year = {2018},
urllink = {https://arxiv.org/pdf/1709.06620.pdf},
abstract = {Inspired by biological swarms, robotic swarms are envisioned to solve real-world problems that are difficult for individual agents. Biological swarms can achieve collective intelligence based on local interactions and simple rules; however, designing effective distributed policies for large-scale robotic swarms to achieve a global objective can be challenging. Although it is often possible to design an optimal centralized strategy for smaller numbers of agents, those methods can fail as the number of agents increases. Motivated by the growing success of machine learning, we develop a deep learning approach that learns distributed coordination policies from centralized policies. In contrast to traditional distributed control approaches, which are usually based on human-designed policies for relatively simple tasks, this learning-based approach can be adapted to more difficult tasks. We demonstrate the efficacy of our proposed approach on two different tasks, the well-known rendezvous problem and a more difficult particle assignment problem. For the latter, no known distributed policy exists. From extensive simulations, it is shown that the performance of the learned coordination policies is comparable to the centralized policies, surpassing state-of-the-art distributed policies. Thereby, our proposed approach provides a promising alternative for real-world coordination problems that would be otherwise computationally expensive to solve or intangible to explore.},
}

2017

A real-time analysis of post-blast rock fragmentation using UAV technology
T. Bamford, K. Esmaeili, and A. P. Schoellig
International Journal of Mining, Reclamation and Environment, vol. 31, iss. 6, p. 439–456, 2017.

The current practice of collecting rock fragmentation data for image analysis is highly manual and provides data with low temporal and spatial resolution. Using Unmanned Aerial Vehicles (UAVs) for collecting images of rock fragments improves the quality of the image data and automates the data collection process. This work presents the results of laboratory-scale using a UAV. The goal is to highlight the benefits of aerial fragmentation analysis in terms of both prediction accuracy and time effort. The pile was manually photographed and the results of the manual method were compared to the UAV method.

@article{bamford-ijmre17,
title = {A Real-Time Analysis of Post-Blast Rock Fragmentation Using {UAV} Technology},
author = {Bamford, Thomas and Esmaeili, Kamran and Schoellig, Angela P.},
journal = {{International Journal of Mining, Reclamation and Environment}},
year = {2017},
volume = {31},
number = {6},
doi = {10.1080/17480930.2017.1339170},
pages = {439--456},
publisher = {Taylor & Francis},
urlvideo = {https://youtu.be/q0syk6J_JHY},
abstract = {The current practice of collecting rock fragmentation data for image analysis is highly manual and provides data with low temporal and spatial resolution. Using Unmanned Aerial Vehicles (UAVs) for collecting images of rock fragments improves the quality of the image data and automates the data collection process. This work presents the results of laboratory-scale using a UAV. The goal is to highlight the benefits of aerial fragmentation analysis in terms of both prediction accuracy and time effort. The pile was manually photographed and the results of the manual method were compared to the UAV method.},
}

Optimizing a drone network to deliver automated external defibrillators
J. J. Boutilier, S. C. Brooks, A. Janmohamed, A. Byers, J. E. Buick, C. Zhan, A. P. Schoellig, S. Cheskes, L. J. Morrison, and T. C. Y. Chan
Circulation, 2017. In press.

BACKGROUND Public access defibrillation programs can improve survival after out-of-hospital cardiac arrest (OHCA), but automated external defibrillators (AEDs) are rarely available for bystander use at the scene. Drones are an emerging technology that can deliver an AED to the scene of an OHCA for bystander use. We hypothesize that a drone network designed with the aid of a mathematical model combining both optimization and queuing can reduce the time to AED arrival. METHODS We applied our model to 53,702 OHCAs that occurred in the eight regions of the Toronto Regional RescuNET between January 1st 2006 and December 31st 2014. Our primary analysis quantified the drone network size required to deliver an AED one, two, or three minutes faster than historical median 911 response times for each region independently. A secondary analysis quantified the reduction in drone resources required if RescuNET was treated as one large coordinated region. RESULTS The region-specific analysis determined that 81 bases and 100 drones would be required to deliver an AED ahead of median 911 response times by three minutes. In the most urban region, the 90th percentile of the AED arrival time was reduced by 6 minutes and 43 seconds relative to historical 911 response times in the region. In the most rural region, the 90th percentile was reduced by 10 minutes and 34 seconds. A single coordinated drone network across all regions required 39.5% fewer bases and 30.0% fewer drones to achieve similar AED delivery times. CONCLUSIONS An optimized drone network designed with the aid of a novel mathematical model can substantially reduce the AED delivery time to an OHCA event.

@article{boutilier-circ17,
title={Optimizing a Drone Network to Deliver Automated External Defibrillators},
author = {Boutilier, Justin J. and Brooks, Steven C. and Janmohamed, Alyf and Byers, Adam and Buick, Jason E. and Zhan, Cathy and Schoellig, Angela P. and Cheskes, Sheldon and Morrison, Laurie J. and Chan, Timothy C. Y.},
journal={Circulation},
year={2017},
doi = {10.1161/CIRCULATIONAHA.116.026318},
publisher = {American Heart Association, Inc.},
note = {In press},
abstract = {BACKGROUND Public access defibrillation programs can improve survival after out-of-hospital cardiac arrest (OHCA), but automated external defibrillators (AEDs) are rarely available for bystander use at the scene. Drones are an emerging technology that can deliver an AED to the scene of an OHCA for bystander use. We hypothesize that a drone network designed with the aid of a mathematical model combining both optimization and queuing can reduce the time to AED arrival. METHODS We applied our model to 53,702 OHCAs that occurred in the eight regions of the Toronto Regional RescuNET between January 1st 2006 and December 31st 2014. Our primary analysis quantified the drone network size required to deliver an AED one, two, or three minutes faster than historical median 911 response times for each region independently. A secondary analysis quantified the reduction in drone resources required if RescuNET was treated as one large coordinated region. RESULTS The region-specific analysis determined that 81 bases and 100 drones would be required to deliver an AED ahead of median 911 response times by three minutes. In the most urban region, the 90th percentile of the AED arrival time was reduced by 6 minutes and 43 seconds relative to historical 911 response times in the region. In the most rural region, the 90th percentile was reduced by 10 minutes and 34 seconds. A single coordinated drone network across all regions required 39.5% fewer bases and 30.0% fewer drones to achieve similar AED delivery times. CONCLUSIONS An optimized drone network designed with the aid of a novel mathematical model can substantially reduce the AED delivery time to an OHCA event.},
}

Safe model-based reinforcement learning with stability guarantees
F. Berkenkamp, M. Turchetta, A. P. Schoellig, and A. Krause
in Proc. of Neural Information Processing Systems (NIPS), 2017, p. 908–918.

Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorithm that explicitly considers safety in terms of stability guarantees. Specifically, we extend control theoretic results on Lyapunov stability verification and show how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates. Moreover, under additional regularity assumptions in terms of a Gaussian process prior, we prove that one can effectively and safely collect data in order to learn about the dynamics and thus both improve control performance and expand the safe region of the state space. In our experiments, we show how the resulting algorithm can safely optimize a neural network policy on a simulated inverted pendulum, without the pendulum ever falling down.

@INPROCEEDINGS{berkenkamp-nips17,
title = {Safe model-based reinforcement learning with stability guarantees},
booktitle = {{Proc. of Neural Information Processing Systems (NIPS)}},
author = {Felix Berkenkamp and Matteo Turchetta and Angela P. Schoellig and Andreas Krause},
year = {2017},
urllink = {https://arxiv.org/abs/1705.08551},
pages = {908--918},
abstract = {Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorithm that explicitly considers safety in terms of stability guarantees. Specifically, we extend control theoretic results on Lyapunov stability verification and show how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates. Moreover, under additional regularity assumptions in terms of a Gaussian process prior, we prove that one can effectively and safely collect data in order to learn about the dynamics and thus both improve control performance and expand the safe region of the state space. In our experiments, we show how the resulting algorithm can safely optimize a neural network policy on a simulated inverted pendulum, without the pendulum ever falling down.},
}

Towards visual teach & repeat for GPS-denied flight of a fixed-wing UAV
M. Warren, M. Paton, K. MacTavish, A. P. Schoellig, and T. D. Barfoot
in Proc. of the 11th Conference on Field and Service Robotics (FSR), 2017, p. 481–498.

Most consumer and industrial Unmanned Aerial Vehicles (UAVs) rely on combining Global Navigation Satellite Systems (GNSS) with barometric and inertial sensors for outdoor operation. As a consequence these vehicles are prone to a variety of potential navigation failures such as jamming and environmental interference. This usually limits their legal activities to locations of low population density within line-of-sight of a human pilot to reduce risk of injury and damage. Autonomous route-following methods such as Visual Teach & Repeat (VT&R) have enabled long-range navigational autonomy for ground robots without the need for reliance on external infrastructure or an accurate global position estimate. In this paper, we demonstrate the localisation component of (VT&R) outdoors on a fixed-wing UAV as a method of backup navigation in case of primary sensor failure. We modify the localisation engine of (VT&R) to work with a single downward facing camera on a UAV to enable safe navigation under the guidance of vision alone. We evaluate the method using visual data from the UAV flying a 1200 m trajectory (at altitude of 80 m) several times during a multi-day period, covering a total distance of 10.8 km using the algorithm. We examine the localisation performance for both small (single flight) and large (inter-day) temporal differences from teach to repeat. Through these experiments, we demonstrate the ability to successfully localise the aircraft on a self-taught route using vision alone without the need for additional sensing or infrastructure.

@INPROCEEDINGS{warren-fsr17,
author={Michael Warren and Michael Paton and Kirk MacTavish and Angela P. Schoellig and Tim D. Barfoot},
title={Towards visual teach & repeat for {GPS}-denied
flight of a fixed-wing {UAV}},
booktitle={{Proc. of the 11th Conference on Field and Service Robotics (FSR)}},
year={2017},
pages={481--498},
urllink={https://link.springer.com/chapter/10.1007/978-3-319-67361-5_31},
abstract={Most consumer and industrial Unmanned Aerial Vehicles (UAVs) rely on combining Global Navigation Satellite Systems (GNSS) with barometric and inertial sensors for outdoor operation. As a consequence these vehicles are prone to a variety of potential navigation failures such as jamming and environmental interference. This usually limits their legal activities to locations of low population density within line-of-sight of a human pilot to reduce risk of injury and damage. Autonomous route-following methods such as Visual Teach & Repeat (VT&R) have enabled long-range navigational autonomy for ground robots without the need for reliance on external infrastructure or an accurate global position estimate. In this paper, we demonstrate the localisation component of (VT&R) outdoors on a fixed-wing UAV as a method of backup navigation in case of primary sensor failure. We modify the localisation engine of (VT&R) to work with a single downward facing camera on a UAV to enable safe navigation under the guidance of vision alone. We evaluate the method using visual data from the UAV flying a 1200 m trajectory (at altitude of 80 m) several times during a multi-day period, covering a total distance of 10.8 km using the algorithm. We examine the localisation performance for both small (single flight) and large (inter-day) temporal differences from teach to repeat. Through these experiments, we demonstrate the ability to successfully localise the aircraft on a self-taught route using vision alone without the need for additional sensing or infrastructure.},
}

Multi-robot transfer learning: a dynamical system perspective
M. K. Helwa and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 4702-4708.

Multi-robot transfer learning allows a robot to use data generated by a second, similar robot to improve its own behavior. The potential advantages are reducing the time of training and the unavoidable risks that exist during the training phase. Transfer learning algorithms aim to find an optimal transfer map between different robots. In this paper, we investigate, through a theoretical study of single-input single-output (SISO) systems, the properties of such optimal transfer maps. We first show that the optimal transfer learning map is, in general, a dynamic system. The main contribution of the paper is to provide an algorithm for determining the properties of this optimal dynamic map including its order and regressors (i.e., the variables it depends on). The proposed algorithm does not require detailed knowledge of the robots’ dynamics, but relies on basic system properties easily obtainable through simple experimental tests. We validate the proposed algorithm experimentally through an example of transfer learning between two different quadrotor platforms. Experimental results show that an optimal dynamic map, with correct properties obtained from our proposed algorithm, achieves 60-70% reduction of transfer learning error compared to the cases when the data is directly transferred or transferred using an optimal static map.

@INPROCEEDINGS{helwa-iros17,
author={Mohamed K. Helwa and Angela P. Schoellig},
title={Multi-Robot Transfer Learning: A Dynamical System Perspective},
booktitle={{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year={2017},
pages={4702-4708},
doi={10.1109/IROS.2017.8206342},
urllink={https://arxiv.org/abs/1707.08689},
abstract={Multi-robot transfer learning allows a robot to use data generated by a second, similar robot to improve its own behavior. The potential advantages are reducing the time of training and the unavoidable risks that exist during the training phase. Transfer learning algorithms aim to find an optimal transfer map between different robots. In this paper, we investigate, through a theoretical study of single-input single-output (SISO) systems, the properties of such optimal transfer maps. We first show that the optimal transfer learning map is, in general, a dynamic system. The main contribution of the paper is to provide an algorithm for determining the properties of this optimal dynamic map including its order and regressors (i.e., the variables it depends on). The proposed algorithm does not require detailed knowledge of the robots’ dynamics, but relies on basic system properties easily obtainable through simple experimental tests. We validate the proposed algorithm experimentally through an example of transfer learning between two different quadrotor platforms. Experimental results show that an optimal dynamic map, with correct properties obtained from our proposed algorithm, achieves 60-70% reduction of transfer learning error compared to the cases when the data is directly transferred or transferred using an optimal static map.},
}

Aerial rock fragmentation analysis in low-light condition using UAV technology
T. Bamford, K. Esmaeili, and A. P. Schoellig
in Proc. of Application of Computers and Operations Research in the Mining Industry (APCOM), 2017, p. 4-1–4-8.

In recent years, Unmanned Aerial Vehicle (UAV) technology has been introduced into the mining industry to conduct terrain surveying. This work investigates the application of UAVs with artificial lighting for measurement of rock fragmentation under poor lighting conditions, representing night shifts in surface mines or working conditions in underground mines. The study relies on indoor and outdoor experiments for rock fragmentation analysis using a quadrotor UAV. Comparison of the rock size distributions in both cases show that adequate artificial lighting enables similar accuracy to ideal lighting conditions.

@INPROCEEDINGS{bamford-apcom17,
author={Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
title={Aerial Rock Fragmentation Analysis in Low-Light Condition Using {UAV} Technology},
booktitle={{Proc. of Application of Computers and Operations Research in the Mining Industry (APCOM)}},
year={2017},
pages = {4-1--4-8},
urlslides={../../wp-content/papercite-data/slides/bamford-apcom17-slides.pdf},
abstract={In recent years, Unmanned Aerial Vehicle (UAV) technology has been introduced into the mining industry to conduct terrain surveying. This work investigates the application of UAVs with artificial lighting for measurement of rock fragmentation under poor lighting conditions, representing night shifts in surface mines or working conditions in underground mines. The study relies on indoor and outdoor experiments for rock fragmentation analysis using a quadrotor UAV. Comparison of the rock size distributions in both cases show that adequate artificial lighting enables similar accuracy to ideal lighting conditions.},
}

A framework for multi-vehicle navigation using feedback-based motion primitives
M. Vukosavljev, Z. Kroeze, M. E. Broucke, and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, p. 223–229.

We present a hybrid control framework for solving a motion planning problem among a collection of heterogenous agents. The proposed approach utilizes a finite set of low-level motion primitives, each based on a piecewise affine feedback control, to generate complex motions in a gridded workspace. The constraints on allowable sequences of successive motion primitives are formalized through a maneuver automaton. At the higher level, a control policy generated by a shortest path non-deterministic algorithm determines which motion primitive is executed in each box of the gridded workspace. The overall framework yields a highly robust control design on both the low and high levels. We experimentally demonstrate the efficacy and robustness of this framework for multiple quadrocopters maneuvering in a 2D or 3D workspace.

@INPROCEEDINGS{vukosavljev-iros17,
author={Marijan Vukosavljev and Zachary Kroeze and Mireille E. Broucke and Angela P. Schoellig},
title={A Framework for Multi-Vehicle Navigation Using Feedback-Based Motion Primitives},
booktitle={{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year={2017},
pages={223--229},
doi={10.1109/IROS.2017.8202161},
urllink={https://arxiv.org/abs/1707.06988},
urlvideo={https://www.youtube.com/watch?v=qhDQyvYNVEc},
urlslides = {../../wp-content/papercite-data/slides/vukosavljev-iros17-slides.pdf},
abstract={We present a hybrid control framework for solving a motion planning problem among a collection of heterogenous agents. The proposed approach utilizes a finite set of low-level motion primitives, each based on a piecewise affine feedback control, to generate complex motions in a gridded workspace. The constraints on allowable sequences of successive motion primitives are formalized through a maneuver automaton. At the higher level, a control policy generated by a shortest path non-deterministic algorithm determines which motion primitive is executed in each box of the gridded workspace. The overall framework yields a highly robust control design on both the low and high levels. We experimentally demonstrate the efficacy and robustness of this framework for multiple quadrocopters maneuvering in a 2D or 3D workspace.},
}

Design of deep neural networks as add-on blocks for improving impromptu trajectory tracking
S. Zhou, M. K. Helwa, and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2017, p. 5201–5207.

This paper introduces deep neural networks (DNNs) as add-on blocks to baseline feedback control systems to enhance tracking performance of arbitrary desired trajectories. The DNNs are trained to adapt the reference signals to the feedback control loop. The goal is to achieve a unity map between the desired and the actual outputs. In previous work, the efficacy of this approach was demonstrated on quadrotors; on 30 unseen test trajectories, the proposed DNN approach achieved an average impromptu tracking error reduction of 43% as compared to the baseline feedback controller. Motivated by these results, this work aims to provide platform-independent design guidelines for the proposed DNN-enhanced control architecture. In particular, we provide specific guidelines for the DNN feature selection, derive conditions for when the proposed approach is effective, and show in which cases the training efficiency can be further increased.

@INPROCEEDINGS{zhou-cdc17,
author={SiQi Zhou and Mohamed K. Helwa and Angela P. Schoellig},
title={Design of Deep Neural Networks as Add-on Blocks for Improving Impromptu Trajectory Tracking},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2017},
pages={5201--5207},
urllink = {https://arxiv.org/pdf/1705.10932.pdf},
abstract = {This paper introduces deep neural networks (DNNs) as add-on blocks to baseline feedback control systems to enhance tracking performance of arbitrary desired trajectories. The DNNs are trained to adapt the reference signals to the feedback control loop. The goal is to achieve a unity map between the desired and the actual outputs. In previous work, the efficacy of this approach was demonstrated on quadrotors; on 30 unseen test trajectories, the proposed DNN approach achieved an average impromptu tracking error reduction of 43% as compared to the baseline feedback controller. Motivated by these results, this work aims to provide platform-independent design guidelines for the proposed DNN-enhanced control architecture. In particular, we provide specific guidelines for the DNN feature selection, derive conditions for when the proposed approach is effective, and show in which cases the training efficiency can be further increased.}
}

Constrained Bayesian optimization with particle swarms for safe adaptive controller tuning
R. R. P. R. Duivenvoorden, F. Berkenkamp, N. Carion, A. Krause, and A. P. Schoellig
in Proc. of the IFAC (International Federation of Automatic Control) World Congress, 2017, p. 12306–12313.

Tuning controller parameters is a recurring and time-consuming problem in control. This is especially true in the field of adaptive control, where good performance is typically only achieved after significant tuning. Recently, it has been shown that constrained Bayesian optimization is a promising approach to automate the tuning process without risking system failures during the optimization process. However, this approach is computationally too expensive for tuning more than a couple of parameters. In this paper, we provide a heuristic in order to efficiently perform constrained Bayesian optimization in high-dimensional parameter spaces by using an adaptive discretization based on particle swarms. We apply the method to the tuning problem of an L1 adaptive controller on a quadrotor vehicle and show that we can reliably and automatically tune parameters in experiments.

@INPROCEEDINGS{duivenvoorden-ifac17,
author = {Rikky R.P.R. Duivenvoorden and Felix Berkenkamp and Nicolas Carion and Andreas Krause and Angela P. Schoellig},
title = {Constrained {B}ayesian Optimization with Particle Swarms for Safe Adaptive Controller Tuning},
booktitle = {{Proc. of the IFAC (International Federation of Automatic Control) World Congress}},
year = {2017},
pages = {12306--12313},
abstract = {Tuning controller parameters is a recurring and time-consuming problem in control. This is especially true in the field of adaptive control, where good performance is typically only achieved after significant tuning. Recently, it has been shown that constrained Bayesian optimization is a promising approach to automate the tuning process without risking system failures during the optimization process. However, this approach is computationally too expensive for tuning more than a couple of parameters. In this paper, we provide a heuristic in order to efficiently perform constrained Bayesian optimization in high-dimensional parameter spaces by using an adaptive discretization based on particle swarms. We apply the method to the tuning problem of an L1 adaptive controller on a quadrotor vehicle and show that we can reliably and automatically tune parameters in experiments.},
}

Learning multimodal models for robot dynamics online with a mixture of Gaussian process experts
C. D. McKinnon and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2017, p. 322–328.

For decades, robots have been essential allies alongside humans in controlled industrial environments like heavy manufacturing facilities. However, without the guidance of a trusted human operator to shepherd a robot safely through a wide range of conditions, they have been barred from the complex, ever changing environments that we live in from day to day. Safe learning control has emerged as a promising way to start bridging algorithms based on first principles to complex real-world scenarios by using data to adapt, and improve performance over time. Safe learning methods rely on a good estimate of the robot dynamics and of the bounds on modelling error in order to be effective. Current methods focus on either a single adaptive model, or a fixed, known set of models for the robot dynamics. This limits them to static or slowly changing environments. This paper presents a method using Gaussian Processes in a Dirichlet Process mixture model to learn an increasing number of non-linear models for the robot dynamics. We show that this approach enables a robot to re-use past experience from an arbitrary number of previously visited operating conditions, and to automatically learn a new model when a new and distinct operating condition is encountered. This approach improves the robustness of existing Gaussian Process-based models to large changes in dynamics that do not have to be specified ahead of time.

@INPROCEEDINGS{mckinnon-icra17,
author = {Christopher D. McKinnon and Angela P. Schoellig},
title = {Learning multimodal models for robot dynamics online with a mixture of {G}aussian process experts},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2017},
pages = {322--328},
doi = {10.1109/ICRA.2017.7989041},
abstract = {For decades, robots have been essential allies alongside humans in controlled industrial environments like heavy manufacturing facilities. However, without the guidance of a trusted human operator to shepherd a robot safely through a wide range of conditions, they have been barred from the complex, ever changing environments that we live in from day to day. Safe learning control has emerged as a promising way to start bridging algorithms based on first principles to complex real-world scenarios by using data to adapt, and improve performance over time. Safe learning methods rely on a good estimate of the robot dynamics and of the bounds on modelling error in order to be effective. Current methods focus on either a single adaptive model, or a fixed, known set of models for the robot dynamics. This limits them to static or slowly changing environments. This paper presents a method using Gaussian Processes in a Dirichlet Process mixture model to learn an increasing number of non-linear models for the robot dynamics. We show that this approach enables a robot to re-use past experience from an arbitrary number of previously visited operating conditions, and to automatically learn a new model when a new and distinct operating condition is encountered. This approach improves the robustness of existing Gaussian Process-based models to large changes in dynamics that do not have to be specified ahead of time.},
}

High-precision trajectory tracking in changing environments through L1 adaptive feedback and iterative learning
K. Pereida, R. R. P. R. Duivenvoorden, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2017, p. 344–350.

As robots and other automated systems are introduced to unknown and dynamic environments, robust and adaptive control strategies are required to cope with disturbances, unmodeled dynamics and parametric uncertainties. In this paper, we propose and provide theoretical proofs of a combined L1 adaptive feedback and iterative learning control (ILC) framework to improve trajectory tracking of a system subject to unknown and changing disturbances. The L1 adaptive controller forces the system to behave in a repeatable, predefined way, even in the presence of unknown and changing disturbances; however, this does not imply that perfect trajectory tracking is achieved. ILC improves the tracking performance based on experience from previous executions. The performance of ILC is limited by the robustness and repeatability of the underlying system, which, in this approach, is handled by the L1 adaptive controller. In particular, we are able to generalize learned trajectories across different system configurations because the L1 adaptive controller handles the underlying changes in the system. We demonstrate the improved trajectory tracking performance and generalization capabilities of the combined method compared to pure ILC in experiments with a quadrotor subject to unknown, dynamic disturbances. This is the first work to show L1 adaptive control combined with ILC in experiment.

@INPROCEEDINGS{pereida-icra17,
author = {Karime Pereida and Rikky R. P. R. Duivenvoorden and Angela P. Schoellig},
title = {High-precision trajectory tracking in changing environments through {L1} adaptive feedback and iterative learning},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2017},
pages = {344--350},
doi = {10.1109/ICRA.2017.7989041},
urllink = {http://ieeexplore.ieee.org/abstract/document/7989044/},
urlslides = {../../wp-content/papercite-data/slides/pereida-icra17-slides.pdf},
abstract = {As robots and other automated systems are introduced to unknown and dynamic environments, robust and adaptive control strategies are required to cope with disturbances, unmodeled dynamics and parametric uncertainties. In this paper, we propose and provide theoretical proofs of a combined L1 adaptive feedback and iterative learning control (ILC) framework to improve trajectory tracking of a system subject to unknown and changing disturbances. The L1 adaptive controller forces the system to behave in a repeatable, predefined way, even in the presence of unknown and changing disturbances; however, this does not imply that perfect trajectory tracking is achieved. ILC improves the tracking performance based on experience from previous executions. The performance of ILC is limited by the robustness and repeatability of the underlying system, which, in this approach, is handled by the L1 adaptive controller. In particular, we are able to generalize learned trajectories across different system configurations because the L1 adaptive controller handles the underlying changes in the system. We demonstrate the improved trajectory tracking performance and generalization capabilities of the combined method compared to pure ILC in experiments with a quadrotor subject to unknown, dynamic disturbances. This is the first work to show L1 adaptive control combined with ILC in experiment.},
}

Deep neural networks for improved, impromptu trajectory tracking of quadrotors
Q. Li, J. Qian, Z. Zhu, X. Bao, M. K. Helwa, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2017, p. 5183–5189.

Trajectory tracking control for quadrotors is important for applications ranging from surveying and inspection, to film making. However, designing and tuning classical controllers, such as proportional-integral-derivative (PID) controllers, to achieve high tracking precision can be time-consuming and difficult, due to hidden dynamics and other non-idealities. The Deep Neural Network (DNN), with its superior capability of approximating abstract, nonlinear functions, proposes a novel approach for enhancing trajectory tracking control. This paper presents a DNN-based algorithm as an add-on module that improves the tracking performance of a classical feedback controller. Given a desired trajectory, the DNNs provide a tailored reference input to the controller based on their gained experience. The input aims to achieve a unity map between the desired and the output trajectory. The motivation for this work is an interactive �fly-as-you-draw� application, in which a user draws a trajectory on a mobile device, and a quadrotor instantly flies that trajectory with the DNN-enhanced control system. Experimental results demonstrate that the proposed approach improves the tracking precision for user-drawn trajectories after the DNNs are trained on selected periodic trajectories, suggesting the method�s potential in real-world applications. Tracking errors are reduced by around 40-50% for both training and testing trajectories from users, highlighting the DNNs� capability of generalizing knowledge.

@INPROCEEDINGS{li-icra17,
author = {Qiyang Li and Jingxing Qian and Zining Zhu and Xuchan Bao and Mohamed K. Helwa and Angela P. Schoellig},
title = {Deep neural networks for improved, impromptu trajectory tracking of quadrotors},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2017},
pages = {5183--5189},
doi = {10.1109/ICRA.2017.7989607},
urllink = {https://arxiv.org/abs/1610.06283},
urlvideo = {https://youtu.be/r1WnMUZy9-Y},
abstract = {Trajectory tracking control for quadrotors is important for applications ranging from surveying and inspection, to film making. However, designing and tuning classical controllers, such as proportional-integral-derivative (PID) controllers, to achieve high tracking precision can be time-consuming and difficult, due to hidden dynamics and other non-idealities. The Deep Neural Network (DNN), with its superior capability of approximating abstract, nonlinear functions, proposes a novel approach for enhancing trajectory tracking control. This paper presents a DNN-based algorithm as an add-on module that improves the tracking performance of a classical feedback controller. Given a desired trajectory, the DNNs provide a tailored reference input to the controller based on their gained experience. The input aims to achieve a unity map between the desired and the output trajectory. The motivation for this work is an interactive �fly-as-you-draw� application, in which a user draws a trajectory on a mobile device, and a quadrotor instantly flies that trajectory with the DNN-enhanced control system. Experimental results demonstrate that the proposed approach improves the tracking precision for user-drawn trajectories after the DNNs are trained on selected periodic trajectories, suggesting the method�s potential in real-world applications. Tracking errors are reduced by around 40-50% for both training and testing trajectories from users, highlighting the DNNs� capability of generalizing knowledge.},
}

Virtual vs. real: trading off simulations and physical experiments in reinforcement learning with Bayesian optimization
A. Marco, F. Berkenkamp, P. Hennig, A. P. Schoellig, A. Krause, S. Schaal, and S. Trimpe
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2017, p. 1557–1563.

In practice, the parameters of control policies are often tuned manually. This is time-consuming and frustrating. Reinforcement learning is a promising alternative that aims to automate this process, yet often requires too many experiments to be practical. In this paper, we propose a solution to this problem by exploiting prior knowledge from simulations, which are readily available for most robotic platforms. Specifically, we extend Entropy Search, a Bayesian optimization algorithm that maximizes information gain from each experiment, to the case of multiple information sources. The result is a principled way to automatically combine cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. We apply the resulting method to a cart-pole system, which confirms that the algorithm can find good control policies with fewer experiments than standard Bayesian optimization on the physical system only.

@INPROCEEDINGS{marco-icra17,
author = {Alonso Marco and Felix Berkenkamp and Phillipp Hennig and Angela P. Schoellig and Andreas Krause and Stefan Schaal and Sebastian Trimpe},
title = {Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with {B}ayesian Optimization},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
month = {may},
year = {2017},
pages = {1557--1563},
doi = {10.1109/ICRA.2017.7989186},
urllink = {https://arxiv.org/abs/1703.01250},
abstract = {In practice, the parameters of control policies are often tuned manually. This is time-consuming and frustrating. Reinforcement learning is a promising alternative that aims to automate this process, yet often requires too many experiments to be practical. In this paper, we propose a solution to this problem by exploiting prior knowledge from simulations, which are readily available for most robotic platforms. Specifically, we extend Entropy Search, a Bayesian optimization algorithm that maximizes information gain from each experiment, to the case of multiple information sources. The result is a principled way to automatically combine cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. We apply the resulting method to a cart-pole system, which confirms that the algorithm can find good control policies with fewer experiments than standard Bayesian optimization on the physical system only.}
}

Point-cloud-based aerial fragmentation analysis for application in the minerals industry
T. Bamford, K. Esmaeili, and A. P. Schoellig
Technical Report, arXiv, 2017.

This work investigates the application of Unmanned Aerial Vehicle (UAV) technology for measurement of rock fragmentation without placement of scale objects in the scene to determine image scale. Commonly practiced image-based rock fragmentation analysis requires a technician to walk to a rock pile, place a scale object of known size in the area of interest, and capture individual 2D images. Our previous work has used UAV technology for the first time to acquire real-time rock fragmentation data and has shown comparable quality of results; however, it still required the (potentially dangerous) placement of scale objects, and continued to make the assumption that the rock pile surface is planar and that the scale objects lie on the surface plane. This work improves our UAV-based approach to enable rock fragmentation measurement without placement of scale objects and without the assumption of planarity. This is achieved by first generating a point cloud of the rock pile from 2D images, taking into account intrinsic and extrinsic camera parameters, and then taking 2D images for fragmentation analysis. This work represents an important step towards automating post-blast rock fragmentation analysis. In experiments, a rock pile with known size distribution was photographed by the UAV with and without using scale objects. For fragmentation analysis without scale objects, a point cloud of the rock pile was generated and used to compute image scale. Comparison of the rock size distributions show that this point-cloud-based method enables producing measurements with better or comparable accuracy (within 10% of the ground truth) to the manual method with scale objects.

@TECHREPORT{bamford-iros17,
title = {Point-cloud-based aerial fragmentation analysis for application in the minerals industry},
institution = {arXiv},
author = {Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
year = {2017},
urllink = {https://arxiv.org/abs/1703.01945},
abstract = {This work investigates the application of Unmanned Aerial Vehicle (UAV) technology for measurement of rock fragmentation without placement of scale objects in the scene to determine image scale. Commonly practiced image-based rock fragmentation analysis requires a technician to walk to a rock pile, place a scale object of known size in the area of interest, and capture individual 2D images. Our previous work has used UAV technology for the first time to acquire real-time rock fragmentation data and has shown comparable quality of results; however, it still required the (potentially dangerous) placement of scale objects, and continued to make the assumption that the rock pile surface is planar and that the scale objects lie on the surface plane. This work improves our UAV-based approach to enable rock fragmentation measurement without placement of scale objects and without the assumption of planarity. This is achieved by first generating a point cloud of the rock pile from 2D images, taking into account intrinsic and extrinsic camera parameters, and then taking 2D images for fragmentation analysis. This work represents an important step towards automating post-blast rock fragmentation analysis. In experiments, a rock pile with known size distribution was photographed by the UAV with and without using scale objects. For fragmentation analysis without scale objects, a point cloud of the rock pile was generated and used to compute image scale. Comparison of the rock size distributions show that this point-cloud-based method enables producing measurements with better or comparable accuracy (within 10% of the ground truth) to the manual method with scale objects.},
}

2016

Robust constrained learning-based NMPC enabling reliable mobile robot path tracking
C. J. Ostafew, A. P. Schoellig, and T. D. Barfoot
International Journal of Robotics Research, vol. 35, iss. 13, pp. 1547-1563, 2016.

This paper presents a Robust Constrained Learning-based Nonlinear Model Predictive Control (RC-LB-NMPC) algorithm for path-tracking in off-road terrain. For mobile robots, constraints may represent solid obstacles or localization limits. As a result, constraint satisfaction is required for safety. Constraint satisfaction is typically guaranteed through the use of accurate, a priori models or robust control. However, accurate models are generally not available for off-road operation. Furthermore, robust controllers are often conservative, since model uncertainty is not updated online. In this work our goal is to use learning to generate low-uncertainty, non-parametric models in situ. Based on these models, the predictive controller computes both linear and angular velocities in real-time, such that the robot drives at or near its capabilities while respecting path and localization constraints. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, off-road environments. The paper presents experimental results, including over 5 km of travel by a 900 kg skid-steered robot at speeds of up to 2.0 m/s. The result is a robust, learning controller that provides safe, conservative control during initial trials when model uncertainty is high and converges to high-performance, optimal control during later trials when model uncertainty is reduced with experience.

@ARTICLE{ostafew-ijrr16,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot},
title = {Robust Constrained Learning-Based {NMPC} Enabling Reliable Mobile Robot Path Tracking},
year = {2016},
journal = {{International Journal of Robotics Research}},
volume = {35},
number = {13},
pages = {1547-1563},
doi = {10.1177/0278364916645661},
url = {http://dx.doi.org/10.1177/0278364916645661},
eprint = {http://dx.doi.org/10.1177/0278364916645661},
urlvideo = {https://youtu.be/3xRNmNv5Efk},
abstract = {This paper presents a Robust Constrained Learning-based Nonlinear Model Predictive Control (RC-LB-NMPC) algorithm for path-tracking in off-road terrain. For mobile robots, constraints may represent solid obstacles or localization limits. As a result, constraint satisfaction is required for safety. Constraint satisfaction is typically guaranteed through the use of accurate, a priori models or robust control. However, accurate models are generally not available for off-road operation. Furthermore, robust controllers are often conservative, since model uncertainty is not updated online. In this work our goal is to use learning to generate low-uncertainty, non-parametric models in situ. Based on these models, the predictive controller computes both linear and angular velocities in real-time, such that the robot drives at or near its capabilities while respecting path and localization constraints. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, off-road environments. The paper presents experimental results, including over 5 km of travel by a 900 kg skid-steered robot at speeds of up to 2.0 m/s. The result is a robust, learning controller that provides safe, conservative control during initial trials when model uncertainty is high and converges to high-performance, optimal control during later trials when model uncertainty is reduced with experience.},
}

Learning-based nonlinear model predictive control to improve vision-based mobile robot path tracking
C. J. Ostafew, J. Collier, A. P. Schoellig, and T. D. Barfoot
Journal of Field Robotics, vol. 33, iss. 1, pp. 133-152, 2016.

This paper presents a Learning-based Nonlinear Model Predictive Control (LB-NMPC) algorithm to achieve high-performance path tracking in challenging off-road terrain through learning. The LB-NMPC algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) as a function of system state, input, and other relevant variables. The GP is updated based on experience collected during previous trials. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results including over 3.0 km of travel by three significantly different robot platforms with masses ranging from 50 kg to 600 kg and at speeds ranging from 0.35 m/s to 1.2 m/s. Planned speeds are generated by a novel experience-based speed scheduler that balances overall travel time, path-tracking errors, and localization reliability. The results show that the controller can start from a generic a priori vehicle model and subsequently learn to reduce vehicle- and trajectory-specific path-tracking errors based on experience.

@ARTICLE{ostafew-jfr16,
author = {Chris J. Ostafew and Jack Collier and Angela P. Schoellig and Timothy D. Barfoot},
title = {Learning-based nonlinear model predictive control to improve vision-based mobile robot path tracking},
year = {2016},
journal = {{Journal of Field Robotics}},
volume = {33},
number = {1},
pages = {133-152},
doi = {10.1002/rob.21587},
urlvideo={https://youtu.be/lxm-2A6yOY0?list=PLC12E387419CEAFF2},
urlvideo2={https://youtu.be/M9xhkHCzpMo?list=PL0F1AD87C0266A961},
urlvideo3={http://youtu.be/MwVElAn95-M?list=PLC0E5EB919968E507},
urlvideo4={http://youtu.be/Pu3_F6k6Fa4?list=PLC0E5EB919968E507},
abstract = {This paper presents a Learning-based Nonlinear Model Predictive Control (LB-NMPC) algorithm to achieve high-performance path tracking in challenging off-road terrain through learning. The LB-NMPC algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) as a function of system state, input, and other relevant variables. The GP is updated based on experience collected during previous trials. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results including over 3.0 km of travel by three significantly different robot platforms with masses ranging from 50 kg to 600 kg and at speeds ranging from 0.35 m/s to 1.2 m/s. Planned speeds are generated by a novel experience-based speed scheduler that balances overall travel time, path-tracking errors, and localization reliability. The results show that the controller can start from a generic a priori vehicle model and subsequently learn to reduce vehicle- and trajectory-specific path-tracking errors based on experience.}
}

Distributed iterative learning control for a team of quadrotors
A. Hock and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2016, pp. 4640-4646.

The goal of this work is to enable a team of quadrotors to learn how to accurately track a desired trajectory while holding a given formation. We solve this problem in a distributed manner, where each vehicle has only access to the information of its neighbors. The desired trajectory is only available to one (or few) vehicles. We present a distributed iterative learning control (ILC) approach where each vehicle learns from the experience of its own and its neighbors’ previous task repetitions, and adapts its feedforward input to improve performance. Existing algorithms are extended in theory to make them more applicable to real-world experiments. In particular, we prove stability for any causal learning function with gains chosen according to a simple scalar condition. Previous proofs were restricted to a specific learning function that only depends on the tracking error derivative (D-type ILC). Our extension provides more degrees of freedom in the ILC design and, as a result, better performance can be achieved. We also show that stability is not affected by a linear dynamic coupling between neighbors. This allows us to use an additional consensus feedback controller to compensate for non-repetitive disturbances. Experiments with two quadrotors attest the effectiveness of the proposed distributed multi-agent ILC approach. This is the first work to show distributed ILC in experiment.

@INPROCEEDINGS{hock-cdc16,
author = {Andreas Hock and Angela P. Schoellig},
title = {Distributed iterative learning control for a team of quadrotors},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2016},
pages = {4640-4646},
doi = {10.1109/CDC.2016.7798976},
urllink = {http://arxiv.org/ads/1603.05933},
urlvideo = {https://youtu.be/Qw598DRw6-Q},
urlvideo2 = {https://youtu.be/JppRu26eZgI},
urlslides = {../../wp-content/papercite-data/slides/hock-cdc16-slides.pdf},
abstract = {The goal of this work is to enable a team of quadrotors to learn how to accurately track a desired trajectory while holding a given formation. We solve this problem in a distributed manner, where each vehicle has only access to the information of its neighbors. The desired trajectory is only available to one (or few) vehicles. We present a distributed iterative learning control (ILC) approach where each vehicle learns from the experience of its own and its neighbors’ previous task repetitions, and adapts its feedforward input to improve performance. Existing algorithms are extended in theory to make them more applicable to real-world experiments. In particular, we prove stability for any causal learning function with gains chosen according to a simple scalar condition. Previous proofs were restricted to a specific learning function that only depends on the tracking error derivative (D-type ILC). Our extension provides more degrees of freedom in the ILC design and, as a result, better performance can be achieved. We also show that stability is not affected by a linear dynamic coupling between neighbors. This allows us to use an additional consensus feedback controller to compensate for non-repetitive disturbances. Experiments with two quadrotors attest the effectiveness of the proposed distributed multi-agent ILC approach. This is the first work to show distributed ILC in experiment.},
}

On the construction of safe controllable regions for affine systems with applications to robotics
M. K. Helwa and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2016, pp. 3000-3005.

This paper studies the problem of constructing in-block controllable (IBC) regions for affine systems. That is, we are concerned with constructing regions in the state space of affine systems such that all the states in the interior of the region are mutually accessible through the region’s interior by applying uniformly bounded inputs. We first show that existing results for checking in-block controllability on given polytopic regions cannot be easily extended to address the question of constructing IBC regions. We then explore the geometry of the problem to provide a computationally efficient algorithm for constructing IBC regions. We also prove the soundness of the algorithm. Finally, we use the proposed algorithm to construct safe speed profiles for fully-actuated robots and for ground robots modeled as unicycles with acceleration limits.

@INPROCEEDINGS{helwa-cdc16,
author = {Mohamed K. Helwa and Angela P. Schoellig},
title = {On the construction of safe controllable regions for affine systems with applications to robotics},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2016},
pages = {3000-3005},
doi = {10.1109/CDC.2016.7798717},
urllink = {https://arxiv.org/abs/1610.01243},
urlslides = {../../wp-content/papercite-data/slides/helwa-cdc16-slides.pdf},
urlvideo = {https://youtu.be/s_N7zTtCjd0},
abstract = {This paper studies the problem of constructing in-block controllable (IBC) regions for affine systems. That is, we are concerned with constructing regions in the state space of affine systems such that all the states in the interior of the region are mutually accessible through the region’s interior by applying uniformly bounded inputs. We first show that existing results for checking in-block controllability on given polytopic regions cannot be easily extended to address the question of constructing IBC regions. We then explore the geometry of the problem to provide a computationally efficient algorithm for constructing IBC regions. We also prove the soundness of the algorithm. Finally, we use the proposed algorithm to construct safe speed profiles for fully-actuated robots and for ground robots modeled as unicycles with acceleration limits.},
}

Safe learning of regions of attraction for uncertain, nonlinear systems with Gaussian processes
F. Berkenkamp, R. Moriconi, A. P. Schoellig, and A. Krause
in Proc. of the IEEE Conference on Decision and Control (CDC), 2016, pp. 4661-4666.

Control theory can provide useful insights into the properties of controlled, dynamic systems. One important property of nonlinear systems is the region of attraction (ROA), a safe subset of the state space in which a given controller renders an equilibrium point asymptotically stable. The ROA is typically estimated based on a model of the system. However, since models are only an approximation of the real world, the resulting estimated safe region can contain states outside the ROA of the real system. This is not acceptable in safety-critical applications. In this paper, we consider an approach that learns the ROA from experiments on a real system, without ever leaving the true ROA and, thus, without risking safety-critical failures. Based on regularity assumptions on the model errors in terms of a Gaussian process prior, we use an underlying Lyapunov function in order to determine a region in which an equilibrium point is asymptotically stable with high probability. Moreover, we provide an algorithm to actively and safely explore the state space in order to expand the ROA estimate. We demonstrate the effectiveness of this method in simulation.

@INPROCEEDINGS{berkenkamp-cdc16,
author = {Felix Berkenkamp and Riccardo Moriconi and Angela P. Schoellig and Andreas Krause},
title = {Safe learning of regions of attraction for uncertain, nonlinear systems with {G}aussian processes},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
year = {2016},
pages = {4661-4666},
doi = {10.1109/CDC.2016.7798979},
urllink = {http://arxiv.org/abs/1603.04915},
urlvideo = {https://youtu.be/bSv-pNOWn7c},
urlslides={../../wp-content/papercite-data/slides/berkenkamp-cdc16-slides.pdf},
urlcode = {https://github.com/befelix/lyapunov-learning},
urlcode2 = {http://berkenkamp.me/jupyter/lyapunov},
abstract = {Control theory can provide useful insights into the properties of controlled, dynamic systems. One important property of nonlinear systems is the region of attraction (ROA), a safe subset of the state space in which a given controller renders an equilibrium point asymptotically stable. The ROA is typically estimated based on a model of the system. However, since models are only an approximation of the real world, the resulting estimated safe region can contain states outside the ROA of the real system. This is not acceptable in safety-critical applications. In this paper, we consider an approach that learns the ROA from experiments on a real system, without ever leaving the true ROA and, thus, without risking safety-critical failures. Based on regularity assumptions on the model errors in terms of a Gaussian process prior, we use an underlying Lyapunov function in order to determine a region in which an equilibrium point is asymptotically stable with high probability. Moreover, we provide an algorithm to actively and safely explore the state space in order to expand the ROA estimate. We demonstrate the effectiveness of this method in simulation.}
}

A real-time analysis of rock fragmentation using UAV technology
T. Bamford, K. Esmaeili, and A. P. Schoellig
in Proc. of the International Conference on Computer Applications in the Minerals Industries (CAMI), 2016.

Accurate measurement of blast-induced rock fragmentation is of great importance for many mining operations. The post-blast rock size distribution can significantly influence the efficiency of all the downstream mining and comminution processes. Image analysis methods are one of the most common methods used to measure rock fragment size distribution in mines regardless of criticism for lack of accuracy to measure fine particles and other perceived deficiencies. The current practice of collecting rock fragmentation data for image analysis is highly manual and provides data with low temporal and spatial resolution. Using Unmanned Aerial Vehicles (UAVs) for collecting images of rock fragments can not only improve the quality of the image data but also automate the data collection process. Ultimately, real-time acquisition of high temporal- and spatial-resolution data based on UAV technology will provide a broad range of opportunities for both improving blast design without interrupting the production process and reducing the cost of the human operator.

@INPROCEEDINGS{bamford-cami16,
author = {Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
title = {A real-time analysis of rock fragmentation using {UAV} technology},
booktitle = {{Proc. of the International Conference on Computer Applications in the Minerals Industries (CAMI)}},
year = {2016},
urllink = {http://arxiv.org/abs/1607.04243},
urlvideo = {https://youtu.be/q0syk6J_JHY},
urlslides={../../wp-content/papercite-data/slides/bamford-cami16-slides.pdf},
abstract = {Accurate measurement of blast-induced rock fragmentation is of great importance for many mining operations. The post-blast rock size distribution can significantly influence the efficiency of all the downstream mining and comminution processes. Image analysis methods are one of the most common methods used to measure rock fragment size distribution in mines regardless of criticism for lack of accuracy to measure fine particles and other perceived deficiencies. The current practice of collecting rock fragmentation data for image analysis is highly manual and provides data with low temporal and spatial resolution. Using Unmanned Aerial Vehicles (UAVs) for collecting images of rock fragments can not only improve the quality of the image data but also automate the data collection process. Ultimately, real-time acquisition of high temporal- and spatial-resolution data based on UAV technology will provide a broad range of opportunities for both improving blast design without interrupting the production process and reducing the cost of the human operator.},
}

Unscented external force estimation for quadrotors and experiments
C. D. McKinnon and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016, pp. 5651-5657.

In this paper, we describe an algorithm, based on the well-known Unscented Quaternion Estimator, to estimate external forces and torques acting on a quadrotor. This formulation uses a non-linear model for the quadrotor dynamics, naturally incorporates process and measurement noise, requires only a few parameters to be tuned manually, and uses singularity-free unit quaternions to represent attitude. We demonstrate in simulation that the proposed algorithm can outperform existing methods. We then highlight how our approach can be used to generate force and torque profiles from experimental data, and how this information can later be used for controller design. Finally, we show how the resulting controllers enable a quadrotor to stay in the wind field of a moving fan.

@INPROCEEDINGS{mckinnon-iros16,
author = {Christopher D. McKinnon and Angela P. Schoellig},
title = {Unscented external force estimation for quadrotors and experiments},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year = {2016},
pages = {5651-5657},
doi = {10.1109/IROS.2016.7759831},
urllink = {http://arxiv.org/abs/1603.02772},
urlvideo = {https://youtu.be/YFA3kHabY38},
abstract = {In this paper, we describe an algorithm, based on the well-known Unscented Quaternion Estimator, to estimate external forces and torques acting on a quadrotor. This formulation uses a non-linear model for the quadrotor dynamics, naturally incorporates process and measurement noise, requires only a few parameters to be tuned manually, and uses singularity-free unit quaternions to represent attitude. We demonstrate in simulation that the proposed algorithm can outperform existing methods. We then highlight how our approach can be used to generate force and torque profiles from experimental data, and how this information can later be used for controller design. Finally, we show how the resulting controllers enable a quadrotor to stay in the wind field of a moving fan.},
}

Safe and robust quadrotor maneuvers based on reach control
M. Vukosavljev, I. Jansen, M. E. Broucke, and A. P. Schoellig
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2016, pp. 5677-5682.

In this paper, we investigate the synthesis of piecewise affine feedback controllers to execute safe and robust quadrocopter maneuvers. The methodology is based on formulating the problem as a reach control problem on a polytopic state space. Reach control has so far only been developed in theory and has not been tested experimentally in a real system before. We demonstrate that these theoretical tools can achieve aggressive, albeit safe and robust, quadrocopter maneuvers without the need for a predefined open-loop reference trajectory. In a proof-of-concept demonstration, the reach controller is implemented in one translational direction while the other degrees of freedom are stabilized by separate controllers. Experimental results on a quadrocopter show the effectiveness and robustness of this control approach.

@INPROCEEDINGS{vukosavljev-icra16,
author = {Marijan Vukosavljev and Ivo Jansen and Mireille E. Broucke and Angela P. Schoellig},
title = {Safe and robust quadrotor maneuvers based on reach control},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2016},
pages = {5677-5682},
doi = {10.1109/ICRA.2016.7487789},
urllink = {https://arxiv.org/abs/1610.02385},
urlvideo={https://youtu.be/l4vdxdmd2xc},
urlslides={../../wp-content/papercite-data/slides/vukosavljev-icra16-slides.pdf},
urlslides2={../../wp-content/papercite-data/slides/vukosavljev-icra16-slides2.pdf},
abstract = {In this paper, we investigate the synthesis of piecewise affine feedback controllers to execute safe and robust quadrocopter maneuvers. The methodology is based on formulating the problem as a reach control problem on a polytopic state space. Reach control has so far only been developed in theory and has not been tested experimentally in a real system before. We demonstrate that these theoretical tools can achieve aggressive, albeit safe and robust, quadrocopter maneuvers without the need for a predefined open-loop reference trajectory. In a proof-of-concept demonstration, the reach controller is implemented in one translational direction while the other degrees of freedom are stabilized by separate controllers. Experimental results on a quadrocopter show the effectiveness and robustness of this control approach.}
}

Safe controller optimization for quadrotors with Gaussian processes
F. Berkenkamp, A. P. Schoellig, and A. Krause
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2016, p. 491–496.

One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to obtain an initial controller, but ultimately the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning step, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters on the real system, safety-critical system failures may happen. In this paper, we overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SafeOpt, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SafeOpt automatically optimizes the parameters of a control law while guaranteeing safety. It models the underlying performance measure as a Gaussian process and only explores new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.

@INPROCEEDINGS{berkenkamp-icra16,
author = {Felix Berkenkamp and Angela P. Schoellig and Andreas Krause},
title = {Safe controller optimization for quadrotors with {G}aussian processes},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
year = {2016},
month = {May},
pages = {491--496},
doi = {10.1109/ICRA.2016.7487170},
urllink = {http://arxiv.org/abs/1509.01066},
urlvideo = {https://www.youtube.com/watch?v=GiqNQdzc5TI},
urlvideo2 = {https://www.youtube.com/watch?v=IYi8qMnt0yU},
urlcode = {https://github.com/befelix/SafeOpt},
abstract = {One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to obtain an initial controller, but ultimately the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning step, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters on the real system, safety-critical system failures may happen. In this paper, we overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SafeOpt, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SafeOpt automatically optimizes the parameters of a control law while guaranteeing safety. It models the underlying performance measure as a Gaussian process and only explores new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.},
}

A preliminary study of transfer learning between unicycle robots
K. V. Raimalwala, B. A. Francis, and A. P. Schoellig
in Proc. of the AAAI Spring Symposium Series, 2016, p. 53–59.

Methods from machine learning have successfully been used to improve the performance of control systems in cases when accurate models of the system or the environment are not available. These methods require the use of data generated from physical trials. Transfer Learning (TL) allows for this data to come from a different, similar system. The goal of this work is to understand in which cases a simple, alignment-based transfer of data is beneficial. A scalar, linear, time- invariant (LTI) transformation is applied to the output from a source system to align with the output from a tar- get system. In a theoretic study, we have already shown that for linear, single-input, single-output systems, the upper bound of the transformation error depends on the dynamic properties of the source and target system, and is small for systems with similar response times. We now consider two nonlinear, unicycle robots. Based on our previous work, we derive analytic error bounds for the linearized robot models. We then provide simulations of the nonlinear robot models and experiments with a Pioneer 3-AT robot that confirm the theoretical findings. As a result, key characteristics of alignment- based transfer learning observed in our theoretic study prove to be also true for real, nonlinear unicycle robots.

@INPROCEEDINGS{raimalwala-aaai16,
author = {Kaizad V. Raimalwala and Bruce A. Francis and Angela P. Schoellig},
title = {A preliminary study of transfer learning between unicycle robots},
booktitle = {{Proc. of the AAAI Spring Symposium Series}},
year = {2016},
pages = {53--59},
abstract = {Methods from machine learning have successfully been used to improve the performance of control systems in cases when accurate models of the system or the environment are not available. These methods require the use of data generated from physical trials. Transfer Learning (TL) allows for this data to come from a different, similar system. The goal of this work is to understand in which cases a simple, alignment-based transfer of data is beneficial. A scalar, linear, time- invariant (LTI) transformation is applied to the output from a source system to align with the output from a tar- get system. In a theoretic study, we have already shown that for linear, single-input, single-output systems, the upper bound of the transformation error depends on the dynamic properties of the source and target system, and is small for systems with similar response times. We now consider two nonlinear, unicycle robots. Based on our previous work, we derive analytic error bounds for the linearized robot models. We then provide simulations of the nonlinear robot models and experiments with a Pioneer 3-AT robot that confirm the theoretical findings. As a result, key characteristics of alignment- based transfer learning observed in our theoretic study prove to be also true for real, nonlinear unicycle robots.},
}

Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics
F. Berkenkamp, A. Krause, and A. P. Schoellig
Technical Report, arXiv, 2016.

Robotic algorithms typically depend on various parameters, the choice of which significantly affects the robot’s performance. While an initial guess for the parameters may be obtained from dynamic models of the robot, parameters are usually tuned manually on the real system to achieve the best performance. Optimization algorithms, such as Bayesian optimization, have been used to automate this process. However, these methods may evaluate unsafe parameters during the optimization process that lead to safety-critical system failures. Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been developed, which guarantees that the performance of the system never falls below a critical value; that is, safety is defined based on the performance function. However, coupling performance and safety is often not desirable in robotics. For example, high-gain controllers might achieve low average tracking error (performance), but can overshoot and violate input constraints. In this paper, we present a generalized algorithm that allows for multiple safety constraints separate from the objective. Given an initial set of safe parameters, the algorithm maximizes performance but only evaluates parameters that satisfy safety for all constraints with high probability. To this end, it carefully explores the parameter space by exploiting regularity assumptions in terms of a Gaussian process prior. Moreover, we show how context variables can be used to safely transfer knowledge to new situations and tasks. We provide a theoretical analysis and demonstrate that the proposed algorithm enables fast, automatic, and safe optimization of tuning parameters in experiments on a quadrotor vehicle.

@TECHREPORT{berkenkamp-tr16,
title = {Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics},
institution = {arXiv},
author = {Berkenkamp, Felix and Krause, Andreas and Schoellig, Angela P.},
year = {2016},
urllink = {http://arxiv.org/abs/1602.04450},
urlvideo = {https://youtu.be/GiqNQdzc5TI},
urlcode = {https://github.com/befelix/SafeOpt},
abstract = {Robotic algorithms typically depend on various parameters, the choice of which significantly affects the robot's performance. While an initial guess for the parameters may be obtained from dynamic models of the robot, parameters are usually tuned manually on the real system to achieve the best performance. Optimization algorithms, such as Bayesian optimization, have been used to automate this process. However, these methods may evaluate unsafe parameters during the optimization process that lead to safety-critical system failures. Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been developed, which guarantees that the performance of the system never falls below a critical value; that is, safety is defined based on the performance function. However, coupling performance and safety is often not desirable in robotics. For example, high-gain controllers might achieve low average tracking error (performance), but can overshoot and violate input constraints. In this paper, we present a generalized algorithm that allows for multiple safety constraints separate from the objective. Given an initial set of safe parameters, the algorithm maximizes performance but only evaluates parameters that satisfy safety for all constraints with high probability. To this end, it carefully explores the parameter space by exploiting regularity assumptions in terms of a Gaussian process prior. Moreover, we show how context variables can be used to safely transfer knowledge to new situations and tasks. We provide a theoretical analysis and demonstrate that the proposed algorithm enables fast, automatic, and safe optimization of tuning parameters in experiments on a quadrotor vehicle.}
}

Rock fragmentation analysis using UAV technology
T. Bamford, K. Esmaeili, and A. P. Schoellig
Professional Magazine Article, Ontario Professional Surveyor (OPS) Magazine, Assn. of Ontario Land Surveyors, 2016.

@MISC{bamford-ops16,
author = {Thomas Bamford and Kamran Esmaeili and Angela P. Schoellig},
title = {Rock fragmentation analysis using {UAV} technology},
year = {2016},
volume = {59},
number = {4},
pages = {14-16},
howpublished = {Professional Magazine Article, Ontario Professional Surveyor (OPS) Magazine, Assn. of Ontario Land Surveyors},
urllink = {http://publications.aols.org/OPS-Magazine/2016Fall/},
}

Quantifying the value of drone-delivered AEDs in cardiac arrest response
J. J. Boutilier, S. C. Brooks, A. Janmohamed, A. Byers, C. Zhan, J. E. Buick, A. P. Schoellig, L. J. Morrison, S. Cheskes, and T. C. Y. Chan
Abstract and Oral Presentation, in American Heart Association (AHA) Resuscitation Science Symposium, 2016.

@MISC{boutilier-aha16,
author = {J. J. Boutilier and S. C. Brooks and A. Janmohamed and A. Byers and C. Zhan and J. E. Buick and A. P. Schoellig and L. J. Morrison and S. Cheskes and T.C.Y. Chan},
title = {Quantifying the value of drone-delivered {AEDs} in cardiac arrest response},
howpublished = {Abstract and Oral Presentation, in American Heart Association (AHA) Resuscitation Science Symposium},
year = {2016},
}

Safe automatic controller tuning for quadrotors
F. Berkenkamp, A. Krause, and A. P. Schoellig
Video Submission, Assn. of the Advancement of Artificial Intelligence (AAAI) AI Video Competition, 2016.

@MISC{berkenkamp-aaai16,
author = {Felix Berkenkamp and Andreas Krause and Angela P. Schoellig},
title = {Safe automatic controller tuning for quadrotors},
howpublished = {Video Submission, Assn. of the Advancement of Artificial Intelligence (AAAI) AI Video Competition},
urlvideo = {https://youtu.be/7ZkZlxXHgTY?list=PLuOoXrWK6Kz5ySULxGMtAUdZEg9SkXDoq},
year = {2016},
}

Data-driven interaction for quadrotors based on external forces
C. McKinnon and A. P. Schoellig
Video Submission, Assn. of the Advancement of Artificial Intelligence (AAAI) AI Video Competition, 2016.

@MISC{mckinnon-aaai16,
author = {Chris McKinnon and Angela P. Schoellig},
title = {Data-driven interaction for quadrotors based on external forces},
howpublished = {Video Submission, Assn. of the Advancement of Artificial Intelligence (AAAI) AI Video Competition},
urlvideo = {https://youtu.be/x0RL7Jh6F9s?list=PLuOoXrWK6Kz5ySULxGMtAUdZEg9SkXDoq},
year = {2016},
}

2015

An upper bound on the error of alignment-based transfer learning between two linear, time-invariant, scalar systems
K. V. Raimalwala, B. A. Francis, and A. P. Schoellig
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015, p. 5253–5258.

Methods from machine learning have successfully been used to improve the performance of control systems in cases when accurate models of the system or the environment are not available. These methods require the use of data generated from physical trials. Transfer Learning (TL) allows for this data to come from a different, similar system. This paper studies a simplified TL scenario with the goal of understanding in which cases a simple, alignment-based transfer of data is possible and beneficial. Two linear, time-invariant (LTI), single-input, single-output systems are tasked to follow the same reference signal. A scalar, LTI transformation is applied to the output from a source system to align with the output from a target system. An upper bound on the 2-norm of the transformation error is derived for a large set of reference signals and is minimized with respect to the transformation scalar. Analysis shows that the minimized error bound is reduced for systems with poles that lie close to each other (that is, for systems with similar response times). This criterion is relaxed for systems with poles that have a larger negative real part (that is, for stable systems with fast response), meaning that poles can be further apart for the same minimized error bound. Additionally, numerical results show that using the reference signal as input to the transformation reduces the minimized bound further.

@INPROCEEDINGS{raimalwala-iros15,
author = {Kaizad V. Raimalwala and Bruce A. Francis and Angela P. Schoellig},
title = {An upper bound on the error of alignment-based transfer learning between two linear, time-invariant, scalar systems},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {5253--5258},
year = {2015},
doi = {10.1109/IROS.2015.7354118},
urllink = {http://hdl.handle.net/1807/69365},
note = {},
abstract = {Methods from machine learning have successfully been used to improve the performance of control systems in cases when accurate models of the system or the environment are not available. These methods require the use of data generated from physical trials. Transfer Learning (TL) allows for this data to come from a different, similar system. This paper studies a simplified TL scenario with the goal of understanding in which cases a simple, alignment-based transfer of data is possible and beneficial. Two linear, time-invariant (LTI), single-input, single-output systems are tasked to follow the same reference signal. A scalar, LTI transformation is applied to the output from a source system to align with the output from a target system. An upper bound on the 2-norm of the transformation error is derived for a large set of reference signals and is minimized with respect to the transformation scalar. Analysis shows that the minimized error bound is reduced for systems with poles that lie close to each other (that is, for systems with similar response times). This criterion is relaxed for systems with poles that have a larger negative real part (that is, for stable systems with fast response), meaning that poles can be further apart for the same minimized error bound. Additionally, numerical results show that using the reference signal as input to the transformation reduces the minimized bound further.}
}

Safe and robust learning control with Gaussian processes
F. Berkenkamp and A. P. Schoellig
in Proc. of the European Control Conference (ECC), 2015, p. 2501–2506.

This paper introduces a learning-based robust control algorithm that provides robust stability and performance guarantees during learning. The approach uses Gaussian process (GP) regression based on data gathered during operation to update an initial model of the system and to gradually decrease the uncertainty related to this model. Embedding this data-based update scheme in a robust control framework guarantees stability during the learning process. Traditional robust control approaches have not considered online adaptation of the model and its uncertainty before. As a result, their controllers do not improve performance during operation. Typical machine learning algorithms that have achieved similar high-performance behavior by adapting the model and controller online do not provide the guarantees presented in this paper. In particular, this paper considers a stabilization task, linearizes the nonlinear, GP-based model around a desired operating point, and solves a convex optimization problem to obtain a linear robust controller. The resulting performance improvements due to the learning-based controller are demonstrated in experiments on a quadrotor vehicle.

@INPROCEEDINGS{berkenkamp-ecc15,
author = {Felix Berkenkamp and Angela P. Schoellig},
title = {Safe and robust learning control with {G}aussian processes},
booktitle = {{Proc. of the European Control Conference (ECC)}},
pages = {2501--2506},
year = {2015},
doi = {10.1109/ECC.2015.7330913},
urlvideo={https://youtu.be/YqhLnCm0KXY?list=PLC12E387419CEAFF2},
urlslides={../../wp-content/papercite-data/slides/berkenkamp-ecc15-slides.pdf},
abstract = {This paper introduces a learning-based robust control algorithm that provides robust stability and performance guarantees during learning. The approach uses Gaussian process (GP) regression based on data gathered during operation to update an initial model of the system and to gradually decrease the uncertainty related to this model. Embedding this data-based update scheme in a robust control framework guarantees stability during the learning process. Traditional robust control approaches have not considered online adaptation of the model and its uncertainty before. As a result, their controllers do not improve performance during operation. Typical machine learning algorithms that have achieved similar high-performance behavior by adapting the model and controller online do not provide the guarantees presented in this paper. In particular, this paper considers a stabilization task, linearizes the nonlinear, GP-based model around a desired operating point, and solves a convex optimization problem to obtain a linear robust controller. The resulting performance improvements due to the learning-based controller are demonstrated in experiments on a quadrotor vehicle.}
}

Conservative to confident: treating uncertainty robustly within learning-based control
C. J. Ostafew, A. P. Schoellig, and T. D. Barfoot
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2015, p. 421–427.

Robust control maintains stability and performance for a fixed amount of model uncertainty but can be conservative since the model is not updated online. Learning- based control, on the other hand, uses data to improve the model over time but is not typically guaranteed to be robust throughout the process. This paper proposes a novel combination of both ideas: a robust Min-Max Learning-Based Nonlinear Model Predictive Control (MM-LB-NMPC) algorithm. Based on an existing LB-NMPC algorithm, we present an efficient and robust extension, altering the NMPC performance objective to optimize for the worst-case scenario. The algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) based on experience collected during previous trials as a function of system state, input, and other relevant variables. Nominal state sequences are predicted using an Unscented Transform and worst-case scenarios are defined as sequences bounding the 3σ confidence region. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results from testing on a 50 kg skid-steered robot executing a path-tracking task. The results show reductions in maximum lateral and heading path-tracking errors by up to 30% and a clear transition from robust control when the model uncertainty is high to optimal control when model uncertainty is reduced.

@INPROCEEDINGS{ostafew-icra15,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot},
title = {Conservative to confident: treating uncertainty robustly within learning-based control},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
pages = {421--427},
year = {2015},
doi = {10.1109/ICRA.2015.7139033},
note = {},
abstract = {Robust control maintains stability and performance for a fixed amount of model uncertainty but can be conservative since the model is not updated online. Learning- based control, on the other hand, uses data to improve the model over time but is not typically guaranteed to be robust throughout the process. This paper proposes a novel combination of both ideas: a robust Min-Max Learning-Based Nonlinear Model Predictive Control (MM-LB-NMPC) algorithm. Based on an existing LB-NMPC algorithm, we present an efficient and robust extension, altering the NMPC performance objective to optimize for the worst-case scenario. The algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) based on experience collected during previous trials as a function of system state, input, and other relevant variables. Nominal state sequences are predicted using an Unscented Transform and worst-case scenarios are defined as sequences bounding the 3σ confidence region. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results from testing on a 50 kg skid-steered robot executing a path-tracking task. The results show reductions in maximum lateral and heading path-tracking errors by up to 30% and a clear transition from robust control when the model uncertainty is high to optimal control when model uncertainty is reduced.}
}

A flying drum machine
X. Wang, N. Dalal, T. Laidlow, and A. P. Schoellig
Technical Report, 2015.

This paper proposes the use of a quadrotor aerial vehicle as a musical instrument. Using the idea of interactions based on physical contact, a system is developed that enables humans to engage in artistic expression with a flying robot and produce music. A robotic user interface that uses physical interactions was created for a quadcopter. The interactive quadcopter was then programmed to drive playback of drum sounds in real-time in response to physical interaction. An intuitive mapping was developed between machine movement and art/creative composition. Challenges arose in meeting realtime latency requirements mainly due to delays in input detection. They were overcome through the development of a quick input detection method, which relies on accurate yet fast digital filtering. Successful experiments were conducted with a professional musician who used the interface to compose complex rhythm patterns. A video accompanying this paper demonstrates his performance.

@TECHREPORT{wang-tr15,
author = {Xingbo Wang and Natasha Dalal and Tristan Laidlow and Angela P. Schoellig},
title = {A Flying Drum Machine},
year = {2015},
urlvideo={https://youtu.be/d5zG-BWB7lE?list=PLD6AAACCBFFE64AC5},
abstract = {This paper proposes the use of a quadrotor aerial vehicle as a musical instrument. Using the idea of interactions based on physical contact, a system is developed that enables humans to engage in artistic expression with a flying robot and produce music. A robotic user interface that uses physical interactions was created for a quadcopter. The interactive quadcopter was then programmed to drive playback of drum sounds in real-time in response to physical interaction. An intuitive mapping was developed between machine movement and art/creative composition. Challenges arose in meeting realtime latency requirements mainly due to delays in input detection. They were overcome through the development of a quick input detection method, which relies on accurate yet fast digital filtering. Successful experiments were conducted with a professional musician who used the interface to compose complex rhythm patterns. A video accompanying this paper demonstrates his performance.}
}

Safe controller optimization for quadrotors with Gaussian processes
F. Berkenkamp, A. P. Schoellig, and A. Krause
Abstract and Presentation in Proc. of the Second Machine Learning in Planning and Control of Robot Motion Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015.

One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to design an initial controller, but ultimately, the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters, safety-critical system failures may hap- pen. We overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SAFEOPT, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SAFEOPT automatically optimizes the parameters of a control law while guaranteeing system safety and stability. It achieves this by modeling the underlying performance measure as a Gaussian process and only exploring new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.

@MISC{berkenkamp-iros15,
author = {Felix Berkenkamp and Angela P. Schoellig and Andreas Krause},
title = {Safe controller optimization for quadrotors with {G}aussian processes},
howpublished = {Abstract and Presentation in Proc. of the Second Machine Learning in Planning and Control of Robot Motion Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year = {2015},
abstract = {One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to design an initial controller, but ultimately, the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters, safety-critical system failures may hap- pen. We overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SAFEOPT, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SAFEOPT automatically optimizes the parameters of a control law while guaranteeing system safety and stability. It achieves this by modeling the underlying performance measure as a Gaussian process and only exploring new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.},
}

2014

Application-driven design of aerial communication networks
T. Andre, K. A. Hummel, A. P. Schoellig, E. Yanmaz, M. Asedpour, C. Bettstetter, P. Grippa, H. Hellwagner, S. Sand, and S. Zhang
IEEE Communications Magazine, vol. 52, iss. 5, pp. 129-137, 2014. Authors 1 to 4 contributed equally.

Networks of micro aerial vehicles (MAVs) equipped with various sensors are increasingly used for civil applications, such as monitoring, surveillance, and disaster management. In this article, we discuss the communication requirements raised by applications in MAV networks. We propose a novel system representation that can be used to specify different application demands. To this end, we extract key functionalities expected in an MAV network. We map these functionalities into building blocks to characterize the expected communication needs. Based on insights from our own and related real-world experiments, we discuss the capabilities of existing communications technologies and their limitations to implement the proposed building blocks. Our findings indicate that while certain requirements of MAV applications are met with available technologies, further research and development is needed to address the scalability, heterogeneity, safety, quality of service, and security aspects of multi-MAV systems.

@ARTICLE{andre-com14,
author = {Torsten Andre and Karin A. Hummel and Angela P. Schoellig and Evsen Yanmaz and Mahdi Asedpour and Christian Bettstetter and Pasquale Grippa and Hermann Hellwagner and Stephan Sand and Siwei Zhang},
title = {Application-driven design of aerial communication networks},
journal = {{IEEE Communications Magazine}},
note={Authors 1 to 4 contributed equally},
volume = {52},
number = {5},
pages = {129-137},
year = {2014},
doi = {10.1109/MCOM.2014.6815903},
urllink = {http://nes.aau.at/?p=1176},
abstract = {Networks of micro aerial vehicles (MAVs) equipped with various sensors are increasingly used for civil applications, such as monitoring, surveillance, and disaster management. In this article, we discuss the communication requirements raised by applications in MAV networks. We propose a novel system representation that can be used to specify different application demands. To this end, we extract key functionalities expected in an MAV network. We map these functionalities into building blocks to characterize the expected communication needs. Based on insights from our own and related real-world experiments, we discuss the capabilities of existing communications technologies and their limitations to implement the proposed building blocks. Our findings indicate that while certain requirements of MAV applications are met with available technologies, further research and development is needed to address the scalability, heterogeneity, safety, quality of service, and security aspects of multi-MAV systems.}
}

A platform for aerial robotics research and demonstration: The Flying Machine Arena
S. Lupashin, M. Hehn, M. W. Mueller, A. P. Schoellig, and R. D’Andrea
Mechatronics, vol. 24, iss. 1, pp. 41-54, 2014.

The Flying Machine Arena is a platform for experiments and demonstrations with fleets of small flying vehicles. It utilizes a distributed, modular architecture linked by robust communication layers. An estimation and control framework along with built-in system protection components enable prototyping of new control systems concepts and implementation of novel demonstrations. More recently, a mobile version has been featured at several eminent public events. We describe the architecture of the Arena from the viewpoint of system robustness and its capability as a dual-purpose research and demonstration platform.

@ARTICLE{lupashin-mech14,
author = {Sergei Lupashin and Markus Hehn and Mark W. Mueller and Angela P. Schoellig and Raffaello D'Andrea},
title = {A platform for aerial robotics research and demonstration: {The Flying Machine Arena}},
journal = {{Mechatronics}},
volume = {24},
number = {1},
pages = {41-54},
year = {2014},
doi = {10.1016/j.mechatronics.2013.11.006},
urllink = {http://flyingmachinearena.org/},
urlvideo={https://youtu.be/pcgvWhu8Arc?list=PLuLKX4lDsLIaVjdGsZxNBKLcogBnVVFQr},
abstract = {The Flying Machine Arena is a platform for experiments and demonstrations with fleets of small flying vehicles. It utilizes a distributed, modular architecture linked by robust communication layers. An estimation and control framework along with built-in system protection components enable prototyping of new control systems concepts and implementation of novel demonstrations. More recently, a mobile version has been featured at several eminent public events. We describe the architecture of the Arena from the viewpoint of system robustness and its capability as a dual-purpose research and demonstration platform.}
}

So you think you can dance? Rhythmic flight performances with quadrocopters
A. P. Schoellig, H. Siegel, F. Augugliaro, and R. D’Andrea
in Controls and Art, A. LaViers and M. Egerstedt, Eds., Springer international publishing, 2014, pp. 73-105.

This chapter presents a set of algorithms that enable quadrotor vehicles to “fly with the music”; that is, to perform rhythmic motions that are aligned with the beat of a given music piece.

@INCOLLECTION{schoellig-springer14,
author = {Angela P. Schoellig and Hallie Siegel and Federico Augugliaro and Raffaello D'Andrea},
title = {So you think you can dance? {Rhythmic} flight performances with quadrocopters},
booktitle = {{Controls and Art}},
editor = {Amy LaViers and Magnus Egerstedt},
publisher = {Springer International Publishing},
pages = {73-105},
year = {2014},
doi = {10.1007/978-3-319-03904-6_4},
urldata={../../wp-content/papercite-data/data/schoellig-springer14-files.zip},
urlslides={../../wp-content/papercite-data/slides/schoellig-springer14-slides.pdf},
urllink = {http://www.tiny.cc/MusicInMotionSite},
urlvideo={https://www.youtube.com/playlist?list=PLD6AAACCBFFE64AC5},
abstract = {This chapter presents a set of algorithms that enable quadrotor vehicles to "fly with the music"; that is, to perform rhythmic motions that are aligned with the beat of a given music piece.}
}

Learning-based robust control: guaranteeing stability while improving performance
F. Berkenkamp and A. P. Schoellig
in Proc. of the Machine Learning in Planning and Control of Robot Motion Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2014.

To control dynamic systems, modern control theory relies on accurate mathematical models that describe the system behavior. Machine learning methods have proven to be an effective method to compensate for initial errors in these models and to achieve high-performance maneuvers by adapting the system model and control online. However, these methods usually do not guarantee stability during the learning process. On the other hand, the control community has traditionally accounted for model uncertainties by designing robust controllers. Robust controllers use a mathematical description of the uncertainty in the dynamic system derived prior to operation and guarantee robust stability for all uncertainties. Unlike machine learning methods, robust control does not improve the control performance by adapting the model online. This paper combines machine learning and robust control theory for the first time with the goal of improving control performance while guaranteeing stability. Data gathered during operation is used to reduce the uncertainty in the model and to learn systematic errors. Specifically, a nonlinear, nonparametric model of the unknown dynamics is learned with a Gaussian Process. This model is used for the computation of a linear robust controller, which guarantees stability around an operating point for all uncertainties. As a result, the robust controller improves its performance online while guaranteeing robust stability. A simulation example illustrates the performance improvements due to the learning-based robust controller.

@INPROCEEDINGS{berkenkamp-iros14,
author = {Felix Berkenkamp and Angela P. Schoellig},
title = {Learning-based robust control: guaranteeing stability while improving performance},
booktitle = {{Proc. of the Machine Learning in Planning and Control of Robot Motion Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
year = {2014},
urllink = {http://www.cs.unm.edu/amprg/mlpc14Workshop/},
urlvideo={https://youtu.be/YqhLnCm0KXY?list=PLC12E387419CEAFF2},
urlslides={../../wp-content/papercite-data/slides/berkenkamp-iros14-slides.pdf},
abstract = {To control dynamic systems, modern control theory relies on accurate mathematical models that describe the system behavior. Machine learning methods have proven to be an effective method to compensate for initial errors in these models and to achieve high-performance maneuvers by adapting the system model and control online. However, these methods usually do not guarantee stability during the learning process. On the other hand, the control community has traditionally accounted for model uncertainties by designing robust controllers. Robust controllers use a mathematical description of the uncertainty in the dynamic system derived prior to operation and guarantee robust stability for all uncertainties. Unlike machine learning methods, robust control does not improve the control performance by adapting the model online. This paper combines machine learning and robust control theory for the first time with the goal of improving control performance while guaranteeing stability. Data gathered during operation is used to reduce the uncertainty in the model and to learn systematic errors. Specifically, a nonlinear, nonparametric model of the unknown dynamics is learned with a Gaussian Process. This model is used for the computation of a linear robust controller, which guarantees stability around an operating point for all uncertainties. As a result, the robust controller improves its performance online while guaranteeing robust stability. A simulation example illustrates the performance improvements due to the learning-based robust controller.}
}

Design of norm-optimal iterative learning controllers: the effect of an iteration-domain Kalman filter for disturbance estimation
N. Degen and A. P. Schoellig
in Proc. of the IEEE Conference on Decision and Control (CDC), 2014, pp. 3590-3596.

Iterative learning control (ILC) has proven to be an effective method for improving the performance of repetitive control tasks. This paper revisits two optimization-based ILC algorithms: (i) the widely used quadratic-criterion ILC law (QILC) and (ii) an estimation-based ILC law using an iteration-domain Kalman filter (K-ILC). The goal of this paper is to analytically compare both algorithms and to highlight the advantages of the Kalman-filter-enhanced algorithm. We first show that for an iteration-constant estimation gain and an appropriate choice of learning parameters both algorithms are identical. We then show that the estimation-enhanced algorithm with its iteration-varying optimal Kalman gains can achieve both fast initial convergence and good noise rejection by (optimally) adapting the learning update rule over the course of an experiment. We conclude that the clear separation of disturbance estimation and input update of the K-ILC algorithm provides an intuitive architecture to design learning schemes that achieve both low noise sensitivity and fast convergence. To benchmark the algorithms we use a simulation of a single-input, single-output mass-spring-damper system.

@INPROCEEDINGS{degen-cdc14,
author = {Nicolas Degen and Angela P. Schoellig},
title = {Design of norm-optimal iterative learning controllers: the effect of an iteration-domain {K}alman filter for disturbance estimation},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
pages = {3590-3596},
year = {2014},
doi = {10.1109/CDC.2014.7039947},
urlslides = {../../wp-content/papercite-data/slides/degen-cdc14-slides.pdf},
abstract = {Iterative learning control (ILC) has proven to be an effective method for improving the performance of repetitive control tasks. This paper revisits two optimization-based ILC algorithms: (i) the widely used quadratic-criterion ILC law (QILC) and (ii) an estimation-based ILC law using an iteration-domain Kalman filter (K-ILC). The goal of this paper is to analytically compare both algorithms and to highlight the advantages of the Kalman-filter-enhanced algorithm. We first show that for an iteration-constant estimation gain and an appropriate choice of learning parameters both algorithms are identical. We then show that the estimation-enhanced algorithm with its iteration-varying optimal Kalman gains can achieve both fast initial convergence and good noise rejection by (optimally) adapting the learning update rule over the course of an experiment. We conclude that the clear separation of disturbance estimation and input update of the K-ILC algorithm provides an intuitive architecture to design learning schemes that achieve both low noise sensitivity and fast convergence. To benchmark the algorithms we use a simulation of a single-input, single-output mass-spring-damper system.}
}

Learning-based nonlinear model predictive control to improve vision-based mobile robot path-tracking in challenging outdoor environments
C. J. Ostafew, A. P. Schoellig, and T. D. Barfoot
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2014, pp. 4029-4036.

This paper presents a Learning-based Nonlinear Model Predictive Control (LB-NMPC) algorithm for an autonomous mobile robot to reduce path-tracking errors over repeated traverses along a reference path. The LB-NMPC algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) based on experience collected during previous traversals as a function of system state, input and other relevant variables. Modelling the disturbance as a GP enables interpolation and extrapolation of learned disturbances, a key feature of this algorithm. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results including over 1.8 km of travel by a four-wheeled, 50 kg robot travelling through challenging terrain (including steep, uneven hills) and by a six-wheeled, 160 kg robot learning disturbances caused by unmodelled dynamics at speeds ranging from 0.35 m/s to 1.0 m/s. The speed is scheduled to balance trial time, path-tracking errors, and localization reliability based on previous experience. The results show that the system can start from a generic a priori vehicle model and subsequently learn to reduce vehicle- and trajectory-specific path-tracking errors based on experience.

@INPROCEEDINGS{ostafew-icra14,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot},
title = {Learning-based nonlinear model predictive control to improve vision-based mobile robot path-tracking in challenging outdoor environments},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
pages = {4029-4036},
year = {2014},
doi = {10.1109/ICRA.2014.6907444},
urlvideo = {https://youtu.be/MwVElAn95-M?list=PLC12E387419CEAFF2},
abstract = {This paper presents a Learning-based Nonlinear Model Predictive Control (LB-NMPC) algorithm for an autonomous mobile robot to reduce path-tracking errors over repeated traverses along a reference path. The LB-NMPC algorithm uses a simple a priori vehicle model and a learned disturbance model. Disturbances are modelled as a Gaussian Process (GP) based on experience collected during previous traversals as a function of system state, input and other relevant variables. Modelling the disturbance as a GP enables interpolation and extrapolation of learned disturbances, a key feature of this algorithm. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied environments. The paper presents experimental results including over 1.8 km of travel by a four-wheeled, 50 kg robot travelling through challenging terrain (including steep, uneven hills) and by a six-wheeled, 160 kg robot learning disturbances caused by unmodelled dynamics at speeds ranging from 0.35 m/s to 1.0 m/s. The speed is scheduled to balance trial time, path-tracking errors, and localization reliability based on previous experience. The results show that the system can start from a generic a priori vehicle model and subsequently learn to reduce vehicle- and trajectory-specific path-tracking errors based on experience.}
}

Speed daemon: experience-based mobile robot speed scheduling
C. J. Ostafew, A. P. Schoellig, T. D. Barfoot, and J. Collier
in Proc. of the International Conference on Computer and Robot Vision (CRV), 2014, pp. 56-62. Best Robotics Paper Award.

A time-optimal speed schedule results in a mobile robot driving along a planned path at or near the limits of the robot’s capability. However, deriving models to predict the effect of increased speed can be very difficult. In this paper, we present a speed scheduler that uses previous experience, instead of complex models, to generate time-optimal speed schedules. The algorithm is designed for a vision-based, path-repeating mobile robot and uses experience to ensure reliable localization, low path-tracking errors, and realizable control inputs while maximizing the speed along the path. To our knowledge, this is the first speed scheduler to incorporate experience from previous path traversals in order to address system constraints. The proposed speed scheduler was tested in over 4 km of path traversals in outdoor terrain using a large Ackermann-steered robot travelling between 0.5 m/s and 2.0 m/s. The approach to speed scheduling is shown to generate fast speed schedules while remaining within the limits of the robot’s capability.

@INPROCEEDINGS{ostafew-crv14,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot and J. Collier},
title = {Speed daemon: experience-based mobile robot speed scheduling},
booktitle = {{Proc. of the International Conference on Computer and Robot Vision (CRV)}},
pages = {56-62},
year = {2014},
doi = {10.1109/CRV.2014.16},
urlvideo = {https://youtu.be/Pu3_F6k6Fa4?list=PLC12E387419CEAFF2},
abstract = {A time-optimal speed schedule results in a mobile robot driving along a planned path at or near the limits of the robot's capability. However, deriving models to predict the effect of increased speed can be very difficult. In this paper, we present a speed scheduler that uses previous experience, instead of complex models, to generate time-optimal speed schedules. The algorithm is designed for a vision-based, path-repeating mobile robot and uses experience to ensure reliable localization, low path-tracking errors, and realizable control inputs while maximizing the speed along the path. To our knowledge, this is the first speed scheduler to incorporate experience from previous path traversals in order to address system constraints. The proposed speed scheduler was tested in over 4 km of path traversals in outdoor terrain using a large Ackermann-steered robot travelling between 0.5 m/s and 2.0 m/s. The approach to speed scheduling is shown to generate fast speed schedules while remaining within the limits of the robot's capability.},
note = {Best Robotics Paper Award}
}

A proof-of-concept demonstration of visual teach and repeat on a quadrocopter using an altitude sensor and a monocular camera
A. Pfrunder, A. P. Schoellig, and T. D. Barfoot
in Proc. of the International Conference on Computer and Robot Vision (CRV), 2014, pp. 238-245.

This paper applies an existing vision-based navigation algorithm to a micro aerial vehicle (MAV). The algorithm has previously been used for long-range navigation of ground robots based on on-board 3D vision sensors such as a stereo or Kinect cameras. A teach-and-repeat operational strategy enables a robot to autonomously repeat a manually taught route without relying on an external positioning system such as GPS. For MAVs we show that a monocular downward looking camera combined with an altitude sensor can be used as 3D vision sensor replacing other resource-expensive 3D vision solutions. The paper also includes a simple path tracking controller that uses feedback from the visual and inertial sensors to guide the vehicle along a straight and level path. Preliminary experimental results demonstrate reliable, accurate and fully autonomous flight of an 8-m-long (straight and level) route, which was taught with the quadrocopter fixed to a cart. Finally, we present the successful flight of a more complex, 16-m-long route.

@INPROCEEDINGS{pfrunder-crv14,
author = {Andreas Pfrunder and Angela P. Schoellig and Timothy D. Barfoot},
title = {A proof-of-concept demonstration of visual teach and repeat on a quadrocopter using an altitude sensor and a monocular camera},
booktitle = {{Proc. of the International Conference on Computer and Robot Vision (CRV)}},
pages = {238-245},
year = {2014},
doi = {10.1109/CRV.2014.40},
urlvideo = {https://youtu.be/BRDvK4xD8ZY?list=PLuLKX4lDsLIaJEVTsuTAVdDJDx0xmzxXr},
urlslides = {../../wp-content/papercite-data/slides/pfrunder-crv14-slides.pdf},
abstract = {This paper applies an existing vision-based navigation algorithm to a micro aerial vehicle (MAV). The algorithm has previously been used for long-range navigation of ground robots based on on-board 3D vision sensors such as a stereo or Kinect cameras. A teach-and-repeat operational strategy enables a robot to autonomously repeat a manually taught route without relying on an external positioning system such as GPS. For MAVs we show that a monocular downward looking camera combined with an altitude sensor can be used as 3D vision sensor replacing other resource-expensive 3D vision solutions. The paper also includes a simple path tracking controller that uses feedback from the visual and inertial sensors to guide the vehicle along a straight and level path. Preliminary experimental results demonstrate reliable, accurate and fully autonomous flight of an 8-m-long (straight and level) route, which was taught with the quadrocopter fixed to a cart. Finally, we present the successful flight of a more complex, 16-m-long route.}
}

2013

Dance of the flying machines: methods for designing and executing an aerial dance choreography
F. Augugliaro, A. P. Schoellig, and R. D’Andrea
IEEE Robotics Automation Magazine, vol. 20, iss. 4, pp. 96-104, 2013.

Imagine a troupe of dancers flying together across a big open stage, their movement choreographed to the rhythm of the music. Their performance is both coordinated and skilled; the dancers are well rehearsed, and the choreography well suited to their abilities. They are no ordinary dancers, however, and this is not an ordinary stage. The performers are quadrocopters, and the stage is the ETH Zurich Flying Machine Arena, a state-of-the-art mobile testbed for aerial motion control research.

@ARTICLE{augugliaro-ram13,
author = {Federico Augugliaro and Angela P. Schoellig and Raffaello D'Andrea},
title = {Dance of the Flying Machines: Methods for Designing and Executing an Aerial Dance Choreography},
journal = {{IEEE Robotics Automation Magazine}},
volume = {20},
number = {4},
pages = {96-104},
year = {2013},
doi = {10.1109/MRA.2013.2275693},
urlvideo={http://youtu.be/NRL_1ozDQCA?t=21s},
urlslides={../../wp-content/papercite-data/slides/augugliaro-ram13-slides.pdf},
abstract = {Imagine a troupe of dancers flying together across a big open stage, their movement choreographed to the rhythm of the music. Their performance is both coordinated and skilled; the dancers are well rehearsed, and the choreography well suited to their abilities. They are no ordinary dancers, however, and this is not an ordinary stage. The performers are quadrocopters, and the stage is the ETH Zurich Flying Machine Arena, a state-of-the-art mobile testbed for aerial motion control research.}
}

Visual teach and repeat, repeat, repeat: iterative learning control to improve mobile robot path tracking in challenging outdoor environments
C. J. Ostafew, A. P. Schoellig, and T. D. Barfoot
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013, pp. 176-181.

This paper presents a path-repeating, mobile robot controller that combines a feedforward, proportional Iterative Learning Control (ILC) algorithm with a feedback-linearized path-tracking controller to reduce path-tracking errors over repeated traverses along a reference path. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied, extreme environments. The paper presents experimental results including over 600 m of travel by a four-wheeled, 50 kg robot travelling through challenging terrain including steep hills and sandy turns and by a six-wheeled, 160 kg robot at gradually-increased speeds up to three times faster than the nominal, safe speed. In the absence of a global localization system, ILC is demonstrated to reduce path-tracking errors caused by unmodelled robot dynamics and terrain challenges.

@INPROCEEDINGS{ostafew-iros13,
author = {Chris J. Ostafew and Angela P. Schoellig and Timothy D. Barfoot},
title = {Visual teach and repeat, repeat, repeat: Iterative learning control to improve mobile robot path tracking in challenging outdoor environments},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {176-181},
year = {2013},
doi = {10.1109/IROS.2013.6696350},
urlvideo = {https://youtu.be/08_d1HSPADA?list=PLC12E387419CEAFF2},
abstract = {This paper presents a path-repeating, mobile robot controller that combines a feedforward, proportional Iterative Learning Control (ILC) algorithm with a feedback-linearized path-tracking controller to reduce path-tracking errors over repeated traverses along a reference path. Localization for the controller is provided by an on-board, vision-based mapping and navigation system enabling operation in large-scale, GPS-denied, extreme environments. The paper presents experimental results including over 600 m of travel by a four-wheeled, 50 kg robot travelling through challenging terrain including steep hills and sandy turns and by a six-wheeled, 160 kg robot at gradually-increased speeds up to three times faster than the nominal, safe speed. In the absence of a global localization system, ILC is demonstrated to reduce path-tracking errors caused by unmodelled robot dynamics and terrain challenges.}
}

Improving tracking performance by learning from past data
A. P. Schoellig
PhD Thesis, Diss. ETH No. 20593, Institute for Dynamic Systems and Control, ETH Zurich, Switzerland, 2013. Awards: ETH Medal, Dimitris N. Chorafas Foundation Prize.

@PHDTHESIS{schoellig-eth13,
author = {Angela P. Schoellig},
title = {Improving tracking performance by learning from past data},
school = {Diss. ETH No. 20593, Institute for Dynamic Systems and Control, ETH Zurich},
doi = {10.3929/ethz-a-009758916},
year = {2013},
address = {Switzerland},
urlabstract = {../../wp-content/papercite-data/pdf/schoellig-eth13-abstract.pdf},
urlslides = {../../wp-content/papercite-data/slides/schoellig-eth13-slides.pdf},
urlvideo = {https://youtu.be/zHTCsSkmADo?list=PLC12E387419CEAFF2},
urlvideo2 = {https://youtu.be/7r281vgfotg?list=PLD6AAACCBFFE64AC5},
note = {Awards: ETH Medal, Dimitris N. Chorafas Foundation Prize}
}

2012

Optimization-based iterative learning for precise quadrocopter trajectory tracking
A. P. Schoellig, F. L. Mueller, and R. D’Andrea
Autonomous Robots, vol. 33, iss. 1-2, pp. 103-127, 2012.

Current control systems regulate the behavior of dynamic systems by reacting to noise and unexpected disturbances as they occur. To improve the performance of such control systems, experience from iterative executions can be used to anticipate recurring disturbances and proactively compensate for them. This paper presents an algorithm that exploits data from previous repetitions in order to learn to precisely follow a predefined trajectory. We adapt the feed-forward input signal to the system with the goal of achieving high tracking performance – even under the presence of model errors and other recurring disturbances. The approach is based on a dynamics model that captures the essential features of the system and that explicitly takes system input and state constraints into account. We combine traditional optimal filtering methods with state-of-the-art optimization techniques in order to obtain an effective and computationally efficient learning strategy that updates the feed-forward input signal according to a customizable learning objective. It is possible to define a termination condition that stops an execution early if the deviation from the nominal trajectory exceeds a given bound. This allows for a safe learning that gradually extends the time horizon of the trajectory. We developed a framework for generating arbitrary flight trajectories and for applying the algorithm to highly maneuverable autonomous quadrotor vehicles in the ETH Flying Machine Arena testbed. Experimental results are discussed for selected trajectories and different learning algorithm parameters.

@ARTICLE{schoellig-auro12,
author = {Angela P. Schoellig and Fabian L. Mueller and Raffaello D'Andrea},
title = {Optimization-based iterative learning for precise quadrocopter trajectory tracking},
journal = {{Autonomous Robots}},
volume = {33},
number = {1-2},
pages = {103-127},
year = {2012},
doi = {10.1007/s10514-012-9283-2},
urlvideo={http://youtu.be/goVuP5TJIUU?list=PLC12E387419CEAFF2},
abstract = {Current control systems regulate the behavior of dynamic systems by reacting to noise and unexpected disturbances as they occur. To improve the performance of such control systems, experience from iterative executions can be used to anticipate recurring disturbances and proactively compensate for them. This paper presents an algorithm that exploits data from previous repetitions in order to learn to precisely follow a predefined trajectory. We adapt the feed-forward input signal to the system with the goal of achieving high tracking performance - even under the presence of model errors and other recurring disturbances. The approach is based on a dynamics model that captures the essential features of the system and that explicitly takes system input and state constraints into account. We combine traditional optimal filtering methods with state-of-the-art optimization techniques in order to obtain an effective and computationally efficient learning strategy that updates the feed-forward input signal according to a customizable learning objective. It is possible to define a termination condition that stops an execution early if the deviation from the nominal trajectory exceeds a given bound. This allows for a safe learning that gradually extends the time horizon of the trajectory. We developed a framework for generating arbitrary flight trajectories and for applying the algorithm to highly maneuverable autonomous quadrotor vehicles in the ETH Flying Machine Arena testbed. Experimental results are discussed for selected trajectories and different learning algorithm parameters.}
}

Limited benefit of joint estimation in multi-agent iterative learning
A. P. Schoellig, J. Alonso-Mora, and R. D’Andrea
Asian Journal of Control, vol. 14, iss. 3, pp. 613-623, 2012.

This paper studies iterative learning control (ILC) in a multi-agent framework, wherein a group of agents simultaneously and repeatedly perform the same task. Assuming similarity between the agents, we investigate whether exchanging information between the agents improves an individual’s learning performance. That is, does an individual agent benefit from the experience of the other agents? We consider the multi-agent iterative learning problem as a two-step process of: first, estimating the repetitive disturbance of each agent; and second, correcting for it. We present a comparison of an agent’s disturbance estimate in the case of (I) independent estimation, where each agent has access only to its own measurement, and (II) joint estimation, where information of all agents is globally accessible. When the agents are identical and noise comes from measurement only, joint estimation yields a noticeable improvement in performance. However, when process noise is encountered or when the agents have an individual disturbance component, the benefit of joint estimation is negligible.

@ARTICLE{schoellig-ajc12,
author = {Angela P. Schoellig and Javier Alonso-Mora and Raffaello D'Andrea},
title = {Limited benefit of joint estimation in multi-agent iterative learning},
journal = {{Asian Journal of Control}},
volume = {14},
number = {3},
pages = {613-623},
year = {2012},
doi = {10.1002/asjc.398},
urldata={../../wp-content/papercite-data/data/schoellig-ajc12-files.zip},
urlslides={../../wp-content/papercite-data/slides/schoellig-ajc12-slides.pdf},
abstract = {This paper studies iterative learning control (ILC) in a multi-agent framework, wherein a group of agents simultaneously and repeatedly perform the same task. Assuming similarity between the agents, we investigate whether exchanging information between the agents improves an individual's learning performance. That is, does an individual agent benefit from the experience of the other agents? We consider the multi-agent iterative learning problem as a two-step process of: first, estimating the repetitive disturbance of each agent; and second, correcting for it. We present a comparison of an agent's disturbance estimate in the case of (I) independent estimation, where each agent has access only to its own measurement, and (II) joint estimation, where information of all agents is globally accessible. When the agents are identical and noise comes from measurement only, joint estimation yields a noticeable improvement in performance. However, when process noise is encountered or when the agents have an individual disturbance component, the benefit of joint estimation is negligible.}
}

Generation of collision-free trajectories for a quadrocopter fleet: a sequential convex programming approach
F. Augugliaro, A. P. Schoellig, and R. D’Andrea
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2012, pp. 1917-1922.

This paper presents an algorithm that generates collision-free trajectories in three dimensions for multiple vehicles within seconds. The problem is cast as a non-convex optimization problem, which is iteratively solved using sequential convex programming that approximates non-convex constraints by using convex ones. The method generates trajectories that account for simple dynamics constraints and is thus independent of the vehicle’s type. An extensive a posteriori vehicle-specific feasibility check is included in the algorithm. The algorithm is applied to a quadrocopter fleet. Experimental results are shown.

@INPROCEEDINGS{augugliaro-iros12,
author = {Federico Augugliaro and Angela P. Schoellig and Raffaello D'Andrea},
title = {Generation of collision-free trajectories for a quadrocopter fleet: A sequential convex programming approach},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {1917-1922},
year = {2012},
doi = {10.1109/IROS.2012.6385823},
urlvideo = {https://youtu.be/wwK7WvvUvlI?list=PLD6AAACCBFFE64AC5},
abstract = {This paper presents an algorithm that generates collision-free trajectories in three dimensions for multiple vehicles within seconds. The problem is cast as a non-convex optimization problem, which is iteratively solved using sequential convex programming that approximates non-convex constraints by using convex ones. The method generates trajectories that account for simple dynamics constraints and is thus independent of the vehicle's type. An extensive a posteriori vehicle-specific feasibility check is included in the algorithm. The algorithm is applied to a quadrocopter fleet. Experimental results are shown.}
}

Iterative learning of feed-forward corrections for high-performance tracking
F. L. Mueller, A. P. Schoellig, and R. D’Andrea
in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2012, pp. 3276-3281.

We revisit a recently developed iterative learning algorithm that enables systems to learn from a repeated operation with the goal of achieving high tracking performance of a given trajectory. The learning scheme is based on a coarse dynamics model of the system and uses past measurements to iteratively adapt the feed-forward input signal to the system. The novelty of this work is an identification routine that uses a numerical simulation of the system dynamics to extract the required model information. This allows the learning algorithm to be applied to any dynamic system for which a dynamics simulation is available (including systems with underlying feedback loops). The proposed learning algorithm is applied to a quadrocopter system that is guided by a trajectory-following controller. With the identification routine, we are able to extend our previous learning results to three-dimensional quadrocopter motions and achieve significantly higher tracking accuracy due to the underlying feedback control, which accounts for non-repetitive noise.

@INPROCEEDINGS{mueller-iros12,
author = {Fabian L. Mueller and Angela P. Schoellig and Raffaello D'Andrea},
title = {Iterative learning of feed-forward corrections for high-performance tracking},
booktitle = {{Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {3276-3281},
year = {2012},
doi = {10.1109/IROS.2012.6385647},
urlvideo = {https://youtu.be/zHTCsSkmADo?list=PLC12E387419CEAFF2},
urlslides = {../../wp-content/papercite-data/slides/mueller-iros12-slides.pdf},
abstract = {We revisit a recently developed iterative learning algorithm that enables systems to learn from a repeated operation with the goal of achieving high tracking performance of a given trajectory. The learning scheme is based on a coarse dynamics model of the system and uses past measurements to iteratively adapt the feed-forward input signal to the system. The novelty of this work is an identification routine that uses a numerical simulation of the system dynamics to extract the required model information. This allows the learning algorithm to be applied to any dynamic system for which a dynamics simulation is available (including systems with underlying feedback loops). The proposed learning algorithm is applied to a quadrocopter system that is guided by a trajectory-following controller. With the identification routine, we are able to extend our previous learning results to three-dimensional quadrocopter motions and achieve significantly higher tracking accuracy due to the underlying feedback control, which accounts for non-repetitive noise.}
}

Feed-forward parameter identification for precise periodic quadrocopter motions
A. P. Schoellig, C. Wiltsche, and R. D’Andrea
in Proc. of the American Control Conference (ACC), 2012, pp. 4313-4318.

This paper presents an approach for precisely tracking periodic trajectories with a quadrocopter. In order to improve temporal and spatial tracking performance, we propose a feed-forward strategy that adapts the motion parameters sent to the vehicle controller. The motion parameters are either adjusted on the fly or, in order to avoid initial transients, identified prior to the flight performance. We outline an identification scheme that tunes parameters for a large class of periodic motions, and requires only a small number of identification experiments prior to flight. This reduced identification is based on analysis and experiments showing that the quadrocopter’s closed-loop dynamics can be approximated by three directionally decoupled linear systems. We show the effectiveness of this approach by performing a sequence of periodic motions on real quadrocopters using the tuned parameters obtained by the reduced identification.

@INPROCEEDINGS{schoellig-acc12,
author = {Angela P. Schoellig and Clemens Wiltsche and Raffaello D'Andrea},
title = {Feed-forward parameter identification for precise periodic quadrocopter motions},
booktitle = {{Proc. of the American Control Conference (ACC)}},
pages = {4313-4318},
year = {2012},
doi = {10.1109/ACC.2012.6315248},
urlvideo = {http://tiny.cc/MusicInMotion},
urlslides = {../../wp-content/papercite-data/slides/schoellig-acc12-slides.pdf},
abstract = {This paper presents an approach for precisely tracking periodic trajectories with a quadrocopter. In order to improve temporal and spatial tracking performance, we propose a feed-forward strategy that adapts the motion parameters sent to the vehicle controller. The motion parameters are either adjusted on the fly or, in order to avoid initial transients, identified prior to the flight performance. We outline an identification scheme that tunes parameters for a large class of periodic motions, and requires only a small number of identification experiments prior to flight. This reduced identification is based on analysis and experiments showing that the quadrocopter's closed-loop dynamics can be approximated by three directionally decoupled linear systems. We show the effectiveness of this approach by performing a sequence of periodic motions on real quadrocopters using the tuned parameters obtained by the reduced identification.}
}

An aerial robotics demonstration for controls research at the ETH Flying Machine Arena
R. Ritz, M. W. Müller, F. Augugliaro, M. Hehn, S. Lupashin, A. P. Schoellig, and R. D’Andrea
Swiss Society for Automatic Control Bulletin, 2012.

@MISC{ritz-sga12,
author = {Robin Ritz and Mark W. M{\"u}ller and Federico Augugliaro and Markus Hehn and Sergei Lupashin and Angela P. Schoellig and Raffaello D'Andrea},
title = {An aerial robotics demonstration for controls research at the {ETH Flying Machine Arena}},
year = {2012},
number = {463},
pages = {2-15},
howpublished = {Swiss Society for Automatic Control Bulletin}
}

Quadrocopter slalom learning
A. P. Schoellig, F. L. Mueller, and R. D’Andrea
Video Submission, AI and Robotics Multimedia Fair, Conference on Artificial Intelligence (AI), Assn. of the Advancement of Artificial Intelligence (AAAI), 2012.

@MISC{schoellig-aaai12,
author = {Angela P. Schoellig and Fabian L. Mueller and Raffaello D'Andrea},
title = {Quadrocopter Slalom Learning},
howpublished = {Video Submission, AI and Robotics Multimedia Fair, Conference on Artificial Intelligence (AI), Assn. of the Advancement of Artificial Intelligence (AAAI)},
urlvideo = {https://youtu.be/zHTCsSkmADo?list=PLC12E387419CEAFF2},
year = {2012},
}

2011

Sensitivity of joint estimation in multi-agent iterative learning control
A. P. Schoellig and R. D’Andrea
in Proc. of the IFAC (International Federation of Automatic Control) World Congress, 2011, pp. 1204-1212.

We consider a group of agents that simultaneously learn the same task, and revisit a previously developed algorithm, where agents share their information and learn jointly. We have already shown that, as compared to an independent learning model that disregards the information of the other agents, and when assuming similarity between the agents, a joint algorithm improves the learning performance of an individual agent. We now revisit the joint learning algorithm to determine its sensitivity to the underlying assumption of similarity between agents. We note that an incorrect assumption about the agents’ degree of similarity degrades the performance of the joint learning scheme. The degradation is particularly acute if we assume that the agents are more similar than they are in reality; in this case, a joint learning scheme can result in a poorer performance than the independent learning algorithm. In the worst case (when we assume that the agents are identical, but they are, in reality, not) the joint learning does not even converge to the correct value. We conclude that, when applying the joint algorithm, it is crucial not to overestimate the similarity of the agents; otherwise, a learning scheme that is independent of the similarity assumption is preferable.

@INPROCEEDINGS{schoellig-ifac11,
author = {Angela P. Schoellig and Raffaello D'Andrea},
title = {Sensitivity of joint estimation in multi-agent iterative learning control},
booktitle = {{Proc. of the IFAC (International Federation of Automatic Control) World Congress}},
pages = {1204-1212},
year = {2011},
doi = {10.3182/20110828-6-IT-1002.03687},
urlslides = {../../wp-content/papercite-data/slides/schoellig-ifac11-slides.pdf},
urldata = {../../wp-content/papercite-data/data/schoellig-ifac11-files.zip},
abstract = {We consider a group of agents that simultaneously learn the same task, and revisit a previously developed algorithm, where agents share their information and learn jointly. We have already shown that, as compared to an independent learning model that disregards the information of the other agents, and when assuming similarity between the agents, a joint algorithm improves the learning performance of an individual agent. We now revisit the joint learning algorithm to determine its sensitivity to the underlying assumption of similarity between agents. We note that an incorrect assumption about the agents' degree of similarity degrades the performance of the joint learning scheme. The degradation is particularly acute if we assume that the agents are more similar than they are in reality; in this case, a joint learning scheme can result in a poorer performance than the independent learning algorithm. In the worst case (when we assume that the agents are identical, but they are, in reality, not) the joint learning does not even converge to the correct value. We conclude that, when applying the joint algorithm, it is crucial not to overestimate the similarity of the agents; otherwise, a learning scheme that is independent of the similarity assumption is preferable.}
}

Feasibility of motion primitives for choreographed quadrocopter flight
A. P. Schoellig, M. Hehn, S. Lupashin, and R. D’Andrea
in Proc. of the American Control Conference (ACC), 2011, pp. 3843-3849.

This paper describes a method for checking the feasibility of quadrocopter motions. The approach, meant as a validation tool for preprogrammed quadrocopter performances, is based on first principles models and ensures that a desired trajectory respects both vehicle dynamics and motor thrust limits. We apply this method towards the eventual goal of using parameterized motion primitives for expressive quadrocopter choreographies. First, we show how a large class of motion primitives can be formulated as truncated Fourier series. We then show how the feasibility check can be applied to such motions by deriving explicit parameter constraints for two particular parameterized primitives. The predicted feasibility constraints are compared against experimental results from quadrocopters in the ETH Flying Machine Arena.

@INPROCEEDINGS{schoellig-acc11,
author = {Angela P. Schoellig and Markus Hehn and Sergei Lupashin and Raffaello D'Andrea},
title = {Feasibility of motion primitives for choreographed quadrocopter flight},
booktitle = {{Proc. of the American Control Conference (ACC)}},
pages = {3843-3849},
year = {2011},
doi = {10.1109/ACC.2011.5991482},
urlvideo = {https://www.youtube.com/playlist?list=PLD6AAACCBFFE64AC5},
urlslides = {../../wp-content/papercite-data/slides/schoellig-acc11-slides.pdf},
urldata = {../../wp-content/papercite-data/data/schoellig-acc11-files.zip},
abstract = {This paper describes a method for checking the feasibility of quadrocopter motions. The approach, meant as a validation tool for preprogrammed quadrocopter performances, is based on first principles models and ensures that a desired trajectory respects both vehicle dynamics and motor thrust limits. We apply this method towards the eventual goal of using parameterized motion primitives for expressive quadrocopter choreographies. First, we show how a large class of motion primitives can be formulated as truncated Fourier series. We then show how the feasibility check can be applied to such motions by deriving explicit parameter constraints for two particular parameterized primitives. The predicted feasibility constraints are compared against experimental results from quadrocopters in the ETH Flying Machine Arena.}
}

The Flying Machine Arena as of 2010
S. Lupashin, A. P. Schoellig, M. Hehn, and R. D’Andrea
Video Submission, in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2011.

The Flying Machine Arena (FMA) is an indoor research space built specifically for the study of autonomous systems and aerial robotics. In this video, we give an overview of this testbed and some of its capabilities. We show the FMA infrastructure and hardware, which includes a fleet of quadrocopters and a motion capture system for vehicle localization. The physical components of the FMA are complemented by specialized software tools and components that facilitate the use of the space and provide a unified framework for communication and control. The flexibility and modularity of the experimental platform is highlighted by various research projects and demonstrations.

@MISC{lupashin-icra11,
author = {Sergei Lupashin and Angela P. Schoellig and Markus Hehn and Raffaello D'Andrea},
title = {The {Flying Machine Arena} as of 2010},
howpublished = {Video Submission, in Proc. of the IEEE International Conference on Robotics and Automation (ICRA)},
year = {2011},
pages = {2970-2971},
doi = {10.1109/ICRA.2011.5980308},
urlvideo = {https://youtu.be/pcgvWhu8Arc?list=PLuLKX4lDsLIaVjdGsZxNBKLcogBnVVFQr},
urllink = {http://www.flyingmachinearena.org},
abstract = {The Flying Machine Arena (FMA) is an indoor research space built specifically for the study of autonomous systems and aerial robotics. In this video, we give an overview of this testbed and some of its capabilities. We show the FMA infrastructure and hardware, which includes a fleet of quadrocopters and a motion capture system for vehicle localization. The physical components of the FMA are complemented by specialized software tools and components that facilitate the use of the space and provide a unified framework for communication and control. The flexibility and modularity of the experimental platform is highlighted by various research projects and demonstrations.},
}

2010

A simple learning strategy for high-speed quadrocopter multi-flips
S. Lupashin, A. P. Schoellig, M. Sherback, and R. D’Andrea
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2010, pp. 1642-1648.

We describe a simple and intuitive policy gradient method for improving parametrized quadrocopter multi-flips by combining iterative experiments with information from a first-principles model. We start by formulating an N-flip maneuver as a five-step primitive with five adjustable parameters. Optimization using a low-order first-principles 2D vertical plane model of the quadrocopter yields an initial set of parameters and a corrective matrix. The maneuver is then repeatedly performed with the vehicle. At each iteration the state error at the end of the primitive is used to update the maneuver parameters via a gradient adjustment. The method is demonstrated at the ETH Zurich Flying Machine Arena testbed on quadrotor helicopters performing and improving on flips, double flips and triple flips.

@INPROCEEDINGS{lupashin-icra10,
author = {Sergei Lupashin and Angela P. Schoellig and Michael Sherback and Raffaello D'Andrea},
title = {A simple learning strategy for high-speed quadrocopter multi-flips},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
pages = {1642-1648},
year = {2010},
doi = {10.1109/ROBOT.2010.5509452},
urlvideo = {https://youtu.be/bWExDW9J9sA?list=PLC12E387419CEAFF2},
abstract = {We describe a simple and intuitive policy gradient method for improving parametrized quadrocopter multi-flips by combining iterative experiments with information from a first-principles model. We start by formulating an N-flip maneuver as a five-step primitive with five adjustable parameters. Optimization using a low-order first-principles 2D vertical plane model of the quadrocopter yields an initial set of parameters and a corrective matrix. The maneuver is then repeatedly performed with the vehicle. At each iteration the state error at the end of the primitive is used to update the maneuver parameters via a gradient adjustment. The method is demonstrated at the ETH Zurich Flying Machine Arena testbed on quadrotor helicopters performing and improving on flips, double flips and triple flips.}
}

Independent vs. joint estimation in multi-agent iterative learning control
A. P. Schoellig, J. Alonso-Mora, and R. D’Andrea
in Proc. of the IEEE Conference on Decision and Control (CDC), 2010, pp. 6949-6954.

This paper studies iterative learning control (ILC) in a multi-agent framework, wherein a group of agents simultaneously and repeatedly perform the same task. The agents improve their performance by using the knowledge gained from previous executions. Assuming similarity between the agents, we investigate whether exchanging information between the agents improves an individual’s learning performance. That is, does an individual agent benefit from the experience of the other agents? We consider the multi-agent iterative learning problem as a two-step process of: first, estimating the repetitive disturbance of each agent; and second, correcting for it. We present a comparison of an agent’s disturbance estimate in the case of (I) independent estimation, where each agent has access only to its own measurement, and (II) joint estimation, where information of all agents is globally accessible. We analytically derive an upper bound of the performance improvement due to joint estimation. Results are obtained for two limiting cases: (i) pure process noise, and (ii) pure measurement noise. The benefits of information sharing are negligible in (i). For (ii), a performance improvement is observed when a high similarity between the agents is guaranteed.

@INPROCEEDINGS{schoellig-cdc10,
author = {Angela P. Schoellig and Javier Alonso-Mora and Raffaello D'Andrea},
title = {Independent vs. joint estimation in multi-agent iterative learning control},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
pages = {6949-6954},
year = {2010},
doi = {10.1109/CDC.2010.5717888},
urlslides = {../../wp-content/papercite-data/slides/schoellig-cdc10-slides.pdf},
urldata = {../../wp-content/papercite-data/data/schoellig-cdc10-files.zip},
abstract = {This paper studies iterative learning control (ILC) in a multi-agent framework, wherein a group of agents simultaneously and repeatedly perform the same task. The agents improve their performance by using the knowledge gained from previous executions. Assuming similarity between the agents, we investigate whether exchanging information between the agents improves an individual's learning performance. That is, does an individual agent benefit from the experience of the other agents? We consider the multi-agent iterative learning problem as a two-step process of: first, estimating the repetitive disturbance of each agent; and second, correcting for it. We present a comparison of an agent's disturbance estimate in the case of (I) independent estimation, where each agent has access only to its own measurement, and (II) joint estimation, where information of all agents is globally accessible. We analytically derive an upper bound of the performance improvement due to joint estimation. Results are obtained for two limiting cases: (i) pure process noise, and (ii) pure measurement noise. The benefits of information sharing are negligible in (i). For (ii), a performance improvement is observed when a high similarity between the agents is guaranteed.}
}

A platform for dance performances with multiple quadrocopters
A. P. Schoellig, F. Augugliaro, and R. D’Andrea
in Proc. of the Workshop on Robots and Musical Expressions at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2010, pp. 1-8.

This paper presents a platform for rhythmic flight with multiple quadrocopters. We envision an expressive multimedia dance performance that is automatically composed and controlled, given a random piece of music. Results in this paper prove the feasibility of audio-motion synchronization when precisely timing the side-to-side motion of a quadrocopter to the beat of the music. An illustration of the indoor flight space and the vehicles shows the characteristics and capabilities of the experimental setup. Prospective features of the platform are outlined and key challenges are emphasized. The paper concludes with a proof-of-concept demonstration showing three vehicles synchronizing their side-to-side motion to the music beat. Moreover, a dance performance to a remix of the sound track ‘Pirates of the Caribbean’ gives a first impression of the novel musical experience. Future steps include an appropriate multiscale music analysis and the development of algorithms for the automated generation of choreography based on a database of motion primitives.

@INPROCEEDINGS{schoellig-iros10,
author = {Angela P. Schoellig and Federico Augugliaro and Raffaello D'Andrea},
title = {A platform for dance performances with multiple quadrocopters},
booktitle = {{Proc. of the Workshop on Robots and Musical Expressions at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}},
pages = {1-8},
year = {2010},
urlvideo = {https://youtu.be/aaaGJKnJdrg?list=PLD6AAACCBFFE64AC5},
urlvideo2 = {https://www.youtube.com/playlist?list=PLD6AAACCBFFE64AC5},
urlslides = {../../wp-content/papercite-data/slides/schoellig-iros10-slides.pdf},
abstract = {This paper presents a platform for rhythmic flight with multiple quadrocopters. We envision an expressive multimedia dance performance that is automatically composed and controlled, given a random piece of music. Results in this paper prove the feasibility of audio-motion synchronization when precisely timing the side-to-side motion of a quadrocopter to the beat of the music. An illustration of the indoor flight space and the vehicles shows the characteristics and capabilities of the experimental setup. Prospective features of the platform are outlined and key challenges are emphasized. The paper concludes with a proof-of-concept demonstration showing three vehicles synchronizing their side-to-side motion to the music beat. Moreover, a dance performance to a remix of the sound track 'Pirates of the Caribbean' gives a first impression of the novel musical experience. Future steps include an appropriate multiscale music analysis and the development of algorithms for the automated generation of choreography based on a database of motion primitives.}
}

Synchronizing the motion of a quadrocopter to music
A. P. Schoellig, F. Augugliaro, and R. D’Andrea
in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2010, pp. 3355-3360.

This paper presents a quadrocopter flying in rhythm to music. The quadrocopter performs a periodic side-to-side motion in time to a musical beat. Underlying controllers are designed that stabilize the vehicle and produce a swinging motion. Synchronization is then achieved by using concepts from phase-locked loops. A phase comparator combined with a correction algorithm eliminate the phase error between the music reference and the actual quadrocopter motion. Experimental results show fast and effective synchronization that is robust to sudden changes in the reference amplitude and frequency. Changes in frequency and amplitude are tracked precisely when adding an additional feedforward component, based on an experimentally determined look-up table.

@INPROCEEDINGS{schoellig-icra10,
author = {Angela P. Schoellig and Federico Augugliaro and Raffaello D'Andrea},
title = {Synchronizing the motion of a quadrocopter to music},
booktitle = {{Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}},
pages = {3355-3360},
year = {2010},
doi = {10.1109/ROBOT.2010.5509755},
urlslides = {../../wp-content/papercite-data/slides/schoellig-icra10-slides.pdf},
urlvideo = {https://youtu.be/Kx4DtXv_bPo?list=PLD6AAACCBFFE64AC5},
abstract = {This paper presents a quadrocopter flying in rhythm to music. The quadrocopter performs a periodic side-to-side motion in time to a musical beat. Underlying controllers are designed that stabilize the vehicle and produce a swinging motion. Synchronization is then achieved by using concepts from phase-locked loops. A phase comparator combined with a correction algorithm eliminate the phase error between the music reference and the actual quadrocopter motion. Experimental results show fast and effective synchronization that is robust to sudden changes in the reference amplitude and frequency. Changes in frequency and amplitude are tracked precisely when adding an additional feedforward component, based on an experimentally determined look-up table.}
}

2009

Optimization-based iterative learning control for trajectory tracking
A. P. Schoellig and R. D’Andrea
in Proc. of the European Control Conference (ECC), 2009, pp. 1505-1510.

In this paper, an optimization-based iterative learning control approach is presented. Given a desired trajectory to be followed, the proposed learning algorithm improves the system performance from trial to trial by exploiting the experience gained from previous repetitions. Taking advantage of the a-priori knowledge about the systems dominating dynamics, a data-based update rule is derived which adapts the feedforward input signal after each trial. By combining traditional model-based optimal filtering methods with state-of-the-art optimization techniques such as convex programming, an effective and computationally highly efficient learning strategy is obtained. Moreover, the derived formalism allows for the direct treatment of input and state constraints. Different (nonlinear) performance objectives can be specified defining the overall learning behavior. Finally, the proposed algorithm is successfully applied to the benchmark problem of swinging up a pendulum using open-loop control only.

@INPROCEEDINGS{schoellig-ecc09,
author = {Angela P. Schoellig and Raffaello D'Andrea},
title = {Optimization-based iterative learning control for trajectory tracking},
booktitle = {{Proc. of the European Control Conference (ECC)}},
pages = {1505-1510},
year = {2009},
urlslides = {../../wp-content/papercite-data/slides/schoellig-ecc09-slides.pdf},
urllink = {http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=7074619},
urlvideo = {https://youtu.be/W2gCn6aAwz4?list=PLC12E387419CEAFF2},
abstract = {In this paper, an optimization-based iterative learning control approach is presented. Given a desired trajectory to be followed, the proposed learning algorithm improves the system performance from trial to trial by exploiting the experience gained from previous repetitions. Taking advantage of the a-priori knowledge about the systems dominating dynamics, a data-based update rule is derived which adapts the feedforward input signal after each trial. By combining traditional model-based optimal filtering methods with state-of-the-art optimization techniques such as convex programming, an effective and computationally highly efficient learning strategy is obtained. Moreover, the derived formalism allows for the direct treatment of input and state constraints. Different (nonlinear) performance objectives can be specified defining the overall learning behavior. Finally, the proposed algorithm is successfully applied to the benchmark problem of swinging up a pendulum using open-loop control only.}
}

2008

Verification of the performance of selected subsystems for the LISA mission (in German)
P. F. Gath, D. Weise, T. Heinrich, A. P. Schoellig, and S. Otte
in Proc. of the German Aerospace Congress, German Society for Aeronautics and Astronautics (DGLR), 2008.

Im Rahmen der Untersuchung zur Systemleistung alternativer Nutzlastkonzepte fuer die LISA Mission (Laser Interferometer Space Antenna) werden bei Astrium derzeit einzelne Untersysteme der Nutzlast auf ihre Leistungsfaehigkeit hin ueberprueft. Dies geschieht sowohl durch theoretische Untersuchungen im Rahmen von Simulationen als auch durch experimentelle Laboruntersuchungen.

@INPROCEEDINGS{gath-gac08,
author = {Peter F. Gath and Dennis Weise and Thomas Heinrich and Angela P. Schoellig and S. Otte},
title = {Verification of the performance of selected subsystems for the {LISA} mission {(in German)}},
booktitle = {{Proc. of the German Aerospace Congress, German Society for Aeronautics and Astronautics (DGLR)}},
year = {2008},
abstract = {Im Rahmen der Untersuchung zur Systemleistung alternativer Nutzlastkonzepte fuer die LISA Mission (Laser Interferometer Space Antenna) werden bei Astrium derzeit einzelne Untersysteme der Nutzlast auf ihre Leistungsfaehigkeit hin ueberprueft. Dies geschieht sowohl durch theoretische Untersuchungen im Rahmen von Simulationen als auch durch experimentelle Laboruntersuchungen.}
}

Learning through experience – Optimizing performance by repetition
A. P. Schoellig and R. D’Andrea
Abstract and Poster, in Proc. of the Robotics Challenges for Machine Learning Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2008.

The goal of our research is to develop a strategy which enables a system, executing the same task multiple times, to use the knowledge of the previous trials to learn more about its own dynamics and enhance its future performance. Our approach, which falls in the field of iterative learning control, combines methods from both areas, traditional model-based estimation and control and purely data-based learning.

@MISC{schoellig-iros08,
author = {Angela P. Schoellig and Raffaello D'Andrea},
title = {Learning through experience -- {O}ptimizing performance by repetition},
howpublished = {Abstract and Poster, in Proc. of the Robotics Challenges for Machine Learning Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year = {2008},
urlvideo = {https://youtu.be/W2gCn6aAwz4?list=PLC12E387419CEAFF2},
urlslides = {../../wp-content/papercite-data/slides/schoellig-iros08-slides.pdf},
urllink = {http://www.learning-robots.de/pmwiki.php/TC/IROS2008},
abstract = {The goal of our research is to develop a strategy which enables a system, executing the same task multiple times, to use the knowledge of the previous trials to learn more about its own dynamics and enhance its future performance. Our approach, which falls in the field of iterative learning control, combines methods from both areas, traditional model-based estimation and control and purely data-based learning.},
}

2007

A hybrid Bellman equation for bimodal systems
P. Caines, M. Egerstedt, R. Malhame, and A. P. Schoellig
in Hybrid Systems: Computation and Control, A. Bemporad, A. Bicchi, and G. Buttazzo, Eds., Springer berlin heidelberg, 2007, vol. 4416, pp. 656-659.

In this paper we present a dynamic programming formulation of a hybrid optimal control problem for bimodal systems with regional dynamics. In particular, based on optimality-zone computations, a framework is presented in which the resulting hybrid Bellman equation guides the design of optimal control programs with, at most, N discrete transitions.

@INCOLLECTION{caines-springer07,
author={Peter Caines and Magnus Egerstedt and Roland Malhame and Angela P. Schoellig},
title={A Hybrid {Bellman} Equation for Bimodal Systems},
booktitle={{Hybrid Systems: Computation and Control}},
editor={Bemporad, Alberto and Bicchi, Antonio and Buttazzo, Giorgio},
publisher={Springer Berlin Heidelberg},
pages={656-659},
year={2007},
volume={4416},
series={Lecture Notes in Computer Science},
doi={10.1007/978-3-540-71493-4_54},
abstract = {In this paper we present a dynamic programming formulation of a hybrid optimal control problem for bimodal systems with regional dynamics. In particular, based on optimality-zone computations, a framework is presented in which the resulting hybrid Bellman equation guides the design of optimal control programs with, at most, N discrete transitions.}
}

A hybrid Bellman equation for systems with regional dynamics
A. P. Schoellig, P. E. Caines, M. Egerstedt, and R. P. Malhamé
in Proc. of the IEEE Conference on Decision and Control (CDC), 2007, pp. 3393-3398.

In this paper, we study hybrid systems with regional dynamics, i.e., systems where transitions between different dynamical regimes occur as the continuous state of the system reaches given switching surfaces. In particular, we focus our attention on the optimal control problem associated with such systems, and we present a Hybrid Bellman Equation for such systems that provide a characterization of global optimality, given an upper bound on the number of switches. Not surprisingly, the solution will be hybrid in nature in that it will depend on not only the continuous control signals, but also on discrete decisions as to what domains the system should go through in the first place. A number of examples are presented to highlight the operation of the proposed approach.

@INPROCEEDINGS{schoellig-cdc07,
author = {Angela P. Schoellig and Peter E. Caines and Magnus Egerstedt and Roland P. Malham\'e},
title = {A hybrid {B}ellman equation for systems with regional dynamics},
booktitle = {{Proc. of the IEEE Conference on Decision and Control (CDC)}},
pages = {3393-3398},
year = {2007},
doi = {10.1109/CDC.2007.4434952},
abstract = {In this paper, we study hybrid systems with regional dynamics, i.e., systems where transitions between different dynamical regimes occur as the continuous state of the system reaches given switching surfaces. In particular, we focus our attention on the optimal control problem associated with such systems, and we present a Hybrid Bellman Equation for such systems that provide a characterization of global optimality, given an upper bound on the number of switches. Not surprisingly, the solution will be hybrid in nature in that it will depend on not only the continuous control signals, but also on discrete decisions as to what domains the system should go through in the first place. A number of examples are presented to highlight the operation of the proposed approach.}
}

Topology-dependent stability of a network of dynamical systems with communication delays
A. P. Schoellig, U. Münz, and F. Allgöwer
in Proc. of the European Control Conference (ECC), 2007, pp. 1197-1202.

In this paper, we analyze the stability of a network of first-order linear time-invariant systems with constant, identical communication delays. We investigate the influence of both system parameters and network characteristics on stability. In particular, a non-conservative stability bound for the delay is given such that the network is asymptotically stable for any delay smaller than this bound. We show how the network topology changes the stability bound. Exemplarily, we use these results to answer the question if a symmetric or skew-symmetric interconnection is preferable for a given set of subsystems.

@INPROCEEDINGS{schoellig-ecc07,
author = {Angela P. Schoellig and Ulrich M\"unz and Frank Allg\"ower},
title = {Topology-Dependent Stability of a Network of Dynamical Systems with Communication Delays},
booktitle = {{Proc. of the European Control Conference (ECC)}},
pages = {1197-1202},
year = {2007},
urllink = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7068977},
abstract = {In this paper, we analyze the stability of a network of first-order linear time-invariant systems with constant, identical communication delays. We investigate the influence of both system parameters and network characteristics on stability. In particular, a non-conservative stability bound for the delay is given such that the network is asymptotically stable for any delay smaller than this bound. We show how the network topology changes the stability bound. Exemplarily, we use these results to answer the question if a symmetric or skew-symmetric interconnection is preferable for a given set of subsystems.}
}

Optimal control of hybrid systems with regional dynamics
A. P. Schoellig
Master Thesis, Georgia Institute of Technology, USA, 2007.

In this work, hybrid systems with regional dynamics are considered. These are systems where transitions between different dynamical regimes occur as the continuous state of the system reaches given switching surfaces. In particular, the attention is focused on the optimal control problem associated with such systems. More precisely, given a specific cost function, the goal is to determine the optimal path of going from a given starting point to a fixed final state during an a priori specified time horizon. The key characteristic of the approach presented in this thesis is a hierarchical decomposition of the hybrid optimal control problem, yielding to a framework which allows a solution on different levels of control. On the highest level of abstraction, the regional structure of the state space is taken into account and a discrete representation of the connections between the different regions provides global accessibility relations between regions. These are used on a lower level of control to formulate the main theorem of this work, namely, the Hybrid Bellman Equation for multimodal systems, which, in fact, provides a characterization of global optimality, given an upper bound on the number of transitions along a hybrid trajectory. Not surprisingly, the optimal solution is hybrid in nature, in that it depends on not only the continuous control signals, but also on discrete decisions as to what domains the system’s continuous state should go through in the first place. The main benefit with the proposed approach lies in the fact that a hierarchical Dynamic Programming algorithm can be used to representing both a theoretical characterization of the hybrid solution’s structural composition and, from a more application-driven point of view, a numerically implementable calculation rule yielding to globally optimal solutions in a regional dynamics framework. The operation of the recursive algorithm is highlighted by the consideration of numerous examples, among them, a heterogeneous multi-agent problem.

@MASTERSTHESIS{schoellig-gatech07,
author = {Angela P. Schoellig},
title = {Optimal control of hybrid systems with regional dynamics},
school = {Georgia Institute of Technology},
urllink = {http://hdl.handle.net/1853/19874},
urlslides = {../../wp-content/papercite-data/slides/schoellig-gatech07-slides.pdf},
year = {2007},
address = {USA},
abstract = {In this work, hybrid systems with regional dynamics are considered. These are systems where transitions between different dynamical regimes occur as the continuous state of the system reaches given switching surfaces. In particular, the attention is focused on the optimal control problem associated with such systems. More precisely, given a specific cost function, the goal is to determine the optimal path of going from a given starting point to a fixed final state during an a priori specified time horizon. The key characteristic of the approach presented in this thesis is a hierarchical decomposition of the hybrid optimal control problem, yielding to a framework which allows a solution on different levels of control. On the highest level of abstraction, the regional structure of the state space is taken into account and a discrete representation of the connections between the different regions provides global accessibility relations between regions. These are used on a lower level of control to formulate the main theorem of this work, namely, the Hybrid Bellman Equation for multimodal systems, which, in fact, provides a characterization of global optimality, given an upper bound on the number of transitions along a hybrid trajectory. Not surprisingly, the optimal solution is hybrid in nature, in that it depends on not only the continuous control signals, but also on discrete decisions as to what domains the system's continuous state should go through in the first place. The main benefit with the proposed approach lies in the fact that a hierarchical Dynamic Programming algorithm can be used to representing both a theoretical characterization of the hybrid solution's structural composition and, from a more application-driven point of view, a numerically implementable calculation rule yielding to globally optimal solutions in a regional dynamics framework. The operation of the recursive algorithm is highlighted by the consideration of numerous examples, among them, a heterogeneous multi-agent problem.},
}

2006

Stability of a network of dynamical systems with communication delays (in German)
A. P. Schoellig
Semester Project, University of Stuttgart, Germany, 2006.

@MASTERSTHESIS{schoellig-stuttgart06,
author = {Angela P. Schoellig},
title = {Stability of a network of dynamical systems with communication delays {(in German)}},
school = {University of Stuttgart},
type = {Semester Project},
urlslides = {../../wp-content/papercite-data/slides/schoellig-stuttgart06-slides.pdf},
year = {2006},
address = {Germany}
}

Publications Home » Research » Publications