RSS’24 Workshop on Semantics for Robotics: From Environment Understanding and Reasoning to Safe Interaction

The workshop will take place at the TU Delft on July 15, 2024.


For robots to safely interact with people and the real world, they need the capability to not only perceive but also understand their surroundings in a semantically meaningful way (i.e., understanding implications or pertinent properties associated with the objects in the scene). Advanced perception methods coupled with learning algorithms have made significant progress in enabling semantic understanding. Recent breakthroughs in foundation models have further exposed opportunities for robots to contextually reason about their operating environments. Semantics is ingrained in every aspect of robotics, from perception to action; reliably exploiting semantic information in embodied systems requires tightly coupled perception, learning, and control algorithm design (e.g., a robot in a warehouse must recognize objects on the floor and reason whether it is safe to run over them). By organizing this workshop, we hope to foster discussions on innovative approaches that harness semantic understanding for the design and deployment of intelligent embodied systems. We aim to facilitate an interdisciplinary exchange between researchers in robot learning, perception, mapping, and control to identify the opportunities and pressing challenges when incorporating semantics into robotic applications.

Our workshop comprises two general themes with invited talks from experts in each:

Theme A – Environment Understanding and Reasoning: In this theme, we aim to provide an overview of the recent advances in 3D spatial understanding and reasoning in robotics. The invited talks will cover state-of-the-art methods enabling robots to derive semantically meaningful information from diverse sensor modalities and natural language instructions for effective downstream decision-making. The panel discussions will delve into opportunities and challenges related to geometric and semantic representations, uncertainty-aware perception, data-driven learning methods, and (safety) evaluation in practical applications.

Theme B – Safe Interaction with the World: In this theme, we will highlight the seminal approaches towards planning and control in complex, interactive scenarios. The invited talks will encompass control-theoretic frameworks for safe learning-based decision-making under uncertainties, safe and compliant human-robot interaction, and high-level language instructions for non-expert robot operation. The panel discussion will facilitate discussions on strategies for incorporating semantic understanding into planning and control algorithms to enable robot interaction in complex environments and enhance safe operation in real-world deployment.

Discussion Topics | Program | Call for Papers | Speakers | Organizers

Discussion Topics

Our workshop has two general themes. The first theme is “Environment Understanding and Reasoning,” where speakers will provide an overview of the recent developments in 3D scene understanding, semantic mapping and localization, and language-conditioned contextual reasoning. The second theme, “Safe Interaction with the World,” will focus on downstream motion planning and control frameworks for safe interaction with the perceived world. A preliminary set of discussion questions are listed below.

Theme A: Environment Understanding and Reasoning

  • How do we efficiently represent the robot operating environment to facilitate the downstream planning and control tasks? What are the advantages and limitations of different representations (e.g., dense metric maps and scene graphs)?
  • How do we characterize and account for uncertainties from perception, especially in dynamic or changing scenes?
  • How do we meaningfully fuse high-dimensional sensor data (camera, lidar, radar) with foundation models for contextual reasoning in robotics?
  • World models have shown promising results in computer vision (e.g., video generation). What are the current challenges and opportunities in robotics?
  • How do ethical considerations come into play when designing robots with advanced semantic understanding for real-world applications?

Theme B: Safe Interaction with the World

  • How do we incorporate semantic understanding into a robot decision-making pipeline to enable safe interaction?
  • How do we propagate uncertainties from perception efficiently into planning and control to guarantee safety during deployment?
  • How do we map a high-level understanding of the environment to constraints and objectives that are compatible with current planning and control-theoretic frameworks? Or do we need to rethink planning and control in the age of generative models?
  • What dimensions of safety should be considered in real-world interactive scenarios, and how should safety be measured and benchmarked?
  • How can interdisciplinary collaboration between researchers in robot learning, perception, and control enhance the development of intelligent embodied systems?


Below is a tentative program of the workshop. Times are in CEST.

Morning Session

08:45 – 09:00: Opening Remarks
09:00 – 10:00: Part 1: Invited Talks
10:00 – 10:30: Coffee Break and Poster Session
10:30 – 11:10: Part 2: Invited Talks
11:10 – 11:45: Morning Panel
11:45 – 12:30: Spotlight Talks
12:30 – 14:00: Lunch Break

Afternoon Session

14:00 – 15:30: Part 3: Invited Talks
15:30 – 16:00: Coffee Break and Poster Session
16:00 – 16:20: Part 4: Invited Talks
16:20 – 16:55: Afternoon Panel
16:55 – 17:00: Concluding Remarks

We will have an optional post-workshop social event; further details will be shared on the workshop day.

Call for Papers

We are inviting researchers from different disciplines to share novel ideas and ideas on topics pertinent to the workshop themes, which include but are not limited to:

  • Spatial perception methods incorporating semantic, geometric and multi-modal information into 3D mapping and state estimation algorithms
  • Efficient 3D object and environment representations from multi-modal sensor inputs
  • Uncertainty estimation for robust 3D perception
  • Contextual reasoning of the 3D environments (e.g., object relations, affordance, traversability)
  • Safe and risk-aware robot motion planning and control under geometric and/or semantic uncertainties
  • Robot skill acquisition and learning leveraging semantics information
  • Multi-agent collaboration through semantic information
  • Demonstration or position papers on foundation-model-based perception and decision-making methods

The review process will be single-blind. Accepted papers will be published on the workshop webpage and will be presented as a spotlight talk or as a poster. If you have any questions, please contact us at

Paper Format

Suggested Length: minimum 2 and maximum 4 pages excluding references
Style Template: RSS Paper Format

Important Dates

Initial Submission: May 15, 2024 (11:59 pm AoE)
Author Notification: May 31, 2024
Camera Ready Submission: June 15, 2024 (11:59 pm AoE)
Workshop Date: July 15, 2024

CMT Submission Link

Confirmed Invited Speakers

Theme A: Environment Understanding and Reasoning

Prof. Michael Milford
Queensland University of Technology (QUT)
Prof. Luca Carlone
Massachusetts Institute of Technology (MIT)

Prof. Angela Dai
Technical University of Munich (TUM)

Dr. Oier Mees
University of California, Berkeley (UCB)

Dr. Masha Itkina
Toyota Research Institute (TRI)

Theme B: Safe Interaction with the World

Prof. Marco Pavone
Stanford University and Nvidia
Prof. Andrea Bajcsy
Carnegie Mellon University (CMU)

Prof. Koushil Sreenath
University of California, Berkeley (UCB)

Prof. Florian Shkurti
University of Toronto (UofT)

Dr. Manuel Keppler
German Aerospace Center (DLR)


University of Toronto Institute for Aerospace Studies