Research

Robot Learning from Demonstration Using Elastic Maps

Learning from Demonstration (LfD) is a popular method of reproducing and generalizing robot skills from human-provided demonstrations. In this paper, we propose a novel optimization-based LfD method that encodes demonstrations as elastic maps. An elastic map is a graph of nodes connected through a mesh of springs. We build a skill model by fitting an elastic map to the set of demonstrations. The formulated optimization problem in our approach includes three objectives with natural and physical interpretations. The main term rewards the mean squared error in the Cartesian coordinate. The second term penalizes the non-equidistant distribution of points resulting in the optimum total length of the trajectory. The third term rewards smoothness while pe-nalizing nonlinearity. These quadratic objectives form a convex problem that can be solved efficiently with local optimizers. We examine nine methods for constructing and weighting the elastic maps and study their performance in robotic tasks. We also evaluate the proposed method in several simulated and real-world experiments using a UR5e manipulator arm, and compare it to other LfD approaches to demonstrate its benefits and flexibility across a variety of metrics.

Brendan Hertel, Matthew Pelland, S. Reza Ahmadzadeh, "Learning from Demonstration using Elastic Maps," In Proc.  IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022), Kyoto, Japan, pp. 7407--7413, Oct. 23-27, 2022. [arXiv][IEEE][video] 

Methods for Combining and Representing Non-Contextual Autonomy Scores for Unmanned Aerial Systems

Measuring an overall autonomy score for a robotic system requires the combination of a set of relevant aspects and features of the system that might be measured in different units, qualitative, and/or discordant. In this paper, we build upon an existing non-contextual autonomy framework that measures and combines the Autonomy Level and the Component Performance of a system as an overall autonomy score. We examine several methods of combining features, showing how some methods find different rankings from the same data. We discuss resolving this issue by employing the weighted product method. Furthermore, we introduce two new means for representing relative and absolute autonomy score, namely, the autonomy coordinate system and the autonomy distance which represents the overall autonomy of a system. We apply our method to a set of seven Unmanned Aerial Systems (UAS) and obtain their absolute autonomy score as well as their relative score with respect to the best system.

Brendan Hertel, Ryan Donald, Christian Dumas, S. Reza Ahmadzadeh, ''Methods for Combining and Representing Non-Contextual Autonomy Scores for Unmanned Aerial Systems,'' In Proc. 8th International Conference on Automation, Robotics, and Applications (ICARA), Prague, Czech Republic, pp. 135--139, Feb. 18-20, 2022. [IEEE][arXiv] 

Similarity-aware skill reproduction based on multi-representational learning from demonstration

Learning from Demonstration (LfD) algorithms enable humans to teach new skills to robots through demonstrations. The learned skills can be robustly reproduced from the identical or near boundary conditions (e.g., initial point). However, when generalizing a learned skill over boundary conditions with higher variance, the similarity of the reproductions changes from one boundary condition to another, and a single LfD representation cannot preserve a consistent similarity across a generalization region. We propose a novel similarityaware framework including multiple LfD representations and a similarity metric that can improve skill generalization by finding reproductions with the highest similarity values for a given boundary condition. Given a demonstration of the skill, our framework constructs a similarity region around a point of interest (e.g., initial point) by evaluating individual LfD representations using the similarity metric. Any point within this volume corresponds to a representation that reproduces the skill with the greatest similarity. We validate our multi-representational framework in three simulated and four sets of real-world experiments using a physical 6-DOF robot. We also evaluate 11 different similarity metrics and categorize them according to their biases in 286 simulated experiments.

Brendan Hertel, S. Reza Ahmadzadeh, "Similarity-aware Skill Reproduction based on Multi-representational Learning from Demonstration," In Proc. International Conference on Advanced Robotics (ICAR 2021), Ljubljana, Slovenia, pp. 652--657, Dec. 6-10, 2021. [IEEE][arXiv][video] 

Learning from Successful and Failed Demonstrations via Optimization

Learning from Demonstration (LfD) is a popular approach that allows humans to teach robots new skills by showing the correct way(s) of performing the desired skill. Human-provided demonstrations, however, are not always optimal and the teacher usually addresses this issue by discarding or replacing sub-optimal (noisy or faulty) demonstrations. We propose a novel LfD representation that learns from both successful and failed demonstrations of a skill. Our approach encodes the two subsets of captured demonstrations (labeled by the teacher) into a statistical skill model, constructs a set of quadratic costs, and finds an optimal reproduction of the skill under novel problem conditions (i.e. constraints). The optimal reproduction balances convergence towards successful examples and divergence from failed examples. We evaluate our approach through several 2D and 3D experiments in real-world using a UR5e manipulator arm and also show that it can reproduce a skill from only failed demonstrations. The benefits of exploiting both failed and successful demonstrations are shown through comparison with two existing LfD approaches. We also compare our approach against an existing skill refinement method and show its capabilities in a multi-coordinate setting.

Reference:

Brendan Hertel, S. Reza Ahmadzadeh, "Learning from Successful and Failed Demonstrations via Optimization," In Proc.  IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021), Prague, Czech Republic, pp. 7784--7789, Sept. 27 - Oct. 1, 2021.  [IEEE][pdf][arXiv][video][talk][lightning talk] 

Reward-Sharing Relational Networks in Multi-Agent Reinforcement Learning as a Framework for Emergent Behavior

In this work, we integrate ‘social’ interactions into the MARL setup through a user-defined relational network and examine the effects of agent-agent relations on the rise of emergent behaviors. Leveraging insights from sociology and neuroscience, our proposed framework models agent relationships using the notion of Reward-Sharing Relational Networks (RSRN), where network edge weights act as a measure of how much one agent is invested in the success of (or ‘cares about’) another. We construct relational rewards as a function of the RSRN interaction weights to collectively train the multi-agent system via a multi-agent reinforcement learning algorithm. The performance of the system is tested for a 3-agent scenario with different relational network structures (e.g., self-interested, communitarian, and authoritarian networks). Our results indicate that reward-sharing relational networks can significantly influence learned behaviors. We posit that RSRN can act as a framework where different relational networks produce distinct emergent behaviors, often analogous to the intuited sociological understanding of such networks.

Reference:

H. Haeri, S. R. Ahmadzadeh, K. Jerath, "Reward-Sharing Relational Networks in Multi-Agent Reinforcement Learning as a Framework for Emergent Behavior," in Adaptive and Learning Agents (ALA) Workshop at AAMAS, London, UK, May 3-4, 2021. 

[pdf][ALA][Website]

Benchmark for Skill Learning from Demonstration: Impact of User Experience, Task Complexity, and Start Configuration on Performance

In this work, we contribute a large-scale study benchmarking the performance of multiple motion-based learning from demonstration approaches. Given the number and diversity of existing methods, it is critical that comprehensive empirical studies be performed comparing the relative strengths of these learning techniques. In particular, we evaluate four different approaches based on properties an end user may desire for real-world tasks. To perform this evaluation, we collected data from nine participants, across four different manipulation tasks with varying starting conditions.  The resulting demonstrations were used to train 180 task models and evaluated on 720 task reproductions on a physical robot. Our results detail how i) complexity of the task, ii) the expertise of the human demonstrator, and iii) the starting configuration of the robot affect task performance. The collected dataset of demonstrations, robot executions, and evaluations are being made publicly available. Research insights and guidelines are also provided to guide future research and deployment choices about these approaches.

Reference:

M. A. Rana, D. Chen, S. R. Ahmadzadeh, J. Williams, V. Chu, S.  Chernova, "Benchmark for Skill Learning from Demonstration: Impact of User Experience, Task Complexity, and Start Configuration on Performance,'' In Proc. IEEE Intl Conf. on Robotics and Automation (ICRA 2020), Paris, France, pp. 7561--7567, 31 May - 4 June, 2020. 

[pdf][Website]

Towards Mobile Multi-Task Manipulation in a Confined and Integrated Environment with Irregular Objects

The FetchIt! Mobile Manipulation Challenge, held at the IEEE International Conference on Robots and Automation (ICRA) in May 2019, offered an environment with complex and integrated task sets, irregular objects, confined space, and machining, introducing new challenges in the mobile manipulation domain. Here we describe our efforts to address these challenges by demonstrating the assembly of a kit of mechanical parts in a caddy. In addition to implementation details, we examine the issues in this task set extensively, and we discuss our software architecture in the hope of providing a base for other researchers. To evaluate performance and consistency, we conducted 20 full runs, then examined failure cases with possible solutions. We conclude by identifying future research directions to address the open challenges.

Reference:

Z. Han, J. Allspaw, G. LeMasurier, J. Parrillo, D. Giger, S. R. Ahmadzadeh, H. Yanco, "Towards Mobile Multi-Task Manipulation in a Confined and Integrated Environment with Irregular Objects,'' In Proc. IEEE Intl Conf. on Robotics and Automation (ICRA 2020), Paris, France, pp. 11025--11031, 31 May - 4 June, 2020.

[pdf]

Trajectory Learning using Generalized Cylinders

In this article, we introduce Trajectory Learning using Generalized Cylinders (TLGC), a novel trajectory-based skill learning approach from human demonstrations. To model a demonstrated skill, TLGC uses a Generalized Cylinder—a geometric representation composed of an arbitrary space curve called the spine and a surface with smoothly varying cross-sections. Our approach is the first application of Generalized Cylinders to manipulation, and its geometric representation offers several key features: it identifies and extracts the implicit characteristics and boundaries of the skill by encoding the demonstration space, it supports for generation of multiple skill reproductions maintaining those characteristics, the constructed model can generalize the skill to unforeseen situations through trajectory editing techniques, our approach also allows for obstacle avoidance and interactive human refinement of the resulting model through kinesthetic correction. We validate our approach through a set of real-world experiments with both a Jaco 6-DOF and a Sawyer 7-DOF robotic arm. 

Reference:

S. R. Ahmadzadeh, S. Chernova, "Trajectory-based Skill Learning using Generalized Cylinders," Frontiers in Robotics and AI, section Human-Robot Interaction, vol.5 (2018).

[pdf]

Skill Learning via Cost Balancing

We propose a learning framework, named Multi-Coordinate Cost Balancing (MCCB), to address the problem of acquiring point-to-point movement skills from demonstrations. MCCB encodes demonstrations simultaneously in multiple differential coordinates that specify local geometric properties. MCCB generates reproductions by solving a convex optimization problem with a multi-coordinate cost function and linear constraints on the reproductions, such as initial, target, and via points. Further, since the relative importance of each coordinate system in the cost function might be unknown for a given skill, MCCB learns optimal weighting factors that balance the cost function. We demonstrate the effectiveness of MCCB via detailed experiments conducted on one handwriting dataset and three complex skill datasets.

Reference:

H. Ravichandar*, S. R. Ahmadzadeh*, M. A. Rana, S. Chernova, "Skill Acquisition via Automated Multi-Coordinate Cost Balancing," In Proc. IEEE Intl Conf. on Robotics and Automation, (ICRA 2019), Montreal, Canada, pp. 7776--7782, 20-24 May, 2019.

[pdf][arXiv]

Benchmark of Skill Learning from Demonstration

In this work, we contribute a large-scale study benchmarking the performance of multiple motion-based learning from demonstration approaches. Given the number and diversity of existing methods, it is critical that comprehensive empirical studies be performed comparing the relative strengths of these learning techniques. In particular, we evaluate four different approaches based on properties an end user may desire for real-world tasks. To perform this evaluation, we collected data from nine participants, across four different manipulation tasks with varying starting conditions. The resulting demonstrations were used to train 180 task models and evaluated on 720 task reproductions on a physical robot. Our results detail how i) complexity of the task, ii) the expertise of the human demonstrator, and iii) the starting configuration of the robot affect task performance. The collected dataset of demonstrations, robot executions, and evaluations are being made publicly available. Research insights and guidelines are also provided to guide future research and deployment choices about these approaches. 

[pdf][arXiv]