Research
Collaborative Adaptation for Recovery from Unforeseen Malfunctions in Discrete and Continuous MARL Domains
To overcome this limitation, we present the Collaborative Adaptation (CA) framework, highlighting its unique capability to operate in both continuous and discrete domains. Our framework enhances the adaptability of agents to unexpected failures by integrating inter-agent relationships into their learning processes, thereby accelerating the recovery from malfunctions. We evaluated our framework’s performance through experiments in both discrete and continuous environments. Empirical results reveal that in scenarios involving unforeseen malfunction, although state-of-the-art algorithms often converge on sub-optimal solutions, the proposed CA framework mitigates and recovers more effectively.
Reference:
Yasin Findik, Hunter Hasenfus, Reza Azadeh, "Collaborative Adaptation for Recovery from Unforeseen Malfunctions in Discrete and Continuous MARL Domains," In Proc. 63rd IEEE Conference on Decision and Control (CDC), Milan, Italy, pp. xxxx--xxxx, Dec. 16-19, 2024.
[pdf][arXiv]
Relational Weight Optimization for Enhancing Team Performance in Multi-Agent Multi-Armed Bandits
We introduce an approach to improve team performance in a Multi-Agent Multi-Armed Bandit (MAMAB) framework using Fastest Mixing Markov Chain (FMMC) and Fastest Distributed Linear Averaging (FDLA) optimization algorithms. The multi-agent team is represented using a fixed relational network and simulated using the Coop-UCB2 algorithm. The edge weights of the communication network directly impact the time taken to reach distributed consensus. Our goal is to shrink the timescale on which the convergence of the consensus occurs to achieve optimal team performance and maximize reward. Through our experiments, we show that the convergence to team consensus occurs slightly faster in large constrained networks.
Reference:
Monish Reddy Kotturu, Saniya Vahedian Movahed, Kshitij Jerath, Paul Robinette, Reza Azadeh, "Relational Weight Optimization for Enhancing Team Performance in Multi-Agent Multi-Armed Bandits," In Proc. 4th Modeling, Estimation and Control Conference (MECC) 2024, Chicago, IL, USA, pp. xx--xx, Oct. 27-30, 2024.
[pdf]
Graph Attention Inference of Network Topology in Multi-Agent Systems
Accurately predicting the states of multi-agent systems and identifying their underlying graph structures remains a difficult challenge. Our work introduces a novel machine learning-based solution that leverages the attention mechanism to predict future states of multiagent systems by learning node representations. The graph structure is then inferred from the strength of the attention values. This approach is applied to both linear consensus dynamics and the non-linear dynamics of Kuramoto oscillators, learning the graph by learning good agent representations. Our results demonstrate that the presented data-driven graph attention machine learning model can identify the network topology in multi-agent systems with unknown dynamics, as evidenced by the F1 scores achieved in the link prediction.
Reference:
Akshay Kolli, Reza Azadeh, Kshitij Jerath, "Graph Attention Inference of Network Topology in Multi-Agent Systems," In Proc. 4th Modeling, Estimation and Control Conference (MECC) 2024, Chicago, IL, USA, pp. xx--xx, Oct. 27-30, 2024.
[pdf]
Comparing a 2D Keyboard and Mouse Interface to Virtual Reality for Human-in-the-Loop Robot Planning for Mobile Manipulation
Human-in-the-loop robot teleoperation interfaces enable operators to control robots to complete complex tasks, as seen by the success of teams in the DARPA Robotics Challenge (DRC). In this work, we compare two human-in-the-loop planning interfaces, a 2D keyboard and mouse (KBM) interface modeled after those used in the DRC and a 3D virtual reality (VR) interface, for teleoperating a robot to perform navigation and manipulation tasks. In our study, we investigated operator performance, and cognitive workload while using the interface, as well as the perceived usability of each. We found that participants had better performance in both task types when using the KBM interface, however they experienced fewer collisions between the robot and the world in the VR interface. Given these findings, we recommend utilizing a KBM interface in low-risk situations where task performance is the primary factor. In high-risk scenarios, where collisions can be detrimental, we recommend using VR. With this work we aim to contribute to building effective and intuitive interfaces for human-in-the-loop planning to allow robots to complete complex tasks in challenging environments.
Reference:
Gregory LeMasurier, James Tukpah, Murphy Wonsick, Jordan Allspaw, Brendan Hertel, Jacob Epstein, Reza Azadeh, Taskin Padir, Holly Yanco, Elizabeth Phillips, "Comparing a 2D Keyboard and Mouse Interface to Virtual Reality for Human-in-the-Loop Robot Planning for Mobile Manipulation," In Proc. 33rd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), Pasadena, CA, USA, pp. xx--xx, Aug. 26-30, 2024.
[pdf]
Do Humans Have Different Expectations Regarding Humans and Robots' Morality?
The growing implementation of robots in societal contexts necessitates a deeper exploration of the dynamics of trust between humans and robots. This exploration should expand beyond traditional viewpoints that primarily emphasize the influence of robot performance. In the burgeoning area of social robotics, fine-tuning a robot’s personality traits is increasingly recognized as a crucial element in shaping users’ experiences during human-robot interaction (HRI). Research in this field has led to the creation of trust scales that encompass various trust dimensions in HRI. These scales include aspects related to performance as well as moral dimensions. Our previous study revealed that breaches of moral trust by robots impact human trust more negatively than performance trust breaches, and humans take retaliatory approaches in response to morality breaches by robots. In the present study, our main aim was to explore if trust loss and retaliation tendencies differ based on the identity of the teammates following the violations of these different trust aspects. Through multiple versions of an online search task, we examined our research questions and found that breaches of morality by robotic teammates cause a significantly higher trust loss in humans compared to human teammates. These findings highlight the importance of a robot’s morality in determining how humans view a robot’s trustworthiness. For effective robot design, robots must meet ethical and moral standards, which are higher than the ethical and moral standards expected from humans.
Reference:
Zahra Rezaei Khavas, Monish Reddy Kotturu, Reza Azadeh, Paul Robinette, "Do Humans Have Different Expectations Regarding Humans and Robots' Morality?, " In Proc. 33rd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), Pasadena, CA, USA, pp. xx--xx, Aug. 26-30, 2024.
[pdf]
Relational Q-Functionals: Multi-Agent Learning to Recover from Unforeseen Robot Malfunctions in Continuous Action Domains
Cooperative multi-agent learning methods are essential in developing effective cooperation strategies in multiagent domains. In robotics, these methods extend beyond multirobot scenarios to single-robot systems, where they enable coordination among different robot modules (e.g., robot legs or joints). However, current methods often struggle to quickly adapt to unforeseen failures, such as a malfunctioning robot leg, especially after the algorithm has converged to a strategy. To overcome this, we introduce the Relational Q-Functionals (RQF) framework. RQF leverages a relational network, representing agents’ relationships, to enhance adaptability, providing resilience against malfunction(s). Our algorithm also efficiently handles continuous state-action domains, making it adept for robotic learning tasks. Our empirical results show that RQF enables agents to use these relationships effectively to facilitate cooperation and recover from an unexpected malfunction in single-robot systems with multiple interacting modules. Thus, our approach offers promising applications in multi-agent systems, particularly in scenarios with unforeseen malfunctions.
Reference:
Yasin Findik, Paul Robinette, Kshitij Jerath, Reza Azadeh, "Relational Q-Functionals: Multi-Agent Learning to Recover from Unforeseen Robot Malfunctions in Continuous Action Domains," In Proc. 21st International Conference on Ubiquitous Robots (UR), New York, USA, pp. 251--256, June 24-27, 2024.
An Adaptive Framework for Manipulator Skill Reproduction in Dynamic Environments
Robot skill learning and execution in uncertain and dynamic environments is a challenging task. This paper proposes an adaptive framework that combines Learning from Demonstration (LfD), environment state prediction, and high-level decision making. Proactive adaptation prevents the need for reactive adaptation, which lags behind changes in the environment rather than anticipating them. We propose a novel LfD representation, Elastic-Laplacian Trajectory Editing (ELTE), which continuously adapts the trajectory shape to predictions of future states. Then, a high-level reactive system using an Unscented Kalman Filter (UKF) and Hidden Markov Model (HMM) prevents unsafe execution in the current state of the dynamic environment based on a discrete set of decisions. We first validate our LfD representation in simulation, then experimentally assess the entire framework using a legged mobile manipulator in 36 real-world scenarios. We show the effectiveness of the proposed framework under different dynamic changes in the environment. Our results show that the proposed framework produces robust and stable adaptive behaviors.
Reference:
Ryan Donald, Brendan Hertel, Stephen Misenti, Yan Gu, Reza Azadeh, "An Adaptive Framework for Manipulator Skill Reproduction in Dynamic Environments," In Proc. 21st International Conference on Ubiquitous Robots (UR), New York, USA, pp. 498--503, June 24-27, 2024.
Investigating the Generalizability of Assistive Robots Models over Various Tasks
In the domain of assistive robotics, the significance of effective modeling is well acknowledged. Prior research has primarily focused on enhancing model accuracy or involved the collection of extensive, often impractical amounts of data. While improving individual model accuracy is beneficial, it necessitates constant remodeling for each new task and user interaction. In this paper, we investigate the generalizability of different modeling methods. We focus on constructing the dynamic model of an assistive exoskeleton using six data-driven regression algorithms. Six tasks are considered in our experiments, including horizontal, vertical, diagonal from left leg to the right eye and the opposite, as well as eating and pushing. We constructed thirty-six unique models applying different regression methods to data gathered from each task. Each trained model's performance was evaluated in a cross-validation scenario, utilizing five folds for each dataset. These trained models are then tested on the other tasks that the model is not trained with. Finally the models in our study are assessed in terms of generalizability. Results show the superior generalizability of the task model performed along the horizontal plane, and decision tree based algorithms.
Reference:
Hamid Osooli, Christopher Coco, Jonathan Spanos, Amin Majdi, Reza Azadeh, "Investigating the Generalizability of Assistive Robots Models over Various Tasks," In Proc. 21st International Conference on Ubiquitous Robots (UR), New York, USA, pp. 227--232, June 24-27, 2024.
[pdf][arXiv]
Design of Fuzzy Logic Parameter Tuners for Upper-Limb Assistive Robots
Assistive Exoskeleton Robots are helping restore functions to people suffering from underlying medical conditions. These robots require precise tuning of hyper-parameters to feel natural to the user. The device hyper-parameters often need to be re-tuned from task to task, which can be tedious and require expert knowledge. To address this issue, we develop a set of fuzzy logic controllers that can dynamically tune robot gain parameters to adapt its sensitivity to the user's intention determined from muscle activation. The designed fuzzy controllers benefit from a set of expert-defined rules and do not rely on extensive amounts of training data. We evaluate the designed controllers with three different tasks and compare our results against the manually tuned system. Our preliminary results show that our controllers reduce the amount of fighting between the device and the human, measured using a set of pressure sensors.
Reference:
Christopher Coco, Jonathan Spanos, Hamid Osooli, Reza Azadeh, "Design of Fuzzy Logic Parameter Tuners for Upper-Limb Assistive Robots," WIP paper at 21st International Conference on Ubiquitous Robots (UR), New York, USA, pp. 386--389, June 24-27, 2024.
A Framework for Learning and Reusing Robotic Skills
In this paper, we present our work in progress towards creating a library of motion primitives. This library facilitates easier and more intuitive learning and reusing of robotic skills. Users can teach robots complex skills through Learning from Demonstration, which is automatically segmented into primitives and stored in clusters of similar skills. We propose a novel multimodal segmentation method as well as a novel trajectory clustering method. Then, when needed for reuse, we transform primitives into new environments using trajectory editing. We present simulated results for our framework with demonstrations taken on real-world robots.
Reference:
Brendan Hertel, Nhu Tran, Meriem Elkoudi, Reza Azadeh, "A Framework for Learning and Reusing Robotic Skills," WIP paper at 21st International Conference on Ubiquitous Robots (UR), New York, USA, pp. 801--804, June 24-27, 2024.
[pdf][arXiv]
Mixed Q-Functionals: Advancing Value-Based Methods in Cooperative MARL with Continuous Action Domains
Tackling multi-agent learning problems efficiently is a challenging task in continuous action domains. While value-based algorithms excel in sample efficiency when applied to discrete action domains, they are usually inefficient when dealing with continuous actions. Policy-based algorithms, on the other hand, attempt to address this challenge by leveraging critic networks for guiding the learning process and stabilizing the gradient estimation. The limitations in the estimation of true return and falling into local optima in these methods result in inefficient and often sub-optimal policies. In this paper, we diverge from the trend of further enhancing critic networks, and focus on improving the effectiveness of value-based methods in multi-agent continuous domains by concurrently evaluating numerous actions. We propose a novel multi-agent value-based algorithm, Mixed Q-Functionals (MQF), inspired from the idea of Q-Functionals, that enables agents to transform their states into basis functions. Our algorithm fosters collaboration among agents by mixing their action-values. We evaluate the efficacy of our algorithm in six cooperative multi-agent scenarios. Our empirical findings reveal that MQF outperforms four variants of Deep Deterministic Policy Gradient through rapid action evaluation and increased sample efficiency.
Reference:
[pdf][arXiv]
Impact of relational networks in multi-agent learning: A value-based factorization view
Effective coordination and cooperation among agents are crucial for accomplishing individual or shared objectives in multi-agent systems. In many real-world multiagent systems, agents possess varying abilities and constraints, making it necessary to prioritize agents based on their specific properties to ensure successful coordination and cooperation within the team. However, most existing cooperative multi-agent algorithms do not take into account these individual differences, and lack an effective mechanism to guide coordination strategies. We propose a novel multi-agent learning approach that incorporates relationship awareness into value-based factorization methods. Given a relational network, our approach utilizes inter-agents relationships to discover new team behaviors by prioritizing certain agents over other, accounting for differences between them in cooperative tasks. We evaluated the effectiveness of our proposed approach by conducting fifteen experiments in two different environments. The results demonstrate that our proposed algorithm can influence and shape team behavior, guide cooperation strategies, and expedite agent learning. Therefore, our approach shows promise for use in multi-agent systems, especially when agents have diverse properties.
Reference:
Yasin Findik, Paul Robinette, Kshitij Jerath, S. Reza Ahmadzadeh, "Impact of Relational Networks in Multi-Agent Learning: A Value-Based Factorization View," In Proc. 62nd IEEE Conference on Decision and Control (CDC), Marina Bay Sands, Singapore, pp. 4447--4454, Dec. 13-15, 2023.
Influence of team interactions on multi-robot cooperation: A relational network perspective
Relational networks within a team play a critical role in the performance of many real-world multi-robot systems. To successfully accomplish tasks that require cooperation and coordination, different agents (e.g., robots) necessitate different priorities based on their positioning within the team. Yet, many of the existing multi-robot cooperation algorithms regard agents as interchangeable and lack a mechanism to guide the type of cooperation strategy the agents should exhibit. To account for the team structure in cooperative tasks, we propose a novel algorithm that uses a relational network comprising inter-agent relationships to prioritize certain agents over others. Through appropriate design of the team’s relational network, we can guide the cooperation strategy, resulting in the emergence of new behaviors that accomplish the specified task. We conducted six experiments in a multi-robot setting with a cooperative task. Our results demonstrate that the proposed method can effectively influence the type of solution that the algorithm converges to by specifying the relationships between the agents, making it a promising approach for tasks that require cooperation among agents with a specified team structure.
Reference:
Yasin Findik, Hamid Osooli, Paul Robinette, Kshitij Jerath, S. Reza Ahmadzadeh, "Influence of Team Interactions on Multi-Robot Cooperation: A Relational Network Perspective," In Proc. International Symposium on Multi-Robot and Multi-Agent Systems (MRS), Boston, MA, pp. 50--56, Dec. 4-5, 2023.
Collaborative Adaptation: Learning to Recover from Unforeseen Malfunctions in Multi-Robot Teams
Cooperative multi-agent reinforcement learning (MARL) approaches tackle the challenge of finding effective multi-agent cooperation strategies for accomplishing individual or shared objectives in multi-agent teams. In real-world scenarios, however, agents may encounter unforeseen failures due to constraints like battery depletion or mechanical issues. Existing state-of-the-art methods in MARL often recover slowly -- if at all -- from such malfunctions once agents have already converged on a cooperation strategy. To address this gap, we present the Collaborative Adaptation (CA) framework. CA introduces a mechanism that guides collaboration and accelerates adaptation from unforeseen failures by leveraging inter-agent relationships. Our findings demonstrate that CA enables agents to act on the knowledge of inter-agent relations, recovering from unforeseen agent failures and selecting appropriate cooperative strategies.
Reference:
Yasin Findik, Paul Robinette, Kshitij Jerath, S. Reza Ahmadzadeh, "Collaborative Adaptation: Learning to Recover from Unforeseen Malfunctions in Multi-Robot Teams," In MADGames workshop at IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, pp. 1--6, Oct. 1-5, 2023.
A Multi-Robot Task Assignment Framework for Search and Rescue with Heterogeneous Teams
In post-disaster scenarios, efficient search and rescue operations involve collaborative efforts between robots and humans. Existing planning approaches focus on specific aspects but overlook crucial elements like information gathering, task assignment, and planning. Furthermore, previous methods considering robot capabilities and victim requirements suffer from time complexity due to repetitive planning steps. To overcome these challenges, we introduce a comprehensive framework__the Multi-Stage Multi-Robot Task Assignment. This framework integrates scouting, task assignment, and path-planning stages, optimizing task allocation based on robot capabilities, victim requirements, and past robot performance. Our iterative approach ensures objective fulfillment within problem constraints. Evaluation across four maps, comparing with a state-of-the-art baseline, demonstrates our algorithm's superiority with a remarkable 97 percent performance increase. Our code is open-sourced to enable result replication.
Reference:
Hamid Osooli, Paul Robinette, Kshitij Jerath, S. Reza Ahmadzadeh, "A Multi-Robot Task Assignment Framework for Search and Rescue with Heterogeneous Teams," In Advances in Multi-Agent Learning - Coordination, Communication, and Control Workshop at IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, pp. 1--5, Oct. 1-5, 2023.
[pdf][arXiv]
Design and Evaluation of a Bioinspired Tendon-Driven 3D-Printed Robotic Eye with Active Vision Capabilities
The field of robotics has seen significant advancements in recent years, particularly in the development of humanoid robots. One area of research that has yet to be fully explored is the design of robotic eyes. In this paper, we propose a computer-aided 3D design scheme for a robotic eye that incorporates realistic appearance, natural movements, and efficient actuation. The proposed design utilizes a tendon-driven actuation mechanism, which offers a broad range of motion capabilities. The use of the minimum number of servos for actuation, one for each agonist-antagonist pair of muscles, makes the proposed design highly efficient. Compared to existing ones in the same class, our designed robotic eye comprises aesthetic and realistic features. We evaluate the robot's performance using a vision-based controller, which demonstrates the effectiveness of the proposed design in achieving natural movement, and efficient actuation. The experiment code, toolbox, and printable 3D sketches of our design have been open-sourced.
Reference:
Hamid Osooli, Mohsen I. Rahaghi, S. Reza Ahmadzadeh, "Design and Evaluation of a Bioinspired Tendon-Driven 3D-Printed Robotic Eye with Active Vision Capabilities," In Proc. 20th International Conference on Ubiquitous Robots (UR), Honolulu, Hawaii, pp. 747--752, Jun. 25-28, 2023.
Confidence-Based Skill Reproduction Through Perturbation Analysis
Several methods exist for teaching robots, with one of the most prominent being Learning from Demonstration (LfD). Many LfD representations can be formulated as constrained optimization problems. We propose a novel convex formulation of the LfD problem represented as elastic maps, which models reproductions as a series of connected springs. Relying on the properties of strong duality and perturbation analysis of the constrained optimization problem, we create a confidence metric. Our method allows the demonstrated skill to be reproduced with varying confidence level yielding different levels of smoothness and flexibility. Our confidence-based method provides reproductions of the skill that perform better for a given set of constraints. By analyzing the constraints, our method can also remove unnecessary constraints. We validate our approach using several simulated and real-world experiments using a Jaco2 7DOF manipulator arm.
Reference:
Brendan Hertel, S. Reza Ahmadzadeh, "Confidence-Based Skill Reproduction Through Perturbation Analysis," In Proc. 20th International Conference on Ubiquitous Robots (UR), Honolulu, Hawaii, pp. 165--170, Jun. 25-28, 2023.
Contextual Autonomy Evaluation of Unmanned Aerial Vehicles in Subterranean Environments
In this paper we focus on the evaluation of contextual autonomy for robots. More specifically, we propose a fuzzy framework for calculating the autonomy score for a small Unmanned Aerial Systems (sUAS) for performing a task while considering task complexity and environmental factors. Our framework is a cascaded Fuzzy Inference System (cFIS) composed of combination of three FIS which represent different contextual autonomy capabilities. We performed several experiments to test our framework in various contexts, such as endurance time, navigation, take off/land, and room clearing, with seven different sUAS. We introduce a predictive measure which improves upon previous predictive measures, allowing for previous real-world task performance to be used in predicting future mission performance.
Reference:
Ryan Donald, Peter Gavriel, Adam Norton and S. Reza Ahmadzadeh, "Contextual Autonomy Evaluation of Unmanned Aerial Vehicles in Subterranean Environments," In Proc. 9th International Conference on Automation, Robotics, and Applications (ICARA), Abu Dhabi, United Arab Emirates, pp. 202--207, Feb. 10-12, 2023.
Robot Learning from Demonstration Using Elastic Maps
Learning from Demonstration (LfD) is a popular method of reproducing and generalizing robot skills from human-provided demonstrations. In this paper, we propose a novel optimization-based LfD method that encodes demonstrations as elastic maps. An elastic map is a graph of nodes connected through a mesh of springs. We build a skill model by fitting an elastic map to the set of demonstrations. The formulated optimization problem in our approach includes three objectives with natural and physical interpretations. The main term rewards the mean squared error in the Cartesian coordinate. The second term penalizes the non-equidistant distribution of points resulting in the optimum total length of the trajectory. The third term rewards smoothness while pe-nalizing nonlinearity. These quadratic objectives form a convex problem that can be solved efficiently with local optimizers. We examine nine methods for constructing and weighting the elastic maps and study their performance in robotic tasks. We also evaluate the proposed method in several simulated and real-world experiments using a UR5e manipulator arm, and compare it to other LfD approaches to demonstrate its benefits and flexibility across a variety of metrics.
Reference:
Brendan Hertel, Matthew Pelland, S. Reza Ahmadzadeh, "Learning from Demonstration using Elastic Maps," In Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022), Kyoto, Japan, pp. 7407--7413, Oct. 23-27, 2022.
Methods for Combining and Representing Non-Contextual Autonomy Scores for Unmanned Aerial Systems
Measuring an overall autonomy score for a robotic system requires the combination of a set of relevant aspects and features of the system that might be measured in different units, qualitative, and/or discordant. In this paper, we build upon an existing non-contextual autonomy framework that measures and combines the Autonomy Level and the Component Performance of a system as an overall autonomy score. We examine several methods of combining features, showing how some methods find different rankings from the same data. We discuss resolving this issue by employing the weighted product method. Furthermore, we introduce two new means for representing relative and absolute autonomy score, namely, the autonomy coordinate system and the autonomy distance which represents the overall autonomy of a system. We apply our method to a set of seven Unmanned Aerial Systems (UAS) and obtain their absolute autonomy score as well as their relative score with respect to the best system.
Reference:
Brendan Hertel, Ryan Donald, Christian Dumas, S. Reza Ahmadzadeh, ''Methods for Combining and Representing Non-Contextual Autonomy Scores for Unmanned Aerial Systems,'' In Proc. 8th International Conference on Automation, Robotics, and Applications (ICARA), Prague, Czech Republic, pp. 135--139, Feb. 18-20, 2022.
Similarity-aware skill reproduction based on multi-representational learning from demonstration
Learning from Demonstration (LfD) algorithms enable humans to teach new skills to robots through demonstrations. The learned skills can be robustly reproduced from the identical or near boundary conditions (e.g., initial point). However, when generalizing a learned skill over boundary conditions with higher variance, the similarity of the reproductions changes from one boundary condition to another, and a single LfD representation cannot preserve a consistent similarity across a generalization region. We propose a novel similarityaware framework including multiple LfD representations and a similarity metric that can improve skill generalization by finding reproductions with the highest similarity values for a given boundary condition. Given a demonstration of the skill, our framework constructs a similarity region around a point of interest (e.g., initial point) by evaluating individual LfD representations using the similarity metric. Any point within this volume corresponds to a representation that reproduces the skill with the greatest similarity. We validate our multi-representational framework in three simulated and four sets of real-world experiments using a physical 6-DOF robot. We also evaluate 11 different similarity metrics and categorize them according to their biases in 286 simulated experiments.
Reference:
Brendan Hertel, S. Reza Ahmadzadeh, "Similarity-aware Skill Reproduction based on Multi-representational Learning from Demonstration," In Proc. International Conference on Advanced Robotics (ICAR 2021), Ljubljana, Slovenia, pp. 652--657, Dec. 6-10, 2021.
Learning from Successful and Failed Demonstrations via Optimization
Learning from Demonstration (LfD) is a popular approach that allows humans to teach robots new skills by showing the correct way(s) of performing the desired skill. Human-provided demonstrations, however, are not always optimal and the teacher usually addresses this issue by discarding or replacing sub-optimal (noisy or faulty) demonstrations. We propose a novel LfD representation that learns from both successful and failed demonstrations of a skill. Our approach encodes the two subsets of captured demonstrations (labeled by the teacher) into a statistical skill model, constructs a set of quadratic costs, and finds an optimal reproduction of the skill under novel problem conditions (i.e. constraints). The optimal reproduction balances convergence towards successful examples and divergence from failed examples. We evaluate our approach through several 2D and 3D experiments in real-world using a UR5e manipulator arm and also show that it can reproduce a skill from only failed demonstrations. The benefits of exploiting both failed and successful demonstrations are shown through comparison with two existing LfD approaches. We also compare our approach against an existing skill refinement method and show its capabilities in a multi-coordinate setting.
Reference:
Brendan Hertel, S. Reza Ahmadzadeh, "Learning from Successful and Failed Demonstrations via Optimization," In Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021), Prague, Czech Republic, pp. 7784--7789, Sept. 27 - Oct. 1, 2021.
Reward-Sharing Relational Networks in Multi-Agent Reinforcement Learning as a Framework for Emergent Behavior
In this work, we integrate ‘social’ interactions into the MARL setup through a user-defined relational network and examine the effects of agent-agent relations on the rise of emergent behaviors. Leveraging insights from sociology and neuroscience, our proposed framework models agent relationships using the notion of Reward-Sharing Relational Networks (RSRN), where network edge weights act as a measure of how much one agent is invested in the success of (or ‘cares about’) another. We construct relational rewards as a function of the RSRN interaction weights to collectively train the multi-agent system via a multi-agent reinforcement learning algorithm. The performance of the system is tested for a 3-agent scenario with different relational network structures (e.g., self-interested, communitarian, and authoritarian networks). Our results indicate that reward-sharing relational networks can significantly influence learned behaviors. We posit that RSRN can act as a framework where different relational networks produce distinct emergent behaviors, often analogous to the intuited sociological understanding of such networks.
Reference:
H. Haeri, S. R. Ahmadzadeh, K. Jerath, "Reward-Sharing Relational Networks in Multi-Agent Reinforcement Learning as a Framework for Emergent Behavior," in Adaptive and Learning Agents (ALA) Workshop at AAMAS, London, UK, May 3-4, 2021.
Benchmark for Skill Learning from Demonstration: Impact of User Experience, Task Complexity, and Start Configuration on Performance
In this work, we contribute a large-scale study benchmarking the performance of multiple motion-based learning from demonstration approaches. Given the number and diversity of existing methods, it is critical that comprehensive empirical studies be performed comparing the relative strengths of these learning techniques. In particular, we evaluate four different approaches based on properties an end user may desire for real-world tasks. To perform this evaluation, we collected data from nine participants, across four different manipulation tasks with varying starting conditions. The resulting demonstrations were used to train 180 task models and evaluated on 720 task reproductions on a physical robot. Our results detail how i) complexity of the task, ii) the expertise of the human demonstrator, and iii) the starting configuration of the robot affect task performance. The collected dataset of demonstrations, robot executions, and evaluations are being made publicly available. Research insights and guidelines are also provided to guide future research and deployment choices about these approaches.
Reference:
M. A. Rana, D. Chen, S. R. Ahmadzadeh, J. Williams, V. Chu, S. Chernova, "Benchmark for Skill Learning from Demonstration: Impact of User Experience, Task Complexity, and Start Configuration on Performance,'' In Proc. IEEE Intl Conf. on Robotics and Automation (ICRA 2020), Paris, France, pp. 7561--7567, 31 May - 4 June, 2020.
Towards Mobile Multi-Task Manipulation in a Confined and Integrated Environment with Irregular Objects
The FetchIt! Mobile Manipulation Challenge, held at the IEEE International Conference on Robots and Automation (ICRA) in May 2019, offered an environment with complex and integrated task sets, irregular objects, confined space, and machining, introducing new challenges in the mobile manipulation domain. Here we describe our efforts to address these challenges by demonstrating the assembly of a kit of mechanical parts in a caddy. In addition to implementation details, we examine the issues in this task set extensively, and we discuss our software architecture in the hope of providing a base for other researchers. To evaluate performance and consistency, we conducted 20 full runs, then examined failure cases with possible solutions. We conclude by identifying future research directions to address the open challenges.
Reference:
Z. Han, J. Allspaw, G. LeMasurier, J. Parrillo, D. Giger, S. R. Ahmadzadeh, H. Yanco, "Towards Mobile Multi-Task Manipulation in a Confined and Integrated Environment with Irregular Objects,'' In Proc. IEEE Intl Conf. on Robotics and Automation (ICRA 2020), Paris, France, pp. 11025--11031, 31 May - 4 June, 2020.
[pdf]
Trajectory-based Skill Learning using Generalized Cylinders
In this article, we introduce Trajectory Learning using Generalized Cylinders (TLGC), a novel trajectory-based skill learning approach from human demonstrations. To model a demonstrated skill, TLGC uses a Generalized Cylinder—a geometric representation composed of an arbitrary space curve called the spine and a surface with smoothly varying cross-sections. Our approach is the first application of Generalized Cylinders to manipulation, and its geometric representation offers several key features: it identifies and extracts the implicit characteristics and boundaries of the skill by encoding the demonstration space, it supports for generation of multiple skill reproductions maintaining those characteristics, the constructed model can generalize the skill to unforeseen situations through trajectory editing techniques, our approach also allows for obstacle avoidance and interactive human refinement of the resulting model through kinesthetic correction. We validate our approach through a set of real-world experiments with both a Jaco 6-DOF and a Sawyer 7-DOF robotic arm.
Reference:
S. R. Ahmadzadeh, S. Chernova, "Trajectory-based Skill Learning using Generalized Cylinders," Frontiers in Robotics and AI, section Human-Robot Interaction, vol.5 (2018).
[pdf]
Skill Acquisition via Automated Multi-Coordinate Cost Balancing
We propose a learning framework, named Multi-Coordinate Cost Balancing (MCCB), to address the problem of acquiring point-to-point movement skills from demonstrations. MCCB encodes demonstrations simultaneously in multiple differential coordinates that specify local geometric properties. MCCB generates reproductions by solving a convex optimization problem with a multi-coordinate cost function and linear constraints on the reproductions, such as initial, target, and via points. Further, since the relative importance of each coordinate system in the cost function might be unknown for a given skill, MCCB learns optimal weighting factors that balance the cost function. We demonstrate the effectiveness of MCCB via detailed experiments conducted on one handwriting dataset and three complex skill datasets.
Reference:
H. Ravichandar*, S. R. Ahmadzadeh*, M. A. Rana, S. Chernova, "Skill Acquisition via Automated Multi-Coordinate Cost Balancing," In Proc. IEEE Intl Conf. on Robotics and Automation, (ICRA 2019), Montreal, Canada, pp. 7776--7782, 20-24 May, 2019.
Benchmark of Skill Learning from Demonstration
In this work, we contribute a large-scale study benchmarking the performance of multiple motion-based learning from demonstration approaches. Given the number and diversity of existing methods, it is critical that comprehensive empirical studies be performed comparing the relative strengths of these learning techniques. In particular, we evaluate four different approaches based on properties an end user may desire for real-world tasks. To perform this evaluation, we collected data from nine participants, across four different manipulation tasks with varying starting conditions. The resulting demonstrations were used to train 180 task models and evaluated on 720 task reproductions on a physical robot. Our results detail how i) complexity of the task, ii) the expertise of the human demonstrator, and iii) the starting configuration of the robot affect task performance. The collected dataset of demonstrations, robot executions, and evaluations are being made publicly available. Research insights and guidelines are also provided to guide future research and deployment choices about these approaches.
[pdf][arXiv]