Publications

* denotes equal contribution and joint lead authorship.
Blue - Conference Papers.
Red - Workshop and Doctoral Consortia Papers.
Orange - Journal Papers.


2023

  1. Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach For Adaptive Brain Stimulation
    Michelle Pan*, Mariah Schrum*, Vivek Myers, Erdem Biyik, and Anca Dragan,

    International Conference on Machine Learning

    — Adaptive brain stimulation can treat neurological conditions such as Parkinson’s disease and post- stroke motor deficits by influencing abnormal neu- ral activity. Because of patient heterogeneity, each patient requires a unique stimulation policy to achieve optimal neural responses. Model-free re- inforcement learning (MFRL) holds promise in learning effective policies for a variety of simi- lar control tasks, but is limited in domains like brain stimulation by a need for numerous costly environment interactions. In this work we intro- duce Coprocessor Actor Critic, a novel, model- based reinforcement learning (MBRL) approach for learning neural coprocessor policies for brain stimulation. Our key insight is that coprocessor policy learning is a combination of learning how to act optimally in the world and learning how to induce optimal actions in the world through stimulation of an injured brain. We show that our approach overcomes the limitations of traditional MFRL methods in terms of sample efficiency and task success and outperforms baseline MBRL ap- proaches in a neurologically realistic model of an injured brain
  2. Mixed-Initiative Multiagent Apprenticeship Learning for Human Training of Robot Teams
    Esi Seraj Jerry Yuyang Xiong, Mariah Schrum, and Matthew Gombolay,

    Conference on Neural Information Processing

    —Extending recent advances in Learning from Demonstration (LfD) frameworks to multi-robot settings poses critical challenges such as environment non-stationarity due to partial observability which is detrimental to the applicability of existing methods. Although prior work has shown that enabling communication among agents of a robot team can alleviate such issues, creating inter-agent communication under existing Multi-Agent LfD (MA-LfD) frameworks requires the human expert to provide demonstrations for both environment actions and communication actions, which necessitates an efficient communication strategy on a known message spaces. To address this problem, we propose Mixed-Initiative Multi-Agent Apprenticeship Learning (MixTURE). MixTURE enables robot teams to learn from a human expert-generated data a preferred policy to accomplish a collaborative task, while simultaneously learning emergent inter-agent communication to enhance team coordination. The key ingredient to MixTURE's success is automatically learning a communication policy, enhanced by a mutual-information maximizing reverse model that rationalizes the underlying expert demonstrations without the need for human generated data or an auxiliary reward function. MixTURE outperforms a variety of relevant baselines on diverse data generated by human experts in complex heterogeneous domains. MixTURE is the first MA-LfD framework to enable learning multi-robot collaborative policies directly from real human data, resulting in ~44% less human workload, and ~46% higher usability score.
  3. MAVERIC: A Data-Driven Approach to Personalized Autonomous Driving
    Mariah Schrum, Emily Sumner, Matthew Gombolay, and Andrew Best,

    IEEE Transactions on Robotics

    — Personalization of autonomous vehicles (AV) may significantly increase trust, use, and acceptance. In particular, we hypothesize that the similarity of an AV’s driving style compared to the end-user’s driving style will have a major impact on end-user’s willingness to use the AV. To investigate the impact of driving style on user acceptance, we 1) develop a data-driven approach to personalize driving style and 2) demonstrate that personalization significantly impacts attitudes towards AVs. Our approach learns a high-level model that tunes low-level controllers to ensure safe and personalized control of the AV. The key to our approach is learning an informative, personalized embedding that represents a user’s driving style. Our framework is capable of calibrating the level of aggression so as to optimize driving style based upon driver preference. Across two human subject studies (n = 54), we first demonstrate our approach mimics the driving styles of end-users and can tune attributes of style (e.g., aggressiveness). Second, we investigate the factors (e.g., trust, personality etc.) that impact homophily, i.e. an individual’s preference for a driving style similar to their own. We find that our approach generates driving styles consistent with end-user styles (p < .001) and participants rate our approach as more similar to their level of aggressiveness (p = .002). We find that personality (p < .001), perceived similarity (p < .001), and high-velocity driving style (p = .0031) significantly modulate the effect of homophily.
  4. Privacy and Personalization: Transparency, Acceptance, and the Ethics of Personalized Robots
    Mariah Schrum and Matthew Gombolay,

    Conference on Human-Robot Interaction Social Robots Personalisation Workshop

    — To effectively support humans, machines must be capable of recognizing individual desires, abilities, and characteristics and adapt to account for differences across individuals. However, personalization does not come without a cost. In many domains, for robots to effectively personalize their behavior, the robot must solicit often private and intimate information about an end-user so as to optimize the interaction. However, not all end-users may be comfortable sharing this information, especially if the end-user is not provided with insight into why the robot is requesting it. As HRI researchers, we have the responsibility of ensuring the robots we create do not infringe upon the privacy rights of end-users and that end-users are provided with the means to make informed decisions about the information they share with robots. While prior work has investigated willingness to share information in the context of consumerism, no prior work has investigated the impact of domain, type of requested information, or explanations on end-user’s comfort and acceptance of a personalized robot. To gain a better understanding of these questions, we propose an experimental design in which we investigate the impact of domain, nature of personal information requested, and the role of explanations on robot transparency and end-user willingness to share information. Our goal of this study is to provide guidance for HRI researchers who are conducting work in personalization by examining the factors that may impact transparency and acceptance of personalized robots
  5. The Effect of Robot Skill Level and Communication in Rapid, Proximate Human-Robot Collaboration

    ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2023

    As high-speed, agile robots become more commonplace, these robots will have the potential to better aid and collaborate with humans. However, due to the increased agility and functionality of these robots, close collaboration with humans can create safety concerns that alter team dynamics and degrade task performance. In this work, we aim to enable the deployment of safe and trustworthy agile robots that operate in proximity with humans. We do so by 1) Proposing a novel human-robot doubles table tennis scenario to serve as a testbed for studying agile, proximate human-robot collaboration and 2) Conducting a user-study to understand how attributes of the robot (e.g., robot competency or capacity to communicate) impact team dynamics, perceived safety, and perceived trust, and how these latent factors affect human-robot collaboration (HRC) performance. We find that robot competency significantly increases perceived trust ($p<.001$), extending skill-to-trust assessments in prior studies to agile, proximate HRC. Furthermore, interestingly, we find that when the robot vocalizes its intention to perform a task, it results in a significant decrease in team performance (p=.037) and perceived safety of the system (p=.009).

2022

  1. Explainable Artificial Intelligence: Evaluating the Objective and Subjective Impacts of xAI on Human-Agent Interaction
    Andrew Silva, Mariah Schrum, Erin Hedlund-Botti, Nakul Gopalan, and Matthew Gombolay.

    International Journal on Human-Computer Interaction, 2022

    Intelligent agents must be able to communicate intentions and explain their decision-making processes to build trust, foster confidence, and improve human-agent team dynamics. Recognizing this need, academia and industry are rapidly proposing new ideas, methods, and frameworks to aid in the design of more explainable AI. Yet, there remains no standardized metric or experimental protocol for benchmarking new methods, leaving researchers to rely on their own intuition or ad hoc methods for assessing new concepts. In this work, we present the first comprehensive (n = 286) user study testing a wide range of approaches for explainable machine learning, including feature importance, probability scores, decision trees, counterfactual reasoning, natural language explanations, and case-based reasoning, as well as a baseline condition with no explanations. We provide the first large-scale empirical evidence of the effects of explainability on human-agent teaming. Our results will help to guide the future of explainability research by highlighting the benefits of counterfactual explanations and the shortcomings of confidence scores for explainability. We also propose a novel questionnaire to measure explainability with human participants, inspired by relevant prior work and correlated with human-agent teaming metrics.
  2. Concerning Trends in Likert Scale Usage in Human-Robot Interaction: Towards Improving Best Practices
    Mariah Schrum*, Leng Ghuy*, Erin Hedlund-Botti, Manisha Natarajan, Michael Johnson, and Matthew Gombolay.

    Transactions on Human-Robot Interaction, 2022

    As robots become more prevalent, the importance of the ield of human-robot interaction (HRI) grows accordingly. As such, we should endeavor to employ the best statistical practices in HRI research. Likert scales are commonly used metrics in HRI to measure perceptions and attitudes. Due to misinformation or honest mistakes, many HRI researchers do not adopt best practices when analyzing Likert data. We conduct a review of psychometric literature to determine the current standard for Likert scale design and analysis. Next, we conduct a survey of ive years of the International Conference on Human-Robot Interaction (HRIc) (2016 through 2020) and report on incorrect statistical practices and design of Likert scales [1 ś 3, 5 , 7]. During these years, only 4 of the 144 papers applied proper statistical testing to correctly-designed Likert scales. We additionally conduct a survey of best practices across several venues and provide a comparative analysis to determine how Likert practices difer across the ield of Human-Robot Interaction. We ind that a venue’s impact score negatively correlates with number of Likert related errors and acceptance rate, and total number of papers accepted per venue positively correlates with the number of errors. We also ind statistically signiicant diferences between venues for the frequency of misnomer and design errors. Our analysis suggests there are areas for meaningful improvement in the design and testing of Likert scales. Based on our indings, we provide guidelines and a tutorial for researchers for developing and analyzing Likert scales and associated data. We also detail a list of recommendations to improve the accuracy of conclusions drawn from Likert data
  3. Meta-Active Learning in Probabilistically Safe Optimization
    Mariah Schrum, Mark Connolly, Eric Cole, Mihir Ghetiya, Robert Gross, and Matthew Gombolay.

    International Conference on Intelligent Robots and Systems (IROS), 2022

    IEEE Robotics and Automation Letters, 2022

    When a robotic system is faced with uncertainty, the system must take calculated risks to gain information as efficiently as possible while ensuring system safety. The need to safely and efficiently gain information in the face of uncertainty spans domains from healthcare to search and rescue. To efficiently learn when data is scarce or difficult to label, active learning acquisition functions intelligently select a data point that, if the label were known, would most improve the estimate of the unknown model. Unfortunately, prior work in active learning suffers from an inability to accurately quantify information-gain, generalize to new domains, and ensure safe operation. To overcome these limitations, we develop Safe MetAL, a probabilistically-safe, active learning algorithm which meta-learns an acquisition function for selecting sample efficient data points in safety critical domains. The key to our approach is a novel integration of meta-active learning and chance-constrained optimization. We (1) meta-learn an acquisition function based on sample history, (2) encode this acquisition function in a chance-constrained optimization framework, and (3) solve for an information-rich set of data points while enforcing probabilistic safety guarantees. We present state-of-the-art results in active learning of the model of a damaged UAV and in learning the optimal parameters for deep brain stimulation. Our approach achieves a 41% improvement in learning the optimal model and a 20% speedup in computation time compared to active and meta-learning approaches while ensuring safety of the system.
  4. Reciprocal MIND MELD: Improving Learning From Demonstration via Personalized, Reciprocal Teaching
    Mariah Schrum, Erin Hedlund-Botti and Matthew Gombolay.

    Conference on Robot Learning, 2022

    Endowing robots with the ability to learn novel tasks via demonstrations will increase the accessibility of robots for non-expert, non-roboticists. However, research has shown that humans can be poor teachers, making it difficult for robots to effectively learn from humans. If the robot could instruct humans how to provide better demonstrations, then humans might be able to effectively teach a broader range of novel, out-of-distribution tasks. In this work, we introduce Reciprocal MIND MELD, a framework in which the robot learns the way in which a demonstrator is suboptimal and utilizes this information to provide feedback to the demonstrator to improve upon their demonstrations. We additionally develop an Embedding Predictor Network which learns to predict the demonstrator’s suboptimality online without the need for optimal labels. In a series of human-subject experiments in a driving simulator domain, we demonstrate that robotic feedback can effectively improve human demonstrations in two dimensions of suboptimality (p < .001) and that robotic feedback translates into better learning outcomes for a robotic agent on novel tasks (p = .045).
  5. Mind meld: Personalized meta-learning for robot-centric imitation learning
    Mariah Schrum*, Erin Hedlund-Botti* Nina Moorman, and Matthew Gombolay.

    Conference on Robot Learning (CoRL), 2021
    Best Technical Paper

    Conference on Human-Robot Interaction, 2022

    —Learning from demonstration (LfD) techniques seek to enable users without computer programming experience to teach robots novel tasks. There are generally two types of LfD: human- and robot-centric. While human-centric learning is intuitive, human centric learning suffers from performance degradation due to covariate shift. Robot-centric approaches, such as Dataset Aggregation (DAgger), address covariate shift but can struggle to learn from suboptimal human teachers. To create a more human-aware version of robot-centric LfD, we present Mutual Information-driven Meta-learning from Demonstration (MIND MELD). MIND MELD meta-learns a mapping from suboptimal and heterogeneous human feedback to optimal labels, thereby improving the learning signal for robot-centric LfD. The key to our approach is learning an informative personalized embedding using mutual information maximization via variational inference. The embedding then informs a mapping from human provided labels to optimal labels. We evaluate our framework in a human-subjects experiment, demonstrating that our approach improves corrective labels provided by human demonstrators. Our framework outperforms baselines in terms of ability to reach the goal (p < .001), average distance from the goal (p = .006), and various subjective ratings (p = .008).
  6. Personalized meta-learning for domain agnostic learning from demonstration
    Mariah Schrum, Erin Hedlund-Botti, and Matthew Gombolay.

    HRI Pioneers Doctoral Consortium, 2022

    For robots to perform novel tasks in the real-world, they must be capable of learning from heterogeneous, non-expert human teachers across various domains. Yet, novice human teachers often provide suboptimal demonstrations, making it difficult for robots to successfully learn. Therefore, to effectively learn from humans, we must develop learning methods that can account for teacher suboptimality and can do so across various robotic platforms. To this end, we introduce Mutual Information Driven Meta-Learning from Demonstration (MIND MELD) [12, 13], a personalized meta-learning framework which meta-learns a mapping from suboptimal human feedback to feedback closer to optimal, conditioned on a learned personalized embedding. In a human subjects study, we demonstrate MIND MELD’s ability to improve upon suboptimal demonstrations and learn meaningful, personalized embeddings. We then propose Domain Agnostic MIND MELD, which learns to transfer the personalized embedding learned in one domain to a novel domain, thereby al- lowing robots to learn from suboptimal humans across disparate platforms (e.g., self-driving car or in-home robot).

2021

  1. Improving robot-centric learning from demonstration via personalized embeddings
    Mariah Schrum, Erin Hedlund-Botti*, and Matthew Gombolay.

    Artificial Intelligence for Human-Robot Interaction Symposium, 2021

    Learning from demonstration (LfD) techniques seek to en- able novice users to teach robots novel tasks in the real world. However, prior work has shown that robot-centric LfD ap- proaches, such as Dataset Aggregation (DAgger), do not per- form well with human teachers. DAgger requires a human demonstrator to provide corrective feedback to the learner ei- ther in real-time, which can result in degraded performance due to suboptimal human labels, or in a post hoc manner which is time intensive and often not feasible. To address this problem, we present Mutual Information-driven Meta- learning from Demonstration (MIND MELD), which meta- learns a mapping from poor quality human labels to pre- dicted ground truth labels, thereby improving upon the per- formance of prior LfD approaches for DAgger-based train- ing. The key to our approach for improving upon suboptimal feedback is mutual information maximization via variational inference. Our approach learns a meaningful, personalized embedding via variational inference which informs the map- ping from human provided labels to predicted ground truth labels. We demonstrate our framework in a synthetic domain and in a human-subjects experiment, illustrating that our ap- proach improves upon the corrective labels provided by a hu- man demonstrator by 63%.
  2. Four years in review: Statistical practices of likert scales in human-robot interaction studies
    Mariah Schrum, Leng Ghuy*, Michael Johnson*, and Matthew Gombolay.

    Conference on Human-Robot Interaction, 2021

    As robots become more prevalent, the importance of the field of human-robot interaction (HRI) grows accordingly. As such, we should endeavor to employ the best statistical practices. Likert scales are commonly used metrics in HRI to measure perceptions and attitudes. Due to misinformation or honest mistakes, most HRI researchers do not adopt best practices when analyzing Likert data. We conduct a review of psychometric literature to determine the current standard for Likert scale design and analysis. Next, we conduct a survey of four years of the International Conference on Human-Robot Interaction (2016 through 2019) and report on incorrect statistical practices and design of Likert scales. During these years, only 3 of the 110 papers applied proper statistical testing to correctly-designed Likert scales. Our analysis suggests there are areas for meaningful improvement in the design and testing of Likert scales. Lastly, we provide recommendations to improve the accuracy of conclusions drawn from Likert data.
  3. Effects of Social Factors and Team Dynamics on Adoption of Collaborative Robot Autonomy
    Mariah Schrum*, Glen Neville*, Michael Johnson*, Nina Moorman, Karen Feigh, and Matthew Gombolay.

    Conference on Human-Robot Interaction, 2021

    As automation becomes more prevalent, the fear of job loss due to automation increases [ 22 ]. Workers may not be amenable to working with a robotic co-worker due to a negative perception of the technology. The attitudes of workers towards automation are influenced by a variety of complex and multi-faceted factors such as intention to use, perceived usefulness and other external variables [15 ]. In an analog manufacturing environment, we explore how these various factors influence an individual’s willingness to work with a robot over a human co-worker in a collaborative Lego building task. We specifically explore how this willingness is affected by: 1) the level of social rapport established between the individual and his or her human co-worker, 2) the anthropomorphic qualities of the robot, and 3) factors including trust, fluency and personality traits. Our results show that a participant’s willingness to work with automation decreased due to lower perceived team fluency (p=0.045), rapport established between a participant and their co-worker (p=0.003), the gender of the participant being male (p=0.041), and a higher inherent trust in people (p=0.018).

2020

  1. When your robot breaks: Active learning during plant failure
    Mariah Schrum, and Matthew Gombolay.

    International Conference on Robotics and Automation (ICRA), 2020

    IEEE Robotics and Automation Letters (RA-L), 2020

    Detecting and adapting to catastrophic failures in robotic systems requires a robot to learn its new dynamics quickly and safely to best accomplish its goals. To address this challenging problem, we propose probabilistically-safe, online learning techniques to infer the altered dynamics of a robot at the moment a failure (e.g., physical damage) occurs. We combine model predictive control and active learning within a chance- constrained optimization framework to safely and efficiently learn the new plant model of the robot. We leverage a neural net- work for function approximation in learning the latent dynamics of the robot under failure conditions. Our framework generalizes to various damage conditions while being computationally light- weight to advance real-time deployment. We empirically validate within a virtual environment that we can regain control of a severely damaged aircraft in seconds and require only 0.1 seconds to find safe, information-rich trajectories, outperforming state- of-the-art approaches.