Construction of a POMDP Learning Model for Human-Robot Collaboration

Principal Investigator: Professor Lin

Other Contributor: Wei Zheng

AWaRE REU Researcher: Chase Brown, Bethel University

Project Description: In Human-Robot Collaboration, robots are expected to work next to humans in warehouses, daily housekeeping, and other robot assistant applications safely, intelligently and friendly. To achieve this goal, the robotic system should be equipped with capacities of understanding the intentions of human partners and reasoning according to the behaviors of human partners and the state of the environment. The main idea is to combine the learning-based approach with traditional high-level task planning algorithms. The first step is to build a human model using data collected from the visual perception system such as stereo cameras. Based on the learned human model, robots could infer the intentions of human partners using the data collected during run-time. For example, in the handover task, the robot could track the skeleton of the human, collect data from several demonstrations and then infer the human intention. Once the robots understand human intention, they could behave collaboratively with the human according to decisions made by high-level task planning algorithms.

Finding: The collaboration of humans and robots in the industrial and home environment is essential to the progression of modern robotics. The implementation of a vector autoregressive partially observable Markov decision process (VAR-POMDP) allows humans and robots to collaborate on high-level objectives with precision and accuracy.

The undergraduate research experience in human-robot collaboration is tasked with the implementation of this model in a low-cost personal robot. Creating a system for tracking a unique stochastic (human) model is integral for providing meaningful data to the robotic control system.

The VAR-POMDP learning model is recognized for its ability to accurately predict the stochastic actions of a human contained within a predefined environment. Distinct from prior work the proposed demonstration does not predefine human states or transition states in order to show the flexibility of the proposed VAR-POMDP model in many unique situations.

A Bayesian non-parametric learning model is implemented in the construction of the demonstration to define potential human states through observable data collected within the operating environment. The states defined by the Bayesian learning model are multiplied by the predefined discretized states of the robotic environment to create a set of all perceivable states. These states form the foundations for a state transition matrix which is used to predict human behavior. The successful implementation of this demonstration has advanced the validity of the proposed VAR-POMPD model. The further implementation of this model in the fields of assisted living and driver assistance systems will induce innovation and advancements.