KEYWORDS: Education and training, Machine learning, Decision making, Data modeling, Contrast transfer function, Evolutionary algorithms, Gallium nitride, Data acquisition, Data processing, Computing systems
We consider the problem of synthetically generating data that can closely resemble human decisions made in the context of an interactive human-AI system like a computer game. We propose a novel algorithm that can generate synthetic, human-like, decision making data while starting from a very small set of decision making data collected from humans. Our proposed algorithm integrates the concept of reward shaping with an imitation learning algorithm to generate the synthetic data. We have validated our synthetic data generation technique by using the synthetically generated data as a surrogate for human interaction data to solve three sequential decision making tasks of increasing complexity within a small computer game-like setup. Different empirical and statistical analyses of our results show that the synthetically generated data can substitute the human data and perform the game-playing tasks almost indistinguishably, with very low divergence, from a human performing the same tasks.
Imitation learning has been shown to be a successful learning technique in scenarios where autonomous agents have to adapt their operation across diverse environments or domains. The main principle underlying imitation learning is to determine a state-to-action mapping, called a policy, from trajectories demonstrated by an expert. We consider the problem of imitation learning under adversarial settings where the expert could be malicious and intermittently give incorrect demonstrations to misguide the learning agent. We propose a technique using temporally extended policies called options to make a learning agent robust against adversarial expert demonstrations. Experimental evaluation of our proposed technique for a game playing AI shows that a learning agent using our options based technique can successfully resist deterioration in its task performance as compared to using conventional reinforcement learning, when an expert adversarially modifies the demonstrations either randomly or strategically.
KEYWORDS: Robots, Land mines, Sensors, Stochastic processes, Metals, Detection and tracking algorithms, Computer simulations, Mobile robots, General packet radio service, Agriculture
Multi-robot systems comprising of heterogeneous autonomous vehicles on land, air, water are being increasingly
used to assist or replace humans in different hazardous missions. Two crucial aspects in such multi-robot
systems are to: a) explore an initially unknown region of interest to discover tasks, and, b) allocate and share
the discovered tasks between the robots in a coordinated manner using a multi-robot task allocation (MRTA)
algorithm. In this paper, we describe results from our research on multi-robot terrain coverage and MRTA
algorithms within an autonomous landmine detection scenario, done as part of the COMRADES project. Each
robot is equipped with a different type of landmine detection sensor and different sensors, even of the same type,
can have different degrees of accuracy. The landmine detection-related operations performed by each robot are
abstracted as tasks and multiple robots are required to complete a single task. First, we describe a distributed
and robust terrain coverage algorithm that employs Voronoi partitions to divide the area of interest among the
robots and then uses a single-robot coverage algorithm to explore each partition for potential landmines. Then,
we describe MRTA algorithms that use the location information of discovered potential landmines and employ
either a greedy strategy, or, an opportunistic strategy to allocate tasks among the robots while attempting to
minimize the time (energy) expended by the robots to perform the tasks. We report experimental results of our
algorithms using accurately-simulated Corobot robots within the Webots simulator performing a multi-robot,
landmine detection operation.
KEYWORDS: Sensors, Land mines, Robots, Infrared sensors, General packet radio service, Environmental sensing, Information fusion, Data fusion, Mining, Metals
We consider the problem of distributed sensor information fusion by multiple autonomous robots within the
context of landmine detection. We assume that different landmines can be composed of different types of material
and robots are equipped with different types of sensors, while each robot has only one type of landmine detection
sensor on it. We introduce a novel technique that uses a market-based information aggregation mechanism
called a prediction market. Each robot is provided with a software agent that uses sensory input of the robot
and performs calculations of the prediction market technique. The result of the agent's calculations is a 'belief'
representing the confidence of the agent in identifying the object as a landmine. The beliefs from different
robots are aggregated by the market mechanism and passed on to a decision maker agent. The decision maker
agent uses this aggregate belief information about a potential landmine and makes decisions about which other
robots should be deployed to its location, so that the landmine can be confirmed rapidly and accurately. Our
experimental results show that, for identical data distributions and settings, using our prediction market-based
information aggregation technique increases the accuracy of object classification favorably as compared to two
other commonly used techniques.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.