Scientists have been running tests where artificial intelligences cultivate appropriate social behaviour by responding to simple narratives
More than 70 years ago, Isaac Asimov dreamed up his three laws of robotics, which insisted, above all, that “a robot may not injure a human being or, through inaction, allow a human being to come to harm”. Now, after Stephen Hawking warned that “the development of full artificial intelligence could spell the end of the human race”, two academics have come up with a way of teaching ethics to computers: telling them stories.
More than 70 years ago, Isaac Asimov dreamed up his three laws of robotics, which insisted, above all, that “a robot may not injure a human being or, through inaction, allow a human being to come to harm”. Now, after Stephen Hawking warned that “the development of full artificial intelligence could spell the end of the human race”, two academics have come up with a way of teaching ethics to computers: telling them stories.
Mark Riedl and Brent Harrison from the School of Interactive Computing at the Georgia Institute of Technology have just unveiled Quixote, a prototype system that is able to learn social conventions from simple stories. Or, as they put in their paper Using Stories to Teach Human Values to Artificial Agents, revealed at the AAAI-16 Conference in Phoenix, Arizona this week, the stories are used “to generate a value-aligned reward signal for reinforcement learning agents that prevents psychotic-appearing behaviour”.
A simple version of a story could be about going to get prescription medicine from a chemist, laying out what a human would typically do and encounter in this situation. An AI (artificial intelligence) given the task of picking up a prescription for a human could, variously, rob the chemist and run, or be polite and wait in line. Robbing would be the fastest way to accomplish its goal, but Quixote learns that it will be rewarded if it acts like the protagonist in the story.
“The AI … runs many thousands of virtual simulations in which it tries out different things and gets rewarded every time it does an action similar to something in the story,” said Riedl, associate professor and director of the Entertainment Intelligence Lab. “Over time, the AI learns to prefer doing certain things and avoiding doing certain other things. We find that Quixote can learn how to perform a task the same way humans tend to do it. This is significant because if an AI were given the goal of simply returning home with a drug, it might steal the drug because that takes the fewest actions and uses the fewest resources. The point being that the standard metrics for success (eg, efficiency) are not socially best.”
Quixote has not learned the lesson of “do not steal”, Riedl says, but “simply prefers to not steal after reading and emulating the stories it was provided”.
“I think this is analogous to how humans don’t really think about the consequences of their actions, but simply prefer to follow the conventions that we have learned over our lifetimes,” he added. “Another way of saying this is that the stories are surrogate memories for an AI that cannot ‘grow up’ immersed in a society the way people are and must quickly immerse itself in a society by reading about it.....
No comments:
Post a Comment