DeepMind, the London-based subsidiary of Alphabet, has createda system that can quickly master anygame in the classthat includes chess, Go, and Shogi, and do sowithouthuman guidance. The system, called AlphaZero, began its lifelast yearby beating a DeepMind system that had been specialized just for Go. That earlier system haditself made history by beating one of the world’s best Go players, but it needed human help to get through a months-long course of improvement.
AlphaZero trained itself—injust3days. AlphaZero, playing White against Stockfish, began byidentifying four candidate moves. After 1,000simulations, it rejected the moves marked in red;after another 100,000 simulations, it chose the move marked in green over the one marked in orange.
AlphaZerowent on to win, thanks in large part to having opened the diagonal forits bishop. The research,published today in the journal Science, was performed by a team ledby DeepMind’sDavid Silver. The paper was accompanied by a commentary by Murray Campbell,an AI researcher at the IBMThomas J. Watson Research Center in Yorktown Heights,
N.Y. AlphaZero can crack any game that provides all the information that’s relevant to decision-making; the new generation of games to which Campbell alludes do not. Poker furnishes a good example of such games of “imperfect” information: Players can holdtheir cards close totheir chests. Other examples include many multiplayer games, such as StarCraft II, Dota, andMinecraft.
Butthey may not pose a worthy challenge for long.
Source: ieee.org