Why DeepMind is sending AI humanoids to soccer camp

“It didn’t really work,” says Nicolas Heess, also a researcher at DeepMind and a co-author of the paper along with Lever. Due to the complexity of the problem, the wide range of options available, and the lack of prior knowledge of the task, the agents had no idea where to start – hence the squirming and twitching.

So Heess, Lever and colleagues instead used neural probabilistic motor primitives (NPMP), a teaching method that nudged the AI ​​model towards more human-like movement patterns in the expectation that this underlying knowledge would help solve the problem move on virtual soccer field. “It basically redirects your motor control to realistic human behavior, realistic human movements,” says Lever. “And that’s learned from motion capture — in this case, human actors playing soccer.”

This “reconfigures the action space,” says Lever. Agents’ movements are already constrained by their human-like bodies and joints, which can only bend in certain ways, and being exposed to data from real humans will further constrain them, simplifying the problem. “It makes it more likely that useful things will be discovered through trial and error,” says Lever. NPMP accelerates the learning process. There has to be a “delicate balance” of teaching the AI ​​to do things the way humans do them, while giving it enough freedom to find its own solutions to problems – which may be more efficient than the ones we imagine come up with yourself .

Basic practice was followed by single player practice: running, dribbling and kicking the ball to mimic the way people learn to play a new sport before diving into a full game situation. The rewards for reinforcement learning were things like successfully chasing a target without the ball, or dribbling the ball close to a target. This skills curriculum was a natural way to work towards increasingly complex tasks, says Lever.

The goal was to encourage agents to reuse skills they may have learned outside of the soccer context in a soccer environment – ​​to generalize and switch flexibly between different movement strategies. The agents who mastered these exercises were assigned as teachers. In the same way that the AI ​​was encouraged to mimic what it learned from human motion capture, it was also rewarded for not straying too far from the strategies used by the teacher agents in specific scenarios, at least initially. “This is actually a parameter of the algorithm that is optimized during training,” says Lever. “In principle, over time, they can reduce their dependence on the teachers.”

With their virtual players trained, it was time for some match action: starting with 2v2 and 3v3 matches to maximize the experience the agents gained during each simulation round (and to mimic how young players start playing small-sided games in real life). The highlights – which you can watch here – have the chaotic energy of a dog chasing a ball in the park: the players don’t run so much as stumble forward, constantly on the verge of falling to the ground. When goals are scored, it’s not complicated passing movements, but hopeful shots into the field and soccer-like rebounds off the back wall.

Leave a Comment

Your email address will not be published.