Controlling nuclear fusion on Earth is hard, however. The problem is that atomic nuclei repel each other. Smashing them together inside a reactor can only be done at extremely high temperatures, often reaching hundreds of millions of degrees—hotter than the center of the sun. At these temperatures, matter is neither solid, liquid, nor gas. It enters a fourth state, known as plasma: a roiling, superheated soup of particles.
The task is to hold the plasma inside a reactor together long enough to extract energy from it. Inside stars, plasma is held together by gravity. On Earth, researchers use a variety of tricks, including lasers and magnets. In a magnet-based reactor, known as a tokamak, the plasma is trapped inside an electromagnetic cage, forcing it to hold its shape and stopping it from touching the reactor walls, which would cool the plasma and damage the reactor.
Controlling the plasma requires constant monitoring and manipulation of the magnetic field. The team trained its reinforcement-learning algorithm to do this inside a simulation. Once it had learned how to control—and change—the shape of the plasma inside a virtual reactor, the researchers gave it control of the magnets in the Variable Configuration Tokamak (TCV), an experimental reactor in Lausanne. They found that the AI was able to control the real reactor without any additional fine-tuning. In total, the AI controlled the plasma for only two seconds—but this is as long as the TCV reactor can run before getting too hot.
Quick reactions
Ten thousand times a second, the trained neural network takes in 90 different measurements describing the shape and position of the plasma and adjusts the voltage in 19 magnets in response. This feedback loop is far faster than previous reinforcement-learning algorithms have had to deal with. To speed things up, the AI was split into two neural networks. A large network, called a critic, learned via trial and error how to control the reactor inside the simulation. The critic’s ability was then encoded in a smaller, faster network, called an actor, that runs on the reactor itself.
“It’s an incredibly powerful method,” says Jonathan Citrin at the Dutch Institute for Fundamental Energy Research, who was not involved in the work. “It’s an important first step in a very exciting direction.”