Only 16 months have passed, but the release of ChatGPT back in November 2022 feels already like ancient AI history. Hundreds of billions of dollars, both public and private, are pouring into AI. Thousands of AI-powered products have been created, including the new GPT-4o just this week. Everyone from students to scientists now use these large language models. Our world, and in particular the world of AI, has decidedly changed.
But the real prize of human-level AI—or artificial general intelligence (AGI)—has yet to be achieved. Such a breakthrough would mean an AI that can carry out most economically productive work, engage with others, do science, build and maintain social networks, conduct politics, and carry out modern warfare. The main constraint for all these tasks today is cognition. Removing this constraint would be world-changing. Yet many across the globe’s leading AI labs believe this technology could be a reality before the end of this decade.
That could be an enormous boon for humanity. But AI could also be extremely dangerous, especially if we cannot control it. Uncontrolled AI could hack its way into online systems that power so much of the world, and use them to achieve its goals. It could gain access to our social media accounts and create tailor-made manipulations for large numbers of people. Even worse, military personnel in charge of nuclear weapons could be manipulated by an AI to share their credentials, posing a huge threat to humanity.
It would be a constructive step to make it as hard as possible for any of that to happen by strengthening the world’s defenses against adverse online actors. But when AI can convince humans, which it is already better at than we are, there is no known defense.
For these reasons, many AI safety researchers at AI labs such as OpenAI, Google DeepMind and Anthropic, and at safety-minded nonprofits, have given up on trying to limit the actions future AI can do. They are instead focusing on creating “aligned” or inherently safe AI. Aligned AI might get powerful enough to be able to exterminate humanity, but it should not want to do this.
There are big question marks about aligned AI. First, the technical part of alignment is an unsolved scientific problem. Recently, some of the best researchers working on aligning superhuman AI left OpenAI in dissatisfaction, a move that does not inspire confidence. Second, it is unclear what a superintelligent AI would be aligned to. If it was an academic value system, such as utilitarianism, we might quickly find out that most humans’ values actually do not match these aloof ideas, after which the unstoppable superintelligence could go on to act against most people’s will forever. If the alignment was to people’s actual intentions, we would need some way to aggregate these very different intentions. While idealistic solutions such as a U.N. council or AI-powered decision aggregation algorithms are in the realm of possibility, there is a worry that superintelligence’s absolute power would be concentrated in the hands of very few politicians or CEOs. This would of course be unacceptable for—and a direct danger to—all other human beings.
Read More: The Only Way to Deal With the Threat From AI? Shut It Down
Dismantling the time bomb
If we cannot find a way to at the very least keep humanity safe from extinction, and preferably also from an alignment dystopia, AI that could become uncontrollable must not be created in the first place. This solution, postponing human-level or superintelligent AI, for as long as we haven’t solved safety concerns, has the downside that AI’s grand promises—ranging from curing disease to creating massive economic growth—will need to wait.
Pausing AI might seem like a radical idea to some, but it will be necessary if AI continues to improve without us reaching a satisfactory alignment plan. When AI’s capabilities reach near-takeover levels, the only realistic option is that labs are firmly required by governments to pause development. Doing otherwise would be suicidal.
And pausing AI may not be as difficult as some make it out to be. At the moment, only a relatively small number of large companies have the means to perform leading training runs, meaning enforcement of a pause is mostly limited by political will, at least in the short run. In the longer term, however, hardware and algorithmic improvement mean a pause may be seen as difficult to enforce. Enforcement between countries would be required, for example with a treaty, as would enforcement within countries, with steps like stringent hardware controls.
In the meantime, scientists need to better understand the risks. Although there is widely-shared academic concern, no consensus exists yet. Scientists should formalize their points of agreement, and show where and why their views deviate, in the new International Scientific Report on Advanced AI Safety, which should develop into an “Intergovernmental Panel on Climate Change for AI risks.” Leading scientific journals should open up further to existential risk research, even if it seems speculative. The future does not provide data points, but looking ahead is as important for AI as it is for climate change.
For their part, governments have an enormous part to play in how AI unfolds. This starts with officially acknowledging AI’s existential risk, as has already been done by the U.S., U.K., and E.U., and setting up AI safety institutes. Governments should also draft plans for what to do in the most important, thinkable scenarios, as well as how to deal with AGI’s many non-existential issues such as mass unemployment, runaway inequality, and energy consumption. Governments should make their AGI strategies publicly available, allowing scientific, industry, and public evaluation.
It is great progress that major AI countries are constructively discussing common policy at biannual AI safety summits, including one in Seoul from May 21 to 22. This process, however, needs to be guarded and expanded. Working on a shared ground truth on AI’s existential risks and voicing shared concern with all 28 invited nations would already be major progress in that direction. Beyond that, relatively easy measures need to be agreed upon, such as creating licensing regimes, model evaluations, tracking AI hardware, expanding liability for AI labs, and excluding copyrighted content from training. An international AI agency needs to be set up to guard execution.
It is fundamentally difficult to predict scientific progress. Still, superhuman AI will likely impact our civilization more than anything else this century. Simply waiting for the time bomb to explode is not a feasible strategy. Let us use the time we have as wisely as possible.