OpenAI’s Ilya Sutskever Has a Plan for Keeping Super-Intelligent AI in Check


OpenAI was established with a commitment to develop artificial intelligence (AI) that serves the greater good of humanity, even as the AI surpasses human intelligence. Despite recent commercial focus, especially with the introduction of ChatGPT, the company remains dedicated to addressing the challenges posed by increasingly powerful AIs. The Superalignment research team, formed in July, is actively working on strategies to manage future superhuman AI entities, which are anticipated to possess significant capabilities and potential risks.

Leopold Aschenbrenner, a researcher at OpenAI involved in the Superalignment project, emphasizes the rapid approach of Artificial General Intelligence (AGI) and the need for effective control methods. OpenAI has allocated a substantial portion of its computing power to this critical research initiative.

In a newly released research paper, OpenAI details experiments aimed at allowing a less advanced AI model to guide the behavior of a more intelligent one without compromising its capabilities. The study focuses on the supervision process, currently involving human feedback to enhance the performance of models like GPT-4. As AI progresses, there is a growing interest in automating this feedback loop, considering the potential limitations of human input as AI surpasses human intelligence.

The researchers conducted a control experiment using GPT-2 to train GPT-4, which initially resulted in a decrease in the capabilities of the superior model. Two proposed solutions were tested: training progressively larger models to mitigate performance loss and implementing an algorithmic adjustment to GPT-4, enabling it to heed the guidance of the inferior model without significant performance reduction. The latter proved more effective, although the researchers acknowledge that these methods do not guarantee flawless behavior from the stronger model, considering it a preliminary step for future research.

Dan Hendryks, director of the Center for AI Safety, applauds OpenAI’s proactive approach to the challenges of controlling superhuman AIs and emphasizes the need for sustained, dedicated efforts over many years to successfully address this complex issue.

Other news

US Chief Justice: AI won’t replace judges but will ‘transform our work’ – 03.01.2024

In the Federal Judiciary’s year-end report, US Chief Justice John Roberts addressed the potential impact of AI on the judicial system. In particular, he aimed to quell concerns about the obsolescence of judges in the face of technological advancements.

Read More

Microsoft Copilot gets new features to help edit AI-generated images

Microsoft’s AI-powered chatbots and assistants, part of the Copilot family, are receiving significant upgrades.

Read More