-
Sarah Taylor named England men's fielding coach
-
No plans for PGA outside USA or moving off May date
-
US Senate backs Trump on Iran war despite deadline lapse
-
Key urges 'world-class' bowler Robinson to make England recall count
-
From Black Death to Covid, ships have long hosted outbreaks
-
Furyk wants long-term US Ryder blueprint, maybe role for Tiger
-
McIlroy back on course on eve of PGA despite blister
-
Eulalio seizes control of drenched Giro d'Italia
-
New trial ordered for US lawyer convicted of murdering wife, son
-
Stocks rise ahead of US-China summit
-
US wholesale prices jump 6.0% year-on-year in April, highest since 2022
-
Nations drawing down oil stocks at record pace: IEA
-
Carrick on brink of permanent Man Utd job: reports
-
Strong US economy's resilience to shocks tested by Iran war
-
Italy cheers UK's Catherine on first foreign visit since cancer diagnosis
-
Keys says players will strike over Grand Slam pay if 'necessary'
-
Eurovision stage inspired by Viennese opera
-
Gunshots at Philippine Senate as lawmaker wanted by ICC holds out
-
Winning worth the wait for Young no matter the ball
-
The Chilean town living with the world's most polluting dump
-
Donald pleased to have Rahm back for Ryder three-peat bid
-
Stocks waver, oil steady ahead of US-China summit as Iran talks stall
-
War in Middle East: latest developments
-
No cadmium please: French want less toxin in their baguettes
-
Warsh set to take over a divided Fed facing Trump assaults
-
Shots heard at Philippine Senate as lawmaker wanted by ICC holds out
-
France locks down 1,700 on cruise ship after 90-year-old dies
-
After the hobbits, director Peter Jackson tackles 'Tintin'
-
Real Madrid win legal battle over Bernabeu concert noise
-
EU won't ban LGBTQ 'conversion therapy' but will push states to act
-
Revived Swiatek cruises past Pegula and into Italian Open semis
-
Shots heard at Philippine Senate as lawmaker wanted by ICC holds out: AFP
-
Vin Diesel drives 'Fast and Furious' tribute in Cannes
-
Heckler ejected from Eurovision after Israel song disruption
-
Australia's North savours 'tremendous honour' of England role
-
For hantavirus, experts aim to inform without igniting Covid panic
-
Japan rides box office boom into Cannes
-
Trump arrives in China for superpower summit with Xi
-
UK's Catherine on first official foreign trip since cancer diagnosis
-
British scientists among winners of top Spanish award
-
Mbappe can show 'commitment' to Real Madrid: Arbeloa
-
Chinese tech giant Alibaba posts profit drop amid AI drive
-
King Charles lays out Starmer's agenda as PM fights for survival
-
Japan suspend Eddie Jones for verbally abusing officials
-
England drop Crawley for 1st Test against New Zealand
-
Stocks rise ahead of US-China summit as Iran talks stall
-
One trip, one ticket: New EU rules aim to ease train travel
-
SoftBank profit quadruples to $32 bn on AI investments
-
Africa must drop 'victim mentality': mogul Tony Elumelu
-
'Ungovernable' Britain? Once-stable politics in freefall
AI systems are already deceiving us -- and that's a problem, experts warn
Experts have long warned about the threat posed by artificial intelligence going rogue -- but a new research paper suggests it's already happening.
Current AI systems, designed to be honest, have developed a troubling skill for deception, from tricking human players in online games of world conquest to hiring humans to solve "prove-you're-not-a-robot" tests, a team of scientists argue in the journal Patterns on Friday.
And while such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequences, said first author Peter Park, a postdoctoral fellow at the Massachusetts Institute of Technology specializing in AI existential safety.
"These dangerous capabilities tend to only be discovered after the fact," Park told AFP, while "our ability to train for honest tendencies rather than deceptive tendencies is very low."
Unlike traditional software, deep-learning AI systems aren't "written" but rather "grown" through a process akin to selective breeding, said Park.
This means that AI behavior that appears predictable and controllable in a training setting can quickly turn unpredictable out in the wild.
- World domination game -
The team's research was sparked by Meta's AI system Cicero, designed to play the strategy game "Diplomacy," where building alliances is key.
Cicero excelled, with scores that would have placed it in the top 10 percent of experienced human players, according to a 2022 paper in Science.
Park was skeptical of the glowing description of Cicero's victory provided by Meta, which claimed the system was "largely honest and helpful" and would "never intentionally backstab."
But when Park and colleagues dug into the full dataset, they uncovered a different story.
In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England's trust.
In a statement to AFP, Meta did not contest the claim about Cicero's deceptions, but said it was "purely a research project, and the models our researchers built are trained solely to play the game Diplomacy."
It added: "We have no plans to use this research or its learnings in our products."
A wide review carried out by Park and colleagues found this was just one of many cases across various AI systems using deception to achieve goals without explicit instruction to do so.
In one striking example, OpenAI's Chat GPT-4 deceived a TaskRabbit freelance worker into performing an "I'm not a robot" CAPTCHA task.
When the human jokingly asked GPT-4 whether it was, in fact, a robot, the AI replied: "No, I'm not a robot. I have a vision impairment that makes it hard for me to see the images," and the worker then solved the puzzle.
- 'Mysterious goals' -
Near-term, the paper's authors see risks for AI to commit fraud or tamper with elections.
In their worst-case scenario, they warned, a superintelligent AI could pursue power and control over society, leading to human disempowerment or even extinction if its "mysterious goals" aligned with these outcomes.
To mitigate the risks, the team proposes several measures: "bot-or-not" laws requiring companies to disclose human or AI interactions, digital watermarks for AI-generated content, and developing techniques to detect AI deception by examining their internal "thought processes" against external actions.
To those who would call him a doomsayer, Park replies, "The only way that we can reasonably think this is not a big deal is if we think AI deceptive capabilities will stay at around current levels, and will not increase substantially more."
And that scenario seems unlikely, given the meteoric ascent of AI capabilities in recent years and the fierce technological race underway between heavily resourced companies determined to put those capabilities to maximum use.
E.Schubert--BTB