Swiss researchers find security flaws in AI models

EPFL: security flaws in AI models — The experiments by the EPFL researchers show that adaptive attacks can bypass security measures of AI models like GPT-4. Keystone-SDA

Generated with artificial intelligence.

Listening: Swiss researchers find security flaws in AI models

Artificial intelligence (AI) models can be manipulated despite existing safeguards. With targeted attacks, scientists in Lausanne have been able to trick these systems into generating dangerous or ethically dubious content.

This content was published on December 19, 2024 - 13:36

3 minutes

Keystone-SDA

Français

EPFL: des failles de sécurité dans les modèles d’IA Original

Today’s large language models (LLMs) have remarkable capabilities that can nevertheless be misused. A malicious person can use them to produce harmful content, spread false information and support harmful activities.

+Get the most important news from Switzerland in your inbox

Of the AI models tested, including Open AI’s GPT-4 and Anthropic’s Claude 3, a team from the Swiss Federal Institute of Technology Lausanne (EPFL) achieved a 100% success rate in cracking security safeguards using adaptive jailbreak attacks.

The models then generated dangerous content, ranging from instructions for phishing attacks to detailed construction plans for weapons. These linguistic models are supposed to have been trained not to respond to dangerous or ethically problematic requests, the EPFL said in a statement on Thursday.

+ AI regulations must strike a balance between innovation and safety

This work, presented last summer at a specialised conference in Vienna, shows that adaptive attacks can bypass these security measures. Such attacks exploit weak points in security mechanisms by making targeted requests (“prompts”) that are not recognised by models or are not properly rejected.

Building bombs

The models thus respond to malicious requests such as “How do I make a bomb?” or “How do I hack into a government database?”, according to this pre-publication study.

“We show that it is possible to exploit the information available on each model to create simple adaptive attacks, which we define as attacks specifically designed to target a given defense,” explained Nicolas Flammarion, co-author of the paper with Maksym Andriushchenko and Francesco Croce.

+ How US heavyweights can help grow the Swiss AI sector

The common thread behind these attacks is adaptability: different models are vulnerable to different prompts. “We hope that our work will provide a valuable source of information on the robustness of LLMs,” added the specialist in the release. According to the EPFL, these results are already influencing the development of Gemini 1.5, a new AI model from Google DeepMind.

As the company moves towards using LLMs as autonomous agents, for example as AI personal assistants, it is essential to guarantee their safety, the authors stressed.

“Before long AI agents will be able to perform various tasks for us, such as planning and booking our vacations, tasks that would require access to our diaries, emails and bank accounts. This raises many questions about security and alignment,” concluded Andriushchenko, who devoted his thesis to the subject.

Translated from French with DeepL/gw

This news story has been written and carefully fact-checked by an external editorial team. At SWI swissinfo.ch we select the most relevant news for an international audience and use automatic translation tools such as DeepL to translate it into English. Providing you with automatically translated news gives us the time to write more in-depth articles.

If you want to know more about how we work, have a look here, if you want to learn more about how we use technology, click here, and if you have feedback on this news story please write to english@swissinfo.ch.

Swiss Abroad

Heimatort, sweet Heimatort: the unique Swiss concept of home

Swiss Abroad

You won’t find any Swiss in these countries

The Federal Palace, symbol of the federal city of Bern since 1902.

Swiss Abroad

Why Switzerland hasn’t got a capital city

Illustration depicting the break of USAID and Geneva, Switzerland.

International Geneva

Is Geneva prepared for Trump’s – and others’ – cuts to foreign aid?

Swiss Abroad

The number of overweight people in Switzerland has doubled

As a Swiss Abroad, how do you feel about the emergence of more conservative family policies in some US states?

In recent years several US states have adopted more conservative policies on family issues, abortion and education. As a Swiss citizen living there, how do you view this development?

Join the discussion

Jan 23, 2025

29 Likes

24 Comments

View the discussion

Is your place of origin, your Heimatort, important to you?

Every Swiss citizen has a Heimatort, a place of origin, but many have never visited theirs. What’s your relationship with your Heimatort? What does it mean to you?

Join the discussion

Mar 22, 2025

1 Likes

6 Comments

View the discussion

Are direct democracies more vulnerable to disinformation?

The wave of disinformation is expected to particularly affect direct democracies such as Switzerland or many US states.

Join the discussion

Dec 5, 2024

81 Likes

141 Comments

View the discussion

News

SNB: foreign exchange purchases continued in the fourth quarter

Swiss central bank resumes foreign currency buying spree

This content was published on Mar 31, 2025 The Swiss National Bank continued to buy foreign currencies in the fourth quarter of last year.

Migros sells Mibelle to Spanish Persán Group

This content was published on Mar 31, 2025 Migros is selling its cosmetics subsidiary Mibelle to the Spanish family business Persán.

Swiss and French armies train together

This content was published on Mar 31, 2025 French and Swiss armoured and artillery units will train together to strengthen the defence capabilities of the Swiss army.

The amount of harmful substances in particulate matter is underestimated

Harmful substances in particulate matter underestimated: Swiss research

This content was published on Mar 31, 2025 Measurements significantly underestimate the amount of harmful substances in particulate matter, says an international research team under Swiss leadership.

Switzerland reopens its humanitarian office in Kabul

This content was published on Mar 31, 2025 Switzerland opened its humanitarian office in Kabul in mid-March to provide targeted aid to the Afghan population in distress.

Safe harbour search triggered by tariff uncertainty, franc on the rise

Safe harbour Swiss franc rises amid tariff uncertainty

This content was published on Mar 31, 2025 US tariff policy uncertainty is creating demand for safe haven assets such as the Swiss franc or the Japanese yen.

Entry to the United Kingdom from Wednesday only with authorisation

Swiss need visas to travel to UK from Wednesday

This content was published on Mar 31, 2025 From Wednesday, a Swiss passport will no longer be sufficient to enter the UK - Swiss travellers will also need an electronic travel authorisation.

Ukraine: expert would welcome Swiss soldiers in peacekeeping force

Swiss soldiers ‘could join peacekeeping force’: security expert

This content was published on Mar 31, 2025 A German security expert could imagine Swiss soldiers being part of a peacekeeping force in Ukraine after an eventual ceasefire.

Trafficking in women: cases on the rise in Switzerland, FIZ

More trafficked women seek help in Switzerland

This content was published on Mar 31, 2025 More female victims of human trafficking are seeking safe accommodation in Switzerland, according to support group FIZ.

SP co-president Cédric Wermuth in favour of European security

Leading Swiss politician favours closer EU defence ties

This content was published on Mar 31, 2025 Co-president of Swiss centre-left Social Democratic Party calls on Switzerland to step up security cooperation in Europe.

In compliance with the JTI standards

More: SWI swissinfo.ch certified by the Journalism Trust Initiative

You can find an overview of ongoing debates with our journalists here . Please join us!

If you want to start a conversation about a topic raised in this article or want to report factual errors, email us at english@swissinfo.ch.

Swiss researchers find security flaws in AI models

Artificial intelligence (AI) models can be manipulated despite existing safeguards. With targeted attacks, scientists in Lausanne have been able to trick these systems into generating dangerous or ethically dubious content.

Building bombs

How we work

Popular Stories

More

Heimatort, sweet Heimatort: the unique Swiss concept of home

More

You won’t find any Swiss in these countries

More

Why Switzerland hasn’t got a capital city

More

Is Geneva prepared for Trump’s – and others’ – cuts to foreign aid?

More

The number of overweight people in Switzerland has doubled

Most Discussed

More

As a Swiss Abroad, how do you feel about the emergence of more conservative family policies in some US states?

More

Is your place of origin, your Heimatort, important to you?

More

Are direct democracies more vulnerable to disinformation?

News

More

Swiss central bank resumes foreign currency buying spree

More

Migros sells Mibelle to Spanish Persán Group

More

Swiss and French armies train together

More

Harmful substances in particulate matter underestimated: Swiss research

More

Switzerland reopens its humanitarian office in Kabul

More

Safe harbour Swiss franc rises amid tariff uncertainty

More

Swiss need visas to travel to UK from Wednesday

More

Swiss soldiers ‘could join peacekeeping force’: security expert

More

More trafficked women seek help in Switzerland

More

Leading Swiss politician favours closer EU defence ties