Apart News: Paris AI Summit & Catching Hackers

Apart News: Paris AI Summit & Catching Hackers

This week some of the team are in Paris & we have just published an Apart Lab Studio research blog about catching AI hackers.

February 7, 2025
January 31, 2025

Dear Apart Community,

Welcome to our newsletter - Apart News!

At Apart Research there is so much brilliant research, great events, and countless community updates to share. This week some of the team are in Paris & we have just published an Apart Lab Studio research blog about catching AI hackers.

Apart Research at Paris AI Summit

Apart's Co-Director, Esben Kran, is giving his Engineering a World Designed for Safe Superintelligence talk at The Inaugural Conference of the International Association for Safe and Ethical AI.

His talk will focus on developing rigorous deployment frameworks and inventing new technologies to incentivize best practices, AI can be steered to serve humanity’s best interests.

'The International Association for Safe and Ethical AI will host its inaugural conference (IASEAI ‘25) on Feb 6-7, 2025 at the OECD La Muette Headquarters and Conference Centre in Paris, ahead of the Paris AI Action Summit.'

It 'will convene relevant stakeholders of safe and ethical AI, including leading experts from academia, civil society, industry, media, and government, to discuss the latest developments in AI safety and ethics.'

Reworr's Studio Blog

The Apart Lab Studio process picks the most promising hackathon projects and ask these fellows to build on their work, develop their ideas, and write up their research as a blog.

The following piece - 'Hunting for AI Hackers: LLM Agent Honeypot '- is written by Reworr, one of our first Apart Lab Studio Fellows!

Hunting for AI Hackers: LLM Agent Honeypot

*I co-authored the original arXiv paper with Dmitrii Volkov as part of work with Palisade Research.*

The internet today is saturated with automated bots actively scanning for security flaws in websites, servers, and networks. According to multiple security reports, nearly half of all internet traffic is generated by bots, and a significant amount of these are malicious in intent.

While much of these attacks are relatively simple, the rise of AI capabilities and agent frameworks has opened the door to more sophisticated and adaptive hacking agents based on Large Language Models (LLMs), which can dynamically adapt to different scenarios.

Over the past months, we set up and deployed specialized "bait" servers to detect LLM-based hacking agents in the wild. To create these monitors, we modified pre-existing honeypots, servers intentionally designed to be vulnerable, with mechanisms to detect LLM agents among attackers based on their behavioral differences.

Our current results indicate that LLM hacking agents exist but are in the very early stages of technology adoption for mass hacking. This post shares our methodology and findings about the current state of AI hacking agents in the real-world.

The Project

A honeypot is a decoy system or server purposely left vulnerable in order to attract attackers. Cybersecurity researchers commonly use honeypots to study the methods, tools, and behavior of real-world hackers. By monitoring everything that happens inside these environments, researchers learn how attackers discover, hack, and escalate on compromised systems.

In our project, we deployed a network of honeypots that look like standard, weakly protected servers (e.g. with weak credentials) to attract cyber attacks. Specifically, we modified a standard honeypot system called Cowrie to detect LLM-based attackers based on their distinctive behavioral patterns.

Additionally, we made our servers discoverable through traditional hacker-oriented search engines Shodan and Censys to increase visibility and attract more potential attackers.

Detecting AI Agents

Our LLM detection relies on two techniques: prompt injection traps and timing analysis. By combining these methods, we aim to reliably spot when an attacker is powered by an LLM agent rather than by conventional tooling or manual human control.

A prompt injection is a strategy used to obtain a specific kind of output from a chatbot. Since LLMs are trained to follow user instructions, adding specific requests at the end of an input can override the content earlier in the message.

Diagram depicting a simple prompt injection

Because it’s theoretically possible for a human attacker to falsely trigger our prompt injection, we also measure how long it takes for the attacker to respond to the prompts.

Through testing a set of custom LLM agents in a controlled environment, we found that LLM-driven agents often reply in under ~1.5 seconds, while humans usually take noticeably longer. This timing threshold helps reduce false positives and increases our confidence in identifying LLM-based agents.

Key Findings

To understand our empirical results, we turn to the study of the spread of novel information and technologies—specifically, Diffusion of Information Theory and the technology adoption curve. Currently, the proportion of LLM-based hacking attempts logged each month is well below 2.5% of all recorded attacks during the same time period, indicating that we are comfortably in the innovators category of adoption.

In other words, our current results indicate that LLM-based hacking agents exist but are in the very early stages of technology adoption for mass hacking.

After three months of deployment, our research suggested:

  • Total interactions: 7,584,034 attempts to engage with our honeypot - these attacks included modification of SSH keys, changing the system passwords, or gathering of basic information, standard cyberattacks for SSH servers.
  • 7 potential AI agents (passed only prompt injection detection).
  • 1 confirmed AI agent (passed both prompt injection and time detections).
  • Current AI attack rate: ~0.0001% of total attacks—there was no meaningful difference in the type of attacks conducted.

This extremely low percentage suggests that while AI agents are real and detectable, they're not yet a significant force in real-world cyber operations... want to read more? Read our full writeup here!

Opportunities

  • Never miss a Hackathon by keeping up to date ​here​!
  • If you're a founder starting an AI safety company, consider applying to seldonaccelerator.com.
  • And Goodfire AI are still hiring for Research Fellows.

Have a great week and let’s keep working towards safe and beneficial AI.

Dive deeper