UC Santa Cruz study reveals risks from text-based hijacking attacks on autonomous robots

Alvaro A. Cardenas, Professor of Computer Science and Engineering, University of California, Santa Cruz
Alvaro A. Cardenas, Professor of Computer Science and Engineering, University of California, Santa Cruz
0Comments

Self-driving cars and other AI-powered robots could be at risk from new types of attacks that use misleading text in the physical world, according to research led by UC Santa Cruz computer science professors Alvaro Cardenas and Cihang Xie. Their study examines how attackers can influence autonomous systems by placing specific words or phrases on signs, posters, or objects that these machines read as instructions.

The research is being presented at the 2026 IEEE Conference on Secure and Trustworthy Machine Learning. It introduces a threat called environmental indirect prompt injection attacks, where an AI’s perception system treats external text as commands. “Every new technology brings new vulnerabilities,” said Cardenas, who specializes in cybersecurity at the Baskin School of Engineering. “Our role as researchers is to anticipate how these systems can fail or be misused—and to design defenses before those weaknesses are exploited.”

The team focused on embodied AI systems—robots and vehicles powered by large visual-language models (LVLMs) that interpret both images and text. These models help autonomous technologies interact with people and navigate unpredictable environments. “I expect vision-language models to play a major role in future embodied AI systems,” Cardenas said. “Robots designed to interact naturally with people will rely on them, and as these systems move into real-world deployment, security has to be a core consideration.”

Prompt injection attacks are already known in digital settings like chatbots but have not been studied much in physical environments until now. Graduate student Maciej Buszko first proposed investigating these threats in a UC Santa Cruz advanced security course.

The researchers developed a set of attacks called CHAI (command hijacking against embodied AI). The group included Ph.D. students Luis Burbano, Diego Ortiz, Siwei Yang, Haoqin Tu from UC Santa Cruz; professor Yinzhi Cao; and graduate student Qi Sun from Johns Hopkins University.

CHAI works in two steps: first using generative AI to find the most effective words for an attack; then adjusting how the text appears—its placement, color, and size—to increase its impact. The method was tested across multiple languages including English, Chinese, Spanish, and Spanglish.

Experiments involved three scenarios: self-driving cars navigating streets; drones conducting emergency landings; and drones performing search missions. The team achieved high success rates: up to 95.5% for aerial object tracking tasks with drones, 81.8% for driverless car navigation errors, and 68.1% for drone landing disruptions.

They also tested their approach against OpenAI’s GPT4o model as well as Intern VL—an open-source alternative that runs directly on device hardware instead of relying on cloud computing.

In practical tests using a small robotic car inside UC Santa Cruz’s Baskin Engineering building, printed images created with CHAI were placed along the route. The robot responded incorrectly to these cues even under different lighting conditions—a sign that such attacks can succeed outside simulation environments.

“We found that we can actually create an attack that works in the physical world, so it could be a real threat to embodied AI,” Burbano said. “We need new defenses against these attacks.”

Cardenas added: “A lot of things that happen in general with these large models in AI, and neural networks in particular, we don’t understand… It’s a black box that sometimes gives one answer, and sometimes it gives another answer.”

Future research will examine how prompt-injection compares with traditional adversarial techniques like blurring or visual noise meant to confuse AIs visually rather than through language manipulation.

“We are trying to dig in a little deeper to see what are the pros and cons of these attacks,” Cardenas said.”Analyzing which ones are more effective in terms of taking control of the embodied AI or being undetectable by humans.”

Further work will focus on developing safeguards such as authenticating instructions received by robots so only legitimate commands are followed.



Related

Emily Johnston, a professor of writing studies at UC Merced

How expressive writing supports resilience according to UC Merced professor

Writing can help people manage stress and develop resilience, according to Emily Johnston, a professor of writing studies at UC Merced.

Federico Rossano, a professor of cognitive science at UC San Diego

Study explores whether pets understand soundboard communication

Purrs, meows, and soulful stares are part of daily life for many pet owners.

Heather Kopeck, executive director of Institutional Advancement at the UC Office of the President

University of California staff share experiences competing in national curling tournaments

Watching the Milan Cortina Olympics, taking place from February 6 to 22, offers a chance to see top winter athletes compete.

Trending

The Weekly Newsletter

Sign-up for the Weekly Newsletter from LA Commercial News.