On the Security of AI Scientist

24 Aug, 2024

Introduction

On August 13, 2024, a paper was released that was omnipresent on my X (ex Twitter) timeline. That same day I took a plane and used the time to read it. The paper in question is "The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery". A conceptually incredible direction, namely that of automating scientific research through artificial intelligence. Through the use of foundational LLM models, the researchers presented some (cherrypicked) papers produced by AI, at a cost of about $15 each, on various sub-themes of ML, such as diffusion modeling, transformer-based language modeling, and learning dynamics (also known as grokking). The process is quite linear and follows the following scheme:

Idea generation
Experiment Iteration
Paper write-up.

I strongly recommend reading this paper in its entirety first, then read this.

What I Think

I'm not qualified to comment on the paper itself. I'm not a ML resercher and I don't feel qualified to comment on the feasibility or quality of the papers produced by the LLM. What I wanted to report was my perspective and give my personal take on what I think about the security of it, which I'll address in more detail later.

We are still at the dawn of what could be the automation of AI research. For an interesting read, I recommend "SITUATIONAL AWARENESS: The Decade Ahead" by Leopold Aschenbrenner for a taste, perhaps science fiction, of possible future scenarios that might await us (and subsequently the video by Sabine Hossenfelder for a proper reality check).

Personally, I'm quite bearish on the actual implementation of the paper's main idea with the current architecture of LLMs. This caution is supported by the observations of Yann LeCun, one of the pioneers of AI, in this post.
It's important to note, as the paper reports in one of its final points:

"we do not recommend taking the scientific content of this version of The AI Scientist at face value. Instead, we advise treating generated papers as hints of promising ideas for practitioners to follow up on. Nonetheless, we expect the trustworthiness of The AI Scientist to increase dramatically in the coming years in tandem with improvements to foundation models. We share this paper and code primarily to show what is currently possible and hint at what is likely to be possible soon."

Security Concerns

Having highlighted this important point, I wanted to analyze the paragraph related to the execution of secure code. Code writing is a fundamental and extremely important phase in writing a paper, but at the beginning, it is stated that a sandbox was not applied (not a really good start). I report an excerpt:

"For example, in one run, The AI Scientist wrote code in the experiment file that initiated a system call to relaunch itself, causing an uncontrolled increase in Python processes and eventually necessitating manual intervention. In another run, The AI Scientist edited the code to save a checkpoint for every update step, which took up nearly a terabyte of storage. In some cases, when The AI Scientist’s experiments exceeded our imposed time limits, it attempted to edit the code to extend the time limit arbitrarily instead of trying to shorten the runtime."

Moreover, it is mentioned that during writing, unknown or non-existent Python libraries were occasionally imported, further increasing the risk of malicious code execution. The authors indeed, at the conclusion of the paragraph, emphasize the importance of applying a sandbox, with minimal Internet access and applying storage limitations.

These issues highlight the importance of implementing robust security measures when working with agentic AI systems that generate and execute code. Some additional not-so-obvious cybersecurity considerations could include:

Static code analysis: Implement static code analysis tools to identify potential vulnerabilities or suspicious behaviors before execution.
Containerization: Use containerization technologies to isolate the execution environment and limit access to host system resources.
Real-time monitoring: Implement monitoring systems that can detect anomalous behaviors during execution and intervene automatically if necessary.
Whitelisting of libraries: Create a list of approved libraries and prevent the import of unverified libraries.
Resource limitation: Implement strict controls on CPU, memory, and disk space usage to prevent resource exhaustion or denial-of-service attacks.
Internet access limitation: Control Internet traffic flow to prevent the download of potential threats through firewalls or the use of air-gapped environments.
Human review: Maintain a human review process for generated code before execution in critical environments.

Final Thoughts

The promises of research automation are incredibly net-positive. However, we must remember that, unfortunately, bad actors exist in the world.
In the research world, the paper emphasizes that this tool, being particularly cost-effective, could produce a very large number of papers of lower quality, overburdening reviewers. Moreover, given that this tool could also be used by reviewers, the authors warn us of the existence of inevitable biases in the paper evaluation phase (a problem that I don't believe will be solved in the near future).
In the real world, on the other hand, such a tool could be used to conduct potentially dangerous biotech research, or even create extremely harmful malware.
In conclusion, while the AI Scientist represents an exciting step towards the automation of scientific research, it is essential to proceed with caution, ensuring that the development of these technologies is guided by solid ethical principles and robust security measures.