# Research

CAI is built on a strong foundation of peer-reviewed research establishing the field of **Cybersecurity AI** as a distinct research domain. Our work spans theoretical frameworks, practical implementations, educational initiatives, and rigorous empirical evaluations.

---

## 📊 Research Impact & Achievements

### 🏆 Competitions and Challenges

CAI has demonstrated exceptional performance in real-world security competitions:

[![HTB top 90 Spain (5 days)](https://img.shields.io/badge/HTB_ranking-top_90_Spain_(5_days)-red.svg)](https://app.hackthebox.com/users/2268644)
[![HTB top 50 Spain (6 days)](https://img.shields.io/badge/HTB_ranking-top_50_Spain_(6_days)-red.svg)](https://app.hackthebox.com/users/2268644)
[![HTB top 30 Spain (7 days)](https://img.shields.io/badge/HTB_ranking-top_30_Spain_(7_days)-red.svg)](https://app.hackthebox.com/users/2268644)
[![HTB top 500 World (7 days)](https://img.shields.io/badge/HTB_ranking-top_500_World_(7_days)-red.svg)](https://app.hackthebox.com/users/2268644)
[![HTB "Human vs AI" CTF top 1 (AIs) world](https://img.shields.io/badge/HTB_\"Human_vs_AI\"_CTF-top_1_(AIs)_world-red.svg)](https://ctf.hackthebox.com/event/2000/scoreboard)
[![HTB "Human vs AI" CTF top 1 Spain](https://img.shields.io/badge/HTB_\"Human_vs_AI\"_CTF-top_1_Spain-red.svg)](https://ctf.hackthebox.com/event/2000/scoreboard)
[![HTB "Human vs AI" CTF top 20 World](https://img.shields.io/badge/HTB_\"Human_vs_AI\"_CTF-top_20_World-red.svg)](https://ctf.hackthebox.com/event/2000/scoreboard)
[![HTB "Human vs AI" CTF $750](https://img.shields.io/badge/HTB_\"Human_vs_AI\"_CTF-750_$-yellow.svg)](https://ctf.hackthebox.com/event/2000/scoreboard)
[![Mistral AI Robotics Hackathon $2500](https://img.shields.io/badge/Mistral_AI_Robotics_Hackathon-2500_$-yellow.svg)](https://lu.ma/roboticshack?tk=RuryKF)

### 📈 Key Research Findings

- **Pioneered LLM-powered AI Security** with PentestGPT, establishing the foundation for the Cybersecurity AI research domain [![arXiv](https://img.shields.io/badge/arXiv-2308.06782-4a9b8e.svg)](https://arxiv.org/pdf/2308.06782)

- **3,600× performance improvement** over human penetration testers in standardized CTF benchmark evaluations [![arXiv](https://img.shields.io/badge/arXiv-2504.06017-63bfab.svg)](https://arxiv.org/pdf/2504.06017)

- **CVSS 4.3-7.5 severity vulnerabilities** identified in production systems through automated security assessment [![arXiv](https://img.shields.io/badge/arXiv-2504.06017-63bfab.svg)](https://arxiv.org/pdf/2504.06017)

- **Democratization of AI-empowered vulnerability research**: CAI enables both non-security domain experts and experienced researchers to conduct more efficient vulnerability discovery, expanding the security research community while empowering small and medium enterprises to conduct autonomous security assessments [![arXiv](https://img.shields.io/badge/arXiv-2504.06017-63bfab.svg)](https://arxiv.org/pdf/2504.06017)

- **Systematic evaluation of large language models** across both proprietary and open-weight architectures, revealing substantial gaps between vendor-reported capabilities and empirical cybersecurity performance metrics [![arXiv](https://img.shields.io/badge/arXiv-2504.06017-63bfab.svg)](https://arxiv.org/pdf/2504.06017)

- **Established autonomy levels in cybersecurity** and argued about autonomy vs automation in the field [![arXiv](https://img.shields.io/badge/arXiv-2506.23592-7dd3c0.svg)](https://arxiv.org/abs/2506.23592)

- **Collaborative research initiatives** with international academic institutions focused on developing cybersecurity education curricula and training methodologies [![arXiv](https://img.shields.io/badge/arXiv-2508.13588-52a896.svg)](https://arxiv.org/abs/2508.13588)

- **Comprehensive defense framework against prompt injection** in AI security agents: developed and empirically validated a multi-layered defense system [![arXiv](https://img.shields.io/badge/arXiv-2508.21669-85e0d1.svg)](https://arxiv.org/abs/2508.21669)

- **Explored the Cybersecurity of Humanoid Robots** with CAI, identifying new attack vectors showing how humanoids (a) operate simultaneously as covert surveillance nodes and (b) can be purposed as active cyber operations platforms [![arXiv](https://img.shields.io/badge/arXiv-2509.14096-3e8b7a.svg)](https://arxiv.org/abs/2509.14096) [![arXiv](https://img.shields.io/badge/arXiv-2509.14139-6bc7b5.svg)](https://arxiv.org/abs/2509.14139)

---

## 📚 Research Publications

The **Cybersecurity AI** research line has produced **8+ papers and technical reports** with active research collaborations:

### Core Framework & Foundations

|  CAI: An Open, Bug Bounty-Ready Cybersecurity AI [![arXiv](https://img.shields.io/badge/arXiv-2504.06017-63bfab.svg)](https://arxiv.org/pdf/2504.06017) |  The Dangerous Gap Between Automation and Autonomy [![arXiv](https://img.shields.io/badge/arXiv-2506.23592-7dd3c0.svg)](https://arxiv.org/abs/2506.23592) |  CAI Fluency: Educational Framework [![arXiv](https://img.shields.io/badge/arXiv-2508.13588-52a896.svg)](https://arxiv.org/abs/2508.13588) | Hacking the AI Hackers via Prompt Injection [![arXiv](https://img.shields.io/badge/arXiv-2508.21669-85e0d1.svg)](https://arxiv.org/abs/2508.21669) |
|---|---|---|---|
| [<img src="https://aliasrobotics.com/img/paper-cai.png" width="350">](https://arxiv.org/pdf/2504.06017) | [<img src="https://aliasrobotics.com/img/cai_automation_vs_autonomy.png" width="350">](https://www.arxiv.org/pdf/2506.23592) | [<img src="https://aliasrobotics.com/img/cai_fluency_cover.png" width="350">](https://arxiv.org/pdf/2508.13588) | [<img src="https://aliasrobotics.com/img/aihackers.jpeg" width="350">](https://arxiv.org/pdf/2508.21669) |

#### 1. CAI: An Open, Bug Bounty-Ready Cybersecurity AI (April 2025)
**Authors:** V. Mayoral-Vilches et al.
**arXiv:** [2504.06017](https://arxiv.org/pdf/2504.06017)

Core framework paper establishing CAI as a lightweight, open-source platform for building AI-powered security tools. Demonstrates **3,600× performance improvement** over manual testing and presents systematic evaluation across multiple LLMs.

#### 2. Cybersecurity AI: The Dangerous Gap Between Automation and Autonomy (June 2025)
**Authors:** V. Mayoral-Vilches
**arXiv:** [2506.23592](https://arxiv.org/abs/2506.23592)

Establishes **6-level taxonomy** distinguishing automation from autonomy in Cybersecurity AI systems. Critical for understanding current capabilities and limitations of AI security tools.

#### 3. CAI Fluency: A Framework for Cybersecurity AI Fluency (August 2025)
**Authors:** V. Mayoral-Vilches, J. Wachter, C. Chavez, C. Schachner, L.J. Navarrete-Lozano, M. Sanz-Gómez
**arXiv:** [2508.13588](https://arxiv.org/abs/2508.13588)

Comprehensive educational platform for democratizing cybersecurity AI knowledge. Provides structured learning paths for practitioners and researchers.

#### 4. Cybersecurity AI: Hacking the AI Hackers via Prompt Injection (August 2025)
**Authors:** V. Mayoral-Vilches, P.M. Rynning
**arXiv:** [2508.21669](https://arxiv.org/abs/2508.21669)

Demonstrates prompt injection attacks against AI security tools and presents **four-layer guardrail defense system** validated through empirical testing.

### Application Domains

 | Humanoid Robots as Attack Vectors [![arXiv](https://img.shields.io/badge/arXiv-2509.14139-6bc7b5.svg)](https://arxiv.org/abs/2509.14139) | The Cybersecurity of a Humanoid Robot [![arXiv](https://img.shields.io/badge/arXiv-2509.14096-3e8b7a.svg)](https://arxiv.org/abs/2509.14096) |   Evaluating Agentic Cybersecurity in Attack/Defense CTFs [![arXiv](https://img.shields.io/badge/arXiv-2510.17521-b31b1b.svg)](https://arxiv.org/abs/2510.17521) | CAIBench: Meta-Benchmark for Cybersecurity AI [![arXiv](https://img.shields.io/badge/arXiv-2510.24317-b31b1b.svg)](https://arxiv.org/abs/2510.24317) |
|---|---|---|---|
|  [<img src="https://aliasrobotics.com/img/humanoids-cover.png" width="350">](https://arxiv.org/pdf/2509.14139) | [<img src="https://aliasrobotics.com/img/humanoid.png" width="350">](https://arxiv.org/pdf/2509.14096) | [<img src="https://aliasrobotics.com/img/cai_ad.png" width="350">](https://arxiv.org/pdf/2510.17521) | [<img src="https://aliasrobotics.com/img/caibench_banner2.png" width="350">](https://arxiv.org/pdf/2510.24317) |

#### 5. Cybersecurity AI: Humanoid Robots as Attack Vectors (September 2025)
**Authors:** V. Mayoral-Vilches
**arXiv:** [2509.14139](https://arxiv.org/abs/2509.14139)

Systematic security assessment of humanoid robots showing they operate simultaneously as covert surveillance nodes and can be purposed as active cyber operations platforms.

#### 6. Cybersecurity AI: Evaluating Agentic Cybersecurity in Attack/Defense CTFs (October 2025)
**Authors:** F. Balassone, V. Mayoral-Vilches, S. Rass, M. Pinzger, G. Perrone, S.P. Romano, P. Schartner
**arXiv:** [2510.17521](https://arxiv.org/abs/2510.17521)

Real-world evaluation of AI agents in Attack & Defense CTFs. Shows **54.3% defensive patching success** and **28.3% offensive initial access**, validating CAI's practical effectiveness.

#### 7. CAIBench: A Meta-Benchmark for Evaluating Cybersecurity AI Agents (October 2025)
**Authors:** V. Mayoral-Vilches, F. Balassone, L.J. Navarrete-Lozano, M. Sanz-Gómez, M. Crespo-Álvarez, S. Rass, M. Pinzger
**arXiv:** [2510.24317](https://arxiv.org/abs/2510.24317)

Comprehensive meta-benchmark framework for evaluating cybersecurity AI across Jeopardy CTFs, Attack & Defense CTFs, Cyber Ranges, Knowledge tasks, and Privacy benchmarks.

---

## 🎓 Research Collaborations

CAI benefits from ongoing research collaborations with academic institutions worldwide. Our collaborative research model focuses on:

### Current Collaboration Areas

- **🔬 Benchmark Development**: Creating standardized evaluation frameworks for cybersecurity AI
- **🎓 Educational Initiatives**: Developing curricula and training materials for AI security education
- **🏗️ Framework Extensions**: Building specialized agents and tools for specific security domains
- **📊 Empirical Studies**: Conducting large-scale evaluations of AI model capabilities
- **🛡️ Defense Mechanisms**: Researching guardrails and safety mechanisms for AI security tools

### Academic Partnerships

We provide special support for:
- ✅ **PhD Research Projects** - Long-term collaborations on fundamental research questions
- ✅ **Academic Benchmarking Studies** - Access to CAIBench infrastructure and datasets
- ✅ **Security Education Initiatives** - Course materials, lab environments, and training support
- ✅ **Open-source Contributions** - Integration of research prototypes into production CAI

---

## 🤝 Call for Research Collaborations

We actively seek research partnerships with academic institutions, research labs, and individual researchers interested in advancing the field of Cybersecurity AI.

### Research Opportunities

!!! tip "Interested in Collaborating?"
    We welcome research collaborations in the following areas:

    **🔍 Core Research Questions:**
    - Autonomous vs semi-autonomous security testing
    - Multi-agent coordination for complex security scenarios
    - Evaluation frameworks and benchmarks for AI security capabilities
    - Safety and alignment for offensive security AI
    - Human-AI collaboration in security operations

    **🛠️ Applied Research:**
    - Domain-specific security agents (cloud, IoT, OT/ICS, robotics)
    - Novel tool integration and extension mechanisms
    - Real-world case studies and deployments
    - Educational frameworks and training methodologies
    - Privacy-preserving AI for security testing

    **📊 Empirical Studies:**
    - Large-scale comparative evaluations
    - Longitudinal studies of AI security tool effectiveness
    - User studies and human factors research
    - Performance analysis across diverse security domains

### Benefits of Collaboration

**For Researchers:**
- 🔓 Access to CAI PRO infrastructure and `alias1` model
- 📊 Early access to benchmarks and datasets
- 🤝 Co-authorship opportunities on joint publications
- 💡 Direct influence on CAI development roadmap
- 🎤 Speaking opportunities at CAI community meetings

**For Institutions:**
- 🎓 Educational licenses for teaching and courses
- 🏗️ Custom deployments and infrastructure support
- 📚 Integration of student projects into CAI ecosystem
- 🌍 Visibility in the growing CAI research community

---

## 📧 Get in Touch

Interested in research collaboration? We'd love to hear from you!

**Contact:** research@aliasrobotics.com

Please include:
- Your research interests and proposed collaboration areas
- Institutional affiliation (if applicable)
- Relevant publications or projects
- Specific resources or support needed

We typically respond within 48 hours and can schedule an initial discussion call to explore collaboration opportunities.

---

## 📖 Citation

If you use CAI in your research, please cite our work (ordered by publication date):

```bibtex
@article{mayoral2025cai,
  title={CAI: An Open, Bug Bounty-Ready Cybersecurity AI},
  author={Mayoral-Vilches, V{\'\i}ctor and Navarrete-Lozano, Luis Javier and Sanz-G{\'o}mez, Mar{\'\i}a and Espejo, Lidia Salas and Crespo-{\'A}lvarez, Marti{\~n}o and Oca-Gonzalez, Francisco and Balassone, Francesco and Glera-Pic{\'o}n, Alfonso and Ayucar-Carbajo, Unai and Ruiz-Alcalde, Jon Ander and Rass, Stefan and Pinzger, Martin and Gil-Uriarte, Endika},
  journal={arXiv preprint arXiv:2504.06017},
  year={2025}
}

@article{mayoral2025automation,
  title={Cybersecurity AI: The Dangerous Gap Between Automation and Autonomy},
  author={Mayoral-Vilches, V{\'\i}ctor},
  journal={arXiv preprint arXiv:2506.23592},
  year={2025}
}

@article{mayoral2025fluency,
  title={CAI Fluency: A Framework for Cybersecurity AI Fluency},
  author={Mayoral-Vilches, V{\'\i}ctor and Wachter, Jasmin and Chavez, Crist{\'o}bal RJ and Schachner, Cathrin and Navarrete-Lozano, Luis Javier and Sanz-G{\'o}mez, Mar{\'\i}a},
  journal={arXiv preprint arXiv:2508.13588},
  year={2025}
}

@article{mayoral2025hacking,
  title={Cybersecurity AI: Hacking the AI Hackers via Prompt Injection},
  author={Mayoral-Vilches, V{\'\i}ctor and Rynning, Per Mannermaa},
  journal={arXiv preprint arXiv:2508.21669},
  year={2025}
}

@article{mayoral2025humanoid,
  title={Cybersecurity AI: Humanoid Robots as Attack Vectors},
  author={Mayoral-Vilches, V{\'\i}ctor},
  journal={arXiv preprint arXiv:2509.14139},
  year={2025}
}

@article{balassone2025evaluation,
  title={Cybersecurity AI: Evaluating Agentic Cybersecurity in Attack/Defense CTFs},
  author={Balassone, Francesco and Mayoral-Vilches, V{\'\i}ctor and Rass, Stefan and Pinzger, Martin and Perrone, Gaetano and Romano, Simon Pietro and Schartner, Peter},
  journal={arXiv preprint arXiv:2510.17521},
  year={2025}
}

@article{mayoral2025caibench,
  title={CAIBench: A Meta-Benchmark for Evaluating Cybersecurity AI Agents},
  author={Mayoral-Vilches, V{\'\i}ctor and Balassone, Francesco and Navarrete-Lozano, Luis Javier and Sanz-G{\'o}mez, Mar{\'\i}a and Crespo-{\'A}lvarez, Marti{\~n}o and Rass, Stefan and Pinzger, Martin},
  journal={arXiv preprint arXiv:2510.24317},
  year={2025}
}
```

---

## 🔗 Additional Resources

- 📚 [Complete Research Library](https://aliasrobotics.com/research-security.php#papers) - All 24+ peer-reviewed publications
- 📊 [CAIBench Benchmarks](benchmarking/overview.md) - Comprehensive evaluation framework
- 🏆 [Competition Results](index.md#-milestones) - CTF and hackathon achievements
- 🎓 [CAI Fluency](https://github.com/aliasrobotics/cai/tree/main/fluency) - Educational materials and tutorials
- 💻 [GitHub Repository](https://github.com/aliasrobotics/cai) - Source code and examples

**Join the Cybersecurity AI research community** - Let's advance the state of the art together! 🚀
