Hackbot R&D

I’m a researcher working at the intersection of artificial intelligence and computer security. I work at Dreadnode training and evaluating the hacking capabilities of agents. I’ll be working in this field until we see hacking’s move 37.
Shane Caldwell

RL Needed LLMs Because Agency Requires Priors

We tried RL once. It didn’t work. I’m confident it will this time.

August 19, 2025 · 26 min · 5458 words · Shane Caldwell

GPT-5 is Good, Actually: The Agony and Ecstasy of Public Benchmarks

An attempt to explain why benchmarks are either bad or secret, and why the bar charts don’t matter so much.

August 17, 2025 · 17 min · 3488 words · Shane Caldwell

The Religious Devotion of Haskell

A brief(ish) anecdote and investigation into the religious devotion of Haskell programmers.

July 1, 2024 · 10 min · 2024 words · Shane Caldwell

The Input Sanitization Perspective on Prompt Injection

So, you mixed user input and instructions.

July 2, 2023 · 23 min · 4772 words · Shane Caldwell

Deep Reinforcement Learning for Security: Toward an Autonomous Pentesting Agent

A manifesto on RL in cybersecurity, from when deep RL was the thing.

April 28, 2020 · 30 min · 6178 words · Shane Caldwell

An ML Eng's Review of OSCP

Because you shouldn’t try and automate anything you can’t do yourself.

April 26, 2020 · 16 min · 3343 words · Shane Caldwell