DiLoCo: Data Parallelism for the Datacenter Poor

Distributed training sans datacenter.

October 3, 2025 · 25 min · 5275 words · Shane Caldwell

RL Needed LLMs Because Agency Requires Priors

We tried RL once. It didn’t work. I’m confident it will this time.

August 19, 2025 · 26 min · 5458 words · Shane Caldwell

Deep Reinforcement Learning for Security: Toward an Autonomous Pentesting Agent

A manifesto on RL in cybersecurity, from when deep RL was the thing.

April 28, 2020 · 29 min · 6176 words · Shane Caldwell