RL | Shane Caldwell

DiLoCo: Data Parallelism for the Datacenter Poor

Distributed training sans datacenter.

We tried RL once. It didn’t work. I’m confident it will this time.

A manifesto on RL in cybersecurity, from when deep RL was the thing.