Hackbot R&D

The practical realities of devestatingly high communication cost in training.
Looking at the data and letting it look back at us.
Optimizing training a Llama 3.2 1B model so we can pretrain in a day without going broke.
So, you mixed user input and instructions.
A manifesto on RL in cybersecurity, from when deep RL was the thing.
Because you shouldn’t try and automate anything you can’t do yourself.