Hackbot R&D

Looking at the data and letting it look back at us.
Optimizing training a Llama 3.2 1B model so we can pretrain in a day without going broke.
If you contribute a public benchmark, are you giving free capability to your competitors?
So, you mixed user input and instructions.
A manifesto on RL in cybersecurity, from when deep RL was the thing.
Because you shouldn’t try and automate anything you can’t do yourself.