“Inoculation prompting: Instructing models to misbehave at train-time can improve run-time behavior” by Sam Marks
by
LessWrong (Curated & Popular)
2025-10-10 14:15:40
Release date
04:06
Length