“Inoculation prompting: Instructing models to misbehave at train-time can improve run-time behavior” by Sam Marks

od LessWrong (Curated & Popular)

  • 2025-10-10 14:15:40Datum izdaje
  • 04:06Trajanje