LessWrong Curated Podcast
Durata totale:
14 h 18 min
AI companies aren’t really using external evaluators
LessWrong Curated Podcast
07:42
EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024
LessWrong Curated Podcast
06:42
What’s Going on With OpenAI’s Messaging?
LessWrong Curated Podcast
06:30
Language Models Model Us
LessWrong Curated Podcast
29:05
Jaan Tallinn’s 2023 Philanthropy Overview
LessWrong Curated Podcast
00:51
OpenAI: Exodus
LessWrong Curated Podcast
84:44
DeepMind’s ”Frontier Safety Framework” is weak and unambitious
LessWrong Curated Podcast
07:20
Do you believe in hundred dollar bills lying on the ground? Consider humming
LessWrong Curated Podcast
11:29
Deep Honesty
LessWrong Curated Podcast
15:22
On Not Pulling The Ladder Up Behind You
LessWrong Curated Podcast
14:29
Mechanistically Eliciting Latent Behaviors in Language Models
LessWrong Curated Podcast
80:59
Ironing Out the Squiggles
LessWrong Curated Podcast
18:59
Introducing AI Lab Watch
LessWrong Curated Podcast
02:42
Refusal in LLMs is mediated by a single direction
LessWrong Curated Podcast
17:07
Funny Anecdote of Eliezer From His Sister
LessWrong Curated Podcast
04:06
Thoughts on seed oil
LessWrong Curated Podcast
34:23
Why Would Belief-States Have A Fractal Structure, And Why Would That Matter For Interpretability? An Explainer
LessWrong Curated Podcast
12:36
Express interest in an “FHI of the West”
LessWrong Curated Podcast
05:44
Transformers Represent Belief State Geometry in their Residual Stream
LessWrong Curated Podcast
23:51
Paul Christiano named as US AI Safety Institute Head of AI Safety
LessWrong Curated Podcast
02:18
[HUMAN VOICE] "How could I have thought that faster?" by mesaoptimizer
LessWrong Curated Podcast
03:02
[HUMAN VOICE] "My PhD thesis: Algorithmic Bayesian Epistemology" by Eric Neyman
LessWrong Curated Podcast
13:07
[HUMAN VOICE] "Toward a Broader Conception of Adverse Selection" by Ricki Heicklen
LessWrong Curated Podcast
21:49
[HUMAN VOICE] "On green" by Joe Carlsmith
LessWrong Curated Podcast
75:13
LLMs for Alignment Research: a safety priority?
LessWrong Curated Podcast
20:46
[HUMAN VOICE] "Social status part 1/2: negotiations over object-level preferences" by Steven Byrnes
LessWrong Curated Podcast
50:08
[HUMAN VOICE] "Using axis lines for good or evil" by dynomight
LessWrong Curated Podcast
12:17
[HUMAN VOICE] "Scale Was All We Needed, At First" by Gabriel Mukobi
LessWrong Curated Podcast
15:04
[HUMAN VOICE] "Acting Wholesomely" by OwenCB
LessWrong Curated Podcast
27:26
The Story of “I Have Been A Good Bing”
LessWrong Curated Podcast
22:39
The Best Tacit Knowledge Videos on Every Subject
LessWrong Curated Podcast
14:44
[HUMAN VOICE] "My Clients, The Liars" by ymeskhout
LessWrong Curated Podcast
13:59
[HUMAN VOICE] "Deep atheism and AI risk" by Joe Carlsmith
LessWrong Curated Podcast
46:59
[HUMAN VOICE] "CFAR Takeaways: Andrew Critch" by Raemon
LessWrong Curated Podcast
09:10
[HUMAN VOICE] "Speaking to Congressional staffers about AI risk" by Akash, hath
LessWrong Curated Podcast
24:14
Many arguments for AI x-risk are wrong
LessWrong Curated Podcast
20:03
Tips for Empirical Alignment Research
LessWrong Curated Podcast
39:53
Timaeus’s First Four Months
LessWrong Curated Podcast
11:55
Contra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”
LessWrong Curated Podcast
07:43
[HUMAN VOICE] "And All the Shoggoths Merely Players" by Zack_M_Davis
LessWrong Curated Podcast
21:40