Slušaj Joe Carlsmith Audio podkast

How human-like do safe AI motivations need to be?	Joe Carlsmith Audio	83:32
Leaving Open Philanthropy, going to Anthropic	Joe Carlsmith Audio	32:09
Controlling the options AIs can pursue	Joe Carlsmith Audio	55:34
Giving AIs safe motivations	Joe Carlsmith Audio	83:25
The stakes of AI moral status	Joe Carlsmith Audio	37:29
Can we safely automate alignment research?	Joe Carlsmith Audio	89:38
AI for AI safety	Joe Carlsmith Audio	27:51
Paths and waystations in AI safety	Joe Carlsmith Audio	18:07
When should we worry about AI power-seeking?	Joe Carlsmith Audio	46:54
What is it to solve the alignment problem?	Joe Carlsmith Audio	40:13
How do we solve the alignment problem?	Joe Carlsmith Audio	08:43
Fake thinking and real thinking	Joe Carlsmith Audio	78:47
Takes on "Alignment Faking in Large Language Models"	Joe Carlsmith Audio	87:54
(Part 2, AI takeover) Extended audio from my conversation with Dwarkesh Patel	Joe Carlsmith Audio	127:33
(Part 1, Otherness) Extended audio from my conversation with Dwarkesh Patel	Joe Carlsmith Audio	238:38
Introduction and summary for "Otherness and control in the age of AGI"	Joe Carlsmith Audio	12:23
Second half of full audio for "Otherness and control in the age of AGI"	Joe Carlsmith Audio	251:02
First half of full audio for "Otherness and control in the age of AGI"	Joe Carlsmith Audio	187:29
Loving a world you don't trust	Joe Carlsmith Audio	63:54
On attunement	Joe Carlsmith Audio	44:14
On green	Joe Carlsmith Audio	75:13
On the abolition of man	Joe Carlsmith Audio	69:22
Being nicer than Clippy	Joe Carlsmith Audio	47:30
An even deeper atheism	Joe Carlsmith Audio	25:12
Does AI risk "other" the AIs?	Joe Carlsmith Audio	13:15
When "yang" goes wrong	Joe Carlsmith Audio	21:32
Deep atheism and AI risk	Joe Carlsmith Audio	46:59
Gentleness and the artificial Other	Joe Carlsmith Audio	22:39
In search of benevolence (or: what should you get Clippy for Christmas?)	Joe Carlsmith Audio	52:52
Empirical work that might shed light on scheming (Section 6 of "Scheming AIs")	Joe Carlsmith Audio	28:00
Summing up "Scheming AIs" (Section 5)	Joe Carlsmith Audio	15:46
Speed arguments against scheming (Section 4.4-4.7 of "Scheming AIs")	Joe Carlsmith Audio	15:19
Simplicity arguments for scheming (Section 4.3 of "Scheming AIs")	Joe Carlsmith Audio	19:37
The counting argument for scheming (Sections 4.1 and 4.2 of "Scheming AIs")	Joe Carlsmith Audio	10:40
Arguments for/against scheming that focus on the path SGD takes (Section 3 of "Scheming AIs")	Joe Carlsmith Audio	29:03
Non-classic stories about scheming (Section 2.3.2 of "Scheming AIs")	Joe Carlsmith Audio	24:34
Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs")	Joe Carlsmith Audio	22:54
The goal-guarding hypothesis (Section 2.3.1.1 of "Scheming AIs")	Joe Carlsmith Audio	19:11
How useful for alignment-relevant work are AIs with short-term goals? (Section 2.2.4.3 of "Scheming AIs")	Joe Carlsmith Audio	09:21
Is scheming more likely if you train models to have long-term goals? (Sections 2.2.4.1-2.2.4.2 of "Scheming AIs")	Joe Carlsmith Audio	09:01

How human-like do safe AI motivations need to be?

Joe Carlsmith Audio

83:32

Leaving Open Philanthropy, going to Anthropic

Joe Carlsmith Audio

32:09

Controlling the options AIs can pursue

Joe Carlsmith Audio

55:34

Giving AIs safe motivations

Joe Carlsmith Audio

83:25