Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs")

Joe Carlsmith Audio

Nội dung được cung cấp bởi Joe Carlsmith. Tất cả nội dung podcast bao gồm các tập, đồ họa và mô tả podcast đều được Joe Carlsmith hoặc đối tác nền tảng podcast của họ tải lên và cung cấp trực tiếp. Nếu bạn cho rằng ai đó đang sử dụng tác phẩm có bản quyền của bạn mà không có sự cho phép của bạn, bạn có thể làm theo quy trình được nêu ở đây https://vi.player.fm/legal.

11M ago 22:54

MP3•Trang chủ episode

This is section 2.3.1.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”

Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Chương

1. Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs") (00:00:00)

2. 2.3.1.2 Adequate future empowerment (00:00:33)

3. 2.3.1.2.1 When is the “pay off” supposed to happen? (00:01:22)

4. 2.3.1.2.2 Even if the model’s values survive this generation of training, will they survive long (00:04:20)

5. 2.3.1.2.3 Will escape/take-over be suitably likely to succeed? (00:08:16)

6. 2.3.1.2.4 Will the time horizon of the model’s goals extend to cover escape/take-over? (00:10:05)

7. 2.3.1.2.5 Will the model’s values get enough power after escape/takeover? (00:11:56)

8. 2.3.1.2.6 How much does the model stand to gain from not training-gaming? (00:13:23)

9. 2.3.1.2.7 How “ambitious” is the model? (00:16:38)

10. 2.3.1.3 Overall assessment of the classic goal-guarding story (00:21:43)

56 tập

#Society #Philosophy #Joe