Chuyển sang chế độ ngoại tuyến với ứng dụng Player FM !
Non-classic stories about scheming (Section 2.3.2 of "Scheming AIs")
Manage episode 387620347 series 3402048
This is section 2.3.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”
Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Chương
1. Non-classic stories about scheming (Section 2.3.2 of "Scheming AIs") (00:00:00)
2. 2.3.2 Non-classic stories (00:00:36)
3. 2.3.2.1 AI coordination (00:00:55)
4. 2.3.2.2 AIs with similar values by default (00:05:57)
5. 2.3.2.3 Terminal values that happen to favor escape/takeover (00:07:51)
6. 2.3.2.4 Models with false beliefs about whether scheming is a good strategy (00:11:59)
7. 2.3.2.5 Self-deception (00:13:33)
8. 2.3.2.6 Goal-uncertainty and haziness (00:15:46)
9. 2.3.2.7 Overall assessment of the non-classic stories (00:18:19)
10. 2.4 Take-aways re: the requirements of scheming (00:20:08)
11. 2.5 Path dependence (00:20:51)
56 tập
Manage episode 387620347 series 3402048
This is section 2.3.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”
Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Chương
1. Non-classic stories about scheming (Section 2.3.2 of "Scheming AIs") (00:00:00)
2. 2.3.2 Non-classic stories (00:00:36)
3. 2.3.2.1 AI coordination (00:00:55)
4. 2.3.2.2 AIs with similar values by default (00:05:57)
5. 2.3.2.3 Terminal values that happen to favor escape/takeover (00:07:51)
6. 2.3.2.4 Models with false beliefs about whether scheming is a good strategy (00:11:59)
7. 2.3.2.5 Self-deception (00:13:33)
8. 2.3.2.6 Goal-uncertainty and haziness (00:15:46)
9. 2.3.2.7 Overall assessment of the non-classic stories (00:18:19)
10. 2.4 Take-aways re: the requirements of scheming (00:20:08)
11. 2.5 Path dependence (00:20:51)
56 tập
Tất cả các tập
×Chào mừng bạn đến với Player FM!
Player FM đang quét trang web để tìm các podcast chất lượng cao cho bạn thưởng thức ngay bây giờ. Đây là ứng dụng podcast tốt nhất và hoạt động trên Android, iPhone và web. Đăng ký để đồng bộ các theo dõi trên tất cả thiết bị.