Artwork

Nội dung được cung cấp bởi Daniel Bashir. Tất cả nội dung podcast bao gồm các tập, đồ họa và mô tả podcast đều được Daniel Bashir hoặc đối tác nền tảng podcast của họ tải lên và cung cấp trực tiếp. Nếu bạn cho rằng ai đó đang sử dụng tác phẩm có bản quyền của bạn mà không có sự cho phép của bạn, bạn có thể làm theo quy trình được nêu ở đây https://vi.player.fm/legal.
Player FM - Ứng dụng Podcast
Chuyển sang chế độ ngoại tuyến với ứng dụng Player FM !

Davidad Dalrymple: Towards Provably Safe AI

1:20:50
 
Chia sẻ
 

Manage episode 438376126 series 2975159
Nội dung được cung cấp bởi Daniel Bashir. Tất cả nội dung podcast bao gồm các tập, đồ họa và mô tả podcast đều được Daniel Bashir hoặc đối tác nền tảng podcast của họ tải lên và cung cấp trực tiếp. Nếu bạn cho rằng ai đó đang sử dụng tác phẩm có bản quyền của bạn mà không có sự cho phép của bạn, bạn có thể làm theo quy trình được nêu ở đây https://vi.player.fm/legal.

Episode 137

I spoke with Davidad Dalrymple about:

* His perspectives on AI risk

* ARIA (the UK’s Advanced Research and Invention Agency) and its Safeguarded AI Programme

Enjoy—and let me know what you think!

Davidad is a Programme Director at ARIA. He was most recently a Research Fellow in technical AI safety at Oxford. He co-invented the top-40 cryptocurrency Filecoin, led an international neuroscience collaboration, and was a senior software engineer at Twitter and multiple startups.

Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions.

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (00:36) Calibration and optimism about breakthroughs

* (03:35) Calibration and AGI timelines, effects of AGI on humanity

* (07:10) Davidad’s thoughts on the Orthogonality Thesis

* (10:30) Understanding how our current direction relates to AGI and breakthroughs

* (13:33) What Davidad thinks is needed for AGI

* (17:00) Extracting knowledge

* (19:01) Cyber-physical systems and modeling frameworks

* (20:00) Continuities between Davidad’s earlier work and ARIA

* (22:56) Path dependence in technology, race dynamics

* (26:40) More on Davidad’s perspective on what might go wrong with AGI

* (28:57) Vulnerable world, interconnectedness of computers and control

* (34:52) Formal verification and world modeling, Open Agency Architecture

* (35:25) The Semantic Sufficiency Hypothesis

* (39:31) Challenges for modeling

* (43:44) The Deontic Sufficiency Hypothesis and mathematical formalization

* (49:25) Oversimplification and quantitative knowledge

* (53:42) Collective deliberation in expressing values for AI

* (55:56) ARIA’s Safeguarded AI Programme

* (59:40) Anthropic’s ASL levels

* (1:03:12) Guaranteed Safe AI —

* (1:03:38) AI risk and (in)accurate world models

* (1:09:59) Levels of safety specifications for world models and verifiers — steps to achieve high safety

* (1:12:00) Davidad’s portfolio research approach and funding at ARIA

* (1:15:46) Earlier concerns about ARIA — Davidad’s perspective

* (1:19:26) Where to find more information on ARIA and the Safeguarded AI Programme

* (1:20:44) Outro

Links:

* Davidad’s Twitter

* ARIA homepage

* Safeguarded AI Programme

* Papers

* Guaranteed Safe AI

* Davidad’s Open Agency Architecture for Safe Transformative AI

* Dioptics: a Common Generalization of Open Games and Gradient-Based Learners (2019)

* Asynchronous Logic Automata (2008)


Get full access to The Gradient at thegradientpub.substack.com/subscribe
  continue reading

150 tập

Artwork
iconChia sẻ
 
Manage episode 438376126 series 2975159
Nội dung được cung cấp bởi Daniel Bashir. Tất cả nội dung podcast bao gồm các tập, đồ họa và mô tả podcast đều được Daniel Bashir hoặc đối tác nền tảng podcast của họ tải lên và cung cấp trực tiếp. Nếu bạn cho rằng ai đó đang sử dụng tác phẩm có bản quyền của bạn mà không có sự cho phép của bạn, bạn có thể làm theo quy trình được nêu ở đây https://vi.player.fm/legal.

Episode 137

I spoke with Davidad Dalrymple about:

* His perspectives on AI risk

* ARIA (the UK’s Advanced Research and Invention Agency) and its Safeguarded AI Programme

Enjoy—and let me know what you think!

Davidad is a Programme Director at ARIA. He was most recently a Research Fellow in technical AI safety at Oxford. He co-invented the top-40 cryptocurrency Filecoin, led an international neuroscience collaboration, and was a senior software engineer at Twitter and multiple startups.

Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions.

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (00:36) Calibration and optimism about breakthroughs

* (03:35) Calibration and AGI timelines, effects of AGI on humanity

* (07:10) Davidad’s thoughts on the Orthogonality Thesis

* (10:30) Understanding how our current direction relates to AGI and breakthroughs

* (13:33) What Davidad thinks is needed for AGI

* (17:00) Extracting knowledge

* (19:01) Cyber-physical systems and modeling frameworks

* (20:00) Continuities between Davidad’s earlier work and ARIA

* (22:56) Path dependence in technology, race dynamics

* (26:40) More on Davidad’s perspective on what might go wrong with AGI

* (28:57) Vulnerable world, interconnectedness of computers and control

* (34:52) Formal verification and world modeling, Open Agency Architecture

* (35:25) The Semantic Sufficiency Hypothesis

* (39:31) Challenges for modeling

* (43:44) The Deontic Sufficiency Hypothesis and mathematical formalization

* (49:25) Oversimplification and quantitative knowledge

* (53:42) Collective deliberation in expressing values for AI

* (55:56) ARIA’s Safeguarded AI Programme

* (59:40) Anthropic’s ASL levels

* (1:03:12) Guaranteed Safe AI —

* (1:03:38) AI risk and (in)accurate world models

* (1:09:59) Levels of safety specifications for world models and verifiers — steps to achieve high safety

* (1:12:00) Davidad’s portfolio research approach and funding at ARIA

* (1:15:46) Earlier concerns about ARIA — Davidad’s perspective

* (1:19:26) Where to find more information on ARIA and the Safeguarded AI Programme

* (1:20:44) Outro

Links:

* Davidad’s Twitter

* ARIA homepage

* Safeguarded AI Programme

* Papers

* Guaranteed Safe AI

* Davidad’s Open Agency Architecture for Safe Transformative AI

* Dioptics: a Common Generalization of Open Games and Gradient-Based Learners (2019)

* Asynchronous Logic Automata (2008)


Get full access to The Gradient at thegradientpub.substack.com/subscribe
  continue reading

150 tập

Tất cả các tập

×
 
Loading …

Chào mừng bạn đến với Player FM!

Player FM đang quét trang web để tìm các podcast chất lượng cao cho bạn thưởng thức ngay bây giờ. Đây là ứng dụng podcast tốt nhất và hoạt động trên Android, iPhone và web. Đăng ký để đồng bộ các theo dõi trên tất cả thiết bị.

 

Hướng dẫn sử dụng nhanh

Nghe chương trình này trong khi bạn khám phá
Nghe