BBC Radio 5 live’s award winning gaming podcast, discussing the world of video games and games culture.
…
continue reading
Nội dung được cung cấp bởi LessWrong. Tất cả nội dung podcast bao gồm các tập, đồ họa và mô tả podcast đều được LessWrong hoặc đối tác nền tảng podcast của họ tải lên và cung cấp trực tiếp. Nếu bạn cho rằng ai đó đang sử dụng tác phẩm có bản quyền của bạn mà không có sự cho phép của bạn, bạn có thể làm theo quy trình được nêu ở đây https://vi.player.fm/legal.
Player FM - Ứng dụng Podcast
Chuyển sang chế độ ngoại tuyến với ứng dụng Player FM !
Chuyển sang chế độ ngoại tuyến với ứng dụng Player FM !
“A Rocket–Interpretability Analogy” by plex
Manage episode 446781946 series 3364760
Nội dung được cung cấp bởi LessWrong. Tất cả nội dung podcast bao gồm các tập, đồ họa và mô tả podcast đều được LessWrong hoặc đối tác nền tảng podcast của họ tải lên và cung cấp trực tiếp. Nếu bạn cho rằng ai đó đang sử dụng tác phẩm có bản quyền của bạn mà không có sự cho phép của bạn, bạn có thể làm theo quy trình được nêu ở đây https://vi.player.fm/legal.
1.
4.4% of the US federal budget went into the space race at its peak.
This was surprising to me, until a friend pointed out that landing rockets on specific parts of the moon requires very similar technology to landing rockets in soviet cities.[1]
I wonder how much more enthusiastic the scientists working on Apollo were, with the convenient motivating story of “I’m working towards a great scientific endeavor” vs “I’m working to make sure we can kill millions if we want to”.
2.
The field of alignment seems to be increasingly dominated by interpretability. (and obedience[2])
This was surprising to me[3], until a friend pointed out that partially opening the black box of NNs is the kind of technology that would scaling labs find new unhobblings by noticing ways in which the internals of their models are being inefficient and having better tools to evaluate capabilities advances.[4]
I [...]
---
Outline:
(00:03) 1.
(00:35) 2.
(01:20) 3.
The original text contained 6 footnotes which were omitted from this narration.
---
First published:
October 21st, 2024
Source:
https://www.lesswrong.com/posts/h4wXMXneTPDEjJ7nv/a-rocket-interpretability-analogy
---
Narrated by TYPE III AUDIO.
…
continue reading
4.4% of the US federal budget went into the space race at its peak.
This was surprising to me, until a friend pointed out that landing rockets on specific parts of the moon requires very similar technology to landing rockets in soviet cities.[1]
I wonder how much more enthusiastic the scientists working on Apollo were, with the convenient motivating story of “I’m working towards a great scientific endeavor” vs “I’m working to make sure we can kill millions if we want to”.
2.
The field of alignment seems to be increasingly dominated by interpretability. (and obedience[2])
This was surprising to me[3], until a friend pointed out that partially opening the black box of NNs is the kind of technology that would scaling labs find new unhobblings by noticing ways in which the internals of their models are being inefficient and having better tools to evaluate capabilities advances.[4]
I [...]
---
Outline:
(00:03) 1.
(00:35) 2.
(01:20) 3.
The original text contained 6 footnotes which were omitted from this narration.
---
First published:
October 21st, 2024
Source:
https://www.lesswrong.com/posts/h4wXMXneTPDEjJ7nv/a-rocket-interpretability-analogy
---
Narrated by TYPE III AUDIO.
498 tập
Manage episode 446781946 series 3364760
Nội dung được cung cấp bởi LessWrong. Tất cả nội dung podcast bao gồm các tập, đồ họa và mô tả podcast đều được LessWrong hoặc đối tác nền tảng podcast của họ tải lên và cung cấp trực tiếp. Nếu bạn cho rằng ai đó đang sử dụng tác phẩm có bản quyền của bạn mà không có sự cho phép của bạn, bạn có thể làm theo quy trình được nêu ở đây https://vi.player.fm/legal.
1.
4.4% of the US federal budget went into the space race at its peak.
This was surprising to me, until a friend pointed out that landing rockets on specific parts of the moon requires very similar technology to landing rockets in soviet cities.[1]
I wonder how much more enthusiastic the scientists working on Apollo were, with the convenient motivating story of “I’m working towards a great scientific endeavor” vs “I’m working to make sure we can kill millions if we want to”.
2.
The field of alignment seems to be increasingly dominated by interpretability. (and obedience[2])
This was surprising to me[3], until a friend pointed out that partially opening the black box of NNs is the kind of technology that would scaling labs find new unhobblings by noticing ways in which the internals of their models are being inefficient and having better tools to evaluate capabilities advances.[4]
I [...]
---
Outline:
(00:03) 1.
(00:35) 2.
(01:20) 3.
The original text contained 6 footnotes which were omitted from this narration.
---
First published:
October 21st, 2024
Source:
https://www.lesswrong.com/posts/h4wXMXneTPDEjJ7nv/a-rocket-interpretability-analogy
---
Narrated by TYPE III AUDIO.
…
continue reading
4.4% of the US federal budget went into the space race at its peak.
This was surprising to me, until a friend pointed out that landing rockets on specific parts of the moon requires very similar technology to landing rockets in soviet cities.[1]
I wonder how much more enthusiastic the scientists working on Apollo were, with the convenient motivating story of “I’m working towards a great scientific endeavor” vs “I’m working to make sure we can kill millions if we want to”.
2.
The field of alignment seems to be increasingly dominated by interpretability. (and obedience[2])
This was surprising to me[3], until a friend pointed out that partially opening the black box of NNs is the kind of technology that would scaling labs find new unhobblings by noticing ways in which the internals of their models are being inefficient and having better tools to evaluate capabilities advances.[4]
I [...]
---
Outline:
(00:03) 1.
(00:35) 2.
(01:20) 3.
The original text contained 6 footnotes which were omitted from this narration.
---
First published:
October 21st, 2024
Source:
https://www.lesswrong.com/posts/h4wXMXneTPDEjJ7nv/a-rocket-interpretability-analogy
---
Narrated by TYPE III AUDIO.
498 tập
All episodes
×Chào mừng bạn đến với Player FM!
Player FM đang quét trang web để tìm các podcast chất lượng cao cho bạn thưởng thức ngay bây giờ. Đây là ứng dụng podcast tốt nhất và hoạt động trên Android, iPhone và web. Đăng ký để đồng bộ các theo dõi trên tất cả thiết bị.