15 subscribers
Chuyển sang chế độ ngoại tuyến với ứng dụng Player FM !
How to make your CPU as fast as a GPU - Advances in Sparsity w/ Nir Shavit
Manage episode 345029878 series 2974171
#ai #sparsity #gpu
Sparsity is awesome, but only recently has it become possible to properly handle sparse models at good performance. Neural Magic does exactly this, using a plain CPU. No specialized hardware needed, just clever algorithms for pruning and forward-propagation of neural networks. Nir Shavit and I talk about how this is possible, what it means in terms of applications, and why sparsity should play a much larger role in the Deep Learning community.
Sponsor: AssemblyAI
Link: https://www.assemblyai.com/?utm_sourc...
Check out Neural Magic: https://neuralmagic.com/
and DeepSparse: https://github.com/neuralmagic/deepsp...
OUTLINE:
0:00 Introduction
1:08 Sponsor: AssemblyAI
2:50 Start of Interview
4:15 How the NIR company was founded?
5:10 What is Sparsity about?
9:30 Link between the human brain and sparsity
12:10 Where should the extra resource that the human brain doesn't have go?
14:40 Analogy for Sparse Architecture
16:48 Possible future for Sparse Architecture as standard architure for Neural Networks
20:08 Pruning & Sparsification
22:57 What keeps us from building sparse models?
25:34 Why are GPUs so unsuited for sparse models?
28:47 CPU and GPU in connection with memory
30:14 What Neural Magic does?
32:54 How do you deal with overlaps in tensor columns?
33:41 The best type of sparsity to execute tons of CPU
37:24 What kind of architecture would make the best use out of a combined system of CPUs and GPUs?
41:04 Graph Neural Networks in connection to sparsity
43:04 Intrinsic connection between the Sparsification of Neural Networks, Non Layer-Wise Computation, Blockchain Technology, Smart Contracts and Distributed Computing
45:23 Neural Magic's target audience
48:16 Is there a type of model where it works particularly well and the type where it doesn't?
Links:
Homepage: https://ykilcher.com
Merch: https://ykilcher.com/merch
YouTube:
/ yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
177 tập
Manage episode 345029878 series 2974171
#ai #sparsity #gpu
Sparsity is awesome, but only recently has it become possible to properly handle sparse models at good performance. Neural Magic does exactly this, using a plain CPU. No specialized hardware needed, just clever algorithms for pruning and forward-propagation of neural networks. Nir Shavit and I talk about how this is possible, what it means in terms of applications, and why sparsity should play a much larger role in the Deep Learning community.
Sponsor: AssemblyAI
Link: https://www.assemblyai.com/?utm_sourc...
Check out Neural Magic: https://neuralmagic.com/
and DeepSparse: https://github.com/neuralmagic/deepsp...
OUTLINE:
0:00 Introduction
1:08 Sponsor: AssemblyAI
2:50 Start of Interview
4:15 How the NIR company was founded?
5:10 What is Sparsity about?
9:30 Link between the human brain and sparsity
12:10 Where should the extra resource that the human brain doesn't have go?
14:40 Analogy for Sparse Architecture
16:48 Possible future for Sparse Architecture as standard architure for Neural Networks
20:08 Pruning & Sparsification
22:57 What keeps us from building sparse models?
25:34 Why are GPUs so unsuited for sparse models?
28:47 CPU and GPU in connection with memory
30:14 What Neural Magic does?
32:54 How do you deal with overlaps in tensor columns?
33:41 The best type of sparsity to execute tons of CPU
37:24 What kind of architecture would make the best use out of a combined system of CPUs and GPUs?
41:04 Graph Neural Networks in connection to sparsity
43:04 Intrinsic connection between the Sparsification of Neural Networks, Non Layer-Wise Computation, Blockchain Technology, Smart Contracts and Distributed Computing
45:23 Neural Magic's target audience
48:16 Is there a type of model where it works particularly well and the type where it doesn't?
Links:
Homepage: https://ykilcher.com
Merch: https://ykilcher.com/merch
YouTube:
/ yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
177 tập
Tất cả các tập
×
1 Efficient Streaming Language Models with Attention Sinks (Paper Explained) 32:26

1 Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution (Paper Explained) 46:44

1 Retentive Network: A Successor to Transformer for Large Language Models (Paper Explained) 28:25

1 Reinforced Self-Training (ReST) for Language Modeling (Paper Explained) 53:06

1 [ML News] LLaMA2 Released | LLMs for Robots | Multimodality on the Rise 44:10

1 How Cyber Criminals Are Using ChatGPT (w/ Sergey Shykevich) 29:08

1 Recipe AI suggests FATAL CHLORINE GAS Recipe 7:05

1 DeepFloyd IF - Pixel-Based Text-to-Image Diffusion (w/ Authors) 53:31

1 [ML News] GPT-4 solves MIT Exam with 100% ACCURACY | OpenLLaMA 13B released 31:04

1 Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust (Explained) 35:44

1 RWKV: Reinventing RNNs for the Transformer Era (Paper Explained) 1:02:16

1 Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review) 29:28

1 OpenAI suggests AI licenses (US Senate hearing on AI regulation w/ Sam Altman) 16:12

1 [ML News] Geoff Hinton leaves Google | Google has NO MOAT | OpenAI down half a billion 39:06

1 Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained) 24:33
Chào mừng bạn đến với Player FM!
Player FM đang quét trang web để tìm các podcast chất lượng cao cho bạn thưởng thức ngay bây giờ. Đây là ứng dụng podcast tốt nhất và hoạt động trên Android, iPhone và web. Đăng ký để đồng bộ các theo dõi trên tất cả thiết bị.