Chuyển sang chế độ ngoại tuyến với ứng dụng Player FM !
Transformer Memory as a Differentiable Search Index: memorizing thousands of random doc ids works!?
Manage episode 355037188 series 3446693
Andrew Yates and Sergi Castella discuss the paper titled "Transformer Memory as a Differentiable Search Index" by Yi Tay et al at Google. This work proposes a new approach to document retrieval in which document ids are memorized by a transformer during training (or "indexing") and for retrieval, a query is fed to the model, which then generates autoregressively relevant doc ids for that query.
Paper: https://arxiv.org/abs/2202.06991
Timestamps:
00:00 Intro: Transformer memory as a Differentiable Search Index (DSI)
01:15 The gist of the paper, motivation
4:20 Related work: Autoregressive Entity Linking
7:38 What is an index? Conventional vs. "differentiable"
10:20 Indexing and Retrieval definitions in the context of the DSI
12:40 Learning representations for documents
17:20 How to represent document ids: atomic, string, semantically relevant
22:00 Zero-shot vs. finetuned settings
24:10 Datasets and baselines
27:08 Dinetuned results
36:40 Zero-shot results
43:50 Ablation results
47:15 Where could this model be useds?
52:00 Is memory efficiency a fundamental problem of this approach?
55:14 What about semantically relevant doc ids?
60:30 Closing remarks
Contact: [email protected]
21 tập
Manage episode 355037188 series 3446693
Andrew Yates and Sergi Castella discuss the paper titled "Transformer Memory as a Differentiable Search Index" by Yi Tay et al at Google. This work proposes a new approach to document retrieval in which document ids are memorized by a transformer during training (or "indexing") and for retrieval, a query is fed to the model, which then generates autoregressively relevant doc ids for that query.
Paper: https://arxiv.org/abs/2202.06991
Timestamps:
00:00 Intro: Transformer memory as a Differentiable Search Index (DSI)
01:15 The gist of the paper, motivation
4:20 Related work: Autoregressive Entity Linking
7:38 What is an index? Conventional vs. "differentiable"
10:20 Indexing and Retrieval definitions in the context of the DSI
12:40 Learning representations for documents
17:20 How to represent document ids: atomic, string, semantically relevant
22:00 Zero-shot vs. finetuned settings
24:10 Datasets and baselines
27:08 Dinetuned results
36:40 Zero-shot results
43:50 Ablation results
47:15 Where could this model be useds?
52:00 Is memory efficiency a fundamental problem of this approach?
55:14 What about semantically relevant doc ids?
60:30 Closing remarks
Contact: [email protected]
21 tập
Tất cả các tập
×Chào mừng bạn đến với Player FM!
Player FM đang quét trang web để tìm các podcast chất lượng cao cho bạn thưởng thức ngay bây giờ. Đây là ứng dụng podcast tốt nhất và hoạt động trên Android, iPhone và web. Đăng ký để đồng bộ các theo dõi trên tất cả thiết bị.