Chuyển sang chế độ ngoại tuyến với ứng dụng Player FM !
Podcast đáng để nghe
TÀI TRỢ BỞI


Zeta-Alpha-E5-Mistral: Finetuning LLMs for Retrieval (with Arthur Câmara)
Manage episode 450164769 series 3446693
In the 30th episode of Neural Search Talks, we have our very own Arthur Câmara, Senior Research Engineer at Zeta Alpha, presenting a 20-minute guide on how we fine-tune Large Language Models for effective text retrieval. Arthur discusses the common issues with embedding models in a general-purpose RAG pipeline, how to tackle the lack of retrieval-oriented data for fine-tuning with InPars, and how we adapted E5-Mistral to rank in the top 10 on the BEIR benchmark.
## Sources
InPars
- https://github.com/zetaalphavector/InPars
- https://dl.acm.org/doi/10.1145/3477495.3531863
- https://arxiv.org/abs/2301.01820
- https://arxiv.org/abs/2307.04601
Zeta-Alpha-E5-Mistral
- https://zeta-alpha.com/post/fine-tuning-an-llm-for-state-of-the-art-retrieval-zeta-alpha-s-top-10-submission-to-the-the-mteb-be
- https://huggingface.co/zeta-alpha-ai/Zeta-Alpha-E5-Mistral
NanoBEIR
21 tập
Manage episode 450164769 series 3446693
In the 30th episode of Neural Search Talks, we have our very own Arthur Câmara, Senior Research Engineer at Zeta Alpha, presenting a 20-minute guide on how we fine-tune Large Language Models for effective text retrieval. Arthur discusses the common issues with embedding models in a general-purpose RAG pipeline, how to tackle the lack of retrieval-oriented data for fine-tuning with InPars, and how we adapted E5-Mistral to rank in the top 10 on the BEIR benchmark.
## Sources
InPars
- https://github.com/zetaalphavector/InPars
- https://dl.acm.org/doi/10.1145/3477495.3531863
- https://arxiv.org/abs/2301.01820
- https://arxiv.org/abs/2307.04601
Zeta-Alpha-E5-Mistral
- https://zeta-alpha.com/post/fine-tuning-an-llm-for-state-of-the-art-retrieval-zeta-alpha-s-top-10-submission-to-the-the-mteb-be
- https://huggingface.co/zeta-alpha-ai/Zeta-Alpha-E5-Mistral
NanoBEIR
21 tập
すべてのエピソード
×
1 AGI vs ASI: The future of AI-supported decision making with Louis Rosenberg 54:42

1 EXAONE 3.0: An Expert AI for Everyone (with Hyeongu Yun) 24:57

1 Zeta-Alpha-E5-Mistral: Finetuning LLMs for Retrieval (with Arthur Câmara) 19:35

1 ColPali: Document Retrieval with Vision-Language Models only (with Manuel Faysse) 34:48

1 Using LLMs in Information Retrieval (w/ Ronak Pradeep) 22:15

1 Designing Reliable AI Systems with DSPy (w/ Omar Khattab) 59:57

1 The Power of Noise (w/ Florin Cuconasu) 11:45

1 Benchmarking IR Models (w/ Nandan Thakur) 21:55

1 Baking the Future of Information Retrieval Models 27:05

1 Hacking JIT Assembly to Build Exascale AI Infrastructure 38:04

1 The Promise of Language Models for Search: Generative Information Retrieval 1:07:31

1 Task-aware Retrieval with Instructions 1:11:13

1 Generating Training Data with Large Language Models w/ Special Guest Marzieh Fadaee 1:16:14

1 ColBERT + ColBERTv2: late interaction at a reasonable inference cost 57:30

1 Evaluating Extrapolation Performance of Dense Retrieval: How does DR compare to cross encoders when it comes to generalization? 58:30
Chào mừng bạn đến với Player FM!
Player FM đang quét trang web để tìm các podcast chất lượng cao cho bạn thưởng thức ngay bây giờ. Đây là ứng dụng podcast tốt nhất và hoạt động trên Android, iPhone và web. Đăng ký để đồng bộ các theo dõi trên tất cả thiết bị.