Artwork

Nội dung được cung cấp bởi Real Python. Tất cả nội dung podcast bao gồm các tập, đồ họa và mô tả podcast đều được Real Python hoặc đối tác nền tảng podcast của họ tải lên và cung cấp trực tiếp. Nếu bạn cho rằng ai đó đang sử dụng tác phẩm có bản quyền của bạn mà không có sự cho phép của bạn, bạn có thể làm theo quy trình được nêu ở đây https://vi.player.fm/legal.
Player FM - Ứng dụng Podcast
Chuyển sang chế độ ngoại tuyến với ứng dụng Player FM !

Leveraging Documents and Data to Create a Custom LLM Chatbot

1:08:12
 
Chia sẻ
 

Manage episode 410800254 series 2637014
Nội dung được cung cấp bởi Real Python. Tất cả nội dung podcast bao gồm các tập, đồ họa và mô tả podcast đều được Real Python hoặc đối tác nền tảng podcast của họ tải lên và cung cấp trực tiếp. Nếu bạn cho rằng ai đó đang sử dụng tác phẩm có bản quyền của bạn mà không có sự cho phép của bạn, bạn có thể làm theo quy trình được nêu ở đây https://vi.player.fm/legal.

How do you customize a LLM chatbot to address a collection of documents and data? What tools and techniques can you use to build embeddings into a vector database? This week on the show, Calvin Hendryx-Parker is back to discuss developing an AI-powered, Large Language Model-driven chat interface.

Calvin is the co-founder and CTO of Six Feet Up, a Python and AI consultancy. He shares a recent project for a family-owned seed company that wanted to build a tool for customers to access years of farm research. These documents were stored as brochure-style PDFs and spanned 50 years.

We discuss several of the tools used to augment a LLM. Calvin covers working with LangChain and vectorizing data with ChromaDB. We talk about the obstacles and limitations of capturing documentation.

Calvin also shares a smaller project that you can try out yourself. It takes the information from a conference website and creates a chatbot using Django and Python prompt-toolkit.

This episode is sponsored by Mailtrap.

Course Spotlight: Command Line Interfaces in Python

Command line arguments are the key to converting your programs into useful and enticing tools that are ready to be used in the terminal of your operating system. In this course, you’ll learn their origins, standards, and basics, and how to implement them in your program.

Topics:

  • 00:00:00 – Introduction
  • 00:02:21 – Background on the project
  • 00:03:51 – Complexity of adding documents
  • 00:09:01 – Retrieval-augmented generation and providing links
  • 00:13:46 – Updating information and larger conversation context
  • 00:18:08 – Sponsor: Mailtrap
  • 00:18:43 – Working with context
  • 00:21:02 – Temperature adjustment
  • 00:22:07 – Rally Conference Chatbot Project
  • 00:26:20 – Vectorization using ChromaDB
  • 00:32:49 – Employing Python prompt-toolkit
  • 00:35:07 – Learning libraries on the fly
  • 00:37:38 – Video Course Spotlight
  • 00:39:00 – Problems with tables in documents
  • 00:42:30 – Everything looks like a chat box
  • 00:44:26 – Finding the right fit for a client and customer
  • 00:49:05 – What are questions you ask a new client now?
  • 00:51:54 – Canada Air anecdote
  • 00:56:20 – How do you stay up to date on these topics?
  • 01:01:03 – What are you excited about in the world of Python?
  • 01:03:22 – What do you want to learn next?
  • 01:04:58 – How can people follow your work online?
  • 01:05:31 – IndyPy
  • 01:07:13 – Thanks and goodbye

Show Links:

Level up your Python skills with our expert-led courses:

Support the podcast & join our community of Pythonistas

  continue reading

204 tập

Artwork
iconChia sẻ
 
Manage episode 410800254 series 2637014
Nội dung được cung cấp bởi Real Python. Tất cả nội dung podcast bao gồm các tập, đồ họa và mô tả podcast đều được Real Python hoặc đối tác nền tảng podcast của họ tải lên và cung cấp trực tiếp. Nếu bạn cho rằng ai đó đang sử dụng tác phẩm có bản quyền của bạn mà không có sự cho phép của bạn, bạn có thể làm theo quy trình được nêu ở đây https://vi.player.fm/legal.

How do you customize a LLM chatbot to address a collection of documents and data? What tools and techniques can you use to build embeddings into a vector database? This week on the show, Calvin Hendryx-Parker is back to discuss developing an AI-powered, Large Language Model-driven chat interface.

Calvin is the co-founder and CTO of Six Feet Up, a Python and AI consultancy. He shares a recent project for a family-owned seed company that wanted to build a tool for customers to access years of farm research. These documents were stored as brochure-style PDFs and spanned 50 years.

We discuss several of the tools used to augment a LLM. Calvin covers working with LangChain and vectorizing data with ChromaDB. We talk about the obstacles and limitations of capturing documentation.

Calvin also shares a smaller project that you can try out yourself. It takes the information from a conference website and creates a chatbot using Django and Python prompt-toolkit.

This episode is sponsored by Mailtrap.

Course Spotlight: Command Line Interfaces in Python

Command line arguments are the key to converting your programs into useful and enticing tools that are ready to be used in the terminal of your operating system. In this course, you’ll learn their origins, standards, and basics, and how to implement them in your program.

Topics:

  • 00:00:00 – Introduction
  • 00:02:21 – Background on the project
  • 00:03:51 – Complexity of adding documents
  • 00:09:01 – Retrieval-augmented generation and providing links
  • 00:13:46 – Updating information and larger conversation context
  • 00:18:08 – Sponsor: Mailtrap
  • 00:18:43 – Working with context
  • 00:21:02 – Temperature adjustment
  • 00:22:07 – Rally Conference Chatbot Project
  • 00:26:20 – Vectorization using ChromaDB
  • 00:32:49 – Employing Python prompt-toolkit
  • 00:35:07 – Learning libraries on the fly
  • 00:37:38 – Video Course Spotlight
  • 00:39:00 – Problems with tables in documents
  • 00:42:30 – Everything looks like a chat box
  • 00:44:26 – Finding the right fit for a client and customer
  • 00:49:05 – What are questions you ask a new client now?
  • 00:51:54 – Canada Air anecdote
  • 00:56:20 – How do you stay up to date on these topics?
  • 01:01:03 – What are you excited about in the world of Python?
  • 01:03:22 – What do you want to learn next?
  • 01:04:58 – How can people follow your work online?
  • 01:05:31 – IndyPy
  • 01:07:13 – Thanks and goodbye

Show Links:

Level up your Python skills with our expert-led courses:

Support the podcast & join our community of Pythonistas

  continue reading

204 tập

Semua episode

×
 
Loading …

Chào mừng bạn đến với Player FM!

Player FM đang quét trang web để tìm các podcast chất lượng cao cho bạn thưởng thức ngay bây giờ. Đây là ứng dụng podcast tốt nhất và hoạt động trên Android, iPhone và web. Đăng ký để đồng bộ các theo dõi trên tất cả thiết bị.

 

Hướng dẫn sử dụng nhanh