DoK Talks #138 - Build your own social media analytics with Apache Kafka // Jakub Scholz

Chia sẻ

Manage episode 332468280 series 2865115
Thông tin tác giả Bart Farrell được phát hiện bởi Player FM và cộng đồng của chúng tôi - bản quyền thuộc sở hữu của nhà sản xuất (publisher), không thuộc về Player FM, và audio được phát trực tiếp từ máy chủ của họ. Bạn chỉ cần nhấn nút Theo dõi (Subscribe) để nhận thông tin cập nhật từ Player FM, hoặc dán URL feed vào các ứng dụng podcast khác.

Apache Kafka is more than just a messaging broker. It has a rich ecosystem of different components. There are connectors for importing and exporting data, different stream processing libraries, schema registries and a lot more.

The first part of this talk will explain the Apache Kafka ecosystem and how the different components can be used to load data from social networks and use stream processing and machine learning to analyze them.

The second part will show a demo running on Kubernetes which will use Kafka Connect to load data from Twitter and analyze them using the Kafka Streams API.

After this talk, the attendees should be able to better understand the full advantages of the Apache Kafka ecosystem especially with focus on Kafka Connect and Kafka Streams API. And they should be also able to use these components on top of Kubernetes.


Jakub works at Red Hat as Senior Principal Software Engineer. He has long-term experience with messaging and currently focuses mainly on Apache Kafka and its integration with Kubernetes. He is one of the maintainers of the Strimzi project which provides tooling for running Apache Kafka on Kubernetes. Before joining Red Hat he worked as messaging and solution architect in the financial industry.


The key takeaway of this talk is that Apache Kafka is more than just a messaging broker. It is a platform and ecosystem of different components which can be used to solve complex tasks when dealing with events or processing data. The talk demonstrates this on loading tweets from Twitter and processing them using the different parts of the Kafka ecosystem. The whole talk and its demos are running on Kubernetes using the Strimzi project. So it also shows how to easily run all the different components on top of Kubernetes with the help of few simple YAML files.

235 tập