Through in-depth theoretical explanations of Kafka Connect and detailed practical training that can be used immediately in the field, we will help you grow into an expert in building data linkage and data pipelines based on Kafka Connect that are needed in the field.
What you will learn!
Core Mechanisms of Kafka Connect's Key Components
Understanding CDC(Change Data Capture) and practical application techniques
Understanding MySQL Data Replication and CDC (Change Data Capture) and Practical Application Methods
Core Mechanisms and Features of the Debezium CDC Source Connector
Inter-RDBMS Data Integration Using Debezium CDC Source Connector
Know-how on building a Debizium Connect-based linkage system
Setting up and running JDBC-based Source Connector and Sink Connector environment
Application of various SMT classes for message conversion
Managing connections using REST API
Utilizing the Schema Registry and integrating with Connect
Managing schema registry using REST API
Connect for Apache Kafka Practices,
From principles to practical applications, with certainty!
Kafka Connect enables you to easily, quickly, and reliably build real-time data connections between various systems through pre-built connectors without separate coding implementation.
Overseas, many companies have already introduced Kafka Connect, and in Korea, as Kafka Connect is utilized for integration between heterogeneous data systems and construction of enterprise data pipelines, the demand for talents with practical skills in Kafka and Kafka Connect is increasing. Unfortunately, however, there is still a lack of learning materials for Kafka Connect. Since you can only find books/materials/lectures that provide basic and superficial information, it is difficult to cultivate personnel with the level of practical skills desired in the field.
detailed
Mechanism Description
Practical level
Various examples
Issue Resolution
OK to the room
This course covers Kafka Connect in a detailed and practical manner that you have never encountered in any other course or book. Through detailed explanations of the mechanisms of the core components of Kafka Connect and many practical examples that cover various data linkages and operational management using Connect , we will help you grow into a Kafka Connect expert needed in the field .
Most companies' critical data systems are RDBMS. Real-time linkage of physically separated databases is the trend with CDC (Change Data Capture) . CDC is an excellent data linkage technique that can link large amounts of data in real time without delay while minimizing the load on the system. Debezium Connector is the most representative CDC solution that allows data to be linked between different RDBMS using Kafka Connect.
Many companies are demanding personnel who can handle CDC-based connections. In this lecture, we will explain in detail the mechanisms of CDC and Debezium Connector, the environment settings and application methods, and various issues that may arise when applying Debezium to the field and solutions to them through detailed theoretical explanations and practical exercises.
We will provide you with detailed explanations and practical training on core basic knowledge of Connect Cluster, Connector, SMT (Single Message Transform), Converter, etc. to a level where you can freely utilize them.
We will help you build a practical data linkage system based on Kafka through various connectors that can be applied to the RDBMS operating environment, such as SpoolDir Source, JDBC Source/Sink, and Debezium Source Connector, as well as internal mechanisms and various application practices.
We've covered a lot about Debezium CDC source connectors. We'll give you a detailed guide on how to build real-time connectivity between RDBMSs that are separate from each other in an RDBMS operating environment through Debezium CDC and the JDBC Sink Connector.
We will explain in detail about the transmission and central management of schema data through Avro and schema registry as well as Connect, especially schema compatibility that is important in practice. Through this, you can learn how to build efficient enterprise data integration and data pipelines required in practice by linking Connect and schema registry.
You will learn how to create/modify/delete/manage key elements of Connect and Schema Registry through various REST APIs.
We provide our students with a course textbook of about 200 pages . We hope that it will help you learn Kafka Connect.
We use Ubuntu Linux 20.04 on Oracle VirtualBox VM as the Kafka server OS. Although it uses Linux, it is run on a virtual machine basis, so it can be configured on both Windows/macOS environments.
VirtualBox can be installed in almost all Windows/macOS environments. However, in the case of Mac, VirtualBox is not installed in the latest M1 model, so you must install Ubuntu using a virtual environment such as UTM. For M1 models, please make sure that Ubuntu is installed in a virtual environment before selecting a lecture.
Kafka uses Confluent Kafka Community Edition version 7.1.2, not Apache Kafka.
Confluent is a company founded by the core people who created Kafka, and provides enterprise Kafka that is more advanced in terms of performance and convenience for corporate customers. It is 100% compatible with Apache Kafka, but you can use more diverse Kafka modules and integrated binaries. Use the powerful distributed system Kafka in a more elastic and scalable form with Confluent. It will help you reduce the burden of infrastructure construction and maintenance, and help you develop faster.
Although file data linking, such as the Spooldir Source Connector, is also provided as a hands-on exercise, most of the Connect practice linking is centered around data linking between RDBMS.
In particular, there are many exercises where both Source and Sink are the same MySQL DB, and Source also performs MySQL exercises, and Sink performs PostgreSQL exercises. The versions used in the exercises are MySQL 8.0.31 and PostgreSQL 12 .
A full lab environment configuration may require a PC environment with 20-30 GB of storage capacity and 4 GB or more of RAM .
Q. Why should I learn Kafka Connect?
Kafka Connect is a core component for Kafka-based data integration . Many companies that have already adopted Kafka are efficiently utilizing Kafka Connect to easily build large-scale data pipelines.
Kafka Connect is used to interconnect heterogeneous data systems, including major RDBMS such as Oracle, MySQL, and PostgreSQL, as well as NoSQL such as MongoDB and ElasticSearch, and DW systems such as RedShift, SnowFlake, Vertica, and Teradata, through over 120 different connectors.
Heterogeneous data systems can be easily interconnected/integrated through Kafka Connect without separate coding implementation, and its use and utilization are increasing in many companies due to its advantages such as reduced cost of interconnection S/W through Community license and real-time interconnection of large amounts of data without delay based on CDC.
If you master Kafka Connect through this course, you will be able to take a step forward as a Kafka expert that companies want.
Q. Do I need to take the previous lecture, Kafka Complete Guide - Core?
It would be better to take the previous lecture, Kafka Complete Guide - Core, but even if you did not take the lecture, if you have a well-established concept of Kafka's basics, Broker, Producer, and Consumer, and have experience applying Kafka's message sending and reading, you can sufficiently take this lecture.
Q. Do I need to have RDBMS experience to take this course?
Unfortunately, this course requires at least 3 months of RDBMS experience .
Basically, you can do most of the lecture exercises if you understand only the basics of RDBMS table and column alteration creation. However, if you do not have some experience with RDBMS, you may find the exercises difficult, even though CDC and RDBMS replication are explained in detail in the lecture.
Who is this course right for?
Anyone who wants to understand the internal mechanisms of Kafka Connect and apply them in practice
Data engineers or architects who want to build a data pipeline and understand CDC-based data architecture
DBAs or system administrators who need to operate JDBC or Debezium CDC Connector
DW developer considering ETL and DB linkage through real-time synchronization of operational DB
Developers and architects considering CDC-based data linkage when configuring microservice-based architecture
Need to know before starting?
Basic knowledge of Kafka Broker, Producer, and Consumer
More than 3 months of RDBMS development or operation experience
Students
23,090
Reviews
1,060
Rating
4.9
Courses
12
(전) 엔코아 컨설팅
(전) 한국 오라클
AI 프리랜서 컨설턴트
파이썬 머신러닝 완벽 가이드 저자
All
147 lectures ∙ (24hr 35min)
are provided.