Apache Kafka Syllabus
Introduction to Apache Kafka
Apache Kafka is a distributed streaming platform that enables real-time data feeds and stream processing. This module introduces Apache Kafka, covering its core architecture, features, and use cases in data streaming and integration.
Setting Up Apache Kafka
Learn how to install and configure Apache Kafka. This section covers the setup process, including Kafka brokers, Zookeeper configuration, and cluster setup. Explore how to manage and monitor Kafka clusters for optimal performance.
Kafka Architecture and Components
Discover the architecture of Apache Kafka, including its key components such as producers, consumers, topics, partitions, and brokers. Learn about Kafka’s messaging model and how data is handled within a Kafka cluster.
Producing and Consuming Messages
Gain insights into producing and consuming messages in Apache Kafka. Learn how to create and manage Kafka producers and consumers, handle message serialization and deserialization, and work with Kafka topics and partitions.
Managing Kafka Topics and Partitions
Understand how to manage Kafka topics and partitions. Learn about topic creation, configuration, and partition management. Explore how to balance data across partitions and optimize topic settings for performance.
Kafka Streams and Processing
Learn about Kafka Streams, Kafka’s stream processing library. Explore how to build real-time data processing applications using Kafka Streams. Understand stream processing concepts, such as stateful and stateless operations, windowing, and aggregations.
Monitoring and Troubleshooting
Discover how to monitor and troubleshoot Apache Kafka. Learn about Kafka’s metrics, monitoring tools, and logging. Explore techniques for diagnosing and resolving common issues, ensuring data integrity, and maintaining cluster health.
Security and Access Control
Learn about security features and access control in Apache Kafka. Explore authentication mechanisms, authorization policies, and encryption. Understand how to secure Kafka communications and control access to data and resources.
Integration with Other Systems
Explore how to integrate Apache Kafka with other systems and technologies. Learn about Kafka Connect, which allows integration with various data sources and sinks. Understand how to use Kafka for data ingestion, ETL processes, and event-driven architectures.
Performance Tuning and Best Practices
Learn about performance tuning and best practices for optimizing Apache Kafka. Explore techniques for improving throughput, reducing latency, and managing resources. Understand best practices for configuring and maintaining Kafka clusters for high availability and reliability.
Apache Kafka Syllabus
1. Introduction to Big Data and Apache Kafka
- Introduction to Big Data
- Big Data Analytics
- Hadoop Basics
- HDFS
- MapReduce
- Hive
- HBase
- Need for Kafka
- What is Kafka?
- Kafka Features
- Kafka Concepts
- Kafka Architecture
- Kafka Components
- ZooKeeper
- Where is Kafka Used?
- Kafka Installation
- Kafka Cluster
- Types of Kafka Clusters
- Configuring Single Node Single Broker Cluster
Hands on:
- Kafka Installation
- Implementing Single Node-Single Broker Cluster
2. Kafka Producer
- Configuring Single Node Multi Broker Cluster
- Constructing a Kafka Producer
- Sending a Message to Kafka
- Producing Keyed and Non-Keyed Messages
- Sending a Message Synchronously & Asynchronously
- Configuring Producers
- Serializers
- Serializing Using Apache Avro
Hands on:
- Working with Single Node Multi Broker Cluster
- Creating a Kafka Producer
- Configuring a Kafka Producer
- Sending a Message Synchronously & Asynchronously
3. Kafka Consumer
- Consumers and Consumer Groups
- Standalone Consumer
- Consumer Groups and Partition Rebalance
- Creating a Kafka Consumer
- Subscribing to Topics
- The Poll Loop
- Configuring Consumers
- Commits and Offsets
- Rebalance Listeners
- Consuming Records with Specific Offsets
Hands on:
- Creating a Kafka Consumer
- Configuring a Kafka Consumer
- Working with Offsets
4. Kafka Internals
- Cluster Membership
- The Controller
- Replication
- Request Processing
- Physical Storage
- Reliability
- Broker Configuration
- Using Producers in a Reliable System
- Using Consumers in a Reliable System
- Validating System Reliability
- Performance Tuning Kafka
Hands on:
- Create topic with partition & replication factor 3 and execute it on multi-broker cluster
- Show fault tolerance by shutting down 1 Broker and serving its partition from another broker
5. Kafka Cluster Architectures & Administering Kafka
- Use Cases - Cross-Cluster Mirroring
- Multi-Cluster Architectures
- Apache Kafka’s MirrorMaker
- Other Cross-Cluster Mirroring Solutions
- Topic Operations
- Consumer Groups
- Dynamic Configuration Changes
- Partition Management
- Consuming and Producing
- Unsafe operations
Hands on:
- Topic Operations
- Consumer Group Operations
- Partition Operations
- Consumer and Producer Operations
6. Kafka Monitoring and Kafka Connect
- Considerations When Building Data Pipelines
- Metric Basics
- Kafka Broker Metrics
- Client Monitoring
- Lag Monitoring
- End-to-End Monitoring
- Kafka Connect
- When to Use Kafka Connect?
- Kafka Connect Properties
Hands on:
- Kafka Connect
7. Kafka Stream Processing
- Stream Processing
- Stream-Processing Concepts
- Stream-Processing Design Patterns
- Kafka Streams by Example
- Kafka Stream: Architecture Overview
Hands on:
- Kafka Streams
- Word Count Stream Processing
8. Integration of Kafka With Hadoop, Storm, and Spark
- Apache Hadoop Basics
- Hadoop Configuration
- Kafka Integration with Hadoop
- Apache Storm Basics
- Configuration of Storm
- Integration of Kafka with Storm
- Apache Spark Basics
- Spark Configuration
- Kafka Integration with Spark
Hands on:
- Kafka Integration with Hadoop
- Kafka Integration with Storm
- Kafka Integration with Spark
9. Integration of Kafka With Talend and Cassandra
- Flume Basics
- Integration of Kafka with Flume
- Cassandra Basics such as KeySpace and Table Creation
- Integration of Kafka with Cassandra
- Talend Basics
- Integration of Kafka with Talend
Hands on:
- Kafka Demo with Flume
- Kafka Demo with Cassandra
- Kafka Demo with Talend
10. Kafka Administration
- Setting up and Configuring Multi-Node and Zookeeper Multi-Node Multi-Broker Cluster
- Configuring Apache Kafka Security
- Configuring High Availability and Consistency for Apache Kafka
- Configuring Apache Kafka for Performance and Resource Management
- Viewing Apache Kafka Metrics
- Working with Apache Kafka Logs
11. Course Deliverables
- Workshop style coaching
- Interactive approach
- Course material
- Hands-on practice exercises for each topic
- Quiz at the end of each major topic
- Tips and techniques on Confluent Certified Developer for Apache Kafka (CCDAK)
Training
Basic Level Training
Duration : 1 Month
Advance Level Training
Duration : 1 Month
Project Level Training
Duration : 1 Month
Total Training Period
Duration : 3 Months
Course Mode :
Available Online / Offline
Course Fees :
Please contact the office for details