Apache Kafka Syllabus

Introduction to Apache Kafka

Apache Kafka is a distributed streaming platform that enables real-time data feeds and stream processing. This module introduces Apache Kafka, covering its core architecture, features, and use cases in data streaming and integration.

Setting Up Apache Kafka

Learn how to install and configure Apache Kafka. This section covers the setup process, including Kafka brokers, Zookeeper configuration, and cluster setup. Explore how to manage and monitor Kafka clusters for optimal performance.

Kafka Architecture and Components

Discover the architecture of Apache Kafka, including its key components such as producers, consumers, topics, partitions, and brokers. Learn about Kafka’s messaging model and how data is handled within a Kafka cluster.

Producing and Consuming Messages

Gain insights into producing and consuming messages in Apache Kafka. Learn how to create and manage Kafka producers and consumers, handle message serialization and deserialization, and work with Kafka topics and partitions.

Managing Kafka Topics and Partitions

Understand how to manage Kafka topics and partitions. Learn about topic creation, configuration, and partition management. Explore how to balance data across partitions and optimize topic settings for performance.

Kafka Streams and Processing

Learn about Kafka Streams, Kafka’s stream processing library. Explore how to build real-time data processing applications using Kafka Streams. Understand stream processing concepts, such as stateful and stateless operations, windowing, and aggregations.

Monitoring and Troubleshooting

Discover how to monitor and troubleshoot Apache Kafka. Learn about Kafka’s metrics, monitoring tools, and logging. Explore techniques for diagnosing and resolving common issues, ensuring data integrity, and maintaining cluster health.

Security and Access Control

Learn about security features and access control in Apache Kafka. Explore authentication mechanisms, authorization policies, and encryption. Understand how to secure Kafka communications and control access to data and resources.

Integration with Other Systems

Explore how to integrate Apache Kafka with other systems and technologies. Learn about Kafka Connect, which allows integration with various data sources and sinks. Understand how to use Kafka for data ingestion, ETL processes, and event-driven architectures.

Performance Tuning and Best Practices

Learn about performance tuning and best practices for optimizing Apache Kafka. Explore techniques for improving throughput, reducing latency, and managing resources. Understand best practices for configuring and maintaining Kafka clusters for high availability and reliability.

Apache Kafka Syllabus

1. Introduction to Big Data and Apache Kafka

  • Introduction to Big Data
  • Big Data Analytics
  • Hadoop Basics
    • HDFS
    • MapReduce
    • Hive
    • HBase
  • Need for Kafka
  • What is Kafka?
  • Kafka Features
  • Kafka Concepts
  • Kafka Architecture
  • Kafka Components
  • ZooKeeper
  • Where is Kafka Used?
  • Kafka Installation
  • Kafka Cluster
  • Types of Kafka Clusters
  • Configuring Single Node Single Broker Cluster

Hands on:

  • Kafka Installation
  • Implementing Single Node-Single Broker Cluster

2. Kafka Producer

  • Configuring Single Node Multi Broker Cluster
  • Constructing a Kafka Producer
  • Sending a Message to Kafka
  • Producing Keyed and Non-Keyed Messages
  • Sending a Message Synchronously & Asynchronously
  • Configuring Producers
  • Serializers
  • Serializing Using Apache Avro

Hands on:

  • Working with Single Node Multi Broker Cluster
  • Creating a Kafka Producer
  • Configuring a Kafka Producer
  • Sending a Message Synchronously & Asynchronously

3. Kafka Consumer

  • Consumers and Consumer Groups
  • Standalone Consumer
  • Consumer Groups and Partition Rebalance
  • Creating a Kafka Consumer
  • Subscribing to Topics
  • The Poll Loop
  • Configuring Consumers
  • Commits and Offsets
  • Rebalance Listeners
  • Consuming Records with Specific Offsets

Hands on:

  • Creating a Kafka Consumer
  • Configuring a Kafka Consumer
  • Working with Offsets

4. Kafka Internals

  • Cluster Membership
  • The Controller
  • Replication
  • Request Processing
  • Physical Storage
  • Reliability
  • Broker Configuration
  • Using Producers in a Reliable System
  • Using Consumers in a Reliable System
  • Validating System Reliability
  • Performance Tuning Kafka

Hands on:

  • Create topic with partition & replication factor 3 and execute it on multi-broker cluster
  • Show fault tolerance by shutting down 1 Broker and serving its partition from another broker

5. Kafka Cluster Architectures & Administering Kafka

  • Use Cases - Cross-Cluster Mirroring
  • Multi-Cluster Architectures
  • Apache Kafka’s MirrorMaker
  • Other Cross-Cluster Mirroring Solutions
  • Topic Operations
  • Consumer Groups
  • Dynamic Configuration Changes
  • Partition Management
  • Consuming and Producing
  • Unsafe operations

Hands on:

  • Topic Operations
  • Consumer Group Operations
  • Partition Operations
  • Consumer and Producer Operations

6. Kafka Monitoring and Kafka Connect

  • Considerations When Building Data Pipelines
  • Metric Basics
  • Kafka Broker Metrics
  • Client Monitoring
  • Lag Monitoring
  • End-to-End Monitoring
  • Kafka Connect
  • When to Use Kafka Connect?
  • Kafka Connect Properties

Hands on:

  • Kafka Connect

7. Kafka Stream Processing

  • Stream Processing
  • Stream-Processing Concepts
  • Stream-Processing Design Patterns
  • Kafka Streams by Example
  • Kafka Stream: Architecture Overview

Hands on:

  • Kafka Streams
  • Word Count Stream Processing

8. Integration of Kafka With Hadoop, Storm, and Spark

  • Apache Hadoop Basics
  • Hadoop Configuration
  • Kafka Integration with Hadoop
  • Apache Storm Basics
  • Configuration of Storm
  • Integration of Kafka with Storm
  • Apache Spark Basics
  • Spark Configuration
  • Kafka Integration with Spark

Hands on:

  • Kafka Integration with Hadoop
  • Kafka Integration with Storm
  • Kafka Integration with Spark

9. Integration of Kafka With Talend and Cassandra

  • Flume Basics
  • Integration of Kafka with Flume
  • Cassandra Basics such as KeySpace and Table Creation
  • Integration of Kafka with Cassandra
  • Talend Basics
  • Integration of Kafka with Talend

Hands on:

  • Kafka Demo with Flume
  • Kafka Demo with Cassandra
  • Kafka Demo with Talend

10. Kafka Administration

  • Setting up and Configuring Multi-Node and Zookeeper Multi-Node Multi-Broker Cluster
  • Configuring Apache Kafka Security
  • Configuring High Availability and Consistency for Apache Kafka
  • Configuring Apache Kafka for Performance and Resource Management
  • Viewing Apache Kafka Metrics
  • Working with Apache Kafka Logs

11. Course Deliverables

  • Workshop style coaching
  • Interactive approach
  • Course material
  • Hands-on practice exercises for each topic
  • Quiz at the end of each major topic
  • Tips and techniques on Confluent Certified Developer for Apache Kafka (CCDAK)

Training

Basic Level Training

Duration : 1 Month

Advance Level Training

Duration : 1 Month

Project Level Training

Duration : 1 Month

Total Training Period

Duration : 3 Months

Course Mode :

Available Online / Offline

Course Fees :

Please contact the office for details

Placement Benefit Services

Provide 100% job-oriented training
Develop multiple skill sets
Assist in project completion
Build ATS-friendly resumes
Add relevant experience to profiles
Build and enhance online profiles
Supply manpower to consultants
Supply manpower to companies
Prepare candidates for interviews
Add candidates to job groups
Send candidates to interviews
Provide job references
Assign candidates to contract jobs
Select candidates for internal projects

Note

100% Job Assurance Only
Daily online batches for employees
New course batches start every Monday