About Kafka
Apache Kafka is an open-source stream-processing software platform developed by LinkedIn and donated to the Apache Software Foundation, written in Scala and Java
Kafka® is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.

Course Contents
The following are the course contents offered for Kafka
- Understanding the principles of messaging systems
- Understanding messaging systems
- Peeking into a point-to-point messaging system
- Publish-subscribe messaging system
- Advance Queuing Messaging Protocol
- Using messaging systems in big data streaming applications
- Kafka origins
- Kafka's architecture
- Message topics
- Message partitions
- Replication and replicated logs
- Message producers
- Message consumers
- Role of Zookeeper
- Kafka producer internals
- Kafka Producer APIs
- Producer object and ProducerRecord object
- Custom partition
- Additional producer configuration
- Introduction
- Use Cases
- Architecture
- Components of Kafka - Broker, Producer, Consumer, Topic, Partition
- Ecosystem
- Kafka vs Flume
- First Things First
- Installing a Kafka Broker
- Broker Configuration
- General Broker
- Topic Defaults
- num.partitions
- log.retention.ms
- log.retention.bytes
- log.segment.bytes
- log.segment.ms
- message.max.bytes
- Hardware Selection
- Kafka in the Cloud
- Kafka Clusters
- How Many Brokers
- Broker Configuration
- Operating System Tuning
- Virtual Memory
- Disk
- Networking
- Production Concerns
- Garbage Collector Options
- Datacenter Layout
- Colocating Applications on Zookeeper
- Getting Started With Clients
- Zookeeper
- Single node kafka
- Hands-On - Setting Up
- Multi node kafka
- Hands-On - Multi Node Setup
- Console Producer & Console Consumer
- Hands-On - Producer & Consumer
- High Availability & Performance
- Producer overview
- Constructing a Kafka Producer
- Sending a Message to Kafka
- Serializers
- Custom Serializers
- Serializing using Apache Avro
- Using Avro records with Kafka
- Partitions
- Configuring Producers
- acks
- buffer.memory
- compression.type
- retries
- batch.size
- linger.ms
- client.id
- max.in.flight.requests.per.connection
- timeout.ms and metadata.fetch.timeout.ms
- Old Producer APIs
- Performance tuning
- Serialization
- Message Delivery Semantics
- Replication
- Log Compaction
- Quotas
- Hands-On
- KafkaConsumer Concepts
- Consumers and Consumer Groups
- Consumer Groups - Partition Rebalance
- Creating a Kafka Consumer
- Subscribing to Topics
- The Poll Loop
- Commits and Offsets
- Automatic Commit
- Commit Current Offset
- Asynchronous Commit
- Combining Synchronous and Asynchronous commits
- Commit Specified Offset
- Rebalance Listeners
- Seek and Exactly Once Processing
- But How Do We Exit?
- Deserializers
- Configuring Consumers
- fetch.min.bytes
- fetch.max.wait.ms
- max.partition.fetch.bytes
- session.timeout.ms
- auto.offset.reset
- enable.auto.commit
- partition.assignment.strategy
- client.id
- Stand Alone Consumer - Why and How to Use a Consumer without a Group
- Older consumer APIs
- Cluster Membership
- Replication
- Request Processing
- Produce Requests
- Fetch Requests
- Other Requests
- Physical Storage
- Partition Allocation
- File Management
- File Format
- Indexes
- Compaction
- How Compaction Works
- Deleted Events
- When Are Topics Compacted
- Broker Configs
- Hands-On
- Producer Configs
- Consumer Configs
- Consumer groups
- Hands-On
- API Design
- Producer and Consumer APIs (Java)
- Hands-On Producer & Consumer API
- Message format
- Log
- Hands-On
- Managing Topics
- Decommissioning nodes
- Data mirroring
- Data centers and Racks
- Monitoring
- Security
- Authorization and ACL
- REST API
- Hands-On
- Overview
- Confluent Platform vs Apache Kafka
- Kafka Streams
- Kafka Connectors
- Confluent Platform Hands On Usecases
- Millions of Messages per second
- How to Handle with Kafka?
- IoT HandsOn Usecase
- Kafka with Spark
- Hands-On
- Kafka with Flume (for Hadoop/Hbase/Hive)
- Hands-On
- IoT Realtime Streaming Data via Kafka
- Using Kafka in Big Data Applications
- Managing high volumes in Kafka
- Appropriate hardware choices
- Producer read and consumer write choices
- Kafka message delivery semantics
- At least once delivery
- At most once delivery
- Exactly once delivery
- Big data and Kafka common usage patterns
- Kafka and data governance
- Alerting and monitoring
- Useful Kafka matrices
- Producer matrices
- Broker matrices
- Consumer metrics
- An overview of securing Kafka
- Wire encryption using SSL
- Steps to enable SSL in Kafka
- Configuring SSL for Kafka Broker
- Configuring SSL for Kafka clients
- Kerberos SASL for authentication
- Steps to enable SASL/GSSAPI - in Kafka
- Configuring SASL for Kafka broker
- Configuring SASL for Kafka client - producer and consumer
- Understanding ACL and authorization
- Common ACL operations
- List ACLs
- Understanding Zookeeper authentication
- Apache Ranger for authorization
- Adding Kafka Service to Ranger
- Adding policies
- Best practices
- Latency and throughput
- Data and state persistence
- Data sources
- External data lookups
- Data formats
- Data serialization
- Level of parallelism
- Out-of-order events
- Message processing semantics
- Integrating Kafka with Streaming Applications
- Introduction to Kafka Streams
- Using Kafka in Stream processing
- Kafka Stream - lightweight Stream processing library
- Kafka Stream architecture
- Integrated framework advantages
- Understanding tables and Streams together
- Maven dependency
- Kafka Stream word count
- KTable
- Use case example of Kafka Streams
- Managing high volumes in Kafka
- Appropriate hardware choices
- Producer read and consumer write choices
- Kafka message delivery semantics
- At least once delivery
- At most once delivery
- Exactly once delivery
- Big data and Kafka common usage patterns
- Kafka and data governance
- Alerting and monitoring
- Useful Kafka matrices
- Producer matrices
- Broker matrices
- Consumer metrics
- Securing Kafka
- An overview of securing Kafka
- List ACLs
- Understanding Zookeeper authentication
- Adding policies
- Best practices
- The Confluent Platform
- Introduction
- Installing the Confluent Platform
- Using Kafka operations
- Using the Schema Registry
- Using the Kafka REST Proxy
- Using Kafka Connect
- Using Kafka with Confluent Platform
- Introduction to Confluent Platform
- Deep driving into Confluent architecture
- Understanding Kafka Connect and Kafka Stream
- Kafka Streams
- Moving Kafka data to HDFS
- Gobblin architecture
- Kafka Connect
- Flume
- Apache Kafka Connect API
- Kafka JDBC Connecto
- Kafka ElasticSearch Connector
- Spark Streaming with Kafka IOT Use-case Demo
Have Question?





